* Re: Xen and safety certification, Minutes of the meeting on Apr 4th @ 2018-04-06 14:13 Lars Kurth 2018-04-06 17:01 ` Jarvis Roach 2018-04-06 17:18 ` Artem Mygaiev 0 siblings, 2 replies; 18+ messages in thread From: Lars Kurth @ 2018-04-06 14:13 UTC (permalink / raw) To: Stefano Stabellini Cc: Edgar E. Iglesias, Artem Mygaiev, davorin.mista, robin.randhawa, paul_luperto, Denys Balatsko, Rich Persaud, Jonathan Daugherty, anastassios.nanos, julien.grall, committers, Stewart Hildebrand, xen-devel, vfachin, Volodymyr Babchuk, mirela.simonovic, Jarvis Roach Hi all, adding a few more people who are/may be interested in safety certification, including committers (because item 1 would have an impact). Specifically: Rich Persaud, Paul Luperto, Jonathan Daugherty and Denys Balatsko. There are a few loose ends and updates from other/similar related threads that we should pull into this thread: a) AGL Whitepaper This is out as far as I can tell See https://docs.google.com/document/d/1HpYzClh0nDEocsUHb17X0DxiehsAbCgyWE-P2Wk_RNU/edit# Thank you to Rich for driving this and to all the contributors from the Xen Community Related to this is the following item from the original minutes > AGL will select 2 hypervisors out of the list. Artem has already an > out-of-the-box solution for AGL. Artem will chase up and make sure that > Xen will be one of the two. b) Genivi AMM Hypervisor Workshop, Apr 19 Artem and me will be speaking on various Xen related projects. I will send a draft PDF to this list later this week. Slots are short: 10 minutes + questions each slot See https://at.projects.genivi.org/wiki/display/DIRO/Hypervisor+Workshop+Team c) Xen Specific Automotive Whitepaper This was discussed during a) and I think it would be relatively easy to pull something together. It would be good if someone else, but me could lead this. We have a lot of information already, but more ground-work on safety certification may help. Would there be a volunteer driving this? I could be used as a vehicle to move some of the items discussed in the minutes along. d) I also created https://wiki.xenproject.org/wiki/Category:Safety_Certification to start pulling material relevant to safety and context for it into one place. It's a little crude at this point in time and I expect this document to evolve and split into smaller parts. It would be good, if someone on this list could go over https://wiki.xenproject.org/wiki/Category:Safety_Certification#Automotive_Requirements and map the requirements to functionality we already have. This could then feed into c. Any takers? > Artem suggested to write a whitepaper about Xen real-time capabilities. > Stefano volunteered to help. I believe we have some gaps with regards to real-time requirements and that paper is aiming to highlight these. @Artem: maybe this would be a suitable topic for the developer summit (amongst others) As a reminder: the CfP for the summit closes next Friday > I contacted Lars (CC'ed) who volunteered to help. I am volunteering to act as a program/project manager for this activity. In particular to bootstrap. I think the only practicable way to make progress in this area, is to set up some mechanism which allow us to make progress towards the goal of making it easier and cheaper to build safety certified variants of Xen. As a side-effect of this process we should get data, to scope out the scale of the problem further, that should enable getting more vendors interested. > The main topic of the meeting was certifications for Xen on ARM. The gap > analysis document, mentioned in the previous call, is copyrighted. It > might not be possible to relicense it. Regardless of the document, we > started discussing the major work items and next steps. @Stefano: Thanks for driving this discussion I re-ordered some of the items, to make it more palatable > 2) Create a subset of functions that need to go through certifications > Next step: create a small Kconfig. We could use the Renesas Rcar as > reference. We need a discussion about the features we need, for example > real-time schedulers, do we need them or not? @Stefano agreed to drive this. The minimal configuration does impact 1 and 2, which is why I moved this first. We should probably agree a basic process: aka * Measure baseline size in KSLOC * Remove some feature * Measure reduction in KSLOC And record the data somewhere > 1) Requirements to the code, a subset of MISRA for ASIL B > Next step: get more information about requirements and publish it to > xen-devel. I see a few problems here: * The MISCRA 2012 spec has to be bought and it is rather big (100's of pages): so, I don't think it is practical to work from the spec * Some coding style patterns will likely be perceived as odd and unreasonable by community members: as some common code would be affected we cannot treat this in isolation say on ARM only. Although it is recognized that some of the coding style patterns may not make sense, compliance to MISRA is necessary and cannot normally be discussed away. * PRQA has set up an environment and initial MISRA compliance report for a Xen on ARM build ** The question is what (if anything) can be shared publicly ** The other open question is whether we can come to some sort of longer term agreement between the Xen Project and PRQA to use their tools ** As an aside, what PRQA have done would need to reflect what we do in step 2 is. We also want to minimize the work for PRQA: in other words, it has to be very simple to enable the minimal config coming out of task 2 such that PRQA can ** As far as I recall 90% of all MISRA violations come down to around 70 issues. A large number are in tools ** Also, I believe that MISRA compliance tools will likely lead to a large amount of false positives, due to the distributed nature of Xen: process boundaries, kernel/user space boundaries, etc. would all lead to false positives, which somehow have to be managed. ACTION => Lars to follow up with Paul Luperto from PRQA * An approach that may be manageable would be to look at the most common MISRA violations and work backwards from there. ** This would make the problem more manageable and mean people wouldn't have to read a long spec ** Discussing a small set of issues, would give us a sense of whether/what type of disagreements there are and how we resolve them. ** We should focus prioritize based on: a) Address/discuss the most frequently occurring issues first b) Address/discuss issues in common code first At the very least (and for now in absence of the capability to check compliance), I would need someone who has access to MISRA compliance tools, to drive such an effort. > 3) Understand how to address dom0. FreeRTOS Dom0 sounds like a good > solution. > Next step: reach out to Dornerworks and/or others that worked with > FreeRTOS on Xen before. Figure out whether FreeRTOS is actually a > suitable solution and what needs to be done to run FreeRTOS as Dom0. Some things to check at this stage: a) I believe there is a safety certified version of FreeRTOS - I could not find much, except for https://www.freertos.org/FreeRTOS-Plus/Safety_Critical_Certified/SafeRTOS-Safety-Critical-Certification.shtml - which describes SafeRTOS a commercial safety certified FreeRTOS and (mostly) API compliant version of FreeRTOS. Or am I missing something here? b) There is a DomU capable version from Galois (Jonathan Docherty CC'ed) - I don't know whether others also have such versions c) There is a POXIX wrapper, which may be needed, but it is unclear what this would do to the FreeRTOS footprint d) In other words, what we would have to do is to investigate whether it is possible to build to a Dom0 capable FreeRTOS I see several ways of approaching this: a) A vendor (or groups of vendors) on this list steps up b) We go initially for a lower bar: aka we try and scope out and cost the creation of a Dom0 capable FreeRTOS and then look at how the work can get funded A very good starting point would be to get a list of parties that are interested in having and using a FreeRTOS based Dom0 (regardless of how we get there). A show of hands would be good. Some insights from anyone on the FreeRTOS/SafeRTOS relationship and politics would be good also. Unless there is a route from FreeRTOS upstream to the certified version, someone in our eco-system would have to safety certify FreeRTOS (which may not be such a big deal given the fairly small size of FreeRTOS). > 4) Create artifacts, such as docs, fault analysis, prove fault tolerance, > safety management docs, development processes. > Next step: we need to bring in a company, a certification body, to guide > us through the process. We have companies such as Dornerworks on this list which are experienced with safety certification on Xen for some safety standards: it is not clear to me how much of this is transferable to automotive. Here my understanding is that we need a certification partner like TÜV, MIRA or a company like Dornerworks who already have experience with Xen. By working with a partner experienced in certification, the overall cost of certification would be significantly reduced. The elephant in the room is funding and a business model (aka all the items listed in https://docs.google.com/document/d/1HpYzClh0nDEocsUHb17X0DxiehsAbCgyWE-P2Wk_RNU/edit section 4.1). The reality is that organisations such as TÜV, MIRA, Dornerworks, ... will need to be paid by someone. Which, I think we need to park for now. What I think are sensible goals for now are a) Establish a list of potential partners and start establishing contacts - such conversations would need to be led by a vendor, otherwise it will go nowhere. What would be good though is to have a shared (but possibly private) repository of how these conversations have gone. b) Otherwise focus on tasks 1-3 which deal with some issues listed in https://www.slideshare.net/xen_com_mgr/art-certification, which is still very valid c) Engage/work with with other groups (AGL, Genivi, Linaro) who are also looking at this problem It may be worth in the mid-term to consider some sort of pilot around a small portion of the Xen codebase: the aim would be to gather data that helps establish what can be done in a collaborative FOSS environment. Feedback/views are very welcome Regards Lars _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-04-06 14:13 Xen and safety certification, Minutes of the meeting on Apr 4th Lars Kurth @ 2018-04-06 17:01 ` Jarvis Roach 2018-04-06 17:23 ` Lars Kurth 2018-04-06 17:32 ` Artem Mygaiev 2018-04-06 17:18 ` Artem Mygaiev 1 sibling, 2 replies; 18+ messages in thread From: Jarvis Roach @ 2018-04-06 17:01 UTC (permalink / raw) To: Lars Kurth, Stefano Stabellini Cc: Edgar E. Iglesias, Artem Mygaiev, davorin.mista, robin.randhawa, paul_luperto, Denys Balatsko, Rich Persaud, Jonathan Daugherty, anastassios.nanos, julien.grall, committers, Stewart Hildebrand, xen-devel, vfachin, Volodymyr Babchuk, mirela.simonovic > Hi all, > > adding a few more people who are/may be interested in safety certification, > including committers (because item 1 would have an impact). Specifically: > Rich Persaud, Paul Luperto, Jonathan Daugherty and Denys Balatsko. > > There are a few loose ends and updates from other/similar related threads > that we should pull into this thread: > > a) AGL Whitepaper > This is out as far as I can tell > See > https://docs.google.com/document/d/1HpYzClh0nDEocsUHb17X0DxiehsAb > CgyWE-P2Wk_RNU/edit# > Thank you to Rich for driving this and to all the contributors from the Xen > Community > > Related to this is the following item from the original minutes > > AGL will select 2 hypervisors out of the list. Artem has already an > > out-of-the-box solution for AGL. Artem will chase up and make sure > > that Xen will be one of the two. > > b) Genivi AMM Hypervisor Workshop, Apr 19 Artem and me will be > speaking on various Xen related projects. I will send a draft PDF to this list > later this week. > Slots are short: 10 minutes + questions each slot See > https://at.projects.genivi.org/wiki/display/DIRO/Hypervisor+Workshop+Te > am > > c) Xen Specific Automotive Whitepaper > This was discussed during a) and I think it would be relatively easy to pull > something together. It would be good if someone else, but me could lead > this. We have a lot of information already, but more ground-work on safety > certification may help. Would there be a volunteer driving this? I could be > used as a vehicle to move some of the items discussed in the minutes > along. > > d) I also created > https://wiki.xenproject.org/wiki/Category:Safety_Certification to start > pulling material relevant to safety and context for it into one place. > It's a little crude at this point in time and I expect this document to evolve > and split into smaller parts. > It would be good, if someone on this list could go over > https://wiki.xenproject.org/wiki/Category:Safety_Certification#Automotive > _Requirements and map the requirements to functionality we already have. > This could then feed into c. > > Any takers? > > > Artem suggested to write a whitepaper about Xen real-time capabilities. > > Stefano volunteered to help. > I believe we have some gaps with regards to real-time requirements and > that paper is aiming to highlight these. > @Artem: maybe this would be a suitable topic for the developer summit > (amongst others) As a reminder: the CfP for the summit closes next Friday > One of my engineers has highlighted the need to move Xen to use preemptive locks (similar to what was done with the Linux RT patch updates) before it can be considered hard real-time. Right now we've been pitching it as soft real-time. > > > I contacted Lars (CC'ed) who volunteered to help. > I am volunteering to act as a program/project manager for this activity. In > particular to bootstrap. > > I think the only practicable way to make progress in this area, is to set up > some mechanism which allow us to make progress towards the goal of > making it easier and cheaper to build safety certified variants of Xen. As a > side-effect of this process we should get data, to scope out the scale of the > problem further, that should enable getting more vendors interested. > > > The main topic of the meeting was certifications for Xen on ARM. The > > gap analysis document, mentioned in the previous call, is copyrighted. > > It might not be possible to relicense it. Regardless of the document, > > we started discussing the major work items and next steps. > > @Stefano: Thanks for driving this discussion I re-ordered some of the items, > to make it more palatable > > > 2) Create a subset of functions that need to go through certifications > > Next step: create a small Kconfig. We could use the Renesas Rcar as > > reference. We need a discussion about the features we need, for > > example real-time schedulers, do we need them or not? > Identifying this subset is very important. My recommendation would be to identify the very smallest subset to start with that supports a single, high value use case, which I would suggest is consolidation of Linux and real-time applications with mixed criticality, but not necessarily shared/PV I/O, onto a single processing cluster. Identifying the highest reasonable safety criticality to support would also be very helpful. At the Xen level, you might get away with just the null scheduler if VMs are pinned to their own cores (and jitter caused by contention on the bus and in the cache is acceptable). However, to do CAST-32a type scheduling (effectively time slicing the SoC between your VMs), an updated ARINC-653 scheduler would be needed. > > @Stefano agreed to drive this. > The minimal configuration does impact 1 and 2, which is why I moved this > first. > > We should probably agree a basic process: aka > * Measure baseline size in KSLOC > * Remove some feature > * Measure reduction in KSLOC > And record the data somewhere > > > 1) Requirements to the code, a subset of MISRA for ASIL B Next step: > > get more information about requirements and publish it to xen-devel. > > I see a few problems here: > > * The MISCRA 2012 spec has to be bought and it is rather big (100's of > pages): > so, I don't think it is practical to work from the spec > > * Some coding style patterns will likely be perceived as odd and > unreasonable by community members: as some common code would be > affected we cannot treat this in isolation say on ARM only. Although it is > recognized that some of the coding style patterns may not make sense, > compliance to MISRA is necessary and cannot normally be discussed away. > > * PRQA has set up an environment and initial MISRA compliance report for > a Xen on ARM build > ** The question is what (if anything) can be shared publicly > ** The other open question is whether we can come to some sort of longer > term agreement between the Xen Project and PRQA to use their tools > ** As an aside, what PRQA have done would need to reflect what we do in > step 2 is. We also want to minimize the work for PRQA: in other words, it > has to be very simple to enable the minimal config coming out of task 2 > such that PRQA can > ** As far as I recall 90% of all MISRA violations come down to around 70 > issues. A large number are in tools > ** Also, I believe that MISRA compliance tools will likely lead to a large > amount of false positives, due to the distributed nature of Xen: process > boundaries, kernel/user space boundaries, etc. would all lead to false > positives, which somehow have to be managed. > > ACTION => Lars to follow up with Paul Luperto from PRQA > > * An approach that may be manageable would be to look at the most > common MISRA violations and work backwards from there. > ** This would make the problem more manageable and mean people > wouldn't have to read a long spec > ** Discussing a small set of issues, would give us a sense of whether/what > type of disagreements there are and how we resolve them. > ** We should focus prioritize based on: > a) Address/discuss the most frequently occurring issues first > b) Address/discuss issues in common code first > > At the very least (and for now in absence of the capability to check > compliance), I would need someone who has access to MISRA compliance > tools, to drive such an effort. > > > 3) Understand how to address dom0. FreeRTOS Dom0 sounds like a good > > solution. > > Next step: reach out to Dornerworks and/or others that worked with > > FreeRTOS on Xen before. Figure out whether FreeRTOS is actually a > > suitable solution and what needs to be done to run FreeRTOS as Dom0. > > Some things to check at this stage: > a) I believe there is a safety certified version of FreeRTOS - I could not find > much, except for https://www.freertos.org/FreeRTOS- > Plus/Safety_Critical_Certified/SafeRTOS-Safety-Critical-Certification.shtml - > which describes SafeRTOS a commercial safety certified FreeRTOS and > (mostly) API compliant version of FreeRTOS. Or am I missing something > here? > b) There is a DomU capable version from Galois (Jonathan Docherty CC'ed) - > I don't know whether others also have such versions I ported the version of FreeRTOS that Xilinx distributes with their SDK to run as a domU on the ZUS+ in 2016 and round tripped the change set back to Richard Barry. I've also heard interest in running RTEMS as a guest OS. Since I do not think that a previously certified OS will be available for free, I see 3 general approaches wrt dom0: 1) Find and certify an open source OS. My guess is this will not be Linux due to code base size. POSIX support a plus. 2) Use a commercially available, previously certified OS for dom0. DW ported VxWorks to run on Xen in 2017 and uc/OS-III in 2016. 3) Go with a dom0-less solution; bootloader starts up the necessary VMs based on a static configuration. The XL toolstack in its current form will likely cause cert issues and will probably need to be stripped down and/or rewritten. Bootloader (U-Boot, GRUB, or whatever) will also need to be certified. > c) There is a POXIX wrapper, which may be needed, but it is unclear what > this would do to the FreeRTOS footprint > d) In other words, what we would have to do is to investigate whether it is > possible to build to a Dom0 capable FreeRTOS > > I see several ways of approaching this: > a) A vendor (or groups of vendors) on this list steps up > b) We go initially for a lower bar: aka we try and scope out and cost the > creation of a Dom0 capable FreeRTOS and then look at how the work can > get funded > > A very good starting point would be to get a list of parties that are > interested in having and using a FreeRTOS based Dom0 (regardless of how > we get there). A show of hands would be good. DW is interested in participating with exploring ways to solve the dom0 problem (be it FreeRTOS or other approaches). > Some insights from anyone on the FreeRTOS/SafeRTOS relationship and > politics would be good also. Unless there is a route from FreeRTOS > upstream to the certified version, someone in our eco-system would have > to safety certify FreeRTOS (which may not be such a big deal given the fairly > small size of FreeRTOS). > This link helps explains the relationship between FreeRTOS and SafeRTOS: https://www.highintegritysystems.com/safertos/upgrade-from-freertos-to-safertos/ > > > 4) Create artifacts, such as docs, fault analysis, prove fault > > tolerance, safety management docs, development processes. > > Next step: we need to bring in a company, a certification body, to > > guide us through the process. > > We have companies such as Dornerworks on this list which are experienced > with safety certification on Xen for some safety standards: it is not clear to > me how much of this is transferable to automotive. > Papers have shown that there is a lot of overlap between the artifacts and processes defined in different safety standards. However, someone with more experience in automotive safety should speak to the concern of non-determinism/jitter in that market. Aviation certification authorities are practically rabid about it, and you have to go to great lengths to satisfy them (disable interrupts, flush cache between partitions, prove that silicon vendor's secret features are all disabled, etc) which might be overkill for automotive. > > Here my understanding is that we need a certification partner like TÜV, > MIRA or a company like Dornerworks who already have experience with > Xen. By working with a partner experienced in certification, the overall cost > of certification would be significantly reduced. The elephant in the room is > funding and a business model (aka all the items listed in > https://docs.google.com/document/d/1HpYzClh0nDEocsUHb17X0DxiehsAb > CgyWE-P2Wk_RNU/edit section 4.1). The reality is that organisations such > as TÜV, MIRA, Dornerworks, ... will need to be paid by someone. Which, I > think we need to park for now. > I wouldn't leave it parked too long. The issues of funding and remuneration will delay/derail progress more than all of the technical challenges combined. > > What I think are sensible goals for now are > a) Establish a list of potential partners and start establishing contacts - such > conversations would need to be led by a vendor, otherwise it will go > nowhere. What would be good though is to have a shared (but possibly > private) repository of how these conversations have gone. > b) Otherwise focus on tasks 1-3 which deal with some issues listed in > https://www.slideshare.net/xen_com_mgr/art-certification, which is still > very valid > c) Engage/work with with other groups (AGL, Genivi, Linaro) who are also > looking at this problem > > It may be worth in the mid-term to consider some sort of pilot around a > small portion of the Xen codebase: the aim would be to gather data that > helps establish what can be done in a collaborative FOSS environment. > > Feedback/views are very welcome > > Regards > Lars > > Cheers! -Jarvis _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-04-06 17:01 ` Jarvis Roach @ 2018-04-06 17:23 ` Lars Kurth 2018-04-06 17:32 ` Artem Mygaiev 1 sibling, 0 replies; 18+ messages in thread From: Lars Kurth @ 2018-04-06 17:23 UTC (permalink / raw) To: Jarvis Roach, Stefano Stabellini Cc: Edgar E. Iglesias, Artem Mygaiev, davorin.mista, robin.randhawa, paul_luperto, Denys Balatsko, Rich Persaud, Jonathan Daugherty, anastassios.nanos, julien.grall, committers, Stewart Hildebrand, xen-devel, vfachin, Volodymyr Babchuk, mirela.simonovic Jarvis, thanks for the valuable input. On 06/04/2018, 19:01, "Jarvis Roach" <Jarvis.Roach@dornerworks.com> wrote: > > Here my understanding is that we need a certification partner like TÜV, > MIRA or a company like Dornerworks who already have experience with > Xen. By working with a partner experienced in certification, the overall cost > of certification would be significantly reduced. The elephant in the room is > funding and a business model (aka all the items listed in > https://docs.google.com/document/d/1HpYzClh0nDEocsUHb17X0DxiehsAb > CgyWE-P2Wk_RNU/edit section 4.1). The reality is that organisations such > as TÜV, MIRA, Dornerworks, ... will need to be paid by someone. Which, I > think we need to park for now. > I wouldn't leave it parked too long. The issues of funding and remuneration will delay/derail progress more than all of the technical challenges combined. I agree. My expectation would be to first see whether we can make progress on 2-3 as this affects code size to be certified. Without knowing how much code we are looking at, it will be impossible to have any credible discussion about funding. I am not intending to delay this discussion: primarily looking at this from a critical path perspective. Regards Lars _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-04-06 17:01 ` Jarvis Roach 2018-04-06 17:23 ` Lars Kurth @ 2018-04-06 17:32 ` Artem Mygaiev 2018-04-06 20:47 ` Stefano Stabellini 1 sibling, 1 reply; 18+ messages in thread From: Artem Mygaiev @ 2018-04-06 17:32 UTC (permalink / raw) To: Jarvis Roach, Lars Kurth, Stefano Stabellini Cc: Edgar E. Iglesias, davorin.mista, robin.randhawa, paul_luperto, Denys Balatsko, Rich Persaud, Jonathan Daugherty, anastassios.nanos, julien.grall, committers, Stewart Hildebrand, xen-devel, vfachin, Volodymyr Babchuk, mirela.simonovic Hi Jarvis On 06.04.18 20:01, Jarvis Roach wrote: >> Hi all, >> >> adding a few more people who are/may be interested in safety certification, >> including committers (because item 1 would have an impact). Specifically: >> Rich Persaud, Paul Luperto, Jonathan Daugherty and Denys Balatsko. >> >> There are a few loose ends and updates from other/similar related threads >> that we should pull into this thread: >> >> a) AGL Whitepaper >> This is out as far as I can tell >> See >> https://docs.google.com/document/d/1HpYzClh0nDEocsUHb17X0DxiehsAb >> CgyWE-P2Wk_RNU/edit# >> Thank you to Rich for driving this and to all the contributors from the Xen >> Community >> >> Related to this is the following item from the original minutes >>> AGL will select 2 hypervisors out of the list. Artem has already an >>> out-of-the-box solution for AGL. Artem will chase up and make sure >>> that Xen will be one of the two. >> >> b) Genivi AMM Hypervisor Workshop, Apr 19 Artem and me will be >> speaking on various Xen related projects. I will send a draft PDF to this list >> later this week. >> Slots are short: 10 minutes + questions each slot See >> https://at.projects.genivi.org/wiki/display/DIRO/Hypervisor+Workshop+Te >> am >> >> c) Xen Specific Automotive Whitepaper >> This was discussed during a) and I think it would be relatively easy to pull >> something together. It would be good if someone else, but me could lead >> this. We have a lot of information already, but more ground-work on safety >> certification may help. Would there be a volunteer driving this? I could be >> used as a vehicle to move some of the items discussed in the minutes >> along. >> >> d) I also created >> https://wiki.xenproject.org/wiki/Category:Safety_Certification to start >> pulling material relevant to safety and context for it into one place. >> It's a little crude at this point in time and I expect this document to evolve >> and split into smaller parts. >> It would be good, if someone on this list could go over >> https://wiki.xenproject.org/wiki/Category:Safety_Certification#Automotive >> _Requirements and map the requirements to functionality we already have. >> This could then feed into c. >> >> Any takers? >> >>> Artem suggested to write a whitepaper about Xen real-time capabilities. >>> Stefano volunteered to help. >> I believe we have some gaps with regards to real-time requirements and >> that paper is aiming to highlight these. >> @Artem: maybe this would be a suitable topic for the developer summit >> (amongst others) As a reminder: the CfP for the summit closes next Friday >> > > One of my engineers has highlighted the need to move Xen to use preemptive locks (similar to what was done with the Linux RT patch updates) before it can be considered hard real-time. Right now we've been pitching it as soft real-time. > >> >>> I contacted Lars (CC'ed) who volunteered to help. >> I am volunteering to act as a program/project manager for this activity. In >> particular to bootstrap. >> >> I think the only practicable way to make progress in this area, is to set up >> some mechanism which allow us to make progress towards the goal of >> making it easier and cheaper to build safety certified variants of Xen. As a >> side-effect of this process we should get data, to scope out the scale of the >> problem further, that should enable getting more vendors interested. >> >>> The main topic of the meeting was certifications for Xen on ARM. The >>> gap analysis document, mentioned in the previous call, is copyrighted. >>> It might not be possible to relicense it. Regardless of the document, >>> we started discussing the major work items and next steps. >> >> @Stefano: Thanks for driving this discussion I re-ordered some of the items, >> to make it more palatable >> >>> 2) Create a subset of functions that need to go through certifications >>> Next step: create a small Kconfig. We could use the Renesas Rcar as >>> reference. We need a discussion about the features we need, for >>> example real-time schedulers, do we need them or not? >> > > Identifying this subset is very important. My recommendation would be to identify the very smallest subset to start with that supports a single, high value use case, which I would suggest is consolidation of Linux and real-time applications with mixed criticality, but not necessarily shared/PV I/O, onto a single processing cluster. Identifying the highest reasonable safety criticality to support would also be very helpful. > Unfortunately in mixed criticality systems (at least in automotive) we see a lot of attention to performance and , so processing cluster partitioning may not be well accepted in the industry > At the Xen level, you might get away with just the null scheduler if VMs are pinned to their own cores (and jitter caused by contention on the bus and in the cache is acceptable). However, to do CAST-32a type scheduling (effectively time slicing the SoC between your VMs), an updated ARINC-653 scheduler would be needed. > We are now looking into RTDS as a possible solution for industrial or automotive domains. Also , from our experience bus/cache contention in systems with high load is actually an issue... Looking into that, too >> >> @Stefano agreed to drive this. >> The minimal configuration does impact 1 and 2, which is why I moved this >> first. >> >> We should probably agree a basic process: aka >> * Measure baseline size in KSLOC >> * Remove some feature >> * Measure reduction in KSLOC >> And record the data somewhere >> >>> 1) Requirements to the code, a subset of MISRA for ASIL B Next step: >>> get more information about requirements and publish it to xen-devel. >> >> I see a few problems here: >> >> * The MISCRA 2012 spec has to be bought and it is rather big (100's of >> pages): >> so, I don't think it is practical to work from the spec >> >> * Some coding style patterns will likely be perceived as odd and >> unreasonable by community members: as some common code would be >> affected we cannot treat this in isolation say on ARM only. Although it is >> recognized that some of the coding style patterns may not make sense, >> compliance to MISRA is necessary and cannot normally be discussed away. >> >> * PRQA has set up an environment and initial MISRA compliance report for >> a Xen on ARM build >> ** The question is what (if anything) can be shared publicly >> ** The other open question is whether we can come to some sort of longer >> term agreement between the Xen Project and PRQA to use their tools >> ** As an aside, what PRQA have done would need to reflect what we do in >> step 2 is. We also want to minimize the work for PRQA: in other words, it >> has to be very simple to enable the minimal config coming out of task 2 >> such that PRQA can >> ** As far as I recall 90% of all MISRA violations come down to around 70 >> issues. A large number are in tools >> ** Also, I believe that MISRA compliance tools will likely lead to a large >> amount of false positives, due to the distributed nature of Xen: process >> boundaries, kernel/user space boundaries, etc. would all lead to false >> positives, which somehow have to be managed. >> >> ACTION => Lars to follow up with Paul Luperto from PRQA >> >> * An approach that may be manageable would be to look at the most >> common MISRA violations and work backwards from there. >> ** This would make the problem more manageable and mean people >> wouldn't have to read a long spec >> ** Discussing a small set of issues, would give us a sense of whether/what >> type of disagreements there are and how we resolve them. >> ** We should focus prioritize based on: >> a) Address/discuss the most frequently occurring issues first >> b) Address/discuss issues in common code first >> >> At the very least (and for now in absence of the capability to check >> compliance), I would need someone who has access to MISRA compliance >> tools, to drive such an effort. >> >>> 3) Understand how to address dom0. FreeRTOS Dom0 sounds like a good >>> solution. >>> Next step: reach out to Dornerworks and/or others that worked with >>> FreeRTOS on Xen before. Figure out whether FreeRTOS is actually a >>> suitable solution and what needs to be done to run FreeRTOS as Dom0. >> >> Some things to check at this stage: >> a) I believe there is a safety certified version of FreeRTOS - I could not find >> much, except for https://www.freertos.org/FreeRTOS- >> Plus/Safety_Critical_Certified/SafeRTOS-Safety-Critical-Certification.shtml - >> which describes SafeRTOS a commercial safety certified FreeRTOS and >> (mostly) API compliant version of FreeRTOS. Or am I missing something >> here? >> b) There is a DomU capable version from Galois (Jonathan Docherty CC'ed) - >> I don't know whether others also have such versions > > I ported the version of FreeRTOS that Xilinx distributes with their SDK to run as a domU on the ZUS+ in 2016 and round tripped the change set back to Richard Barry. > I've also heard interest in running RTEMS as a guest OS. > We've had experience in running QNX in domu, but that was not very welcomed by BB QSSL folks back then :) They dont really like OSS > Since I do not think that a previously certified OS will be available for free, I see 3 general approaches wrt dom0: > 1) Find and certify an open source OS. My guess is this will not be Linux due to code base size. POSIX support a plus. > 2) Use a commercially available, previously certified OS for dom0. DW ported VxWorks to run on Xen in 2017 and uc/OS-III in 2016. > 3) Go with a dom0-less solution; bootloader starts up the necessary VMs based on a static configuration. > > The XL toolstack in its current form will likely cause cert issues and will probably need to be stripped down and/or rewritten. > Bootloader (U-Boot, GRUB, or whatever) will also need to be certified. > We'd like to explore both FreeRTOS in dom0 and dom0-less options. I think there were some patches while ago for dom0-less xen. >> c) There is a POXIX wrapper, which may be needed, but it is unclear what >> this would do to the FreeRTOS footprint >> d) In other words, what we would have to do is to investigate whether it is >> possible to build to a Dom0 capable FreeRTOS >> >> I see several ways of approaching this: >> a) A vendor (or groups of vendors) on this list steps up >> b) We go initially for a lower bar: aka we try and scope out and cost the >> creation of a Dom0 capable FreeRTOS and then look at how the work can >> get funded >> >> A very good starting point would be to get a list of parties that are >> interested in having and using a FreeRTOS based Dom0 (regardless of how >> we get there). A show of hands would be good. > > DW is interested in participating with exploring ways to solve the dom0 problem (be it FreeRTOS or other approaches). > >> Some insights from anyone on the FreeRTOS/SafeRTOS relationship and >> politics would be good also. Unless there is a route from FreeRTOS >> upstream to the certified version, someone in our eco-system would have >> to safety certify FreeRTOS (which may not be such a big deal given the fairly >> small size of FreeRTOS). >> > > This link helps explains the relationship between FreeRTOS and SafeRTOS: > https://www.highintegritysystems.com/safertos/upgrade-from-freertos-to-safertos/ > >> >>> 4) Create artifacts, such as docs, fault analysis, prove fault >>> tolerance, safety management docs, development processes. >>> Next step: we need to bring in a company, a certification body, to >>> guide us through the process. >> >> We have companies such as Dornerworks on this list which are experienced >> with safety certification on Xen for some safety standards: it is not clear to >> me how much of this is transferable to automotive. >> > > Papers have shown that there is a lot of overlap between the artifacts and processes defined in different safety standards. > > However, someone with more experience in automotive safety should speak to the concern of non-determinism/jitter in that market. Aviation certification authorities are practically rabid about it, and you have to go to great lengths to satisfy them (disable interrupts, flush cache between partitions, prove that silicon vendor's secret features are all disabled, etc) which might be overkill for automotive. > Indeed, we need to analyze safety in different domains, but at least all derivatives from IEC 61508 (ISO 26262, etc.) have common baseline. I am not sure about medical/aerospace/military though - we do not have expertise there. >> >> Here my understanding is that we need a certification partner like TÜV, >> MIRA or a company like Dornerworks who already have experience with >> Xen. By working with a partner experienced in certification, the overall cost >> of certification would be significantly reduced. The elephant in the room is >> funding and a business model (aka all the items listed in >> https://docs.google.com/document/d/1HpYzClh0nDEocsUHb17X0DxiehsAb >> CgyWE-P2Wk_RNU/edit section 4.1). The reality is that organisations such >> as TÜV, MIRA, Dornerworks, ... will need to be paid by someone. Which, I >> think we need to park for now. >> > > I wouldn't leave it parked too long. The issues of funding and remuneration will delay/derail progress more than all of the technical challenges combined. > >> >> What I think are sensible goals for now are >> a) Establish a list of potential partners and start establishing contacts - such >> conversations would need to be led by a vendor, otherwise it will go >> nowhere. What would be good though is to have a shared (but possibly >> private) repository of how these conversations have gone. >> b) Otherwise focus on tasks 1-3 which deal with some issues listed in >> https://www.slideshare.net/xen_com_mgr/art-certification, which is still >> very valid >> c) Engage/work with with other groups (AGL, Genivi, Linaro) who are also >> looking at this problem >> >> It may be worth in the mid-term to consider some sort of pilot around a >> small portion of the Xen codebase: the aim would be to gather data that >> helps establish what can be done in a collaborative FOSS environment. >> >> Feedback/views are very welcome >> >> Regards >> Lars >> >> > > Cheers! > -Jarvis > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-04-06 17:32 ` Artem Mygaiev @ 2018-04-06 20:47 ` Stefano Stabellini 2018-04-11 16:19 ` Artem Mygaiev ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Stefano Stabellini @ 2018-04-06 20:47 UTC (permalink / raw) To: Artem Mygaiev Cc: Edgar E. Iglesias, Lars Kurth, davorin.mista, Stefano Stabellini, paul_luperto, Denys Balatsko, Rich Persaud, Jonathan Daugherty, anastassios.nanos, julien.grall, robin.randhawa, committers, Stewart Hildebrand, xen-devel, vfachin, Volodymyr Babchuk, mirela.simonovic, Jarvis Roach On Fri, 6 Apr 2018, Artem Mygaiev wrote: > > > > 2) Create a subset of functions that need to go through certifications > > > > Next step: create a small Kconfig. We could use the Renesas Rcar as > > > > reference. We need a discussion about the features we need, for > > > > example real-time schedulers, do we need them or not? > > > > > > > Identifying this subset is very important. My recommendation would be to > > identify the very smallest subset to start with that supports a single, high > > value use case, which I would suggest is consolidation of Linux and > > real-time applications with mixed criticality, but not necessarily shared/PV > > I/O, onto a single processing cluster. Identifying the highest reasonable > > safety criticality to support would also be very helpful. > > > > Unfortunately in mixed criticality systems (at least in automotive) we see a > lot of attention to performance and , so processing cluster partitioning may > not be well accepted in the industry Sorry, I didn't quite understand your comment. Are you saying that statically partitioning a cluster into VMs, for example with vcpu-pinning or the null scheduler, in a way to have a total number of vcpus equal to the total number of pcpus, is not acceptable because it leads to lower hardware utilization? We need nr_vcpus > nr_pcpus? > > At the Xen level, you might get away with just the null scheduler if VMs are > > pinned to their own cores (and jitter caused by contention on the bus and in > > the cache is acceptable). However, to do CAST-32a type scheduling > > (effectively time slicing the SoC between your VMs), an updated ARINC-653 > > scheduler would be needed. > > > > We are now looking into RTDS as a possible solution for industrial or > automotive domains. Also , from our experience bus/cache contention in systems > with high load is actually an issue... Looking into that, too Bus/cache contention is where issues can become very board specific. It is also why we'll need to narrow down a small set of boards initially. > > > > > > @Stefano agreed to drive this. > > > The minimal configuration does impact 1 and 2, which is why I moved this > > > first. > > > > > > We should probably agree a basic process: aka > > > * Measure baseline size in KSLOC > > > * Remove some feature > > > * Measure reduction in KSLOC > > > And record the data somewhere I am happy to drive the discussion. I was already planning to submit a small kconfig and a LOC counter to the Xen build. I wrote down my name on the wikipage next to this item. I understand that good real-time support is critical in the provided configuration. I am happy to work with others to help improve it. > > > > 1) Requirements to the code, a subset of MISRA for ASIL B Next step: > > > > get more information about requirements and publish it to xen-devel. > > > > > > I see a few problems here: > > > > > > * The MISCRA 2012 spec has to be bought and it is rather big (100's of > > > pages): > > > so, I don't think it is practical to work from the spec > > > > > > * Some coding style patterns will likely be perceived as odd and > > > unreasonable by community members: as some common code would be > > > affected we cannot treat this in isolation say on ARM only. Although it is > > > recognized that some of the coding style patterns may not make sense, > > > compliance to MISRA is necessary and cannot normally be discussed away. > > > > > > * PRQA has set up an environment and initial MISRA compliance report for > > > a Xen on ARM build > > > ** The question is what (if anything) can be shared publicly > > > ** The other open question is whether we can come to some sort of longer > > > term agreement between the Xen Project and PRQA to use their tools > > > ** As an aside, what PRQA have done would need to reflect what we do in > > > step 2 is. We also want to minimize the work for PRQA: in other words, it > > > has to be very simple to enable the minimal config coming out of task 2 > > > such that PRQA can > > > ** As far as I recall 90% of all MISRA violations come down to around 70 > > > issues. A large number are in tools > > > ** Also, I believe that MISRA compliance tools will likely lead to a large > > > amount of false positives, due to the distributed nature of Xen: process > > > boundaries, kernel/user space boundaries, etc. would all lead to false > > > positives, which somehow have to be managed. > > > > > > ACTION => Lars to follow up with Paul Luperto from PRQA > > > > > > * An approach that may be manageable would be to look at the most > > > common MISRA violations and work backwards from there. > > > ** This would make the problem more manageable and mean people > > > wouldn't have to read a long spec > > > ** Discussing a small set of issues, would give us a sense of whether/what > > > type of disagreements there are and how we resolve them. > > > ** We should focus prioritize based on: > > > a) Address/discuss the most frequently occurring issues first > > > b) Address/discuss issues in common code first > > > > > > At the very least (and for now in absence of the capability to check > > > compliance), I would need someone who has access to MISRA compliance > > > tools, to drive such an effort. I wrote "Lars" near this item in https://wiki.xenproject.org/wiki/Safety_Certification_Challenges, just as a reference to where the ball is at the moment. > > > > 3) Understand how to address dom0. FreeRTOS Dom0 sounds like a good > > > > solution. > > > > Next step: reach out to Dornerworks and/or others that worked with > > > > FreeRTOS on Xen before. Figure out whether FreeRTOS is actually a > > > > suitable solution and what needs to be done to run FreeRTOS as Dom0. > > > > > > Some things to check at this stage: > > > a) I believe there is a safety certified version of FreeRTOS - I could not > > > find > > > much, except for https://www.freertos.org/FreeRTOS- > > > Plus/Safety_Critical_Certified/SafeRTOS-Safety-Critical-Certification.shtml > > > - > > > which describes SafeRTOS a commercial safety certified FreeRTOS and > > > (mostly) API compliant version of FreeRTOS. Or am I missing something > > > here? > > > b) There is a DomU capable version from Galois (Jonathan Docherty CC'ed) - > > > I don't know whether others also have such versions > > > > I ported the version of FreeRTOS that Xilinx distributes with their SDK to > > run as a domU on the ZUS+ in 2016 and round tripped the change set back to > > Richard Barry. > > I've also heard interest in running RTEMS as a guest OS. > > > > We've had experience in running QNX in domu, but that was not very welcomed by > BB QSSL folks back then :) They dont really like OSS > > > Since I do not think that a previously certified OS will be available for > > free, I see 3 general approaches wrt dom0: > > 1) Find and certify an open source OS. My guess is this will not be Linux > > due to code base size. POSIX support a plus. > > 2) Use a commercially available, previously certified OS for dom0. DW ported > > VxWorks to run on Xen in 2017 and uc/OS-III in 2016. > > 3) Go with a dom0-less solution; bootloader starts up the necessary VMs > > based on a static configuration. > > > > The XL toolstack in its current form will likely cause cert issues and will > > probably need to be stripped down and/or rewritten. > > Bootloader (U-Boot, GRUB, or whatever) will also need to be certified. > > > > We'd like to explore both FreeRTOS in dom0 and dom0-less options. I think > there were some patches while ago for dom0-less xen. "Dom0-less" is a great name actually :-) Up until now, we discussed this topic under the name of "create multiple guests from device tree". There are no patches (as far as I know), but it was submitted as the Xen on ARM project for Outreachy this year. There are patches for a different project to setup shared memory regions from the xl config file (no need for grant table or xenbus support). > We plan to analyze efforts to port FreeRTOS as dom0 OS Great! I think it makes sense to start from that. I wrote "Artem" down in the wikipage (https://wiki.xenproject.org/wiki/Safety_Certification_Challenges) as the reference contact for the dom0 stuff. Keep us in the loop as Julien and I are very interested in it. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-04-06 20:47 ` Stefano Stabellini @ 2018-04-11 16:19 ` Artem Mygaiev 2018-04-12 18:38 ` Praveen Kumar 2018-05-08 0:11 ` Stefano Stabellini 2 siblings, 0 replies; 18+ messages in thread From: Artem Mygaiev @ 2018-04-11 16:19 UTC (permalink / raw) To: Stefano Stabellini Cc: Edgar E. Iglesias, Lars Kurth, davorin.mista, robin.randhawa, paul_luperto, Denys Balatsko, Rich Persaud, Jonathan Daugherty, anastassios.nanos, julien.grall, committers, Stewart Hildebrand, xen-devel, vfachin, Volodymyr Babchuk, mirela.simonovic, Jarvis Roach Hi Stefano On 06.04.18 23:47, Stefano Stabellini wrote: > On Fri, 6 Apr 2018, Artem Mygaiev wrote: >>>>> 2) Create a subset of functions that need to go through certifications >>>>> Next step: create a small Kconfig. We could use the Renesas Rcar as >>>>> reference. We need a discussion about the features we need, for >>>>> example real-time schedulers, do we need them or not? >>>> >>> >>> Identifying this subset is very important. My recommendation would be to >>> identify the very smallest subset to start with that supports a single, high >>> value use case, which I would suggest is consolidation of Linux and >>> real-time applications with mixed criticality, but not necessarily shared/PV >>> I/O, onto a single processing cluster. Identifying the highest reasonable >>> safety criticality to support would also be very helpful. >>> >> >> Unfortunately in mixed criticality systems (at least in automotive) we see a >> lot of attention to performance and , so processing cluster partitioning may >> not be well accepted in the industry > > Sorry, I didn't quite understand your comment. Are you saying that > statically partitioning a cluster into VMs, for example with > vcpu-pinning or the null scheduler, in a way to have a total number of > vcpus equal to the total number of pcpus, is not acceptable because it > leads to lower hardware utilization? We need nr_vcpus > nr_pcpus? > Yep. In other words, OEMs want to use as much as possible of HW they have. > >>> At the Xen level, you might get away with just the null scheduler if VMs are >>> pinned to their own cores (and jitter caused by contention on the bus and in >>> the cache is acceptable). However, to do CAST-32a type scheduling >>> (effectively time slicing the SoC between your VMs), an updated ARINC-653 >>> scheduler would be needed. >>> >> >> We are now looking into RTDS as a possible solution for industrial or >> automotive domains. Also , from our experience bus/cache contention in systems >> with high load is actually an issue... Looking into that, too > > Bus/cache contention is where issues can become very board specific. It > is also why we'll need to narrow down a small set of boards initially. > We'd like to do a bit more analysis before deciding... I am not very convinced with numbers yet. >>> Since I do not think that a previously certified OS will be available for >>> free, I see 3 general approaches wrt dom0: >>> 1) Find and certify an open source OS. My guess is this will not be Linux >>> due to code base size. POSIX support a plus. >>> 2) Use a commercially available, previously certified OS for dom0. DW ported >>> VxWorks to run on Xen in 2017 and uc/OS-III in 2016. >>> 3) Go with a dom0-less solution; bootloader starts up the necessary VMs >>> based on a static configuration. >>> >>> The XL toolstack in its current form will likely cause cert issues and will >>> probably need to be stripped down and/or rewritten. >>> Bootloader (U-Boot, GRUB, or whatever) will also need to be certified. >>> >> >> We'd like to explore both FreeRTOS in dom0 and dom0-less options. I think >> there were some patches while ago for dom0-less xen. > > "Dom0-less" is a great name actually :-) > > Up until now, we discussed this topic under the name of "create multiple > guests from device tree". There are no patches (as far as I know), but > it was submitted as the Xen on ARM project for Outreachy this year. > There are patches for a different project to setup shared memory regions > from the xl config file (no need for grant table or xenbus support). > Do you have anyone interested in taking this task? > >> We plan to analyze efforts to port FreeRTOS as dom0 OS > > Great! I think it makes sense to start from that. I wrote "Artem" down > in the wikipage > (https://wiki.xenproject.org/wiki/Safety_Certification_Challenges) as > the reference contact for the dom0 stuff. Keep us in the loop as Julien > and I are very interested in it. > Sure! -- Artem _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-04-06 20:47 ` Stefano Stabellini 2018-04-11 16:19 ` Artem Mygaiev @ 2018-04-12 18:38 ` Praveen Kumar 2018-05-08 0:11 ` Stefano Stabellini 2 siblings, 0 replies; 18+ messages in thread From: Praveen Kumar @ 2018-04-12 18:38 UTC (permalink / raw) To: Stefano Stabellini Cc: Artem Mygaiev, Edgar E. Iglesias, davorin.mista, robin.randhawa, paul_luperto, Lars Kurth, Volodymyr Babchuk, julien.grall, mirela.simonovic, Stewart Hildebrand, Rich Persaud, committers, anastassios.nanos, xen-devel, Jonathan Daugherty, Denys Balatsko, vfachin, Jarvis Roach Hi All, >> We'd like to explore both FreeRTOS in dom0 and dom0-less options. I think >> there were some patches while ago for dom0-less xen. > > "Dom0-less" is a great name actually :-) > > Up until now, we discussed this topic under the name of "create multiple > guests from device tree". There are no patches (as far as I know), but > it was submitted as the Xen on ARM project for Outreachy this year. > There are patches for a different project to setup shared memory regions > from the xl config file (no need for grant table or xenbus support). > > I have been in discussion with Stefano over this topic and would be interested to take this up. Regards, ~Praveen. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-04-06 20:47 ` Stefano Stabellini 2018-04-11 16:19 ` Artem Mygaiev 2018-04-12 18:38 ` Praveen Kumar @ 2018-05-08 0:11 ` Stefano Stabellini 2018-05-08 13:39 ` Julien Grall 2 siblings, 1 reply; 18+ messages in thread From: Stefano Stabellini @ 2018-05-08 0:11 UTC (permalink / raw) To: Stefano Stabellini Cc: Artem Mygaiev, Lars Kurth, davorin.mista, Edgar E. Iglesias, paul_luperto, Denys Balatsko, Rich Persaud, Jonathan Daugherty, anastassios.nanos, julien.grall, robin.randhawa, committers, Stewart Hildebrand, xen-devel, vfachin, Volodymyr Babchuk, mirela.simonovic, Jarvis Roach On Fri, 6 Apr 2018, Stefano Stabellini wrote: > > > > > 3) Understand how to address dom0. FreeRTOS Dom0 sounds like a good > > > > > solution. > > > > > Next step: reach out to Dornerworks and/or others that worked with > > > > > FreeRTOS on Xen before. Figure out whether FreeRTOS is actually a > > > > > suitable solution and what needs to be done to run FreeRTOS as Dom0. > > > > > > > > Some things to check at this stage: > > > > a) I believe there is a safety certified version of FreeRTOS - I could not > > > > find > > > > much, except for https://www.freertos.org/FreeRTOS- > > > > Plus/Safety_Critical_Certified/SafeRTOS-Safety-Critical-Certification.shtml > > > > - > > > > which describes SafeRTOS a commercial safety certified FreeRTOS and > > > > (mostly) API compliant version of FreeRTOS. Or am I missing something > > > > here? > > > > b) There is a DomU capable version from Galois (Jonathan Docherty CC'ed) - > > > > I don't know whether others also have such versions > > > > > > I ported the version of FreeRTOS that Xilinx distributes with their SDK to > > > run as a domU on the ZUS+ in 2016 and round tripped the change set back to > > > Richard Barry. > > > I've also heard interest in running RTEMS as a guest OS. > > > > > > > We've had experience in running QNX in domu, but that was not very welcomed by > > BB QSSL folks back then :) They dont really like OSS One more option (apparently taken by others) is to demonstrate that after boot Dom0 cannot affect the system anymore. To do that, we would have to get rid of Dom0 entirely after booting all domains, or, deprivilege/restrict its possible effects on the system. Something like turning Dom0 into a DomU after booting all the other guests. This might actually be easier to achieve than "dom0-less" or using FreeRTOS as dom0. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-05-08 0:11 ` Stefano Stabellini @ 2018-05-08 13:39 ` Julien Grall 2018-05-08 15:49 ` Stefano Stabellini 0 siblings, 1 reply; 18+ messages in thread From: Julien Grall @ 2018-05-08 13:39 UTC (permalink / raw) To: Stefano Stabellini Cc: Artem Mygaiev, Lars Kurth, davorin.mista, Edgar E. Iglesias, paul_luperto, Denys Balatsko, Jonathan Daugherty, Stewart Hildebrand, Rich Persaud, robin.randhawa, committers, anastassios.nanos, xen-devel, vfachin, Volodymyr Babchuk, mirela.simonovic, Jarvis Roach Hi Stefano, On 08/05/18 01:11, Stefano Stabellini wrote: > On Fri, 6 Apr 2018, Stefano Stabellini wrote: >>>>>> 3) Understand how to address dom0. FreeRTOS Dom0 sounds like a good >>>>>> solution. >>>>>> Next step: reach out to Dornerworks and/or others that worked with >>>>>> FreeRTOS on Xen before. Figure out whether FreeRTOS is actually a >>>>>> suitable solution and what needs to be done to run FreeRTOS as Dom0. >>>>> >>>>> Some things to check at this stage: >>>>> a) I believe there is a safety certified version of FreeRTOS - I could not >>>>> find >>>>> much, except for https://www.freertos.org/FreeRTOS- >>>>> Plus/Safety_Critical_Certified/SafeRTOS-Safety-Critical-Certification.shtml >>>>> - >>>>> which describes SafeRTOS a commercial safety certified FreeRTOS and >>>>> (mostly) API compliant version of FreeRTOS. Or am I missing something >>>>> here? >>>>> b) There is a DomU capable version from Galois (Jonathan Docherty CC'ed) - >>>>> I don't know whether others also have such versions >>>> >>>> I ported the version of FreeRTOS that Xilinx distributes with their SDK to >>>> run as a domU on the ZUS+ in 2016 and round tripped the change set back to >>>> Richard Barry. >>>> I've also heard interest in running RTEMS as a guest OS. >>>> >>> >>> We've had experience in running QNX in domu, but that was not very welcomed by >>> BB QSSL folks back then :) They dont really like OSS > > One more option (apparently taken by others) is to demonstrate that > after boot Dom0 cannot affect the system anymore. Can you describe what you mean by "affecting the system anymore". > To do that, we would > have to get rid of Dom0 entirely after booting all domains, or, > deprivilege/restrict its possible effects on the system. Something like > turning Dom0 into a DomU after booting all the other guests. > This might actually be easier to achieve than "dom0-less" or using > FreeRTOS as dom0. Other than accessing the hypercall, there are few other way for Dom0 to affect the platform: - Dom0 by default has access to all the hardware but the one assigned to DomUs. Those hardware may give the possibility to affect the platform irreversibly (or even rebooting). - Not all DMA-capable devices are today protected by an IOMMU You probably can create something similar to the hardware domain as on x86 (i.e all the hardware is owned by a separate domain other than Dom0), but then it is only shifting the problem. However, you surely need an entity to handle domain crash. You don't want to reboot your platform (and therefore you safety critical domain) for a crashed UI, right? So how this is going to be handled in your option? Cheers, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-05-08 13:39 ` Julien Grall @ 2018-05-08 15:49 ` Stefano Stabellini 2018-05-10 4:55 ` Praveen Kumar 0 siblings, 1 reply; 18+ messages in thread From: Stefano Stabellini @ 2018-05-08 15:49 UTC (permalink / raw) To: Julien Grall Cc: Artem Mygaiev, Lars Kurth, davorin.mista, Stefano Stabellini, Edgar E. Iglesias, paul_luperto, Denys Balatsko, Jonathan Daugherty, anastassios.nanos, Rich Persaud, robin.randhawa, committers, Stewart Hildebrand, xen-devel, vfachin, Volodymyr Babchuk, mirela.simonovic, Jarvis Roach On Tue, 8 May 2018, Julien Grall wrote: > Hi Stefano, > > On 08/05/18 01:11, Stefano Stabellini wrote: > > On Fri, 6 Apr 2018, Stefano Stabellini wrote: > > > > > > > 3) Understand how to address dom0. FreeRTOS Dom0 sounds like a > > > > > > > good > > > > > > > solution. > > > > > > > Next step: reach out to Dornerworks and/or others that worked with > > > > > > > FreeRTOS on Xen before. Figure out whether FreeRTOS is actually a > > > > > > > suitable solution and what needs to be done to run FreeRTOS as > > > > > > > Dom0. > > > > > > > > > > > > Some things to check at this stage: > > > > > > a) I believe there is a safety certified version of FreeRTOS - I > > > > > > could not > > > > > > find > > > > > > much, except for https://www.freertos.org/FreeRTOS- > > > > > > Plus/Safety_Critical_Certified/SafeRTOS-Safety-Critical-Certification.shtml > > > > > > - > > > > > > which describes SafeRTOS a commercial safety certified FreeRTOS and > > > > > > (mostly) API compliant version of FreeRTOS. Or am I missing > > > > > > something > > > > > > here? > > > > > > b) There is a DomU capable version from Galois (Jonathan Docherty > > > > > > CC'ed) - > > > > > > I don't know whether others also have such versions > > > > > > > > > > I ported the version of FreeRTOS that Xilinx distributes with their > > > > > SDK to > > > > > run as a domU on the ZUS+ in 2016 and round tripped the change set > > > > > back to > > > > > Richard Barry. > > > > > I've also heard interest in running RTEMS as a guest OS. > > > > > > > > > > > > > We've had experience in running QNX in domu, but that was not very > > > > welcomed by > > > > BB QSSL folks back then :) They dont really like OSS > > > > One more option (apparently taken by others) is to demonstrate that > > after boot Dom0 cannot affect the system anymore. > > Can you describe what you mean by "affecting the system anymore". I don't actually know: I have been told that this is a strategy pursued by other hypervisors. I guess we'll find out more details as we get more familiar with the certification requirements. > > To do that, we would > > have to get rid of Dom0 entirely after booting all domains, or, > > deprivilege/restrict its possible effects on the system. Something like > > turning Dom0 into a DomU after booting all the other guests. > > This might actually be easier to achieve than "dom0-less" or using > > FreeRTOS as dom0. > > Other than accessing the hypercall, there are few other way for Dom0 to affect > the platform: > - Dom0 by default has access to all the hardware but the one assigned > to DomUs. Those hardware may give the possibility to affect the > platform irreversibly (or even rebooting). > - Not all DMA-capable devices are today protected by an IOMMU > > You probably can create something similar to the hardware domain as on x86 > (i.e all the hardware is owned by a separate domain other than Dom0), but then > it is only shifting the problem. Yeah, you are right. It looks like turning Dom0 into a DomU is not good enough. Maybe for this option to be viable we would actually have to terminate (or pause and never unpause?) dom0 after boot. > However, you surely need an entity to handle domain crash. You don't want to > reboot your platform (and therefore you safety critical domain) for a crashed > UI, right? So how this is going to be handled in your option? We need to understand the certification requirements better to know the answer to this. I am guessing that UI crashes are not handled from the certification point of view -- maybe we only need to demonstrate that the system is not affected by them? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-05-08 15:49 ` Stefano Stabellini @ 2018-05-10 4:55 ` Praveen Kumar 2018-05-10 19:51 ` Stefano Stabellini 0 siblings, 1 reply; 18+ messages in thread From: Praveen Kumar @ 2018-05-10 4:55 UTC (permalink / raw) To: Stefano Stabellini Cc: Artem Mygaiev, Lars Kurth, Davorin Mista, Edgar E. Iglesias, paul_luperto, Volodymyr Babchuk, Rich Persaud, Mirela Simonović, Stewart Hildebrand, julien.grall, robin.randhawa, committers, anastassios.nanos, xen-devel, Jonathan Daugherty, Denys Balatsko, vfachin, Jarvis Roach Hi, On Tue, May 8, 2018 at 9:20 PM Stefano Stabellini <sstabellini@kernel.org> wrote: > On Tue, 8 May 2018, Julien Grall wrote: > > Hi Stefano, > > > > On 08/05/18 01:11, Stefano Stabellini wrote: > > > On Fri, 6 Apr 2018, Stefano Stabellini wrote: > > > > > > > > 3) Understand how to address dom0. FreeRTOS Dom0 sounds like a > > > > > > > > good > > > > > > > > solution. > > > > > > > > Next step: reach out to Dornerworks and/or others that worked with > > > > > > > > FreeRTOS on Xen before. Figure out whether FreeRTOS is actually a > > > > > > > > suitable solution and what needs to be done to run FreeRTOS as > > > > > > > > Dom0. > > > > > > > > > > > > > > Some things to check at this stage: > > > > > > > a) I believe there is a safety certified version of FreeRTOS - I > > > > > > > could not > > > > > > > find > > > > > > > much, except for https://www.freertos.org/FreeRTOS- > > > > > > > Plus/Safety_Critical_Certified/SafeRTOS-Safety-Critical-Certification.shtml > > > > > > > - > > > > > > > which describes SafeRTOS a commercial safety certified FreeRTOS and > > > > > > > (mostly) API compliant version of FreeRTOS. Or am I missing > > > > > > > something > > > > > > > here? > > > > > > > b) There is a DomU capable version from Galois (Jonathan Docherty > > > > > > > CC'ed) - > > > > > > > I don't know whether others also have such versions > > > > > > > > > > > > I ported the version of FreeRTOS that Xilinx distributes with their > > > > > > SDK to > > > > > > run as a domU on the ZUS+ in 2016 and round tripped the change set > > > > > > back to > > > > > > Richard Barry. > > > > > > I've also heard interest in running RTEMS as a guest OS. > > > > > > > > > > > > > > > > We've had experience in running QNX in domu, but that was not very > > > > > welcomed by > > > > > BB QSSL folks back then :) They dont really like OSS > > > > > > One more option (apparently taken by others) is to demonstrate that > > > after boot Dom0 cannot affect the system anymore. > > > > Can you describe what you mean by "affecting the system anymore". > I don't actually know: I have been told that this is a strategy pursued > by other hypervisors. I guess we'll find out more details as we get more > familiar with the certification requirements. > > > To do that, we would > > > have to get rid of Dom0 entirely after booting all domains, or, > > > deprivilege/restrict its possible effects on the system. Something like > > > turning Dom0 into a DomU after booting all the other guests. > > > This might actually be easier to achieve than "dom0-less" or using > > > FreeRTOS as dom0. > > > > Other than accessing the hypercall, there are few other way for Dom0 to affect > > the platform: > > - Dom0 by default has access to all the hardware but the one assigned > > to DomUs. Those hardware may give the possibility to affect the > > platform irreversibly (or even rebooting). > > - Not all DMA-capable devices are today protected by an IOMMU > > > > You probably can create something similar to the hardware domain as on x86 > > (i.e all the hardware is owned by a separate domain other than Dom0), but then > > it is only shifting the problem. > Yeah, you are right. It looks like turning Dom0 into a DomU is not good > enough. Maybe for this option to be viable we would actually have to > terminate (or pause and never unpause?) dom0 after boot. Just a thought ! How about keeping Dom0 still be there, but DomUs given Dom0 privilege, with restricted permission on mission critical resources ? And if anyhow Dom0 crashes, the best contended among the existing DomUs take the ownership of Dom0 ? > > However, you surely need an entity to handle domain crash. You don't want to > > reboot your platform (and therefore you safety critical domain) for a crashed > > UI, right? So how this is going to be handled in your option? > We need to understand the certification requirements better to know the > answer to this. I am guessing that UI crashes are not handled from the > certification point of view -- maybe we only need to demonstrate that > the system is not affected by them? Where can we find the certification requirements details ? > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-05-10 4:55 ` Praveen Kumar @ 2018-05-10 19:51 ` Stefano Stabellini 2018-05-12 17:38 ` Rich Persaud 2018-05-15 8:54 ` Artem Mygaiev 0 siblings, 2 replies; 18+ messages in thread From: Stefano Stabellini @ 2018-05-10 19:51 UTC (permalink / raw) To: Praveen Kumar Cc: Artem Mygaiev, Lars Kurth, Davorin Mista, Stefano Stabellini, Edgar E. Iglesias, paul_luperto, Volodymyr Babchuk, Rich Persaud, Mirela Simonović, Stewart Hildebrand, julien.grall, robin.randhawa, committers, anastassios.nanos, xen-devel, Jonathan Daugherty, Denys Balatsko, vfachin, Jarvis Roach On Thu, 10 May 2018, Praveen Kumar wrote: > > Yeah, you are right. It looks like turning Dom0 into a DomU is not good > > enough. Maybe for this option to be viable we would actually have to > > terminate (or pause and never unpause?) dom0 after boot. > > Just a thought ! > How about keeping Dom0 still be there, but DomUs given Dom0 privilege, with > restricted permission on mission critical resources ? And if anyhow Dom0 > crashes, > the best contended among the existing DomUs take the ownership of Dom0 ? I don't think this is easily doable, also it wouldn't solve the issue of removing dom0 from the system. But see below. > > > However, you surely need an entity to handle domain crash. You don't > want to > > > reboot your platform (and therefore you safety critical domain) for a > crashed > > > UI, right? So how this is going to be handled in your option? > > > We need to understand the certification requirements better to know the > > answer to this. I am guessing that UI crashes are not handled from the > > certification point of view -- maybe we only need to demonstrate that > > the system is not affected by them? > > Where can we find the certification requirements details ? Yes, I think we need to understand the requirements better to figure out the right way forward for Dom0. For instance, here is another idea: we could have Xen boot multiple domains at boot time from device tree, as suggested in the dom0-less approach. All of the domains booted from Xen are "mission-critical". The first domain could still be dom0. Once booted, Dom0 can start other VMs, however, Xen would restrict Dom0 from doing any operations affecting the first set of mission-critical domains. This way, we would get the flexibility of being able to start/stop domains at run time, but at the same time we might still be able to avoid certifications for Dom0, because Dom0 cannot affect the mission critical applications. Is this approach actually feasible? We need to read the requirements to know. I am hoping Artem will chime in on this :-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-05-10 19:51 ` Stefano Stabellini @ 2018-05-12 17:38 ` Rich Persaud 2018-05-15 8:54 ` Artem Mygaiev 1 sibling, 0 replies; 18+ messages in thread From: Rich Persaud @ 2018-05-12 17:38 UTC (permalink / raw) To: Stefano Stabellini Cc: Artem Mygaiev, Lars Kurth, Davorin Mista, Edgar E. Iglesias, dgdegra, paul_luperto, Denys Balatsko, julien.grall, Rich Persaud, vfachin, anastassios.nanos, Praveen Kumar, robin.randhawa, committers, Stewart Hildebrand, xen-devel, Jonathan Daugherty, Volodymyr Babchuk, Mirela Simonović, Jarvis Roach [-- Attachment #1.1: Type: text/plain, Size: 2516 bytes --] > On May 10, 2018, at 15:51, Stefano Stabellini <sstabellini@kernel.org> wrote: > > On Thu, 10 May 2018, Praveen Kumar wrote: >>> Yeah, you are right. It looks like turning Dom0 into a DomU is not good >>> enough. Maybe for this option to be viable we would actually have to >>> terminate (or pause and never unpause?) dom0 after boot. >> >> Just a thought ! >> How about keeping Dom0 still be there, but DomUs given Dom0 privilege, with >> restricted permission on mission critical resources ? And if anyhow Dom0 >> crashes, >> the best contended among the existing DomUs take the ownership of Dom0 ? > > I don't think this is easily doable, also it wouldn't solve the issue of > removing dom0 from the system. But see below. > > >>>> However, you surely need an entity to handle domain crash. You don't >> want to >>>> reboot your platform (and therefore you safety critical domain) for a >> crashed >>>> UI, right? So how this is going to be handled in your option? >> >>> We need to understand the certification requirements better to know the >>> answer to this. I am guessing that UI crashes are not handled from the >>> certification point of view -- maybe we only need to demonstrate that >>> the system is not affected by them? >> >> Where can we find the certification requirements details ? > > Yes, I think we need to understand the requirements better to figure out > the right way forward for Dom0. > > For instance, here is another idea: we could have Xen boot multiple > domains at boot time from device tree, as suggested in the dom0-less > approach. All of the domains booted from Xen are "mission-critical". The > first domain could still be dom0. Once booted, Dom0 can start other VMs, > however, Xen would restrict Dom0 from doing any operations affecting the > first set of mission-critical domains. > > This way, we would get the flexibility of being able to start/stop > domains at run time, but at the same time we might still be able to > avoid certifications for Dom0, because Dom0 cannot affect the mission > critical applications. > > Is this approach actually feasible? We need to read the requirements to > know. I am hoping Artem will chime in on this :-) Is any of the x86 hardware domain (non dom0) work applicable to Arm? https://lists.xenproject.org/archives/html/xen-devel/2014-03/msg00314.html Daniel is giving a talk on TCB reduction with a Xen hardware domain: http://platformsecuritysummit.com/#degraaf Rich [-- Attachment #1.2: Type: text/html, Size: 5120 bytes --] [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-05-10 19:51 ` Stefano Stabellini 2018-05-12 17:38 ` Rich Persaud @ 2018-05-15 8:54 ` Artem Mygaiev 2018-05-22 12:08 ` Jarvis Roach 1 sibling, 1 reply; 18+ messages in thread From: Artem Mygaiev @ 2018-05-15 8:54 UTC (permalink / raw) To: Stefano Stabellini, Praveen Kumar Cc: Davorin Mista, Lars Kurth, Edgar E. Iglesias, paul_luperto, Volodymyr Babchuk, Rich Persaud, Mirela Simonović, Stewart Hildebrand, julien.grall, robin.randhawa, committers, anastassios.nanos, xen-devel, Jonathan Daugherty, Denys Balatsko, vfachin, Jarvis Roach Hi Stefano On 10.05.18 22:51, Stefano Stabellini wrote: > On Thu, 10 May 2018, Praveen Kumar wrote: >>> Yeah, you are right. It looks like turning Dom0 into a DomU is not good >>> enough. Maybe for this option to be viable we would actually have to >>> terminate (or pause and never unpause?) dom0 after boot. >> >> Just a thought ! >> How about keeping Dom0 still be there, but DomUs given Dom0 privilege, with >> restricted permission on mission critical resources ? And if anyhow Dom0 >> crashes, >> the best contended among the existing DomUs take the ownership of Dom0 ? > > I don't think this is easily doable, also it wouldn't solve the issue of > removing dom0 from the system. But see below. > > >>>> However, you surely need an entity to handle domain crash. You don't >> want to >>>> reboot your platform (and therefore you safety critical domain) for a >> crashed >>>> UI, right? So how this is going to be handled in your option? >> >>> We need to understand the certification requirements better to know the >>> answer to this. I am guessing that UI crashes are not handled from the >>> certification point of view -- maybe we only need to demonstrate that >>> the system is not affected by them? >> >> Where can we find the certification requirements details ? > ISO26262: https://www.iso.org/standard/51362.html IEC61508: https://webstore.iec.ch/publication/5517 > Yes, I think we need to understand the requirements better to figure out > the right way forward for Dom0. > > For instance, here is another idea: we could have Xen boot multiple > domains at boot time from device tree, as suggested in the dom0-less > approach. All of the domains booted from Xen are "mission-critical". The > first domain could still be dom0. Once booted, Dom0 can start other VMs, > however, Xen would restrict Dom0 from doing any operations affecting the > first set of mission-critical domains. > > This way, we would get the flexibility of being able to start/stop > domains at run time, but at the same time we might still be able to > avoid certifications for Dom0, because Dom0 cannot affect the mission > critical applications. Such dom0 shall have no mission-critical domains memory access, no HW access (SMMU, DVFS Power, etc.), and so on. EL3 software (optee or similar on ARM) shall also be safety certified and not controlled from dom0 > > Is this approach actually feasible? We need to read the requirements to > know. I am hoping Artem will chime in on this :-) > I think this approach is feasible indeed, if we can prove isolation and fault tolerance for FuSa parts of the system. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-05-15 8:54 ` Artem Mygaiev @ 2018-05-22 12:08 ` Jarvis Roach 2018-05-22 13:08 ` Artem Mygaiev 2018-05-22 17:50 ` Stefano Stabellini 0 siblings, 2 replies; 18+ messages in thread From: Jarvis Roach @ 2018-05-22 12:08 UTC (permalink / raw) To: Artem Mygaiev, Stefano Stabellini, Praveen Kumar Cc: Davorin Mista, Lars Kurth, Edgar E. Iglesias, paul_luperto, Volodymyr Babchuk, Rich Persaud, Mirela Simonović, Stewart Hildebrand, julien.grall, robin.randhawa, committers, anastassios.nanos, xen-devel, Jonathan Daugherty, Denys Balatsko, vfachin > Hi Stefano > > On 10.05.18 22:51, Stefano Stabellini wrote: > > On Thu, 10 May 2018, Praveen Kumar wrote: > >>> Yeah, you are right. It looks like turning Dom0 into a DomU is not > >>> good enough. Maybe for this option to be viable we would actually > >>> have to terminate (or pause and never unpause?) dom0 after boot. > >> > >> Just a thought ! > >> How about keeping Dom0 still be there, but DomUs given Dom0 > >> privilege, with restricted permission on mission critical resources ? > >> And if anyhow Dom0 crashes, the best contended among the existing > >> DomUs take the ownership of Dom0 ? > > > > I don't think this is easily doable, also it wouldn't solve the issue > > of removing dom0 from the system. But see below. > > > > > >>>> However, you surely need an entity to handle domain crash. You > >>>> don't > >> want to > >>>> reboot your platform (and therefore you safety critical domain) for > >>>> a > >> crashed > >>>> UI, right? So how this is going to be handled in your option? > >> > >>> We need to understand the certification requirements better to know > >>> the answer to this. I am guessing that UI crashes are not handled > >>> from the certification point of view -- maybe we only need to > >>> demonstrate that the system is not affected by them? > >> > >> Where can we find the certification requirements details ? > > > ISO26262: https://www.iso.org/standard/51362.html > IEC61508: https://webstore.iec.ch/publication/5517 > > > Yes, I think we need to understand the requirements better to figure > > out the right way forward for Dom0. > > > > For instance, here is another idea: we could have Xen boot multiple > > domains at boot time from device tree, as suggested in the dom0-less > > approach. All of the domains booted from Xen are "mission-critical". > > The first domain could still be dom0. Once booted, Dom0 can start > > other VMs, however, Xen would restrict Dom0 from doing any operations > > affecting the first set of mission-critical domains. > > Does the first domain have to be dom0? Would it be possible to have domains boot in parallel (especially if allocated to separate CPU cores) such that a simple OS (like FreeRTOS) would complete booting before dom0/Linux? In other words, does the hypervisor have any dependencies on dom0 having performed certain functions (interrupt configuration, MMU table initialization, timers, etc.) before it can create and start additional VMs? > > This way, we would get the flexibility of being able to start/stop > > domains at run time, but at the same time we might still be able to > > avoid certifications for Dom0, because Dom0 cannot affect the mission > > critical applications. > Such dom0 shall have no mission-critical domains memory access, no HW > access (SMMU, DVFS Power, etc.), and so on. EL3 software (optee or similar > on ARM) shall also be safety certified and not controlled from dom0 > > > > Is this approach actually feasible? We need to read the requirements > > to know. I am hoping Artem will chime in on this :-) > > > > I think this approach is feasible indeed, if we can prove isolation and fault > tolerance for FuSa parts of the system. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-05-22 12:08 ` Jarvis Roach @ 2018-05-22 13:08 ` Artem Mygaiev 2018-05-22 17:50 ` Stefano Stabellini 1 sibling, 0 replies; 18+ messages in thread From: Artem Mygaiev @ 2018-05-22 13:08 UTC (permalink / raw) To: Jarvis Roach, Stefano Stabellini, Praveen Kumar Cc: Davorin Mista, Lars Kurth, Edgar E. Iglesias, paul_luperto, Volodymyr Babchuk, Rich Persaud, Mirela Simonović, Stewart Hildebrand, julien.grall, robin.randhawa, committers, anastassios.nanos, xen-devel, Jonathan Daugherty, Denys Balatsko, vfachin Hello Jarvis On 22.05.18 15:08, Jarvis Roach wrote: >> Hi Stefano >> >> On 10.05.18 22:51, Stefano Stabellini wrote: >>> On Thu, 10 May 2018, Praveen Kumar wrote: >>>>> Yeah, you are right. It looks like turning Dom0 into a DomU is not >>>>> good enough. Maybe for this option to be viable we would actually >>>>> have to terminate (or pause and never unpause?) dom0 after boot. >>>> >>>> Just a thought ! >>>> How about keeping Dom0 still be there, but DomUs given Dom0 >>>> privilege, with restricted permission on mission critical resources ? >>>> And if anyhow Dom0 crashes, the best contended among the existing >>>> DomUs take the ownership of Dom0 ? >>> >>> I don't think this is easily doable, also it wouldn't solve the issue >>> of removing dom0 from the system. But see below. >>> >>> >>>>>> However, you surely need an entity to handle domain crash. You >>>>>> don't >>>> want to >>>>>> reboot your platform (and therefore you safety critical domain) for >>>>>> a >>>> crashed >>>>>> UI, right? So how this is going to be handled in your option? >>>> >>>>> We need to understand the certification requirements better to know >>>>> the answer to this. I am guessing that UI crashes are not handled >>>>> from the certification point of view -- maybe we only need to >>>>> demonstrate that the system is not affected by them? >>>> >>>> Where can we find the certification requirements details ? >>> >> ISO26262: https://www.iso.org/standard/51362.html >> IEC61508: https://webstore.iec.ch/publication/5517 >> >>> Yes, I think we need to understand the requirements better to figure >>> out the right way forward for Dom0. >>> >>> For instance, here is another idea: we could have Xen boot multiple >>> domains at boot time from device tree, as suggested in the dom0-less >>> approach. All of the domains booted from Xen are "mission-critical". >>> The first domain could still be dom0. Once booted, Dom0 can start >>> other VMs, however, Xen would restrict Dom0 from doing any operations >>> affecting the first set of mission-critical domains. >>> > > Does the first domain have to be dom0? Would it be possible to have domains boot in parallel (especially if allocated to separate CPU cores) such that a simple OS (like FreeRTOS) would complete booting before dom0/Linux? In other words, does the hypervisor have any dependencies on dom0 having performed certain functions (interrupt configuration, MMU table initialization, timers, etc.) before it can create and start additional VMs? > We actually have one of the options to run FreeRTOS in dom0 (see earlier emails in this thread) >>> This way, we would get the flexibility of being able to start/stop >>> domains at run time, but at the same time we might still be able to >>> avoid certifications for Dom0, because Dom0 cannot affect the mission >>> critical applications. > >> Such dom0 shall have no mission-critical domains memory access, no HW >> access (SMMU, DVFS Power, etc.), and so on. EL3 software (optee or similar >> on ARM) shall also be safety certified and not controlled from dom0 > >>> >>> Is this approach actually feasible? We need to read the requirements >>> to know. I am hoping Artem will chime in on this :-) >> > >> >> I think this approach is feasible indeed, if we can prove isolation and fault >> tolerance for FuSa parts of the system. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-05-22 12:08 ` Jarvis Roach 2018-05-22 13:08 ` Artem Mygaiev @ 2018-05-22 17:50 ` Stefano Stabellini 1 sibling, 0 replies; 18+ messages in thread From: Stefano Stabellini @ 2018-05-22 17:50 UTC (permalink / raw) To: Jarvis Roach Cc: Artem Mygaiev, Lars Kurth, Davorin Mista, Stefano Stabellini, Edgar E. Iglesias, paul_luperto, Volodymyr Babchuk, julien.grall, Rich Persaud, Mirela Simonović, Stewart Hildebrand, Praveen Kumar, robin.randhawa, committers, anastassios.nanos, xen-devel, Jonathan Daugherty, Denys Balatsko On Tue, 22 May 2018, Jarvis Roach wrote: > > > For instance, here is another idea: we could have Xen boot multiple > > > domains at boot time from device tree, as suggested in the dom0-less > > > approach. All of the domains booted from Xen are "mission-critical". > > > The first domain could still be dom0. Once booted, Dom0 can start > > > other VMs, however, Xen would restrict Dom0 from doing any operations > > > affecting the first set of mission-critical domains. > > > > > Does the first domain have to be dom0? Would it be possible to have domains boot in parallel (especially if allocated to separate CPU cores) such that a simple OS (like FreeRTOS) would complete booting before dom0/Linux? In other words, does the hypervisor have any dependencies on dom0 having performed certain functions (interrupt configuration, MMU table initialization, timers, etc.) before it can create and start additional VMs? > I don't think there are any dependencies except for xenstore and PV protocol access. It should be possible to boot Dom0 and the other domain in parallel, as long as the other domain is like a baremetal guest (no PV network/disk access). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Xen and safety certification, Minutes of the meeting on Apr 4th 2018-04-06 14:13 Xen and safety certification, Minutes of the meeting on Apr 4th Lars Kurth 2018-04-06 17:01 ` Jarvis Roach @ 2018-04-06 17:18 ` Artem Mygaiev 1 sibling, 0 replies; 18+ messages in thread From: Artem Mygaiev @ 2018-04-06 17:18 UTC (permalink / raw) To: Lars Kurth, Stefano Stabellini Cc: Edgar E. Iglesias, davorin.mista, robin.randhawa, paul_luperto, Denys Balatsko, Rich Persaud, Jonathan Daugherty, anastassios.nanos, julien.grall, committers, Stewart Hildebrand, xen-devel, vfachin, Volodymyr Babchuk, mirela.simonovic, Jarvis Roach On 06.04.18 17:13, Lars Kurth wrote: > Hi all, > > adding a few more people who are/may be interested in safety certification, including committers (because item 1 would have an impact). Specifically: Rich Persaud, Paul Luperto, Jonathan Daugherty and Denys Balatsko. > > There are a few loose ends and updates from other/similar related threads that we should pull into this thread: > > a) AGL Whitepaper > This is out as far as I can tell > See https://docs.google.com/document/d/1HpYzClh0nDEocsUHb17X0DxiehsAbCgyWE-P2Wk_RNU/edit# > Thank you to Rich for driving this and to all the contributors from the Xen Community > > Related to this is the following item from the original minutes >> AGL will select 2 hypervisors out of the list. Artem has already an >> out-of-the-box solution for AGL. Artem will chase up and make sure that >> Xen will be one of the two. > > b) Genivi AMM Hypervisor Workshop, Apr 19 > Artem and me will be speaking on various Xen related projects. I will send a draft PDF to this list later this week. > Slots are short: 10 minutes + questions each slot > See https://at.projects.genivi.org/wiki/display/DIRO/Hypervisor+Workshop+Team > > c) Xen Specific Automotive Whitepaper > This was discussed during a) and I think it would be relatively easy to pull something together. It would be good if someone else, but me could lead this. We have a lot of information already, but more ground-work on safety certification may help. Would there be a volunteer driving this? I could be used as a vehicle to move some of the items discussed in the minutes along. > As there's a lot of overlap with what has been done for AGL & GENIVI I'll be happy to drive this. > d) I also created https://wiki.xenproject.org/wiki/Category:Safety_Certification to start pulling material relevant to safety and context for it into one place. > It's a little crude at this point in time and I expect this document to evolve and split into smaller parts. > It would be good, if someone on this list could go over https://wiki.xenproject.org/wiki/Category:Safety_Certification#Automotive_Requirements and map the requirements to functionality we already have. This could then feed into c. > > Any takers? > I'll be happy to take this as well. >> Artem suggested to write a whitepaper about Xen real-time capabilities. >> Stefano volunteered to help. > I believe we have some gaps with regards to real-time requirements and that paper is aiming to highlight these. > @Artem: maybe this would be a suitable topic for the developer summit (amongst others) > As a reminder: the CfP for the summit closes next Friday > Yes, we plan to submit a talk re: RT based on the wp being prepared >> I contacted Lars (CC'ed) who volunteered to help. > I am volunteering to act as a program/project manager for this activity. In particular to bootstrap. > > I think the only practicable way to make progress in this area, is to set up some mechanism which allow us to make progress towards the goal of making it easier and cheaper to build safety certified variants of Xen. As a side-effect of this process we should get data, to scope out the scale of the problem further, that should enable getting more vendors interested. > >> The main topic of the meeting was certifications for Xen on ARM. The gap >> analysis document, mentioned in the previous call, is copyrighted. It >> might not be possible to relicense it. Regardless of the document, we >> started discussing the major work items and next steps. > > @Stefano: Thanks for driving this discussion > I re-ordered some of the items, to make it more palatable > >> 2) Create a subset of functions that need to go through certifications >> Next step: create a small Kconfig. We could use the Renesas Rcar as >> reference. We need a discussion about the features we need, for example >> real-time schedulers, do we need them or not? > > @Stefano agreed to drive this. > The minimal configuration does impact 1 and 2, which is why I moved this first. > > We should probably agree a basic process: aka > * Measure baseline size in KSLOC > * Remove some feature > * Measure reduction in KSLOC > And record the data somewhere > >> 1) Requirements to the code, a subset of MISRA for ASIL B >> Next step: get more information about requirements and publish it to >> xen-devel. > > I see a few problems here: > > * The MISCRA 2012 spec has to be bought and it is rather big (100's of pages): > so, I don't think it is practical to work from the spec > > * Some coding style patterns will likely be perceived as odd and unreasonable > by community members: as some common code would be affected we cannot > treat this in isolation say on ARM only. Although it is recognized that some of > the coding style patterns may not make sense, compliance to MISRA is > necessary and cannot normally be discussed away. > > * PRQA has set up an environment and initial MISRA compliance report for a Xen on ARM build > ** The question is what (if anything) can be shared publicly > ** The other open question is whether we can come to some sort of longer term agreement between the Xen Project and PRQA to use their tools > ** As an aside, what PRQA have done would need to reflect what we do in step 2 is. We also want to minimize the work for PRQA: in other words, it has to be very simple to enable the minimal config coming out of task 2 such that PRQA can > ** As far as I recall 90% of all MISRA violations come down to around 70 issues. A large number are in tools > ** Also, I believe that MISRA compliance tools will likely lead to a large amount of false positives, due to the distributed nature of Xen: process boundaries, kernel/user space boundaries, etc. would all lead to false positives, which somehow have to be managed. > > ACTION => Lars to follow up with Paul Luperto from PRQA > > * An approach that may be manageable would be to look at the most common MISRA violations and work backwards from there. > ** This would make the problem more manageable and mean people wouldn't have to read a long spec > ** Discussing a small set of issues, would give us a sense of whether/what type of disagreements there are and how we resolve them. > ** We should focus prioritize based on: > a) Address/discuss the most frequently occurring issues first > b) Address/discuss issues in common code first > > At the very least (and for now in absence of the capability to check compliance), I would need someone who has access to MISRA compliance tools, to drive such an effort. > >> 3) Understand how to address dom0. FreeRTOS Dom0 sounds like a good >> solution. >> Next step: reach out to Dornerworks and/or others that worked with >> FreeRTOS on Xen before. Figure out whether FreeRTOS is actually a >> suitable solution and what needs to be done to run FreeRTOS as Dom0. > > Some things to check at this stage: > a) I believe there is a safety certified version of FreeRTOS - I could not find much, except for https://www.freertos.org/FreeRTOS-Plus/Safety_Critical_Certified/SafeRTOS-Safety-Critical-Certification.shtml - which describes SafeRTOS a commercial safety certified FreeRTOS and (mostly) API compliant version of FreeRTOS. Or am I missing something here? > b) There is a DomU capable version from Galois (Jonathan Docherty CC'ed) - I don't know whether others also have such versions > c) There is a POXIX wrapper, which may be needed, but it is unclear what this would do to the FreeRTOS footprint > d) In other words, what we would have to do is to investigate whether it is possible to build to a Dom0 capable FreeRTOS > > I see several ways of approaching this: > a) A vendor (or groups of vendors) on this list steps up > b) We go initially for a lower bar: aka we try and scope out and cost the creation of a Dom0 capable FreeRTOS and then look at how the work can get funded > > A very good starting point would be to get a list of parties that are interested in having and using a FreeRTOS based Dom0 (regardless of how we get there). A show of hands would be good. > Some insights from anyone on the FreeRTOS/SafeRTOS relationship and politics would be good also. Unless there is a route from FreeRTOS upstream to the certified version, someone in our eco-system would have to safety certify FreeRTOS (which may not be such a big deal given the fairly small size of FreeRTOS). > We plan to analyze efforts to port FreeRTOS as dom0 OS >> 4) Create artifacts, such as docs, fault analysis, prove fault tolerance, >> safety management docs, development processes. >> Next step: we need to bring in a company, a certification body, to guide >> us through the process. > > We have companies such as Dornerworks on this list which are experienced with safety certification on Xen for some safety standards: it is not clear to me how much of this is transferable to automotive. > > Here my understanding is that we need a certification partner like TÜV, MIRA or a company like Dornerworks who already have experience with Xen. By working with a partner experienced in certification, the overall cost of certification would be significantly reduced. The elephant in the room is funding and a business model (aka all the items listed in https://docs.google.com/document/d/1HpYzClh0nDEocsUHb17X0DxiehsAbCgyWE-P2Wk_RNU/edit section 4.1). The reality is that organisations such as TÜV, MIRA, Dornerworks, ... will need to be paid by someone. Which, I think we need to park for now. > > What I think are sensible goals for now are > a) Establish a list of potential partners and start establishing contacts - such conversations would need to be led by a vendor, otherwise it will go nowhere. What would be good though is to have a shared (but possibly private) repository of how these conversations have gone. > b) Otherwise focus on tasks 1-3 which deal with some issues listed in https://www.slideshare.net/xen_com_mgr/art-certification, which is still very valid > c) Engage/work with with other groups (AGL, Genivi, Linaro) who are also looking at this problem > > It may be worth in the mid-term to consider some sort of pilot around a small portion of the Xen codebase: the aim would be to gather data that helps establish what can be done in a collaborative FOSS environment. > > Feedback/views are very welcome > > Regards > Lars > > > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2018-05-22 17:50 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-04-06 14:13 Xen and safety certification, Minutes of the meeting on Apr 4th Lars Kurth 2018-04-06 17:01 ` Jarvis Roach 2018-04-06 17:23 ` Lars Kurth 2018-04-06 17:32 ` Artem Mygaiev 2018-04-06 20:47 ` Stefano Stabellini 2018-04-11 16:19 ` Artem Mygaiev 2018-04-12 18:38 ` Praveen Kumar 2018-05-08 0:11 ` Stefano Stabellini 2018-05-08 13:39 ` Julien Grall 2018-05-08 15:49 ` Stefano Stabellini 2018-05-10 4:55 ` Praveen Kumar 2018-05-10 19:51 ` Stefano Stabellini 2018-05-12 17:38 ` Rich Persaud 2018-05-15 8:54 ` Artem Mygaiev 2018-05-22 12:08 ` Jarvis Roach 2018-05-22 13:08 ` Artem Mygaiev 2018-05-22 17:50 ` Stefano Stabellini 2018-04-06 17:18 ` Artem Mygaiev
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.