All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/4] misc safety certification fixes
@ 2019-01-09 23:41 Stefano Stabellini
  2019-01-09 23:42 ` [PATCH v6 1/4] xen: introduce SYMBOL Stefano Stabellini
                   ` (3 more replies)
  0 siblings, 4 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-09 23:41 UTC (permalink / raw)
  To: xen-devel; +Cc: andrew.cooper3, julien.grall, sstabellini, JBeulich

Hi all,

This version of the series addresses all the latest comments by Jan. The
principal change is to SYMBOL(), that now returns the native type,
instead of unsigned long.

I would like to note that I believe this not a good change. It would be
better, more safety compliant, to have SYMBOL() return unsigned long as
it was done in v5 of the series.

Cheers,

Stefano


The following changes since commit 808cff4c2af66afd61973451aeb7e708732abf90:

  sched/credit2: remove stale comment (2019-01-09 15:46:05 +0100)

are available in the git repository at:

  http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git certifications-6

for you to fetch changes up to 40ca1684137114698b13c18f2f1468e9bc75574f:

  xen/common: use SYMBOL when required (2019-01-09 15:36:21 -0800)

----------------------------------------------------------------
Stefano Stabellini (4):
      xen: introduce SYMBOL
      xen/arm: use SYMBOL when required
      xen/x86: use SYMBOL when required
      xen/common: use SYMBOL when required

 xen/arch/arm/alternative.c        |  4 ++--
 xen/arch/arm/arm32/livepatch.c    |  2 +-
 xen/arch/arm/arm64/livepatch.c    |  2 +-
 xen/arch/arm/device.c             |  6 +++---
 xen/arch/arm/livepatch.c          |  4 ++--
 xen/arch/arm/mm.c                 | 13 +++++++------
 xen/arch/arm/percpu.c             |  8 ++++----
 xen/arch/arm/platform.c           |  6 ++++--
 xen/arch/arm/setup.c              |  6 ++++--
 xen/arch/x86/alternative.c        |  3 ++-
 xen/arch/x86/efi/efi-boot.h       |  8 ++++----
 xen/arch/x86/percpu.c             |  8 ++++----
 xen/arch/x86/setup.c              |  8 +++++---
 xen/arch/x86/smpboot.c            |  5 +++--
 xen/common/kernel.c               |  8 ++++++--
 xen/common/lib.c                  |  3 ++-
 xen/common/schedule.c             |  6 ++++--
 xen/common/spinlock.c             |  4 +++-
 xen/common/version.c              |  6 +++---
 xen/common/virtual_region.c       | 12 ++++++------
 xen/drivers/vpci/vpci.c           |  2 +-
 xen/include/asm-arm/grant_table.h |  3 ++-
 xen/include/xen/compiler.h        | 10 ++++++++++
 xen/include/xen/kernel.h          | 24 ++++++++++++------------
 24 files changed, 95 insertions(+), 66 deletions(-)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-09 23:41 [PATCH v6 0/4] misc safety certification fixes Stefano Stabellini
@ 2019-01-09 23:42 ` Stefano Stabellini
  2019-01-10  2:40   ` Julien Grall
  2019-01-10  8:34   ` Jan Beulich
  2019-01-09 23:42 ` [PATCH v6 2/4] xen/arm: use SYMBOL when required Stefano Stabellini
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-09 23:42 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, sstabellini, wei.liu2, andrew.cooper3,
	julien.grall, JBeulich

Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
meant to be used everywhere symbols such as _stext and _etext are used
in the code. It can take an array type as a parameter, and it returns
the same type.

SYMBOL is needed when accessing symbols such as _stext and _etext
because the C standard forbids for both comparisons and substraction
(see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
pointing to different objects. _stext, _etext, etc. are all pointers to
different objects from ANCI C point of view.

To work around potential C compiler issues (which have actually
been found, see the comment on top of RELOC_HIDE in Linux), and to help
with certifications, let's introduce some syntactic sugar to be used in
following patches.

[1] https://wiki.sei.cmu.edu/confluence/display/c/ARR36-C.+Do+not+subtract+or+compare+two+pointers+that+do+not+refer+to+the+same+array

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
CC: JBeulich@suse.com
CC: andrew.cooper3@citrix.com
CC: wei.liu2@citrix.com
---
Changes in v6:
- drop acks
- don't use RELOC_HIDE for the implementation
- return native type from SYMBOL

Changes in v4:
- add acked-bys
- remove unneeded parenthesis

Changes in v3:
- improve commit message
- rename __symbol to SYMBOL to avoid name space violations

Changes in v2:
- do not cast return to char*
- move to common header
---
 xen/include/xen/compiler.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/xen/include/xen/compiler.h b/xen/include/xen/compiler.h
index ff6c0f5..d4c856c 100644
--- a/xen/include/xen/compiler.h
+++ b/xen/include/xen/compiler.h
@@ -99,6 +99,16 @@
     __asm__ ("" : "=r"(__ptr) : "0"(ptr));      \
     (typeof(ptr)) (__ptr + (off)); })
 
+/*
+ * Similar to RELOC_HIDE, but written to be used with symbols such as
+ * _stext and _etext to avoid undefined behavior comparing pointers to
+ * different objects. It can handle array types.
+ */
+#define SYMBOL(ptr)                               \
+  ({ unsigned long __ptr;                       \
+    __asm__ ("" : "=r"(__ptr) : "0"(ptr));      \
+    (typeof(*(ptr)) *) (__ptr); })
+
 #ifdef __GCC_ASM_FLAG_OUTPUTS__
 # define ASM_FLAG_OUT(yes, no) yes
 #else
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v6 2/4] xen/arm: use SYMBOL when required
  2019-01-09 23:41 [PATCH v6 0/4] misc safety certification fixes Stefano Stabellini
  2019-01-09 23:42 ` [PATCH v6 1/4] xen: introduce SYMBOL Stefano Stabellini
@ 2019-01-09 23:42 ` Stefano Stabellini
  2019-01-10  8:41   ` Jan Beulich
  2019-01-09 23:42 ` [PATCH v6 3/4] xen/x86: " Stefano Stabellini
  2019-01-09 23:42 ` [PATCH v6 4/4] xen/common: " Stefano Stabellini
  3 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-09 23:42 UTC (permalink / raw)
  To: xen-devel
  Cc: andrew.cooper3, julien.grall, sstabellini, JBeulich, Stefano Stabellini

Use SYMBOL in cases of comparisons and subtractions of:

_start, _end, __init_begin, __init_end, _stext, _etext,
__alt_instructions, __alt_instructions_end, __per_cpu_start,
__per_cpu_data_end, _splatform, _eplatform, _sdevice, _edevice,
_asdevice, _aedevice.

as by the C standard [1].

M3CM: Rule-18.2: Subtraction between pointers shall only be applied to
pointers that address elements of the same array

[1] https://wiki.sei.cmu.edu/confluence/display/c/ARR36-C.+Do+not+subtract+or+compare+two+pointers+that+do+not+refer+to+the+same+array

QAVerify: 2761
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
CC: JBeulich@suse.com
CC: andrew.cooper3@citrix.com
---
Changes in v6:
- more accurate commit message
- use new SYMBOL macro that returns the native type

Changes in v5:
- remove two spurious changes
- split into three patches
- remove SYMBOL() from derived variables

Changes in v4:
- only use SYMBOL where necessary, not "everywhere": comparisons and
  subtractions
- improve commit message
- remove some unnecessary casts
- fix some still unsafe casts
- extend checks to all symbols in xen/arch/x86/xen.lds.S and
  xen/arch/arm/xen.lds.S

Changes in v3:
- improve commit message
- no hard tabs
- rename __symbol to SYMBOL
- fix __end_vpci_array and __start_vpci_array
- avoid all comparisons between pointers: including (void *) casted
  returns from SYMBOL()
- remove useless casts to (unsigned long)

Changes in v2:
- cast return of SYMBOL to char* when required
- define __pa as unsigned long in is_kernel* functions
---
 xen/arch/arm/alternative.c        |  4 ++--
 xen/arch/arm/arm32/livepatch.c    |  2 +-
 xen/arch/arm/arm64/livepatch.c    |  2 +-
 xen/arch/arm/device.c             |  6 +++---
 xen/arch/arm/livepatch.c          |  4 ++--
 xen/arch/arm/mm.c                 | 13 +++++++------
 xen/arch/arm/percpu.c             |  8 ++++----
 xen/arch/arm/platform.c           |  6 ++++--
 xen/arch/arm/setup.c              |  6 ++++--
 xen/include/asm-arm/grant_table.h |  3 ++-
 10 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
index 52ed7ed..ae738a9 100644
--- a/xen/arch/arm/alternative.c
+++ b/xen/arch/arm/alternative.c
@@ -188,7 +188,7 @@ static int __apply_alternatives_multi_stop(void *unused)
         int ret;
         struct alt_region region;
         mfn_t xen_mfn = virt_to_mfn(_start);
-        paddr_t xen_size = _end - _start;
+        paddr_t xen_size = SYMBOL(_end) - SYMBOL(_start);
         unsigned int xen_order = get_order_from_bytes(xen_size);
         void *xenmap;
 
@@ -206,7 +206,7 @@ static int __apply_alternatives_multi_stop(void *unused)
         region.begin = __alt_instructions;
         region.end = __alt_instructions_end;
 
-        ret = __apply_alternatives(&region, xenmap - (void *)_start);
+        ret = __apply_alternatives(&region, xenmap - (void *)SYMBOL(_start));
         /* The patching is not expected to fail during boot. */
         BUG_ON(ret != 0);
 
diff --git a/xen/arch/arm/arm32/livepatch.c b/xen/arch/arm/arm32/livepatch.c
index 41378a5..6bf9132 100644
--- a/xen/arch/arm/arm32/livepatch.c
+++ b/xen/arch/arm/arm32/livepatch.c
@@ -56,7 +56,7 @@ void arch_livepatch_apply(struct livepatch_func *func)
     else
         insn = 0xe1a00000; /* mov r0, r0 */
 
-    new_ptr = func->old_addr - (void *)_start + vmap_of_xen_text;
+    new_ptr = func->old_addr - (void *)SYMBOL(_start) + vmap_of_xen_text;
     len = len / sizeof(uint32_t);
 
     /* PATCH! */
diff --git a/xen/arch/arm/arm64/livepatch.c b/xen/arch/arm/arm64/livepatch.c
index 2247b92..ec49877 100644
--- a/xen/arch/arm/arm64/livepatch.c
+++ b/xen/arch/arm/arm64/livepatch.c
@@ -43,7 +43,7 @@ void arch_livepatch_apply(struct livepatch_func *func)
     /* Verified in livepatch_verify_distance. */
     ASSERT(insn != AARCH64_BREAK_FAULT);
 
-    new_ptr = func->old_addr - (void *)_start + vmap_of_xen_text;
+    new_ptr = func->old_addr - (void *)SYMBOL(_start) + vmap_of_xen_text;
     len = len / sizeof(uint32_t);
 
     /* PATCH! */
diff --git a/xen/arch/arm/device.c b/xen/arch/arm/device.c
index 70cd6c1..bb209be 100644
--- a/xen/arch/arm/device.c
+++ b/xen/arch/arm/device.c
@@ -35,7 +35,7 @@ int __init device_init(struct dt_device_node *dev, enum device_class class,
     if ( !dt_device_is_available(dev) || dt_device_for_passthrough(dev) )
         return  -ENODEV;
 
-    for ( desc = _sdevice; desc != _edevice; desc++ )
+    for ( desc = SYMBOL(_sdevice); desc != SYMBOL(_edevice); desc++ )
     {
         if ( desc->class != class )
             continue;
@@ -56,7 +56,7 @@ int __init acpi_device_init(enum device_class class, const void *data, int class
 {
     const struct acpi_device_desc *desc;
 
-    for ( desc = _asdevice; desc != _aedevice; desc++ )
+    for ( desc = SYMBOL(_asdevice); desc != SYMBOL(_aedevice); desc++ )
     {
         if ( ( desc->class != class ) || ( desc->class_type != class_type ) )
             continue;
@@ -75,7 +75,7 @@ enum device_class device_get_class(const struct dt_device_node *dev)
 
     ASSERT(dev != NULL);
 
-    for ( desc = _sdevice; desc != _edevice; desc++ )
+    for ( desc = SYMBOL(_sdevice); desc != SYMBOL(_edevice); desc++ )
     {
         if ( dt_match_node(desc->dt_match, dev) )
             return desc->class;
diff --git a/xen/arch/arm/livepatch.c b/xen/arch/arm/livepatch.c
index 279d52c..9abcf56 100644
--- a/xen/arch/arm/livepatch.c
+++ b/xen/arch/arm/livepatch.c
@@ -27,7 +27,7 @@ int arch_livepatch_quiesce(void)
         return -EINVAL;
 
     text_mfn = virt_to_mfn(_start);
-    text_order = get_order_from_bytes(_end - _start);
+    text_order = get_order_from_bytes(SYMBOL(_end) - SYMBOL(_start));
 
     /*
      * The text section is read-only. So re-map Xen to be able to patch
@@ -78,7 +78,7 @@ void arch_livepatch_revert(const struct livepatch_func *func)
     uint32_t *new_ptr;
     unsigned int len;
 
-    new_ptr = func->old_addr - (void *)_start + vmap_of_xen_text;
+    new_ptr = func->old_addr - (void *)SYMBOL(_start) + vmap_of_xen_text;
 
     len = livepatch_insn_len(func);
     memcpy(new_ptr, func->opaque, len);
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 01ae2cc..15f5eae 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1084,8 +1084,8 @@ static void set_pte_flags_on_range(const char *p, unsigned long l, enum mg mg)
     ASSERT(!((unsigned long) p & ~PAGE_MASK));
     ASSERT(!(l & ~PAGE_MASK));
 
-    for ( i = (p - _start) / PAGE_SIZE; 
-          i < (p + l - _start) / PAGE_SIZE; 
+    for ( i = (p - SYMBOL(_start)) / PAGE_SIZE;
+          i < (p + l - SYMBOL(_start)) / PAGE_SIZE;
           i++ )
     {
         pte = xen_xenmap[i];
@@ -1122,12 +1122,12 @@ static void set_pte_flags_on_range(const char *p, unsigned long l, enum mg mg)
 void free_init_memory(void)
 {
     paddr_t pa = virt_to_maddr(__init_begin);
-    unsigned long len = __init_end - __init_begin;
+    unsigned long len = SYMBOL(__init_end) - SYMBOL(__init_begin);
     uint32_t insn;
     unsigned int i, nr = len / sizeof(insn);
     uint32_t *p;
 
-    set_pte_flags_on_range(__init_begin, len, mg_rw);
+    set_pte_flags_on_range(SYMBOL(__init_begin), len, mg_rw);
 #ifdef CONFIG_ARM_32
     /* udf instruction i.e (see A8.8.247 in ARM DDI 0406C.c) */
     insn = 0xe7f000f0;
@@ -1138,9 +1138,10 @@ void free_init_memory(void)
     for ( i = 0; i < nr; i++ )
         *(p + i) = insn;
 
-    set_pte_flags_on_range(__init_begin, len, mg_clear);
+    set_pte_flags_on_range(SYMBOL(__init_begin), len, mg_clear);
     init_domheap_pages(pa, pa + len);
-    printk("Freed %ldkB init memory.\n", (long)(__init_end-__init_begin)>>10);
+    printk("Freed %ldkB init memory.\n",
+           (long)(SYMBOL(__init_end)-SYMBOL(__init_begin))>>10);
 }
 
 void arch_dump_shared_mem_info(void)
diff --git a/xen/arch/arm/percpu.c b/xen/arch/arm/percpu.c
index 25442c4..da93502 100644
--- a/xen/arch/arm/percpu.c
+++ b/xen/arch/arm/percpu.c
@@ -6,7 +6,7 @@
 
 unsigned long __per_cpu_offset[NR_CPUS];
 #define INVALID_PERCPU_AREA (-(long)__per_cpu_start)
-#define PERCPU_ORDER (get_order_from_bytes(__per_cpu_data_end-__per_cpu_start))
+#define PERCPU_ORDER (get_order_from_bytes(SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start)))
 
 void __init percpu_init_areas(void)
 {
@@ -22,8 +22,8 @@ static int init_percpu_area(unsigned int cpu)
         return -EBUSY;
     if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
         return -ENOMEM;
-    memset(p, 0, __per_cpu_data_end - __per_cpu_start);
-    __per_cpu_offset[cpu] = p - __per_cpu_start;
+    memset(p, 0, SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start));
+    __per_cpu_offset[cpu] = p - SYMBOL(__per_cpu_start);
     return 0;
 }
 
@@ -37,7 +37,7 @@ static void _free_percpu_area(struct rcu_head *head)
 {
     struct free_info *info = container_of(head, struct free_info, rcu);
     unsigned int cpu = info->cpu;
-    char *p = __per_cpu_start + __per_cpu_offset[cpu];
+    char *p = SYMBOL(__per_cpu_start) + __per_cpu_offset[cpu];
     free_xenheap_pages(p, PERCPU_ORDER);
     __per_cpu_offset[cpu] = INVALID_PERCPU_AREA;
 }
diff --git a/xen/arch/arm/platform.c b/xen/arch/arm/platform.c
index 8eb0b6e..4f78d84 100644
--- a/xen/arch/arm/platform.c
+++ b/xen/arch/arm/platform.c
@@ -51,14 +51,16 @@ void __init platform_init(void)
     ASSERT(platform == NULL);
 
     /* Looking for the platform description */
-    for ( platform = _splatform; platform != _eplatform; platform++ )
+    for ( platform = SYMBOL(_splatform);
+          platform != SYMBOL(_eplatform);
+          platform++ )
     {
         if ( platform_is_compatible(platform) )
             break;
     }
 
     /* We don't have specific operations for this platform */
-    if ( platform == _eplatform )
+    if ( platform == SYMBOL(_eplatform) )
     {
         /* TODO: dump DT machine compatible node */
         printk(XENLOG_INFO "Platform: Generic System\n");
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 444857a..169c2ac 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -772,8 +772,10 @@ void __init start_xen(unsigned long boot_phys_offset,
 
     /* Register Xen's load address as a boot module. */
     xen_bootmodule = add_boot_module(BOOTMOD_XEN,
-                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
-                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
+                             (paddr_t)(uintptr_t)(SYMBOL(_start) +
+                                                  boot_phys_offset),
+                             (paddr_t)(uintptr_t)(SYMBOL(_end) -
+                                                  SYMBOL(_start) + 1), false);
     BUG_ON(!xen_bootmodule);
 
     setup_pagetables(boot_phys_offset);
diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
index 816e3c6..4638713 100644
--- a/xen/include/asm-arm/grant_table.h
+++ b/xen/include/asm-arm/grant_table.h
@@ -31,7 +31,8 @@ void gnttab_mark_dirty(struct domain *d, mfn_t mfn);
  * enough space for a large grant table
  */
 #define gnttab_dom0_frames()                                             \
-    min_t(unsigned int, opt_max_grant_frames, PFN_DOWN(_etext - _stext))
+    min_t(unsigned int, opt_max_grant_frames,                            \
+          PFN_DOWN(SYMBOL(_etext) - SYMBOL(_stext)))
 
 #define gnttab_init_arch(gt)                                             \
 ({                                                                       \
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v6 3/4] xen/x86: use SYMBOL when required
  2019-01-09 23:41 [PATCH v6 0/4] misc safety certification fixes Stefano Stabellini
  2019-01-09 23:42 ` [PATCH v6 1/4] xen: introduce SYMBOL Stefano Stabellini
  2019-01-09 23:42 ` [PATCH v6 2/4] xen/arm: use SYMBOL when required Stefano Stabellini
@ 2019-01-09 23:42 ` Stefano Stabellini
  2019-01-10  8:43   ` Jan Beulich
  2019-01-09 23:42 ` [PATCH v6 4/4] xen/common: " Stefano Stabellini
  3 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-09 23:42 UTC (permalink / raw)
  To: xen-devel
  Cc: andrew.cooper3, julien.grall, sstabellini, JBeulich, Stefano Stabellini

Use SYMBOL in cases of comparisons and subtractions of:

_start, _end, __2M_rwdata_start, __2M_rwdata_end, _stext, _etext,
__end_vpci_array, __start_vpci_array, _stextentry, _etextentry,
__trampoline_rel_start, __trampoline_rel_stop, __trampoline_seg_start,
__trampoline_seg_stop __per_cpu_start, __per_cpu_data_end

as by the C standard [1].

M3CM: Rule-18.2: Subtraction between pointers shall only be applied to
pointers that address elements of the same array

[1] https://wiki.sei.cmu.edu/confluence/display/c/ARR36-C.+Do+not+subtract+or+compare+two+pointers+that+do+not+refer+to+the+same+array

QAVerify: 2761
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
CC: JBeulich@suse.com
CC: andrew.cooper3@citrix.com
---
Changes in v6:
- more accurate commit message
- remove uneeded extra newline
- only use SYMBOL on problematic symbols in alternatives.c
- use new SYMBOL macro that returns the native type

Changes in v5:
- remove two spurious changes
- split into three patches
- remove SYMBOL() from derived variables
---
 xen/arch/x86/alternative.c  | 3 ++-
 xen/arch/x86/efi/efi-boot.h | 8 ++++----
 xen/arch/x86/percpu.c       | 8 ++++----
 xen/arch/x86/setup.c        | 8 +++++---
 xen/arch/x86/smpboot.c      | 5 +++--
 xen/drivers/vpci/vpci.c     | 2 +-
 6 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index b8c819a..92c54eb 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -273,7 +273,8 @@ static int __init nmi_apply_alternatives(const struct cpu_user_regs *regs,
         /* Disable WP to allow patching read-only pages. */
         write_cr0(cr0 & ~X86_CR0_WP);
 
-        apply_alternatives(__alt_instructions, __alt_instructions_end);
+        apply_alternatives(SYMBOL(__alt_instructions),
+                           SYMBOL(__alt_instructions_end));
 
         write_cr0(cr0);
 
diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
index 5789d2c..8dcd981 100644
--- a/xen/arch/x86/efi/efi-boot.h
+++ b/xen/arch/x86/efi/efi-boot.h
@@ -111,12 +111,12 @@ static void __init relocate_trampoline(unsigned long phys)
         return;
 
     /* Apply relocations to trampoline. */
-    for ( trampoline_ptr = __trampoline_rel_start;
-          trampoline_ptr < __trampoline_rel_stop;
+    for ( trampoline_ptr = SYMBOL(__trampoline_rel_start);
+          trampoline_ptr < SYMBOL(__trampoline_rel_stop);
           ++trampoline_ptr )
         *(u32 *)(*trampoline_ptr + (long)trampoline_ptr) += phys;
-    for ( trampoline_ptr = __trampoline_seg_start;
-          trampoline_ptr < __trampoline_seg_stop;
+    for ( trampoline_ptr = SYMBOL(__trampoline_seg_start);
+          trampoline_ptr < SYMBOL(__trampoline_seg_stop);
           ++trampoline_ptr )
         *(u16 *)(*trampoline_ptr + (long)trampoline_ptr) = phys >> 4;
 }
diff --git a/xen/arch/x86/percpu.c b/xen/arch/x86/percpu.c
index 8be4ebd..920bb78 100644
--- a/xen/arch/x86/percpu.c
+++ b/xen/arch/x86/percpu.c
@@ -13,7 +13,7 @@ unsigned long __per_cpu_offset[NR_CPUS];
  * context of PV guests.
  */
 #define INVALID_PERCPU_AREA (0x8000000000000000L - (long)__per_cpu_start)
-#define PERCPU_ORDER get_order_from_bytes(__per_cpu_data_end - __per_cpu_start)
+#define PERCPU_ORDER get_order_from_bytes(SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start))
 
 void __init percpu_init_areas(void)
 {
@@ -33,8 +33,8 @@ static int init_percpu_area(unsigned int cpu)
     if ( (p = alloc_xenheap_pages(PERCPU_ORDER, 0)) == NULL )
         return -ENOMEM;
 
-    memset(p, 0, __per_cpu_data_end - __per_cpu_start);
-    __per_cpu_offset[cpu] = p - __per_cpu_start;
+    memset(p, 0, SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start));
+    __per_cpu_offset[cpu] = p - SYMBOL(__per_cpu_start);
 
     return 0;
 }
@@ -49,7 +49,7 @@ static void _free_percpu_area(struct rcu_head *head)
 {
     struct free_info *info = container_of(head, struct free_info, rcu);
     unsigned int cpu = info->cpu;
-    char *p = __per_cpu_start + __per_cpu_offset[cpu];
+    char *p = SYMBOL(__per_cpu_start) + __per_cpu_offset[cpu];
 
     free_xenheap_pages(p, PERCPU_ORDER);
     __per_cpu_offset[cpu] = INVALID_PERCPU_AREA;
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 06eb483..5c35826 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -972,7 +972,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
          * respective reserve_e820_ram() invocation below.
          */
         mod[mbi->mods_count].mod_start = virt_to_mfn(_stext);
-        mod[mbi->mods_count].mod_end = __2M_rwdata_end - _stext;
+        mod[mbi->mods_count].mod_end = SYMBOL(__2M_rwdata_end) - SYMBOL(_stext);
     }
 
     modules_headroom = bzimage_headroom(bootstrap_map(mod), mod->mod_end);
@@ -1067,7 +1067,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
              * data until after we have switched to the relocated pagetables!
              */
             barrier();
-            move_memory(e + XEN_IMG_OFFSET, XEN_IMG_OFFSET, _end - _start, 1);
+            move_memory(e + XEN_IMG_OFFSET, XEN_IMG_OFFSET,
+                        SYMBOL(_end) - SYMBOL(_start), 1);
 
             /* Walk initial pagetables, relocating page directory entries. */
             pl4e = __va(__pa(idle_pg_table));
@@ -1382,7 +1383,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     }
 #endif
 
-    xen_virt_end = ((unsigned long)_end + (1UL << L2_PAGETABLE_SHIFT) - 1) &
+    xen_virt_end = ((unsigned long)SYMBOL(_end) +
+                    (1UL << L2_PAGETABLE_SHIFT) - 1) &
                    ~((1UL << L2_PAGETABLE_SHIFT) - 1);
     destroy_xen_mappings(xen_virt_end, XEN_VIRT_START + BOOTSTRAP_MAP_BASE);
 
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 7d1226d..9b008b9 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -810,8 +810,9 @@ static int setup_cpu_root_pgt(unsigned int cpu)
     {
         const char *ptr;
 
-        for ( rc = 0, ptr = _stextentry;
-              !rc && ptr < _etextentry; ptr += PAGE_SIZE )
+        for ( rc = 0, ptr = SYMBOL(_stextentry);
+              !rc && ptr < SYMBOL(_etextentry);
+              ptr += PAGE_SIZE )
             rc = clone_mapping(ptr, rpt);
 
         if ( rc )
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 82607bd..bc0cca6 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -33,7 +33,7 @@ struct vpci_register {
 #ifdef __XEN__
 extern vpci_register_init_t *const __start_vpci_array[];
 extern vpci_register_init_t *const __end_vpci_array[];
-#define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
+#define NUM_VPCI_INIT (SYMBOL(__end_vpci_array) - SYMBOL(__start_vpci_array))
 
 void vpci_remove_device(struct pci_dev *pdev)
 {
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v6 4/4] xen/common: use SYMBOL when required
  2019-01-09 23:41 [PATCH v6 0/4] misc safety certification fixes Stefano Stabellini
                   ` (2 preceding siblings ...)
  2019-01-09 23:42 ` [PATCH v6 3/4] xen/x86: " Stefano Stabellini
@ 2019-01-09 23:42 ` Stefano Stabellini
  2019-01-10  8:49   ` Jan Beulich
  3 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-09 23:42 UTC (permalink / raw)
  To: xen-devel
  Cc: andrew.cooper3, julien.grall, sstabellini, JBeulich, Stefano Stabellini

Use SYMBOL in cases of comparisons and subtractions of:

_start, _end, _stext, _etext, _srodata, _erodata, _sinittext,
_einittext, __note_gnu_build_id_start, __note_gnu_build_id_end,
__lock_profile_start, __lock_profile_end, __initcall_start,
__initcall_end, __presmp_initcall_end, __ctors_start, __ctors_end,
__end_schedulers_array, __start_schedulers_array, __start_bug_frames,
__stop_bug_frames_0, __stop_bug_frames_1, __stop_bug_frames_2,
__stop_bug_frames_3,

as by the C standard [1].

M3CM: Rule-18.2: Subtraction between pointers shall only be applied to
pointers that address elements of the same array

Since we are changing the body of is_kernel_text and friends, take the
opportunity to remove the leading underscores in the local variables
names, which are violationg namespace rules.

[1] https://wiki.sei.cmu.edu/confluence/display/c/ARR36-C.+Do+not+subtract+or+compare+two+pointers+that+do+not+refer+to+the+same+array

QAVerify: 2761
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
CC: JBeulich@suse.com
CC: andrew.cooper3@citrix.com
---
Changes in v6:
- more accurate commit message
- remove hard tabs
- remove leading underscores
- code style
- use SYMBOL only on the problematic bug_frames symbols
- use new SYMBOL macro that returns the native type

Changes in v5:
- remove two spurious changes
- split into three patches
- remove SYMBOL() from derived variables
---
 xen/common/kernel.c         |  8 ++++++--
 xen/common/lib.c            |  3 ++-
 xen/common/schedule.c       |  6 ++++--
 xen/common/spinlock.c       |  4 +++-
 xen/common/version.c        |  6 +++---
 xen/common/virtual_region.c | 12 ++++++------
 xen/include/xen/kernel.h    | 24 ++++++++++++------------
 7 files changed, 36 insertions(+), 27 deletions(-)

diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index 5766a0f..ed913be 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -312,14 +312,18 @@ extern const initcall_t __initcall_start[], __presmp_initcall_end[],
 void __init do_presmp_initcalls(void)
 {
     const initcall_t *call;
-    for ( call = __initcall_start; call < __presmp_initcall_end; call++ )
+    for ( call = SYMBOL(__initcall_start);
+          call < SYMBOL(__presmp_initcall_end);
+          call++ )
         (*call)();
 }
 
 void __init do_initcalls(void)
 {
     const initcall_t *call;
-    for ( call = __presmp_initcall_end; call < __initcall_end; call++ )
+    for ( call = SYMBOL(__presmp_initcall_end);
+          call < SYMBOL(__initcall_end);
+          call++ )
         (*call)();
 }
 
diff --git a/xen/common/lib.c b/xen/common/lib.c
index 8ebec81..4e43ee5 100644
--- a/xen/common/lib.c
+++ b/xen/common/lib.c
@@ -497,7 +497,8 @@ extern const ctor_func_t __ctors_start[], __ctors_end[];
 void __init init_constructors(void)
 {
     const ctor_func_t *f;
-    for ( f = __ctors_start; f < __ctors_end; ++f )
+
+    for ( f = SYMBOL(__ctors_start); f < SYMBOL(__ctors_end); ++f )
         (*f)();
 
     /* Putting this here seems as good (or bad) as any other place. */
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index a957c5e..a81de40 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -67,8 +67,10 @@ DEFINE_PER_CPU(struct scheduler *, scheduler);
 /* Scratch space for cpumasks. */
 DEFINE_PER_CPU(cpumask_t, cpumask_scratch);
 
-extern const struct scheduler *__start_schedulers_array[], *__end_schedulers_array[];
-#define NUM_SCHEDULERS (__end_schedulers_array - __start_schedulers_array)
+extern const struct scheduler *__start_schedulers_array[],
+                              *__end_schedulers_array[];
+#define NUM_SCHEDULERS (SYMBOL(__end_schedulers_array) - \
+                        SYMBOL(__start_schedulers_array))
 #define schedulers __start_schedulers_array
 
 static struct scheduler __read_mostly ops;
diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 6bc52d7..ed1b2b0 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -474,7 +474,9 @@ static int __init lock_prof_init(void)
 {
     struct lock_profile **q;
 
-    for ( q = &__lock_profile_start; q < &__lock_profile_end; q++ )
+    for ( q = SYMBOL(&__lock_profile_start);
+          q < SYMBOL(&__lock_profile_end);
+          q++ )
     {
         (*q)->next = lock_profile_glb_q.elem_q;
         lock_profile_glb_q.elem_q = *q;
diff --git a/xen/common/version.c b/xen/common/version.c
index 223cb52..7414d2d 100644
--- a/xen/common/version.c
+++ b/xen/common/version.c
@@ -147,14 +147,14 @@ static int __init xen_build_init(void)
     int rc;
 
     /* --build-id invoked with wrong parameters. */
-    if ( __note_gnu_build_id_end <= &n[0] )
+    if ( SYMBOL(__note_gnu_build_id_end) <= &n[0] )
         return -ENODATA;
 
     /* Check for full Note header. */
-    if ( &n[1] >= __note_gnu_build_id_end )
+    if ( &n[1] >= SYMBOL(__note_gnu_build_id_end) )
         return -ENODATA;
 
-    sz = (void *)__note_gnu_build_id_end - (void *)n;
+    sz = (void *)SYMBOL(__note_gnu_build_id_end) - (void *)n;
 
     rc = xen_build_id_check(n, sz, &build_id_p, &build_id_len);
 
diff --git a/xen/common/virtual_region.c b/xen/common/virtual_region.c
index aa23918..d89c8f4e 100644
--- a/xen/common/virtual_region.c
+++ b/xen/common/virtual_region.c
@@ -103,13 +103,13 @@ void __init setup_virtual_regions(const struct exception_table_entry *start,
 {
     size_t sz;
     unsigned int i;
-    static const struct bug_frame *const __initconstrel bug_frames[] = {
-        __start_bug_frames,
-        __stop_bug_frames_0,
-        __stop_bug_frames_1,
-        __stop_bug_frames_2,
+    const struct bug_frame *bug_frames[] = {
+        SYMBOL(__start_bug_frames),
+        SYMBOL(__stop_bug_frames_0),
+        SYMBOL(__stop_bug_frames_1),
+        SYMBOL(__stop_bug_frames_2),
 #ifdef CONFIG_X86
-        __stop_bug_frames_3,
+        SYMBOL(__stop_bug_frames_3),
 #endif
         NULL
     };
diff --git a/xen/include/xen/kernel.h b/xen/include/xen/kernel.h
index 548b64d..8d204ac 100644
--- a/xen/include/xen/kernel.h
+++ b/xen/include/xen/kernel.h
@@ -66,27 +66,27 @@
 })
 
 extern char _start[], _end[], start[];
-#define is_kernel(p) ({                         \
-    char *__p = (char *)(unsigned long)(p);     \
-    (__p >= _start) && (__p < _end);            \
+#define is_kernel(p) ({                                             \
+    char *p__ = (char *)(unsigned long)(p);                         \
+    (p__ >= SYMBOL(_start)) && (p__ < SYMBOL(_end));                \
 })
 
 extern char _stext[], _etext[];
-#define is_kernel_text(p) ({                    \
-    char *__p = (char *)(unsigned long)(p);     \
-    (__p >= _stext) && (__p < _etext);          \
+#define is_kernel_text(p) ({                                        \
+    char *p__ = (char *)(unsigned long)(p);                         \
+    (p__ >= SYMBOL(_stext)) && (p__ < SYMBOL(_etext));              \
 })
 
 extern const char _srodata[], _erodata[];
-#define is_kernel_rodata(p) ({                  \
-    const char *__p = (const char *)(unsigned long)(p);     \
-    (__p >= _srodata) && (__p < _erodata);      \
+#define is_kernel_rodata(p) ({                                      \
+    const char *p__ = (const char *)(unsigned long)(p);             \
+    (p__ >= SYMBOL(_srodata)) && (p__ < SYMBOL(_erodata));          \
 })
 
 extern char _sinittext[], _einittext[];
-#define is_kernel_inittext(p) ({                \
-    char *__p = (char *)(unsigned long)(p);     \
-    (__p >= _sinittext) && (__p < _einittext);  \
+#define is_kernel_inittext(p) ({                                    \
+    char *p__ = (char *)(unsigned long)(p);                         \
+    (p__ >= SYMBOL(_sinittext)) && (p__ < SYMBOL(_einittext));      \
 })
 
 extern enum system_state {
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-09 23:42 ` [PATCH v6 1/4] xen: introduce SYMBOL Stefano Stabellini
@ 2019-01-10  2:40   ` Julien Grall
  2019-01-10  8:24     ` Jan Beulich
  2019-01-10 17:22     ` Stefano Stabellini
  2019-01-10  8:34   ` Jan Beulich
  1 sibling, 2 replies; 102+ messages in thread
From: Julien Grall @ 2019-01-10  2:40 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Stefano Stabellini, wei.liu2, andrew.cooper3, julien.grall,
	JBeulich, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3063 bytes --]

Hi,

Sorry for the formatting.

On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <sstabellini@kernel.org>
wrote:

> Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
> meant to be used everywhere symbols such as _stext and _etext are used
> in the code. It can take an array type as a parameter, and it returns
> the same type.
>
> SYMBOL is needed when accessing symbols such as _stext and _etext
> because the C standard forbids for both comparisons and substraction
> (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
> pointing to different objects. _stext, _etext, etc. are all pointers to
> different objects from ANCI C point of view.
>

This does not make sense because you still return a pointer and therefore
the undefined behavior is still present.

I really don't believe this patch is going to make the MISRA tool happy.
Furthermore, IIRC, Linux to returns unsigned long. So I would like to
understand why the trick is no needed for us...

At that stage, we should probably involve MlSRA folks (PRQA) to have a
better understanding on what is expected.

Cheers,


> To work around potential C compiler issues (which have actually
> been found, see the comment on top of RELOC_HIDE in Linux), and to help
> with certifications, let's introduce some syntactic sugar to be used in
> following patches.


> [1]
> https://wiki.sei.cmu.edu/confluence/display/c/ARR36-C.+Do+not+subtract+or+compare+two+pointers+that+do+not+refer+to+the+same+array
>
> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
> CC: JBeulich@suse.com
> CC: andrew.cooper3@citrix.com
> CC: wei.liu2@citrix.com
> ---
> Changes in v6:
> - drop acks
> - don't use RELOC_HIDE for the implementation
> - return native type from SYMBOL
>
> Changes in v4:
> - add acked-bys
> - remove unneeded parenthesis
>
> Changes in v3:
> - improve commit message
> - rename __symbol to SYMBOL to avoid name space violations
>
> Changes in v2:
> - do not cast return to char*
> - move to common header
> ---
>  xen/include/xen/compiler.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/xen/include/xen/compiler.h b/xen/include/xen/compiler.h
> index ff6c0f5..d4c856c 100644
> --- a/xen/include/xen/compiler.h
> +++ b/xen/include/xen/compiler.h
> @@ -99,6 +99,16 @@
>      __asm__ ("" : "=r"(__ptr) : "0"(ptr));      \
>      (typeof(ptr)) (__ptr + (off)); })
>
> +/*
> + * Similar to RELOC_HIDE, but written to be used with symbols such as
> + * _stext and _etext to avoid undefined behavior comparing pointers to
> + * different objects. It can handle array types.
> + */
> +#define SYMBOL(ptr)                               \
> +  ({ unsigned long __ptr;                       \
> +    __asm__ ("" : "=r"(__ptr) : "0"(ptr));      \
> +    (typeof(*(ptr)) *) (__ptr); })
> +
>  #ifdef __GCC_ASM_FLAG_OUTPUTS__
>  # define ASM_FLAG_OUT(yes, no) yes
>  #else
> --
> 1.9.1
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 4653 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10  2:40   ` Julien Grall
@ 2019-01-10  8:24     ` Jan Beulich
  2019-01-10 17:29       ` Stefano Stabellini
  2019-01-10 17:22     ` Stefano Stabellini
  1 sibling, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-10  8:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: Andrew Cooper, Julien Grall, Wei Liu, Stefano Stabellini, xen-devel

>>> On 10.01.19 at 03:40, <julien.grall@gmail.com> wrote:
> On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <sstabellini@kernel.org>
> wrote:
> 
>> Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
>> meant to be used everywhere symbols such as _stext and _etext are used
>> in the code. It can take an array type as a parameter, and it returns
>> the same type.
>>
>> SYMBOL is needed when accessing symbols such as _stext and _etext
>> because the C standard forbids for both comparisons and substraction
>> (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
>> pointing to different objects. _stext, _etext, etc. are all pointers to
>> different objects from ANCI C point of view.
>>
> 
> This does not make sense because you still return a pointer and therefore
> the undefined behavior is still present.
> 
> I really don't believe this patch is going to make the MISRA tool happy.

Well, till now I've been assuming that no version of this series was
posted without being certain the changes achieve the goal (of
making that tool happy).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-09 23:42 ` [PATCH v6 1/4] xen: introduce SYMBOL Stefano Stabellini
  2019-01-10  2:40   ` Julien Grall
@ 2019-01-10  8:34   ` Jan Beulich
  2019-01-10 18:09     ` Stefano Stabellini
  1 sibling, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-10  8:34 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Julien Grall, Wei Liu, Stefano Stabellini, xen-devel

>>> On 10.01.19 at 00:42, <sstabellini@kernel.org> wrote:
> --- a/xen/include/xen/compiler.h
> +++ b/xen/include/xen/compiler.h
> @@ -99,6 +99,16 @@
>      __asm__ ("" : "=r"(__ptr) : "0"(ptr));      \
>      (typeof(ptr)) (__ptr + (off)); })
>  
> +/*
> + * Similar to RELOC_HIDE, but written to be used with symbols such as
> + * _stext and _etext to avoid undefined behavior comparing pointers to
> + * different objects. It can handle array types.
> + */
> +#define SYMBOL(ptr)                               \
> +  ({ unsigned long __ptr;                       \
> +    __asm__ ("" : "=r"(__ptr) : "0"(ptr));      \
> +    (typeof(*(ptr)) *) (__ptr); })

I'm sorry for thinking of this only now, but SYMBOL() is (no longer)
appropriate as a name here. I'd like to suggest SYMBOL_HIDE(), in
line with the other macro's name. With that and with
- the stray blank after the cast dropped
- the asm() formatting corrected; there are a number of blanks
  missing
- the name of the local variable corrected as per my original
  suggestion
- indentation corrected
- the use of underscores on "asm" and "typeof" brought in line
you may (re-)add
Acked-by: Jan Beulich <jbeulich@suse.com>

Furthermore I wonder why the cast is needed in the first place.
Doesn't

#define SYMBOL_HIDE(ptr) ({                   \
    __typeof__(*(ptr)) *ptr_;                 \
    __asm__ ( "" : "=r" (ptr_) : "0" (ptr) ); \
    ptr_; \
})

do the job as well?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 2/4] xen/arm: use SYMBOL when required
  2019-01-09 23:42 ` [PATCH v6 2/4] xen/arm: use SYMBOL when required Stefano Stabellini
@ 2019-01-10  8:41   ` Jan Beulich
  2019-01-10 17:44     ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-10  8:41 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, xen-devel

>>> On 10.01.19 at 00:42, <sstabellini@kernel.org> wrote:
> @@ -1138,9 +1138,10 @@ void free_init_memory(void)
>      for ( i = 0; i < nr; i++ )
>          *(p + i) = insn;
>  
> -    set_pte_flags_on_range(__init_begin, len, mg_clear);
> +    set_pte_flags_on_range(SYMBOL(__init_begin), len, mg_clear);
>      init_domheap_pages(pa, pa + len);
> -    printk("Freed %ldkB init memory.\n", (long)(__init_end-__init_begin)>>10);
> +    printk("Freed %ldkB init memory.\n",
> +           (long)(SYMBOL(__init_end)-SYMBOL(__init_begin))>>10);

I've noticed this only here, but I can't exclude I've overlooked other
instances: I think it would be really nice if you corrected formatting
at the same time (here: add the missing blanks).

> --- a/xen/arch/arm/percpu.c
> +++ b/xen/arch/arm/percpu.c
> @@ -6,7 +6,7 @@
>  
>  unsigned long __per_cpu_offset[NR_CPUS];
>  #define INVALID_PERCPU_AREA (-(long)__per_cpu_start)
> -#define PERCPU_ORDER (get_order_from_bytes(__per_cpu_data_end-__per_cpu_start))
> +#define PERCPU_ORDER (get_order_from_bytes(SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start)))

Long line.

> @@ -37,7 +37,7 @@ static void _free_percpu_area(struct rcu_head *head)
>  {
>      struct free_info *info = container_of(head, struct free_info, rcu);
>      unsigned int cpu = info->cpu;
> -    char *p = __per_cpu_start + __per_cpu_offset[cpu];
> +    char *p = SYMBOL(__per_cpu_start) + __per_cpu_offset[cpu];
>      free_xenheap_pages(p, PERCPU_ORDER);
>      __per_cpu_offset[cpu] = INVALID_PERCPU_AREA;
>  }

As per above, to add the missing blank line would be quite nice at
this occasion.

> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -772,8 +772,10 @@ void __init start_xen(unsigned long boot_phys_offset,
>  
>      /* Register Xen's load address as a boot module. */
>      xen_bootmodule = add_boot_module(BOOTMOD_XEN,
> -                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
> -                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
> +                             (paddr_t)(uintptr_t)(SYMBOL(_start) +
> +                                                  boot_phys_offset),
> +                             (paddr_t)(uintptr_t)(SYMBOL(_end) -
> +                                                  SYMBOL(_start) + 1), false);

Why you need the double casts, i.e. why does (uintptr_t) alone not
suffice?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 3/4] xen/x86: use SYMBOL when required
  2019-01-09 23:42 ` [PATCH v6 3/4] xen/x86: " Stefano Stabellini
@ 2019-01-10  8:43   ` Jan Beulich
  2019-01-10 17:45     ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-10  8:43 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, xen-devel

>>> On 10.01.19 at 00:42, <sstabellini@kernel.org> wrote:
> --- a/xen/arch/x86/percpu.c
> +++ b/xen/arch/x86/percpu.c
> @@ -13,7 +13,7 @@ unsigned long __per_cpu_offset[NR_CPUS];
>   * context of PV guests.
>   */
>  #define INVALID_PERCPU_AREA (0x8000000000000000L - (long)__per_cpu_start)
> -#define PERCPU_ORDER get_order_from_bytes(__per_cpu_data_end - __per_cpu_start)
> +#define PERCPU_ORDER get_order_from_bytes(SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start))

Long line.

With this taken care of
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 4/4] xen/common: use SYMBOL when required
  2019-01-09 23:42 ` [PATCH v6 4/4] xen/common: " Stefano Stabellini
@ 2019-01-10  8:49   ` Jan Beulich
  2019-01-10 17:48     ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-10  8:49 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, xen-devel

>>> On 10.01.19 at 00:42, <sstabellini@kernel.org> wrote:
> --- a/xen/common/version.c
> +++ b/xen/common/version.c
> @@ -147,14 +147,14 @@ static int __init xen_build_init(void)
>      int rc;
>  
>      /* --build-id invoked with wrong parameters. */
> -    if ( __note_gnu_build_id_end <= &n[0] )
> +    if ( SYMBOL(__note_gnu_build_id_end) <= &n[0] )
>          return -ENODATA;
>  
>      /* Check for full Note header. */
> -    if ( &n[1] >= __note_gnu_build_id_end )
> +    if ( &n[1] >= SYMBOL(__note_gnu_build_id_end) )
>          return -ENODATA;
>  
> -    sz = (void *)__note_gnu_build_id_end - (void *)n;
> +    sz = (void *)SYMBOL(__note_gnu_build_id_end) - (void *)n;

Now this is an instance where I wouldn't mind if you switched the
casts to (unsigned long).

> --- a/xen/common/virtual_region.c
> +++ b/xen/common/virtual_region.c
> @@ -103,13 +103,13 @@ void __init setup_virtual_regions(const struct 
> exception_table_entry *start,
>  {
>      size_t sz;
>      unsigned int i;
> -    static const struct bug_frame *const __initconstrel bug_frames[] = {
> -        __start_bug_frames,
> -        __stop_bug_frames_0,
> -        __stop_bug_frames_1,
> -        __stop_bug_frames_2,
> +    const struct bug_frame *bug_frames[] = {

Please don't loose the second const.

> --- a/xen/include/xen/kernel.h
> +++ b/xen/include/xen/kernel.h
> @@ -66,27 +66,27 @@
>  })
>  
>  extern char _start[], _end[], start[];
> -#define is_kernel(p) ({                         \
> -    char *__p = (char *)(unsigned long)(p);     \
> -    (__p >= _start) && (__p < _end);            \
> +#define is_kernel(p) ({                                             \
> +    char *p__ = (char *)(unsigned long)(p);                         \
> +    (p__ >= SYMBOL(_start)) && (p__ < SYMBOL(_end));                \
>  })
>  
>  extern char _stext[], _etext[];
> -#define is_kernel_text(p) ({                    \
> -    char *__p = (char *)(unsigned long)(p);     \
> -    (__p >= _stext) && (__p < _etext);          \
> +#define is_kernel_text(p) ({                                        \
> +    char *p__ = (char *)(unsigned long)(p);                         \
> +    (p__ >= SYMBOL(_stext)) && (p__ < SYMBOL(_etext));              \
>  })
>  
>  extern const char _srodata[], _erodata[];
> -#define is_kernel_rodata(p) ({                  \
> -    const char *__p = (const char *)(unsigned long)(p);     \
> -    (__p >= _srodata) && (__p < _erodata);      \
> +#define is_kernel_rodata(p) ({                                      \
> +    const char *p__ = (const char *)(unsigned long)(p);             \

Just like here, in all other of the sibling macros you could easily
have switched p__ to be const char * as well.

With at least the bug_frames[] remark taken care of
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10  2:40   ` Julien Grall
  2019-01-10  8:24     ` Jan Beulich
@ 2019-01-10 17:22     ` Stefano Stabellini
  1 sibling, 0 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-10 17:22 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Stefano Stabellini, wei.liu2, andrew.cooper3,
	julien.grall, JBeulich, xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4052 bytes --]

On Wed, 9 Jan 2019, Julien Grall wrote:
> Hi,
> Sorry for the formatting.
> 
> On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <sstabellini@kernel.org> wrote:
>       Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
>       meant to be used everywhere symbols such as _stext and _etext are used
>       in the code. It can take an array type as a parameter, and it returns
>       the same type.
> 
>       SYMBOL is needed when accessing symbols such as _stext and _etext
>       because the C standard forbids for both comparisons and substraction
>       (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
>       pointing to different objects. _stext, _etext, etc. are all pointers to
>       different objects from ANCI C point of view.
> 
> 
> This does not make sense because you still return a pointer and therefore the undefined behavior is still present.
> 
> I really don't believe this patch is going to make the MISRA tool happy. Furthermore, IIRC, Linux to returns unsigned long. So I
> would like to understand why the trick is no needed for us...
> 
> At that stage, we should probably involve MlSRA folks (PRQA) to have a better understanding on what is expected.

Julien, thanks for chiming in. Yes, I completely agree with you.

My thinking for the current changes is that they are better than nothing
as they clearly mark all problematic sites with "SYMBOL", even if the
implementation of SYMBOL might not be good enough. Then, we double-check
with PRQA and others, once we get their feedback we can still change the
return type of SYMBOL if we need to, and it will be easy to do at that
point (it only took me 1h yesterday to make the opposite change).



>       To work around potential C compiler issues (which have actually
>       been found, see the comment on top of RELOC_HIDE in Linux), and to help
>       with certifications, let's introduce some syntactic sugar to be used in
>       following patches.
> 
> 
>       [1]https://wiki.sei.cmu.edu/confluence/display/c/ARR36-C.+Do+not+subtract+or+compare+two+pointers+that+do+not+refer+to+the+same+arr
>       ay
> 
>       Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
>       CC: JBeulich@suse.com
>       CC: andrew.cooper3@citrix.com
>       CC: wei.liu2@citrix.com
>       ---
>       Changes in v6:
>       - drop acks
>       - don't use RELOC_HIDE for the implementation
>       - return native type from SYMBOL
> 
>       Changes in v4:
>       - add acked-bys
>       - remove unneeded parenthesis
> 
>       Changes in v3:
>       - improve commit message
>       - rename __symbol to SYMBOL to avoid name space violations
> 
>       Changes in v2:
>       - do not cast return to char*
>       - move to common header
>       ---
>        xen/include/xen/compiler.h | 10 ++++++++++
>        1 file changed, 10 insertions(+)
> 
>       diff --git a/xen/include/xen/compiler.h b/xen/include/xen/compiler.h
>       index ff6c0f5..d4c856c 100644
>       --- a/xen/include/xen/compiler.h
>       +++ b/xen/include/xen/compiler.h
>       @@ -99,6 +99,16 @@
>            __asm__ ("" : "=r"(__ptr) : "0"(ptr));      \
>            (typeof(ptr)) (__ptr + (off)); })
> 
>       +/*
>       + * Similar to RELOC_HIDE, but written to be used with symbols such as
>       + * _stext and _etext to avoid undefined behavior comparing pointers to
>       + * different objects. It can handle array types.
>       + */
>       +#define SYMBOL(ptr)                               \
>       +  ({ unsigned long __ptr;                       \
>       +    __asm__ ("" : "=r"(__ptr) : "0"(ptr));      \
>       +    (typeof(*(ptr)) *) (__ptr); })
>       +
>        #ifdef __GCC_ASM_FLAG_OUTPUTS__
>        # define ASM_FLAG_OUT(yes, no) yes
>        #else
>       --
>       1.9.1
> 
> 
>       _______________________________________________
>       Xen-devel mailing list
>       Xen-devel@lists.xenproject.org
>       https://lists.xenproject.org/mailman/listinfo/xen-devel
> 
> 
> 

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10  8:24     ` Jan Beulich
@ 2019-01-10 17:29       ` Stefano Stabellini
  2019-01-10 18:46         ` Stewart Hildebrand
  2019-01-10 19:24         ` Julien Grall
  0 siblings, 2 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-10 17:29 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, xen-devel

On Thu, 10 Jan 2019, Jan Beulich wrote:
> >>> On 10.01.19 at 03:40, <julien.grall@gmail.com> wrote:
> > On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <sstabellini@kernel.org>
> > wrote:
> > 
> >> Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
> >> meant to be used everywhere symbols such as _stext and _etext are used
> >> in the code. It can take an array type as a parameter, and it returns
> >> the same type.
> >>
> >> SYMBOL is needed when accessing symbols such as _stext and _etext
> >> because the C standard forbids for both comparisons and substraction
> >> (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
> >> pointing to different objects. _stext, _etext, etc. are all pointers to
> >> different objects from ANCI C point of view.
> >>
> > 
> > This does not make sense because you still return a pointer and therefore
> > the undefined behavior is still present.
> > 
> > I really don't believe this patch is going to make the MISRA tool happy.
> 
> Well, till now I've been assuming that no version of this series was
> posted without being certain the changes achieve the goal (of
> making that tool happy).

No, Jan: unfortunately we cannot re-run the scanning tool against any
version of Xen we wish to :-(

We cannot know in advance if a set of changes will make the tool happy
or not.

If I knew that SYMBOL returning the native point type as in v6 is
sufficient to make the tool happy, I wouldn't be here arguing. We cannot
know for sure until we commit the changes, then we ask PRQA to re-scan
against a more recent version of Xen. It is an heavy process and for
this reason I preferred the safer of the two approaches.

Anyway, I would rather get something in, even if insufficient, than
nothing. So I'll address all your comments based on returning the
pointer type, and submit a new version. The bothersome changes are the
ones to the call sites, and I would like to get those in no matter the
implementation of SYMBOL.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 2/4] xen/arm: use SYMBOL when required
  2019-01-10  8:41   ` Jan Beulich
@ 2019-01-10 17:44     ` Stefano Stabellini
  2019-01-11 10:52       ` Jan Beulich
  0 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-10 17:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini,
	Stefano Stabellini, xen-devel

On Thu, 10 Jan 2019, Jan Beulich wrote:
> >>> On 10.01.19 at 00:42, <sstabellini@kernel.org> wrote:
> > @@ -1138,9 +1138,10 @@ void free_init_memory(void)
> >      for ( i = 0; i < nr; i++ )
> >          *(p + i) = insn;
> >  
> > -    set_pte_flags_on_range(__init_begin, len, mg_clear);
> > +    set_pte_flags_on_range(SYMBOL(__init_begin), len, mg_clear);
> >      init_domheap_pages(pa, pa + len);
> > -    printk("Freed %ldkB init memory.\n", (long)(__init_end-__init_begin)>>10);
> > +    printk("Freed %ldkB init memory.\n",
> > +           (long)(SYMBOL(__init_end)-SYMBOL(__init_begin))>>10);
> 
> I've noticed this only here, but I can't exclude I've overlooked other
> instances: I think it would be really nice if you corrected formatting
> at the same time (here: add the missing blanks).

OK

I tend not to do cleanups together with meaningful changes, because
typically I find the resulting patch harder to review, but I am OK with
doing cleanups if you the maintainer asks for them


> > --- a/xen/arch/arm/percpu.c
> > +++ b/xen/arch/arm/percpu.c
> > @@ -6,7 +6,7 @@
> >  
> >  unsigned long __per_cpu_offset[NR_CPUS];
> >  #define INVALID_PERCPU_AREA (-(long)__per_cpu_start)
> > -#define PERCPU_ORDER (get_order_from_bytes(__per_cpu_data_end-__per_cpu_start))
> > +#define PERCPU_ORDER (get_order_from_bytes(SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start)))
> 
> Long line.

OK


> > @@ -37,7 +37,7 @@ static void _free_percpu_area(struct rcu_head *head)
> >  {
> >      struct free_info *info = container_of(head, struct free_info, rcu);
> >      unsigned int cpu = info->cpu;
> > -    char *p = __per_cpu_start + __per_cpu_offset[cpu];
> > +    char *p = SYMBOL(__per_cpu_start) + __per_cpu_offset[cpu];
> >      free_xenheap_pages(p, PERCPU_ORDER);
> >      __per_cpu_offset[cpu] = INVALID_PERCPU_AREA;
> >  }
> 
> As per above, to add the missing blank line would be quite nice at
> this occasion.

OK


> > --- a/xen/arch/arm/setup.c
> > +++ b/xen/arch/arm/setup.c
> > @@ -772,8 +772,10 @@ void __init start_xen(unsigned long boot_phys_offset,
> >  
> >      /* Register Xen's load address as a boot module. */
> >      xen_bootmodule = add_boot_module(BOOTMOD_XEN,
> > -                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
> > -                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
> > +                             (paddr_t)(uintptr_t)(SYMBOL(_start) +
> > +                                                  boot_phys_offset),
> > +                             (paddr_t)(uintptr_t)(SYMBOL(_end) -
> > +                                                  SYMBOL(_start) + 1), false);
> 
> Why you need the double casts, i.e. why does (uintptr_t) alone not
> suffice?

The original reason was just not to change the existing code outside of
adding SYMBOL :-)

But to answer your question, uintptr_t is the same size of char*, while
paddr_t is always 64bit. uintptr_t casts to integer type, paddr_t casts
to the right size. I don't think it is allowed to change from pointer to
integer and change integer size in a single cast.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 3/4] xen/x86: use SYMBOL when required
  2019-01-10  8:43   ` Jan Beulich
@ 2019-01-10 17:45     ` Stefano Stabellini
  0 siblings, 0 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-10 17:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini,
	Stefano Stabellini, xen-devel

On Thu, 10 Jan 2019, Jan Beulich wrote:
> >>> On 10.01.19 at 00:42, <sstabellini@kernel.org> wrote:
> > --- a/xen/arch/x86/percpu.c
> > +++ b/xen/arch/x86/percpu.c
> > @@ -13,7 +13,7 @@ unsigned long __per_cpu_offset[NR_CPUS];
> >   * context of PV guests.
> >   */
> >  #define INVALID_PERCPU_AREA (0x8000000000000000L - (long)__per_cpu_start)
> > -#define PERCPU_ORDER get_order_from_bytes(__per_cpu_data_end - __per_cpu_start)
> > +#define PERCPU_ORDER get_order_from_bytes(SYMBOL(__per_cpu_data_end) - SYMBOL(__per_cpu_start))
> 
> Long line.
> 
> With this taken care of
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
 
OK, thanks

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 4/4] xen/common: use SYMBOL when required
  2019-01-10  8:49   ` Jan Beulich
@ 2019-01-10 17:48     ` Stefano Stabellini
  0 siblings, 0 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-10 17:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini,
	Stefano Stabellini, xen-devel

On Thu, 10 Jan 2019, Jan Beulich wrote:
> >>> On 10.01.19 at 00:42, <sstabellini@kernel.org> wrote:
> > --- a/xen/common/version.c
> > +++ b/xen/common/version.c
> > @@ -147,14 +147,14 @@ static int __init xen_build_init(void)
> >      int rc;
> >  
> >      /* --build-id invoked with wrong parameters. */
> > -    if ( __note_gnu_build_id_end <= &n[0] )
> > +    if ( SYMBOL(__note_gnu_build_id_end) <= &n[0] )
> >          return -ENODATA;
> >  
> >      /* Check for full Note header. */
> > -    if ( &n[1] >= __note_gnu_build_id_end )
> > +    if ( &n[1] >= SYMBOL(__note_gnu_build_id_end) )
> >          return -ENODATA;
> >  
> > -    sz = (void *)__note_gnu_build_id_end - (void *)n;
> > +    sz = (void *)SYMBOL(__note_gnu_build_id_end) - (void *)n;
> 
> Now this is an instance where I wouldn't mind if you switched the
> casts to (unsigned long).

OK


> > --- a/xen/common/virtual_region.c
> > +++ b/xen/common/virtual_region.c
> > @@ -103,13 +103,13 @@ void __init setup_virtual_regions(const struct 
> > exception_table_entry *start,
> >  {
> >      size_t sz;
> >      unsigned int i;
> > -    static const struct bug_frame *const __initconstrel bug_frames[] = {
> > -        __start_bug_frames,
> > -        __stop_bug_frames_0,
> > -        __stop_bug_frames_1,
> > -        __stop_bug_frames_2,
> > +    const struct bug_frame *bug_frames[] = {
> 
> Please don't loose the second const.

OK


> > --- a/xen/include/xen/kernel.h
> > +++ b/xen/include/xen/kernel.h
> > @@ -66,27 +66,27 @@
> >  })
> >  
> >  extern char _start[], _end[], start[];
> > -#define is_kernel(p) ({                         \
> > -    char *__p = (char *)(unsigned long)(p);     \
> > -    (__p >= _start) && (__p < _end);            \
> > +#define is_kernel(p) ({                                             \
> > +    char *p__ = (char *)(unsigned long)(p);                         \
> > +    (p__ >= SYMBOL(_start)) && (p__ < SYMBOL(_end));                \
> >  })
> >  
> >  extern char _stext[], _etext[];
> > -#define is_kernel_text(p) ({                    \
> > -    char *__p = (char *)(unsigned long)(p);     \
> > -    (__p >= _stext) && (__p < _etext);          \
> > +#define is_kernel_text(p) ({                                        \
> > +    char *p__ = (char *)(unsigned long)(p);                         \
> > +    (p__ >= SYMBOL(_stext)) && (p__ < SYMBOL(_etext));              \
> >  })
> >  
> >  extern const char _srodata[], _erodata[];
> > -#define is_kernel_rodata(p) ({                  \
> > -    const char *__p = (const char *)(unsigned long)(p);     \
> > -    (__p >= _srodata) && (__p < _erodata);      \
> > +#define is_kernel_rodata(p) ({                                      \
> > +    const char *p__ = (const char *)(unsigned long)(p);             \
> 
> Just like here, in all other of the sibling macros you could easily
> have switched p__ to be const char * as well.

OK


> With at least the bug_frames[] remark taken care of
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
 
I'll make the changes above and add your reviewed-by

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10  8:34   ` Jan Beulich
@ 2019-01-10 18:09     ` Stefano Stabellini
  0 siblings, 0 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-10 18:09 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, xen-devel

On Thu, 10 Jan 2019, Jan Beulich wrote:
> >>> On 10.01.19 at 00:42, <sstabellini@kernel.org> wrote:
> > --- a/xen/include/xen/compiler.h
> > +++ b/xen/include/xen/compiler.h
> > @@ -99,6 +99,16 @@
> >      __asm__ ("" : "=r"(__ptr) : "0"(ptr));      \
> >      (typeof(ptr)) (__ptr + (off)); })
> >  
> > +/*
> > + * Similar to RELOC_HIDE, but written to be used with symbols such as
> > + * _stext and _etext to avoid undefined behavior comparing pointers to
> > + * different objects. It can handle array types.
> > + */
> > +#define SYMBOL(ptr)                               \
> > +  ({ unsigned long __ptr;                       \
> > +    __asm__ ("" : "=r"(__ptr) : "0"(ptr));      \
> > +    (typeof(*(ptr)) *) (__ptr); })
> 
> I'm sorry for thinking of this only now, but SYMBOL() is (no longer)
> appropriate as a name here. I'd like to suggest SYMBOL_HIDE(), in
> line with the other macro's name. With that and with
> - the stray blank after the cast dropped
> - the asm() formatting corrected; there are a number of blanks
>   missing
> - the name of the local variable corrected as per my original
>   suggestion
> - indentation corrected
> - the use of underscores on "asm" and "typeof" brought in line
> you may (re-)add
> Acked-by: Jan Beulich <jbeulich@suse.com>
>
> Furthermore I wonder why the cast is needed in the first place.
> Doesn't
> 
> #define SYMBOL_HIDE(ptr) ({                   \
>     __typeof__(*(ptr)) *ptr_;                 \
>     __asm__ ( "" : "=r" (ptr_) : "0" (ptr) ); \
>     ptr_; \
> })
> 
> do the job as well?

It works, but it goes even more in the direction of not fixing MISRA-C
compliance. Given that we have already enstblished that we'll have to
check with the experts on this topic, I am OK with using this
implementation. Changing the implementation of SYMBOL_HIDE in the future
is easy if we need to. I added your Acked-by.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10 17:29       ` Stefano Stabellini
@ 2019-01-10 18:46         ` Stewart Hildebrand
  2019-01-10 19:03           ` Stefano Stabellini
  2019-01-11 10:35           ` Jan Beulich
  2019-01-10 19:24         ` Julien Grall
  1 sibling, 2 replies; 102+ messages in thread
From: Stewart Hildebrand @ 2019-01-10 18:46 UTC (permalink / raw)
  To: Stefano Stabellini, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Julien Grall,
	Julien Grall, xen-devel

On Thursday, January 10, 2019 12:30 PM, Stefano Stabellini wrote:
> On Thu, 10 Jan 2019, Jan Beulich wrote:
> > >>> On 10.01.19 at 03:40, <julien.grall@gmail.com> wrote:
> > > On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <sstabellini@kernel.org>
> > > wrote:
> > >
> > >> Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
> > >> meant to be used everywhere symbols such as _stext and _etext are used
> > >> in the code. It can take an array type as a parameter, and it returns
> > >> the same type.
> > >>
> > >> SYMBOL is needed when accessing symbols such as _stext and _etext
> > >> because the C standard forbids for both comparisons and substraction
> > >> (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
> > >> pointing to different objects. _stext, _etext, etc. are all pointers to
> > >> different objects from ANCI C point of view.
> > >>
> > >
> > > This does not make sense because you still return a pointer and therefore
> > > the undefined behavior is still present.
> > >
> > > I really don't believe this patch is going to make the MISRA tool happy.
> >
> > Well, till now I've been assuming that no version of this series was
> > posted without being certain the changes achieve the goal (of
> > making that tool happy).
> 
> No, Jan: unfortunately we cannot re-run the scanning tool against any
> version of Xen we wish to :-(
> 
> We cannot know in advance if a set of changes will make the tool happy
> or not.

Playing devil's advocate: even with all sorts of casting and inline assembly
to suppress static analysis tool warnings, we're still not addressing the
underlying rule violation. A pointer value casted or fed through some inline
assembly at the end of the day is still a value that represents an address
in an object. And as soon as we subtract or compare that value to one
that represents another object we start violating the MISRA rules (this is
my own rather strict interpretation for the purpose of making a point - 
please feel free to disagree). 

If all we really care about is making PRQA happy, I believe it does support
some sort of comment-based suppression. I've seen comments like
/* PRQA S 0487 */ or /* PRQA S 0488 */ in various codebases, I'm guessing
comments like this have something to do with suppressing these types of
warnings.

> 
> If I knew that SYMBOL returning the native point type as in v6 is
> sufficient to make the tool happy, I wouldn't be here arguing. We cannot
> know for sure until we commit the changes, then we ask PRQA to re-scan
> against a more recent version of Xen. It is an heavy process and for
> this reason I preferred the safer of the two approaches.
> 
> Anyway, I would rather get something in, even if insufficient, than
> nothing. So I'll address all your comments based on returning the
> pointer type, and submit a new version. The bothersome changes are the
> ones to the call sites, and I would like to get those in no matter the
> implementation of SYMBOL.

I agree, it would be nice to highlight everywhere we think we're in violation
of the "pointers to different objects" rules. Perhaps it would make it clearer
if we added a comment in the codebase to spell out the intent, which I'm
interpreting as roughly "This violates MISRA Rule 18.1, 18.2, or 18.3. We 
need to revisit this in the future to evaluate if we can avoid violating these
rules."

Or perhaps it would make it clearer to forget about SYMBOL altogether and
instead just add suppression comments.

Further, if we decide in an instance that we have no choice but to
subtract/compare pointers to different objects, then the MISRA rules I
mentioned are only *required* rules (not *mandatory*), which means we
are OK to violate them as long as we write up a deviation with the appropriate
justification and explanation of any undefined behavior and why it's OK to rely
on said undefined behavior.

-Stew

> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10 18:46         ` Stewart Hildebrand
@ 2019-01-10 19:03           ` Stefano Stabellini
  2019-01-11 10:35           ` Jan Beulich
  1 sibling, 0 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-10 19:03 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: Stefano Stabellini, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Jan Beulich, xen-devel

On Thu, 10 Jan 2019, Stewart Hildebrand wrote:
> On Thursday, January 10, 2019 12:30 PM, Stefano Stabellini wrote:
> > On Thu, 10 Jan 2019, Jan Beulich wrote:
> > > >>> On 10.01.19 at 03:40, <julien.grall@gmail.com> wrote:
> > > > On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <sstabellini@kernel.org>
> > > > wrote:
> > > >
> > > >> Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
> > > >> meant to be used everywhere symbols such as _stext and _etext are used
> > > >> in the code. It can take an array type as a parameter, and it returns
> > > >> the same type.
> > > >>
> > > >> SYMBOL is needed when accessing symbols such as _stext and _etext
> > > >> because the C standard forbids for both comparisons and substraction
> > > >> (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
> > > >> pointing to different objects. _stext, _etext, etc. are all pointers to
> > > >> different objects from ANCI C point of view.
> > > >>
> > > >
> > > > This does not make sense because you still return a pointer and therefore
> > > > the undefined behavior is still present.
> > > >
> > > > I really don't believe this patch is going to make the MISRA tool happy.
> > >
> > > Well, till now I've been assuming that no version of this series was
> > > posted without being certain the changes achieve the goal (of
> > > making that tool happy).
> > 
> > No, Jan: unfortunately we cannot re-run the scanning tool against any
> > version of Xen we wish to :-(
> > 
> > We cannot know in advance if a set of changes will make the tool happy
> > or not.
> 
> Playing devil's advocate: even with all sorts of casting and inline assembly
> to suppress static analysis tool warnings, we're still not addressing the
> underlying rule violation. A pointer value casted or fed through some inline
> assembly at the end of the day is still a value that represents an address
> in an object. And as soon as we subtract or compare that value to one
> that represents another object we start violating the MISRA rules (this is
> my own rather strict interpretation for the purpose of making a point - 
> please feel free to disagree). 

Yes, this seems to be Jan's point of view too, but I disagree: _start is
not a pontier to an object -- it is a linker-set memory address. Also,
if there any doubts, certainly it is not a pointer to an object after
being converted to unsigned long in assembly. I don't think C compliance
could/should make any assumptions or guesses on asm returned values. But
I am less convinced of this argument if we convert it back to another
C pointer, which is what the current implementation does.

BTW, in case Jan's changes his mind, this is an alternative version of
v7 with SYMBOL_HIDE returning unsigned long:

http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git certification-7-unsigned_long


> If all we really care about is making PRQA happy, I believe it does support
> some sort of comment-based suppression. I've seen comments like
> /* PRQA S 0487 */ or /* PRQA S 0488 */ in various codebases, I'm guessing
> comments like this have something to do with suppressing these types of
> warnings.

Interesting ... something to investigate.


> > If I knew that SYMBOL returning the native point type as in v6 is
> > sufficient to make the tool happy, I wouldn't be here arguing. We cannot
> > know for sure until we commit the changes, then we ask PRQA to re-scan
> > against a more recent version of Xen. It is an heavy process and for
> > this reason I preferred the safer of the two approaches.
> > 
> > Anyway, I would rather get something in, even if insufficient, than
> > nothing. So I'll address all your comments based on returning the
> > pointer type, and submit a new version. The bothersome changes are the
> > ones to the call sites, and I would like to get those in no matter the
> > implementation of SYMBOL.
> 
> I agree, it would be nice to highlight everywhere we think we're in violation
> of the "pointers to different objects" rules. Perhaps it would make it clearer
> if we added a comment in the codebase to spell out the intent, which I'm
> interpreting as roughly "This violates MISRA Rule 18.1, 18.2, or 18.3. We 
> need to revisit this in the future to evaluate if we can avoid violating these
> rules."

Yeah, that basically why I would like the series to go in. Even if it
doesn't make the code compliant, at least SYMBOL_HIDE clearly mark the
non-compliant sites.


> Or perhaps it would make it clearer to forget about SYMBOL altogether and
> instead just add suppression comments.

We still need SYMBOL because of potential weird compiler issues, see the
comment on top of RELOC_HIDE in the Linux kernel.


> Further, if we decide in an instance that we have no choice but to
> subtract/compare pointers to different objects, then the MISRA rules I
> mentioned are only *required* rules (not *mandatory*), which means we
> are OK to violate them as long as we write up a deviation with the appropriate
> justification and explanation of any undefined behavior and why it's OK to rely
> on said undefined behavior.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10 17:29       ` Stefano Stabellini
  2019-01-10 18:46         ` Stewart Hildebrand
@ 2019-01-10 19:24         ` Julien Grall
  2019-01-10 21:36           ` Stefano Stabellini
  1 sibling, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-10 19:24 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Julien Grall,
	Jan Beulich, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2674 bytes --]

On Thu, 10 Jan 2019, 12:29 Stefano Stabellini, <sstabellini@kernel.org>
wrote:

> On Thu, 10 Jan 2019, Jan Beulich wrote:
> > >>> On 10.01.19 at 03:40, <julien.grall@gmail.com> wrote:
> > > On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <sstabellini@kernel.org>
> > > wrote:
> > >
> > >> Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
> > >> meant to be used everywhere symbols such as _stext and _etext are used
> > >> in the code. It can take an array type as a parameter, and it returns
> > >> the same type.
> > >>
> > >> SYMBOL is needed when accessing symbols such as _stext and _etext
> > >> because the C standard forbids for both comparisons and substraction
> > >> (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
> > >> pointing to different objects. _stext, _etext, etc. are all pointers
> to
> > >> different objects from ANCI C point of view.
> > >>
> > >
> > > This does not make sense because you still return a pointer and
> therefore
> > > the undefined behavior is still present.
> > >
> > > I really don't believe this patch is going to make the MISRA tool
> happy.
> >
> > Well, till now I've been assuming that no version of this series was
> > posted without being certain the changes achieve the goal (of
> > making that tool happy).
>
> No, Jan: unfortunately we cannot re-run the scanning tool against any
> version of Xen we wish to :-(
>
> We cannot know in advance if a set of changes will make the tool happy
> or not.
>
> If I knew that SYMBOL returning the native point type as in v6 is
> sufficient to make the tool happy, I wouldn't be here arguing. We cannot
> know for sure until we commit the changes, then we ask PRQA to re-scan
> against a more recent version of Xen. It is an heavy process and for
> this reason I preferred the safer of the two approaches.
>


> Anyway, I would rather get something in, even if insufficient, than
> nothing. So I'll address all your comments based on returning the
> pointer type, and submit a new version. The bothersome changes are the
> ones to the call sites, and I would like to get those in no matter the
> implementation of SYMBOL.


It is not only insufficient but wrong when you read the commit message. You
also were not convinced about this approach.

I understand that we need to commit so we can get the result from the PRQA
tool. However, we should have talked with people knowing MISRA to
understand whether it could work.

You also didn't address my point on why Linux needs to go through unsigned
long.

So I don't think it is right to merge it without more ground.

On that basis:

Nacked-by: Julien Grall <julien.grall@arm.com>

Cheers,

>

[-- Attachment #1.2: Type: text/html, Size: 3876 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10 19:24         ` Julien Grall
@ 2019-01-10 21:36           ` Stefano Stabellini
  2019-01-10 23:31             ` Julien Grall
  0 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-10 21:36 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Jan Beulich, xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5269 bytes --]

On Thu, 10 Jan 2019, Julien Grall wrote:
> On Thu, 10 Jan 2019, 12:29 Stefano Stabellini, <sstabellini@kernel.org> wrote:
>       On Thu, 10 Jan 2019, Jan Beulich wrote:
>       > >>> On 10.01.19 at 03:40, <julien.grall@gmail.com> wrote:
>       > > On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <sstabellini@kernel.org>
>       > > wrote:
>       > >
>       > >> Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
>       > >> meant to be used everywhere symbols such as _stext and _etext are used
>       > >> in the code. It can take an array type as a parameter, and it returns
>       > >> the same type.
>       > >>
>       > >> SYMBOL is needed when accessing symbols such as _stext and _etext
>       > >> because the C standard forbids for both comparisons and substraction
>       > >> (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
>       > >> pointing to different objects. _stext, _etext, etc. are all pointers to
>       > >> different objects from ANCI C point of view.
>       > >>
>       > >
>       > > This does not make sense because you still return a pointer and therefore
>       > > the undefined behavior is still present.
>       > >
>       > > I really don't believe this patch is going to make the MISRA tool happy.
>       >
>       > Well, till now I've been assuming that no version of this series was
>       > posted without being certain the changes achieve the goal (of
>       > making that tool happy).
> 
>       No, Jan: unfortunately we cannot re-run the scanning tool against any
>       version of Xen we wish to :-(
> 
>       We cannot know in advance if a set of changes will make the tool happy
>       or not.
> 
>       If I knew that SYMBOL returning the native point type as in v6 is
>       sufficient to make the tool happy, I wouldn't be here arguing. We cannot
>       know for sure until we commit the changes, then we ask PRQA to re-scan
>       against a more recent version of Xen. It is an heavy process and for
>       this reason I preferred the safer of the two approaches.
> 
> 
> 
>       Anyway, I would rather get something in, even if insufficient, than
>       nothing. So I'll address all your comments based on returning the
>       pointer type, and submit a new version. The bothersome changes are the
>       ones to the call sites, and I would like to get those in no matter the
>       implementation of SYMBOL.
> 
> 
> It is not only insufficient but wrong when you read the commit message. You also were not convinced about this approach. 
> 
> I understand that we need to commit so we can get the result from the PRQA tool. However, we should have talked with people
> knowing MISRA to understand whether it could work.
> 
> You also didn't address my point on why Linux needs to go through unsigned long.
> 
> So I don't think it is right to merge it without more ground.
> 
> On that basis:
> 
> Nacked-by: Julien Grall <julien.grall@arm.com>

Hi Julien,

I well understanding your thinking, I am not happy with this approach.

However, changing all the call sites to use SYMBOL, even if SYMBOL does
not do what you and I think it should, is still a valuable change to
have:

1) it clearly highlight all the related violations
2) it is a burdensome set of changes to maintain off-tree which will be
   difficult to rebase and will bitrot quickly
3) it will be simple to change the implementation of SYMBOL afterwards
   as needed
4) regardless of MISRA, we still have a problem with gcc and symbols
   like _start and _end, see the comment on top of RELOC_HIDE in linux
   (include/linux/compiler-gcc.h)

In fact, even not caring about C compliance, this series is still an
improvement, a fix to a potential compiler problem. On that basis alone,
I think it is a bad decision not to merge this series.


To answer your other questions: yes, we need more information about this
compliance issue and MISRA, this is a good reason for committing the
series so that we can have the tool do a re-scan. It is also a great way
to show the problem to a MISRA expert not familiar with Xen: "look at
the way SYMBOL is used through the code..."

I don't know why Linux is using unsigned long, I looked at the commit
messages and comments but there isn't an explanation. However, it just
makes sense to me. That is how I would have implemented the solution as
well. Jan's approach looks very much like a partial workaround to me.


In conclusion, I still agree with you and disagree with Jan, but it
would be good to make progress regardless:

- I think a series introducing the usage of SYMBOL through the code
  should go in 4.12 regardless of the implementation of SYMBOL
- even the bad implementation of SYMBOL would still help with the
  potential gcc problems mentioned by Linux, if not with certifications


For everybody's reference, I have pushed both versions of the series,
the one returning the native type, as asked by Jan:
http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git certifications-7

And the one returning unsigned long, as Julien and I would like:
http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git certifications-7-unsigned_long


Hoping we won't get stuck on this, regards,

Stefano

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10 21:36           ` Stefano Stabellini
@ 2019-01-10 23:31             ` Julien Grall
  2019-01-11  2:14               ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-10 23:31 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Julien Grall,
	Jan Beulich, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 6218 bytes --]

On Thu, 10 Jan 2019, 15:36 Stefano Stabellini, <sstabellini@kernel.org>
wrote:

> On Thu, 10 Jan 2019, Julien Grall wrote:
> > On Thu, 10 Jan 2019, 12:29 Stefano Stabellini, <sstabellini@kernel.org>
> wrote:
> >       On Thu, 10 Jan 2019, Jan Beulich wrote:
> >       > >>> On 10.01.19 at 03:40, <julien.grall@gmail.com> wrote:
> >       > > On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <
> sstabellini@kernel.org>
> >       > > wrote:
> >       > >
> >       > >> Introduce a macro, SYMBOL, which is similar to RELOC_HIDE,
> but it is
> >       > >> meant to be used everywhere symbols such as _stext and _etext
> are used
> >       > >> in the code. It can take an array type as a parameter, and it
> returns
> >       > >> the same type.
> >       > >>
> >       > >> SYMBOL is needed when accessing symbols such as _stext and
> _etext
> >       > >> because the C standard forbids for both comparisons and
> substraction
> >       > >> (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between
> pointers
> >       > >> pointing to different objects. _stext, _etext, etc. are all
> pointers to
> >       > >> different objects from ANCI C point of view.
> >       > >>
> >       > >
> >       > > This does not make sense because you still return a pointer
> and therefore
> >       > > the undefined behavior is still present.
> >       > >
> >       > > I really don't believe this patch is going to make the MISRA
> tool happy.
> >       >
> >       > Well, till now I've been assuming that no version of this series
> was
> >       > posted without being certain the changes achieve the goal (of
> >       > making that tool happy).
> >
> >       No, Jan: unfortunately we cannot re-run the scanning tool against
> any
> >       version of Xen we wish to :-(
> >
> >       We cannot know in advance if a set of changes will make the tool
> happy
> >       or not.
> >
> >       If I knew that SYMBOL returning the native point type as in v6 is
> >       sufficient to make the tool happy, I wouldn't be here arguing. We
> cannot
> >       know for sure until we commit the changes, then we ask PRQA to
> re-scan
> >       against a more recent version of Xen. It is an heavy process and
> for
> >       this reason I preferred the safer of the two approaches.
> >
> >
> >
> >       Anyway, I would rather get something in, even if insufficient, than
> >       nothing. So I'll address all your comments based on returning the
> >       pointer type, and submit a new version. The bothersome changes are
> the
> >       ones to the call sites, and I would like to get those in no matter
> the
> >       implementation of SYMBOL.
> >
> >
> > It is not only insufficient but wrong when you read the commit message.
> You also were not convinced about this approach.
> >
> > I understand that we need to commit so we can get the result from the
> PRQA tool. However, we should have talked with people
> > knowing MISRA to understand whether it could work.
> >
> > You also didn't address my point on why Linux needs to go through
> unsigned long.
> >
> > So I don't think it is right to merge it without more ground.
> >
> > On that basis:
> >
> > Nacked-by: Julien Grall <julien.grall@arm.com>
>
> Hi Julien,
>
> I well understanding your thinking, I am not happy with this approach.
>
> However, changing all the call sites to use SYMBOL, even if SYMBOL does
> not do what you and I think it should, is still a valuable change to
> have:
>
> 1) it clearly highlight all the related violations
> 2) it is a burdensome set of changes to maintain off-tree which will be
>    difficult to rebase and will bitrot quickly
> 3) it will be simple to change the implementation of SYMBOL afterwards
>    as needed
> 4) regardless of MISRA, we still have a problem with gcc and symbols
>    like _start and _end, see the comment on top of RELOC_HIDE in linux
>    (include/linux/compiler-gcc.h)


> In fact, even not caring about C compliance, this series is still an
> improvement, a fix to a potential compiler problem. On that basis alone,
> I think it is a bad decision not to merge this series.
>

>From the different reading (see link in one of my previous), I don't
believe it will help the compiler. I would be interested to know the
rationale you think otherwise.

I am also worried that any change in the definition will require to
investigate all the callsite. This is a call for missing one.

So it still feels to me we are rushing the series.


>
> To answer your other questions: yes, we need more information about this
> compliance issue and MISRA, this is a good reason for committing the
> series so that we can have the tool a re-scan. It is also a great way
> to show the problem to a MISRA expert not familiar with Xen: "look at
> the way SYMBOL is used through the code..."
>
> I don't know why Linux is using unsigned long, I looked at the commit
> messages and comments but there isn't an explanation. However, it just
> makes sense to me. That is how I would have implemented the solution as
> well. Jan's approach looks very much like a partial workaround to me.
>

On one of my previous e-mail I provided a URL giving a bit more explanation
on the problem.


>
> In conclusion, I still agree with you and disagree with Jan, but it
> would be good to make progress regardless:
>
> - I think a series introducing the usage of SYMBOL through the code
>   should go in 4.12 regardless of the implementation of SYMBOL
> - even the bad implementation of SYMBOL would still help with the
>   potential gcc problems mentioned by Linux, if not with certifications


See above. I clearly don't believe it is solving anything. It will actually
make any change of SYMBOL more difficult if you decide to switch to
unsigned long.


>
> For everybody's reference, I have pushed both versions of the series,
> the one returning the native type, as asked by Jan:
> http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git
> certifications-7
>
> And the one returning unsigned long, as Julien and I would like:
> http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git
> certifications-7-unsigned_long
>
>
> Hoping we won't get stuck on this, regards,
>
> Stefano


Cheers,

[-- Attachment #1.2: Type: text/html, Size: 8663 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10 23:31             ` Julien Grall
@ 2019-01-11  2:14               ` Stefano Stabellini
  2019-01-11  6:52                 ` Juergen Gross
  2019-01-11 10:48                 ` Jan Beulich
  0 siblings, 2 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-11  2:14 UTC (permalink / raw)
  To: jgross
  Cc: Stefano Stabellini, Stefano Stabellini, Wei Liu, Andrew Cooper,
	julien.grall, Julien Grall, Jan Beulich, xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 8143 bytes --]

Hi Juergen, Jan,

I spoke with Julien: we are both convinced that the unsigned long
solution is best. But Julien also did some research and he thinks that
Jan's version (returning pointer type) not only does not help with
MISRA-C, but also doesn't solve the potential GCC problem either. A
description of the GCC issue is available here:

https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1

(Also keep in mind that Linux uses the unsigned long solution to solve
the GCC issue, deviating from it doesn't seem wise.)

I would like to ask for a freeze exception until Monday/Tuesday next
week when Julien will be back, and he and his team will be able to
provide more evidence that the unsigned long solution is correct, while
the other solution is not correct.

Cheers,

Stefano


P.S.
v7 of the series, SYMBOL_HIDE returning pointer type
http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git certifications-7
v7 of the series, SYMBOL_HIDE returning unsigned long
http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git certifications-7-unsigned_long



On Thu, 10 Jan 2019, Julien Grall wrote:
> On Thu, 10 Jan 2019, 15:36 Stefano Stabellini, <sstabellini@kernel.org> wrote:
>       On Thu, 10 Jan 2019, Julien Grall wrote:
>       > On Thu, 10 Jan 2019, 12:29 Stefano Stabellini, <sstabellini@kernel.org> wrote:
>       >       On Thu, 10 Jan 2019, Jan Beulich wrote:
>       >       > >>> On 10.01.19 at 03:40, <julien.grall@gmail.com> wrote:
>       >       > > On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <sstabellini@kernel.org>
>       >       > > wrote:
>       >       > >
>       >       > >> Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
>       >       > >> meant to be used everywhere symbols such as _stext and _etext are used
>       >       > >> in the code. It can take an array type as a parameter, and it returns
>       >       > >> the same type.
>       >       > >>
>       >       > >> SYMBOL is needed when accessing symbols such as _stext and _etext
>       >       > >> because the C standard forbids for both comparisons and substraction
>       >       > >> (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
>       >       > >> pointing to different objects. _stext, _etext, etc. are all pointers to
>       >       > >> different objects from ANCI C point of view.
>       >       > >>
>       >       > >
>       >       > > This does not make sense because you still return a pointer and therefore
>       >       > > the undefined behavior is still present.
>       >       > >
>       >       > > I really don't believe this patch is going to make the MISRA tool happy.
>       >       >
>       >       > Well, till now I've been assuming that no version of this series was
>       >       > posted without being certain the changes achieve the goal (of
>       >       > making that tool happy).
>       >
>       >       No, Jan: unfortunately we cannot re-run the scanning tool against any
>       >       version of Xen we wish to :-(
>       >
>       >       We cannot know in advance if a set of changes will make the tool happy
>       >       or not.
>       >
>       >       If I knew that SYMBOL returning the native point type as in v6 is
>       >       sufficient to make the tool happy, I wouldn't be here arguing. We cannot
>       >       know for sure until we commit the changes, then we ask PRQA to re-scan
>       >       against a more recent version of Xen. It is an heavy process and for
>       >       this reason I preferred the safer of the two approaches.
>       >
>       >
>       >
>       >       Anyway, I would rather get something in, even if insufficient, than
>       >       nothing. So I'll address all your comments based on returning the
>       >       pointer type, and submit a new version. The bothersome changes are the
>       >       ones to the call sites, and I would like to get those in no matter the
>       >       implementation of SYMBOL.
>       >
>       >
>       > It is not only insufficient but wrong when you read the commit message. You also were not convinced about this
>       approach. 
>       >
>       > I understand that we need to commit so we can get the result from the PRQA tool. However, we should have talked
>       with people
>       > knowing MISRA to understand whether it could work.
>       >
>       > You also didn't address my point on why Linux needs to go through unsigned long.
>       >
>       > So I don't think it is right to merge it without more ground.
>       >
>       > On that basis:
>       >
>       > Nacked-by: Julien Grall <julien.grall@arm.com>
> 
>       Hi Julien,
> 
>       I well understanding your thinking, I am not happy with this approach.
> 
>       However, changing all the call sites to use SYMBOL, even if SYMBOL does
>       not do what you and I think it should, is still a valuable change to
>       have:
> 
>       1) it clearly highlight all the related violations
>       2) it is a burdensome set of changes to maintain off-tree which will be
>          difficult to rebase and will bitrot quickly
>       3) it will be simple to change the implementation of SYMBOL afterwards
>          as needed
>       4) regardless of MISRA, we still have a problem with gcc and symbols
>          like _start and _end, see the comment on top of RELOC_HIDE in linux
>          (include/linux/compiler-gcc.h)
> 
> 
>       In fact, even not caring about C compliance, this series is still an
>       improvement, a fix to a potential compiler problem. On that basis alone,
>       I think it is a bad decision not to merge this series.
> 
> 
> From the different reading (see link in one of my previous), I don't believe it will help the compiler. I would be interested to
> know the rationale you think otherwise.
> 
> I am also worried that any change in the definition will require to investigate all the callsite. This is a call for missing one.
> 
> So it still feels to me we are rushing the series.
> 
> 
> 
>       To answer your other questions: yes, we need more information about this
>       compliance issue and MISRA, this is a good reason for committing the
>       series so that we can have the tool a re-scan. It is also a great way
>       to show the problem to a MISRA expert not familiar with Xen: "look at
>       the way SYMBOL is used through the code..."
> 
>       I don't know why Linux is using unsigned long, I looked at the commit
>       messages and comments but there isn't an explanation. However, it just
>       makes sense to me. That is how I would have implemented the solution as
>       well. Jan's approach looks very much like a partial workaround to me.
> 
> 
> On one of my previous e-mail I provided a URL giving a bit more explanation on the problem.
> 
> 
> 
>       In conclusion, I still agree with you and disagree with Jan, but it
>       would be good to make progress regardless:
> 
>       - I think a series introducing the usage of SYMBOL through the code
>         should go in 4.12 regardless of the implementation of SYMBOL
>       - even the bad implementation of SYMBOL would still help with the
>         potential gcc problems mentioned by Linux, if not with certifications
> 
> 
> See above. I clearly don't believe it is solving anything. It will actually make any change of SYMBOL more difficult if you
> decide to switch to unsigned long.
> 
> 
> 
>       For everybody's reference, I have pushed both versions of the series,
>       the one returning the native type, as asked by Jan:
>       http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git certifications-7
> 
>       And the one returning unsigned long, as Julien and I would like:
>       http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git certifications-7-unsigned_long
> 
> 
>       Hoping we won't get stuck on this, regards,

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11  2:14               ` Stefano Stabellini
@ 2019-01-11  6:52                 ` Juergen Gross
  2019-01-11 16:52                   ` Stefano Stabellini
  2019-01-11 10:48                 ` Jan Beulich
  1 sibling, 1 reply; 102+ messages in thread
From: Juergen Gross @ 2019-01-11  6:52 UTC (permalink / raw)
  To: Stefano Stabellini, JBeulich
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, julien.grall,
	Julien Grall, xen-devel

On 11/01/2019 03:14, Stefano Stabellini wrote:
> Hi Juergen, Jan,
> 
> I spoke with Julien: we are both convinced that the unsigned long
> solution is best. But Julien also did some research and he thinks that
> Jan's version (returning pointer type) not only does not help with
> MISRA-C, but also doesn't solve the potential GCC problem either. A
> description of the GCC issue is available here:
> 
> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1
> 
> (Also keep in mind that Linux uses the unsigned long solution to solve
> the GCC issue, deviating from it doesn't seem wise.)
> 
> I would like to ask for a freeze exception until Monday/Tuesday next
> week when Julien will be back, and he and his team will be able to
> provide more evidence that the unsigned long solution is correct, while
> the other solution is not correct.

I'm fine with the freeze exception in this case.

Reasoning:

The functional correctness of the patches is rather easy to verify. The
main risks are:

- syntactical/semantical correctness - handled by the compiler
- MISRA-C correctness - shouldn't be worse than without the patches

So the risk for the release seems to be rather low.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-10 18:46         ` Stewart Hildebrand
  2019-01-10 19:03           ` Stefano Stabellini
@ 2019-01-11 10:35           ` Jan Beulich
  2019-01-11 17:01             ` Stefano Stabellini
  1 sibling, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-11 10:35 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: Stefano Stabellini, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, xen-devel

>>> On 10.01.19 at 19:46, <Stewart.Hildebrand@dornerworks.com> wrote:
> On Thursday, January 10, 2019 12:30 PM, Stefano Stabellini wrote:
>> On Thu, 10 Jan 2019, Jan Beulich wrote:
>> > >>> On 10.01.19 at 03:40, <julien.grall@gmail.com> wrote:
>> > > On Wed, 9 Jan 2019, 18:43 Stefano Stabellini, <sstabellini@kernel.org>
>> > > wrote:
>> > >
>> > >> Introduce a macro, SYMBOL, which is similar to RELOC_HIDE, but it is
>> > >> meant to be used everywhere symbols such as _stext and _etext are used
>> > >> in the code. It can take an array type as a parameter, and it returns
>> > >> the same type.
>> > >>
>> > >> SYMBOL is needed when accessing symbols such as _stext and _etext
>> > >> because the C standard forbids for both comparisons and substraction
>> > >> (see C Standard, 6.5.6 [ISO/IEC 9899:2011] and [1]) between pointers
>> > >> pointing to different objects. _stext, _etext, etc. are all pointers to
>> > >> different objects from ANCI C point of view.
>> > >>
>> > >
>> > > This does not make sense because you still return a pointer and therefore
>> > > the undefined behavior is still present.
>> > >
>> > > I really don't believe this patch is going to make the MISRA tool happy.
>> >
>> > Well, till now I've been assuming that no version of this series was
>> > posted without being certain the changes achieve the goal (of
>> > making that tool happy).
>> 
>> No, Jan: unfortunately we cannot re-run the scanning tool against any
>> version of Xen we wish to :-(
>> 
>> We cannot know in advance if a set of changes will make the tool happy
>> or not.
> 
> Playing devil's advocate: even with all sorts of casting and inline assembly
> to suppress static analysis tool warnings, we're still not addressing the
> underlying rule violation. A pointer value casted or fed through some inline
> assembly at the end of the day is still a value that represents an address
> in an object. And as soon as we subtract or compare that value to one
> that represents another object we start violating the MISRA rules (this is
> my own rather strict interpretation for the purpose of making a point - 
> please feel free to disagree). 

I agree, but I don't see how to solve this. Linker generate symbols
are - like all symbols originating outside of C files - not necessarily
following the C spec anyway. Here, the *_end labels really don't
mark any objects.

> If all we really care about is making PRQA happy, I believe it does support
> some sort of comment-based suppression. I've seen comments like
> /* PRQA S 0487 */ or /* PRQA S 0488 */ in various codebases, I'm guessing
> comments like this have something to do with suppressing these types of
> warnings.

I have to admit that I'm opposed to comments: We've got some
to please Coverity. We've got others to make certain editors to
work for certain people. How many more are we going to gain?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11  2:14               ` Stefano Stabellini
  2019-01-11  6:52                 ` Juergen Gross
@ 2019-01-11 10:48                 ` Jan Beulich
  2019-01-11 18:04                   ` Stefano Stabellini
  1 sibling, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-11 10:48 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, xen-devel

>>> On 11.01.19 at 03:14, <sstabellini@kernel.org> wrote:
> Hi Juergen, Jan,
> 
> I spoke with Julien: we are both convinced that the unsigned long
> solution is best. But Julien also did some research and he thinks that
> Jan's version (returning pointer type) not only does not help with
> MISRA-C, but also doesn't solve the potential GCC problem either. A
> description of the GCC issue is available here:
> 
> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1

I've read through it, and besides not agreeing with some of the
author's arguments I wasn't able to spot where it tells me why/how
the suggested approach doesn't solve the problem.

> (Also keep in mind that Linux uses the unsigned long solution to solve
> the GCC issue, deviating from it doesn't seem wise.)

Which specific gcc issue (that is not solved by retaining type)?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 2/4] xen/arm: use SYMBOL when required
  2019-01-10 17:44     ` Stefano Stabellini
@ 2019-01-11 10:52       ` Jan Beulich
  2019-01-11 16:58         ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-11 10:52 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, xen-devel

>>> On 10.01.19 at 18:44, <sstabellini@kernel.org> wrote:
> On Thu, 10 Jan 2019, Jan Beulich wrote:
>> >>> On 10.01.19 at 00:42, <sstabellini@kernel.org> wrote:
>> > @@ -1138,9 +1138,10 @@ void free_init_memory(void)
>> >      for ( i = 0; i < nr; i++ )
>> >          *(p + i) = insn;
>> >  
>> > -    set_pte_flags_on_range(__init_begin, len, mg_clear);
>> > +    set_pte_flags_on_range(SYMBOL(__init_begin), len, mg_clear);
>> >      init_domheap_pages(pa, pa + len);
>> > -    printk("Freed %ldkB init memory.\n", (long)(__init_end-__init_begin)>>10);
>> > +    printk("Freed %ldkB init memory.\n",
>> > +           (long)(SYMBOL(__init_end)-SYMBOL(__init_begin))>>10);
>> 
>> I've noticed this only here, but I can't exclude I've overlooked other
>> instances: I think it would be really nice if you corrected formatting
>> at the same time (here: add the missing blanks).
> 
> OK
> 
> I tend not to do cleanups together with meaningful changes, because
> typically I find the resulting patch harder to review, but I am OK with
> doing cleanups if you the maintainer asks for them

Well, I wouldn't normally ask for more involved cleanups, but formatting
ones are surely okay in general. That's the best way after all to get rid
of formatting violations without dedicated (and often voluminous)
patches.


>> > --- a/xen/arch/arm/setup.c
>> > +++ b/xen/arch/arm/setup.c
>> > @@ -772,8 +772,10 @@ void __init start_xen(unsigned long boot_phys_offset,
>> >  
>> >      /* Register Xen's load address as a boot module. */
>> >      xen_bootmodule = add_boot_module(BOOTMOD_XEN,
>> > -                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
>> > -                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
>> > +                             (paddr_t)(uintptr_t)(SYMBOL(_start) +
>> > +                                                  boot_phys_offset),
>> > +                             (paddr_t)(uintptr_t)(SYMBOL(_end) -
>> > +                                                  SYMBOL(_start) + 1), false);
>> 
>> Why you need the double casts, i.e. why does (uintptr_t) alone not
>> suffice?
> 
> The original reason was just not to change the existing code outside of
> adding SYMBOL :-)
> 
> But to answer your question, uintptr_t is the same size of char*, while
> paddr_t is always 64bit. uintptr_t casts to integer type, paddr_t casts
> to the right size. I don't think it is allowed to change from pointer to
> integer and change integer size in a single cast.

Correct, but that's not what I've been asking for. Instead I'd like
to see the (paddr_t) casts dropped, at least if this was in code I'm
the maintainer for.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11  6:52                 ` Juergen Gross
@ 2019-01-11 16:52                   ` Stefano Stabellini
  0 siblings, 0 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-11 16:52 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Stefano Stabellini, Stefano Stabellini, Wei Liu, Andrew Cooper,
	julien.grall, Julien Grall, JBeulich, xen-devel

On Fri, 11 Jan 2019, Juergen Gross wrote:
> On 11/01/2019 03:14, Stefano Stabellini wrote:
> > Hi Juergen, Jan,
> > 
> > I spoke with Julien: we are both convinced that the unsigned long
> > solution is best. But Julien also did some research and he thinks that
> > Jan's version (returning pointer type) not only does not help with
> > MISRA-C, but also doesn't solve the potential GCC problem either. A
> > description of the GCC issue is available here:
> > 
> > https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1
> > 
> > (Also keep in mind that Linux uses the unsigned long solution to solve
> > the GCC issue, deviating from it doesn't seem wise.)
> > 
> > I would like to ask for a freeze exception until Monday/Tuesday next
> > week when Julien will be back, and he and his team will be able to
> > provide more evidence that the unsigned long solution is correct, while
> > the other solution is not correct.
> 
> I'm fine with the freeze exception in this case.
> 
> Reasoning:
> 
> The functional correctness of the patches is rather easy to verify. The
> main risks are:
> 
> - syntactical/semantical correctness - handled by the compiler
> - MISRA-C correctness - shouldn't be worse than without the patches
> 
> So the risk for the release seems to be rather low.

Yes, it is exactly as you say.

Thank you!

- Stefano

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 2/4] xen/arm: use SYMBOL when required
  2019-01-11 10:52       ` Jan Beulich
@ 2019-01-11 16:58         ` Stefano Stabellini
  2019-01-14  9:23           ` Jan Beulich
  0 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-11 16:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini,
	Stefano Stabellini, xen-devel

On Fri, 11 Jan 2019, Jan Beulich wrote:
> >> > --- a/xen/arch/arm/setup.c
> >> > +++ b/xen/arch/arm/setup.c
> >> > @@ -772,8 +772,10 @@ void __init start_xen(unsigned long boot_phys_offset,
> >> >  
> >> >      /* Register Xen's load address as a boot module. */
> >> >      xen_bootmodule = add_boot_module(BOOTMOD_XEN,
> >> > -                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
> >> > -                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
> >> > +                             (paddr_t)(uintptr_t)(SYMBOL(_start) +
> >> > +                                                  boot_phys_offset),
> >> > +                             (paddr_t)(uintptr_t)(SYMBOL(_end) -
> >> > +                                                  SYMBOL(_start) + 1), false);
> >> 
> >> Why you need the double casts, i.e. why does (uintptr_t) alone not
> >> suffice?
> > 
> > The original reason was just not to change the existing code outside of
> > adding SYMBOL :-)
> > 
> > But to answer your question, uintptr_t is the same size of char*, while
> > paddr_t is always 64bit. uintptr_t casts to integer type, paddr_t casts
> > to the right size. I don't think it is allowed to change from pointer to
> > integer and change integer size in a single cast.
> 
> Correct, but that's not what I've been asking for. Instead I'd like
> to see the (paddr_t) casts dropped, at least if this was in code I'm
> the maintainer for.

But add_boot_module takes paddr_t as arguments. Why would you want the
explicit cast dropped? Just to rely on the implicit cast? This way is
clearer, but either way is fine by me BTW.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11 10:35           ` Jan Beulich
@ 2019-01-11 17:01             ` Stefano Stabellini
  0 siblings, 0 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-11 17:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Stewart Hildebrand, xen-devel

On Fri, 11 Jan 2019, Jan Beulich wrote:
> > If all we really care about is making PRQA happy, I believe it does support
> > some sort of comment-based suppression. I've seen comments like
> > /* PRQA S 0487 */ or /* PRQA S 0488 */ in various codebases, I'm guessing
> > comments like this have something to do with suppressing these types of
> > warnings.
> 
> I have to admit that I'm opposed to comments: We've got some
> to please Coverity. We've got others to make certain editors to
> work for certain people. How many more are we going to gain?

You have a good point. Also, I am not a fan of tools specific tags
(Coverity, EMACS, etc.) in the code base. However, we could consider
adding our own comment labels, not to please PRQA, but to highlight
sites that we know are violating MISRA-C for one reason or another.
Something like:

/* M3CM: Rule-18.2 violation */

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11 10:48                 ` Jan Beulich
@ 2019-01-11 18:04                   ` Stefano Stabellini
  2019-01-11 18:53                     ` Stewart Hildebrand
  2019-01-14 10:11                     ` Jan Beulich
  0 siblings, 2 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-11 18:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, xen-devel

On Fri, 11 Jan 2019, Jan Beulich wrote:
> >>> On 11.01.19 at 03:14, <sstabellini@kernel.org> wrote:
> > Hi Juergen, Jan,
> > 
> > I spoke with Julien: we are both convinced that the unsigned long
> > solution is best. But Julien also did some research and he thinks that
> > Jan's version (returning pointer type) not only does not help with
> > MISRA-C, but also doesn't solve the potential GCC problem either. A
> > description of the GCC issue is available here:
> > 
> > https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1
> 
> I've read through it, and besides not agreeing with some of the
> author's arguments I wasn't able to spot where it tells me why/how
> the suggested approach doesn't solve the problem.
> 
> > (Also keep in mind that Linux uses the unsigned long solution to solve
> > the GCC issue, deviating from it doesn't seem wise.)
> 
> Which specific gcc issue (that is not solved by retaining type)?

I am hoping Julien and his team will be able to provide the more
decisive information next week for us to make a decision, but it looks
like the issue is not clear-cut and people on the GCC list disagree on
how it should be handled.


The C standard says that "Two pointers compare equal if and only if both
are null pointers, both are pointers to the same object (including a
pointer to an object and a subobject at its beginning) or function, both
are pointers to one past the last element of the same array object, or
one is a pointer to one past the end of one array object and the other
is a pointer to the start of a different array object that happens to
immediately follow the first array object in the address space."

In short, the compiler is free to return false in a pointer comparison
if it believes that the pointers point to different non-consecutive
object.


See this LKML message for the concrete issue:

https://lkml.org/lkml/2016/6/25/77


See this comment from this thread
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502 because it is
enlightening:

  Just because two pointers print the same and have the same bit-pattern 
  doesn't mean they need to compare equal

Also this:

  > --- Comment #14 from Keith Thompson <Keith.S.Thompson at gmail dot com> ---
  > The C standard requires that, if y "happens to immediately follow"
  > x in the address space, then a pointer just past the end of x shall
  > compare equal to a pointer to the beginning of y (C99 and C11 6.5.9p6).
  > 
  > How could I distinguish the current behavior of gcc from the behavior
  > of a hypothetical C compiler that violates that requirement? In
  > other words, in what sense does gcc actually obey that requirement?
  
  They are not distinguishable [...]

Finally continuing down the thread there is an example from the Linux
kernel itself:

  Apparently some folks use linker scripts to get a specific arrangement of objects.
  
  A fresh example is a problem in Linux -- https://lkml.org/lkml/2016/6/25/77 . A simplified example from http://pastebin.com/4Qc6pUAA :
  
  extern int __start[];
  extern int __end[];
   
  extern void bar(int *);
   
  void foo()
  {
      for (int *x = __start; x != __end; ++x)
          bar(x);
  }
  
  This is optimized into an infinite loop by gcc 7 at -O.

 
There is also a suggested workaround on the thread that uses assembly
inline like we do and casts to int*. Overall, reading the blog post and
the thread on the GCC bugzilla, I get the idea that comparing pointers
like we do can be unreliable.

The limit of Jan's solution is that even if we go through an assembly
indirection, we are still comparing pointers. We are opening ourselves
up to trouble. The unsigned long solution looks safer, moreover, it
puts us in the same bandwagon as the Linux kernel, which is as good as
it gets as a guarantee that compilers won't break this behavior.

With the issue so unclear, do we feel confident enough to choose the
more risky solution of the two (returning pointer type)?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11 18:04                   ` Stefano Stabellini
@ 2019-01-11 18:53                     ` Stewart Hildebrand
  2019-01-11 20:35                       ` Julien Grall
  2019-01-14 10:11                     ` Jan Beulich
  1 sibling, 1 reply; 102+ messages in thread
From: Stewart Hildebrand @ 2019-01-11 18:53 UTC (permalink / raw)
  To: Stefano Stabellini, Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, xen-devel

On Friday, January 11, 2019 1:04 PM, Stefano Stabellini wrote: 
> On Fri, 11 Jan 2019, Jan Beulich wrote:
> > >>> On 11.01.19 at 03:14, <sstabellini@kernel.org> wrote:
> > > Hi Juergen, Jan,
> > >
> > > I spoke with Julien: we are both convinced that the unsigned long
> > > solution is best. But Julien also did some research and he thinks that
> > > Jan's version (returning pointer type) not only does not help with
> > > MISRA-C, but also doesn't solve the potential GCC problem either. A
> > > description of the GCC issue is available here:
> > >
> > > https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1
> >
> > I've read through it, and besides not agreeing with some of the
> > author's arguments I wasn't able to spot where it tells me why/how
> > the suggested approach doesn't solve the problem.
> >
> > > (Also keep in mind that Linux uses the unsigned long solution to solve
> > > the GCC issue, deviating from it doesn't seem wise.)
> >
> > Which specific gcc issue (that is not solved by retaining type)?
> 
> I am hoping Julien and his team will be able to provide the more
> decisive information next week for us to make a decision, but it looks
> like the issue is not clear-cut and people on the GCC list disagree on
> how it should be handled.
> 
> 
> The C standard says that "Two pointers compare equal if and only if both
> are null pointers, both are pointers to the same object (including a
> pointer to an object and a subobject at its beginning) or function, both
> are pointers to one past the last element of the same array object, or
> one is a pointer to one past the end of one array object and the other
> is a pointer to the start of a different array object that happens to
> immediately follow the first array object in the address space."
> 
> In short, the compiler is free to return false in a pointer comparison
> if it believes that the pointers point to different non-consecutive
> object.
> 
> 
> See this LKML message for the concrete issue:
> 
> https://lkml.org/lkml/2016/6/25/77
> 
> 
> See this comment from this thread
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502 because it is
> enlightening:
> 
>   Just because two pointers print the same and have the same bit-pattern
>   doesn't mean they need to compare equal
> 
> Also this:
> 
>   > --- Comment #14 from Keith Thompson <Keith.S.Thompson at gmail dot com> ---
>   > The C standard requires that, if y "happens to immediately follow"
>   > x in the address space, then a pointer just past the end of x shall
>   > compare equal to a pointer to the beginning of y (C99 and C11 6.5.9p6).
>   >
>   > How could I distinguish the current behavior of gcc from the behavior
>   > of a hypothetical C compiler that violates that requirement? In
>   > other words, in what sense does gcc actually obey that requirement?
> 
>   They are not distinguishable [...]
> 
> Finally continuing down the thread there is an example from the Linux
> kernel itself:
> 
>   Apparently some folks use linker scripts to get a specific arrangement of objects.
> 
>   A fresh example is a problem in Linux -- https://lkml.org/lkml/2016/6/25/77 . A simplified example from
> http://pastebin.com/4Qc6pUAA :
> 
>   extern int __start[];
>   extern int __end[];
> 
>   extern void bar(int *);
> 
>   void foo()
>   {
>       for (int *x = __start; x != __end; ++x)
>           bar(x);
>   }
> 
>   This is optimized into an infinite loop by gcc 7 at -O.
> 
> 
> There is also a suggested workaround on the thread that uses assembly
> inline like we do and casts to int*. Overall, reading the blog post and
> the thread on the GCC bugzilla, I get the idea that comparing pointers
> like we do can be unreliable.
> 
> The limit of Jan's solution is that even if we go through an assembly
> indirection, we are still comparing pointers. We are opening ourselves
> up to trouble. The unsigned long solution looks safer, moreover, it
> puts us in the same bandwagon as the Linux kernel, which is as good as
> it gets as a guarantee that compilers won't break this behavior.
> 
> With the issue so unclear, do we feel confident enough to choose the
> more risky solution of the two (returning pointer type)?
> 

NO! Definitely not.

The issue seems to be that we are interpreting the linker-generated values
as pointer types, and then going on to do comparisons, subtractions, etc. on
those values. It seems that GCC's idea of what comprises a "pointer to an
object" does not take linker-generated values into account, then goes on to
make the assumption that two linker-provided values are "pointers to
different objects". Given the ambiguity of the C standard, one could probably
successfully argue that GCC did nothing wrong. I would argue that we are
relying on undefined behavior in a strict interpretation of the C standard.

The important message to take away from MISRA is that use of or reliance on
implementation-defined behavior should be understood and documented. In
fact that's the very first MISRA rule: Directive 1.1.

However, it would be even better to avoid having to rely on
implementation-defined behavior in the first place...
So here's a radical idea:

Why don't we change the type of _start so it's not a pointer type?
The MISRA rules in question (18.1 - 18.4) only pertain to pointer types,
so if _start isn't a pointer type, it should silence PRQA. Also, it seems like the
majority of references to _start/_end/etc. don't actually dereference the
value (i.e. not actually using the value as a pointer). And for cases where we
do need to dereference it (i.e. actually use it as a pointer) then introduce an
explicit cast, which would also serve as a hint for "yes, I really do know what
I'm doing in regards to the whole pointers to different objects issue".

Or as an alternative to the cast, introduce a new union type, with one member
as a sufficient-width integer type (for doing arithmetic, comparisons, etc.) and
another member as a pointer type. This would explicitly force us to consider
how exactly the value is being used each time it's referenced and choose the
appropriate interpretation.

Stew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11 18:53                     ` Stewart Hildebrand
@ 2019-01-11 20:35                       ` Julien Grall
  2019-01-11 20:46                         ` Stewart Hildebrand
  0 siblings, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-11 20:35 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Jan Beulich, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 222 bytes --]

On Fri, 11 Jan 2019, 12:53 Stewart Hildebrand, <
Stewart.Hildebrand@dornerworks.com> wrote:

>
> Why don't we change the type of _start so it's not a pointer type?


Can you suggest a type that would be suitable?

Cheers,

[-- Attachment #1.2: Type: text/html, Size: 535 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11 20:35                       ` Julien Grall
@ 2019-01-11 20:46                         ` Stewart Hildebrand
  2019-01-11 21:37                           ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Stewart Hildebrand @ 2019-01-11 20:46 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Jan Beulich, xen-devel

On Friday, January 11, 2019 3:36 PM, Julien Grall wrote:
> On Fri, 11 Jan 2019, 12:53 Stewart Hildebrand wrote:
> >
> > Why don't we change the type of _start so it's not a pointer type?
>
> Can you suggest a type that would be suitable?
>
> Cheers,

Yes. My opinion is that the "sufficient-width integer type" should be a
"uintptr_t" or "intptr_t", since those types by definition are *integer* types
wide enough to hold a value converted from a void pointer. While "unsigned
long" seems to work for Linux, the definition of that type doesn't provide the
same guarantee. Since uintptr_t is an *integer* type by definition (and not a
pointer type), my interpretation of the C standard is that
subtraction/comparison of uintptr_t types won't be subject to the potential
"pointer to object" issues in question.

If I had to choose between "uintptr_t" or "intptr_t" I guess I would choose
"uintptr_t" since that type is already used in various places in the Xen
codebase. And the Linux workaround is also using an unsigned integer type.

Stew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11 20:46                         ` Stewart Hildebrand
@ 2019-01-11 21:37                           ` Stefano Stabellini
  2019-01-14  3:45                             ` Stewart Hildebrand
  2019-01-15 11:46                             ` Julien Grall
  0 siblings, 2 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-11 21:37 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Jan Beulich,
	xen-devel

On Fri, 11 Jan 2019, Stewart Hildebrand wrote:
> On Friday, January 11, 2019 3:36 PM, Julien Grall wrote:
> > On Fri, 11 Jan 2019, 12:53 Stewart Hildebrand wrote:
> > >
> > > Why don't we change the type of _start so it's not a pointer type?
> >
> > Can you suggest a type that would be suitable?
> >
> > Cheers,
> 
> Yes. My opinion is that the "sufficient-width integer type" should be a
> "uintptr_t" or "intptr_t", since those types by definition are *integer* types
> wide enough to hold a value converted from a void pointer. While "unsigned
> long" seems to work for Linux, the definition of that type doesn't provide the
> same guarantee. Since uintptr_t is an *integer* type by definition (and not a
> pointer type), my interpretation of the C standard is that
> subtraction/comparison of uintptr_t types won't be subject to the potential
> "pointer to object" issues in question.
> 
> If I had to choose between "uintptr_t" or "intptr_t" I guess I would choose
> "uintptr_t" since that type is already used in various places in the Xen
> codebase. And the Linux workaround is also using an unsigned integer type.

On changing type of _start & friends: we cannot declare _start as
uintptr_t, the linker won't be able to set the value. It needs to be an
array type. At that point, it is basically a pointer, it doesn't matter
if it is a char[] or uintptr_t[]. It won't help.

But, instead of converting _start to unsigned long via SYMBOL_HIDE, we
could convert it to uintptr_t instead, it would be a trivial change on
top of the existing unsigned long series. Not sure if it is beneficial.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11 21:37                           ` Stefano Stabellini
@ 2019-01-14  3:45                             ` Stewart Hildebrand
  2019-01-14 10:26                               ` Jan Beulich
  2019-01-15 11:46                             ` Julien Grall
  1 sibling, 1 reply; 102+ messages in thread
From: Stewart Hildebrand @ 2019-01-14  3:45 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Jan Beulich, xen-devel

On Friday, January 11, 2019 4:38 PM, Stefano Stabellini wrote:
> On Fri, 11 Jan 2019, Stewart Hildebrand wrote:
> > On Friday, January 11, 2019 3:36 PM, Julien Grall wrote:
> > > On Fri, 11 Jan 2019, 12:53 Stewart Hildebrand wrote:
> > > >
> > > > Why don't we change the type of _start so it's not a pointer type?
> > >
> > > Can you suggest a type that would be suitable?
> > >
> > > Cheers,
> >
> > Yes. My opinion is that the "sufficient-width integer type" should be a
> > "uintptr_t" or "intptr_t", since those types by definition are *integer* types
> > wide enough to hold a value converted from a void pointer. While "unsigned
> > long" seems to work for Linux, the definition of that type doesn't provide the
> > same guarantee. Since uintptr_t is an *integer* type by definition (and not a
> > pointer type), my interpretation of the C standard is that
> > subtraction/comparison of uintptr_t types won't be subject to the potential
> > "pointer to object" issues in question.
> >
> > If I had to choose between "uintptr_t" or "intptr_t" I guess I would choose
> > "uintptr_t" since that type is already used in various places in the Xen
> > codebase. And the Linux workaround is also using an unsigned integer type.
> 
> On changing type of _start & friends: we cannot declare _start as
> uintptr_t, the linker won't be able to set the value. It needs to be an
> array type. At that point, it is basically a pointer, it doesn't matter
> if it is a char[] or uintptr_t[]. It won't help.

Ah, I see. OK. See [1] for further explanation of why this is the case. So
I guess we'll just have to work around that.

I don't mean a uintptr_t[], because that's an array/pointer type. I mean
"uintptr_t" the integer type. I recognize that there are risks of going
from a pointer type to an integer type, since any operations that relied
on pointer arithmetic have to be changed to account for integer
arithmetic.

So let's keep the linker-accessible variable as a type that works for the
linker (which really could be anything as long as you use the address, not
the value), but name it something else - a name that screams "DON'T USE ME
UNLESS YOU KNOW WHAT YOU'RE DOING". And then before the first use, copy
that value to "uintptr_t _start;".

The following is a quick proof of concept for aarch64. I changed the type
of _start and _end, and added code to copy the linker-assigned value to
_start and _end. Upon booting, I see the correct values:

(XEN) sizeof(uintptr_t): 8
(XEN) _start: 0x00200000
(XEN) _end:   0x00320d80

(please keep reading after the patch)

From: Stewart Hildebrand <stewart.hildebrand@dornerworks.com>
Date: Sun, 13 Jan 2019 21:10:43 -0500
Subject: [PATCH RFC] Proof of concept: change _start/_end to uintptr_t

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@dornerworks.com>

---
 xen/include/xen/kernel.h |  3 ++-
 xen/arch/arm/xen.lds.S   |  4 ++--
 xen/arch/arm/setup.c     | 17 +++++++++++++++--
 xen/arch/arm/mm.c        |  4 ++--
 xen/Rules.mk             |  2 +-
 5 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/xen/include/xen/kernel.h b/xen/include/xen/kernel.h
index 548b64da9f..ec7d10bb75 100644
--- a/xen/include/xen/kernel.h
+++ b/xen/include/xen/kernel.h
@@ -65,7 +65,8 @@
 	1;                                      \
 })
 
-extern char _start[], _end[], start[];
+extern uintptr_t _start, _end;
+extern char start[];
 #define is_kernel(p) ({                         \
     char *__p = (char *)(unsigned long)(p);     \
     (__p >= _start) && (__p < _end);            \
diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
index 1e72906477..c837dd534a 100644
--- a/xen/arch/arm/xen.lds.S
+++ b/xen/arch/arm/xen.lds.S
@@ -28,7 +28,7 @@ PHDRS
 SECTIONS
 {
   . = XEN_VIRT_START;
-  _start = .;
+  _start_linker_assigned_dont_use_me = .;
   .text : {
         _stext = .;            /* Text section */
        *(.text)
@@ -205,7 +205,7 @@ SECTIONS
        __per_cpu_data_end = .;
        __bss_end = .;
   } :text
-  _end = . ;
+  _end_linker_assigned_dont_use_me = . ;
 
 #ifdef CONFIG_DTB_FILE
   /* Section for the device tree blob (if any). */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 444857a967..fe17a86384 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -726,6 +726,12 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size)
 
 size_t __read_mostly dcache_line_bytes;
 
+typedef char TYPE_DOESNT_MATTER;
+extern TYPE_DOESNT_MATTER _start_linker_assigned_dont_use_me,
+                          _end_linker_assigned_dont_use_me;
+
+uintptr_t _start, _end;
+
 /* C entry point for boot CPU */
 void __init start_xen(unsigned long boot_phys_offset,
                       unsigned long fdt_paddr,
@@ -770,10 +776,17 @@ void __init start_xen(unsigned long boot_phys_offset,
     printk("Command line: %s\n", cmdline);
     cmdline_parse(cmdline);
 
+    _start = (uintptr_t)&_start_linker_assigned_dont_use_me;
+    _end = (uintptr_t)&_end_linker_assigned_dont_use_me;
+
+    printk("sizeof(uintptr_t): %ld\n", sizeof(uintptr_t));
+    printk("_start: 0x%08" PRIxPTR "\n", _start);
+    printk("_end:   0x%08" PRIxPTR "\n", _end);
+
     /* Register Xen's load address as a boot module. */
     xen_bootmodule = add_boot_module(BOOTMOD_XEN,
-                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
-                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
+        (paddr_t)(uintptr_t)(_start + boot_phys_offset * sizeof(char*)),
+        (paddr_t)(uintptr_t)(_end - _start + sizeof(char*)), false);
     BUG_ON(!xen_bootmodule);
 
     setup_pagetables(boot_phys_offset);
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 01ae2cccc0..b4fd0381d1 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1084,8 +1084,8 @@ static void set_pte_flags_on_range(const char *p, unsigned long l, enum mg mg)
     ASSERT(!((unsigned long) p & ~PAGE_MASK));
     ASSERT(!(l & ~PAGE_MASK));
 
-    for ( i = (p - _start) / PAGE_SIZE; 
-          i < (p + l - _start) / PAGE_SIZE; 
+    for ( i = (p - (const char *)_start) / PAGE_SIZE;
+          i < (p + l - (const char *)_start) / PAGE_SIZE;
           i++ )
     {
         pte = xen_xenmap[i];
diff --git a/xen/Rules.mk b/xen/Rules.mk
index a151b3f625..a05ceec1e5 100644
--- a/xen/Rules.mk
+++ b/xen/Rules.mk
@@ -54,7 +54,7 @@ CFLAGS += -fomit-frame-pointer
 endif
 
 CFLAGS += -nostdinc -fno-builtin -fno-common
-CFLAGS += -Werror -Wredundant-decls -Wno-pointer-arith
+CFLAGS += -Wredundant-decls -Wno-pointer-arith
 $(call cc-option-add,CFLAGS,CC,-Wvla)
 CFLAGS += -pipe -D__XEN__ -include $(BASEDIR)/include/xen/config.h
 CFLAGS-$(CONFIG_DEBUG_INFO) += -g
-- 
2.17.1



I'm not sure if start_xen() is the first actual use of _start/_end, but it
seemed good enough to verify that _start and _end were being assigned
properly. I removed -Werror due to "comparison between pointer and
integer" warnings. Additionally, since this is a switch from pointer
arithmetic to integer arithmetic, we need to add
"* sizeof(some_pointer_type)" in a few places. I only added this in one
place for the proof of concept, so as you might expect, it didn't finish
booting.

Does it make sense to change the type of all variables that could be
considered "pointers to different objects"? If we intend to do any sort of
subtraction/comparison between them (i.e. violate MISRA rules and venture
into the territory of undefined behavior), then yes, they should all be
changed. I believe that's a small price to pay to take a step toward MISRA
conformance and not risk the compiler making certain optimizations that
could break the code.

> 
> But, instead of converting _start to unsigned long via SYMBOL_HIDE, we
> could convert it to uintptr_t instead, it would be a trivial change on
> top of the existing unsigned long series. Not sure if it is beneficial.

The difference would be whether we want to rely on implementation-defined
behavior or not. In this case, whether "unsigned long" is wide enough to
hold a pointer value or not. I recognize that in most implementations and
architectures it is, but it's still not strictly guaranteed per the
definition of the type of "unsigned long" as it is with "uintptr_t".

Lastly, also possibly of interest, while playing around with the proof of
concept, I did also manage to get GCC 7.3 to emit this warning (by
removing the "extern" declaration and adding back the array declaration):
setup.c:731:27: warning: array ‘_end_linker_assigned_dont_use_me’ assumed to have one element
                           _end_linker_assigned_dont_use_me[];
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Stew

[1] http://docs.adacore.com/live/wave/binutils-stable/html/ld/ld.html#Source-Code-Reference
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 2/4] xen/arm: use SYMBOL when required
  2019-01-11 16:58         ` Stefano Stabellini
@ 2019-01-14  9:23           ` Jan Beulich
  0 siblings, 0 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-14  9:23 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, xen-devel

>>> On 11.01.19 at 17:58, <sstabellini@kernel.org> wrote:
> On Fri, 11 Jan 2019, Jan Beulich wrote:
>> >> > --- a/xen/arch/arm/setup.c
>> >> > +++ b/xen/arch/arm/setup.c
>> >> > @@ -772,8 +772,10 @@ void __init start_xen(unsigned long boot_phys_offset,
>> >> >  
>> >> >      /* Register Xen's load address as a boot module. */
>> >> >      xen_bootmodule = add_boot_module(BOOTMOD_XEN,
>> >> > -                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
>> >> > -                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
>> >> > +                             (paddr_t)(uintptr_t)(SYMBOL(_start) +
>> >> > +                                                  boot_phys_offset),
>> >> > +                             (paddr_t)(uintptr_t)(SYMBOL(_end) -
>> >> > +                                                  SYMBOL(_start) + 1), false);
>> >> 
>> >> Why you need the double casts, i.e. why does (uintptr_t) alone not
>> >> suffice?
>> > 
>> > The original reason was just not to change the existing code outside of
>> > adding SYMBOL :-)
>> > 
>> > But to answer your question, uintptr_t is the same size of char*, while
>> > paddr_t is always 64bit. uintptr_t casts to integer type, paddr_t casts
>> > to the right size. I don't think it is allowed to change from pointer to
>> > integer and change integer size in a single cast.
>> 
>> Correct, but that's not what I've been asking for. Instead I'd like
>> to see the (paddr_t) casts dropped, at least if this was in code I'm
>> the maintainer for.
> 
> But add_boot_module takes paddr_t as arguments. Why would you want the
> explicit cast dropped?

Yes. There are very, very many places where we assume the compiler
to do the right thing when changing width of integer types. I simply
see no reason why we would want to diverge here. If the compiler
warned about truncating conversions (like some others do), then
there would be a reason to maintain explicit down-casts, but here it's
an up-cast in all cases afaict, and iirc there's no undefined behavior,
implementation defined behavior, or alike associated with widening of
integer types.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11 18:04                   ` Stefano Stabellini
  2019-01-11 18:53                     ` Stewart Hildebrand
@ 2019-01-14 10:11                     ` Jan Beulich
  2019-01-14 15:41                       ` Julien Grall
  1 sibling, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-14 10:11 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, xen-devel

>>> On 11.01.19 at 19:04, <sstabellini@kernel.org> wrote:
> On Fri, 11 Jan 2019, Jan Beulich wrote:
>> >>> On 11.01.19 at 03:14, <sstabellini@kernel.org> wrote:
>> > Hi Juergen, Jan,
>> > 
>> > I spoke with Julien: we are both convinced that the unsigned long
>> > solution is best. But Julien also did some research and he thinks that
>> > Jan's version (returning pointer type) not only does not help with
>> > MISRA-C, but also doesn't solve the potential GCC problem either. A
>> > description of the GCC issue is available here:
>> > 
>> > 
> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.h 
> tml?m=1
>> 
>> I've read through it, and besides not agreeing with some of the
>> author's arguments I wasn't able to spot where it tells me why/how
>> the suggested approach doesn't solve the problem.
>> 
>> > (Also keep in mind that Linux uses the unsigned long solution to solve
>> > the GCC issue, deviating from it doesn't seem wise.)
>> 
>> Which specific gcc issue (that is not solved by retaining type)?
> 
> I am hoping Julien and his team will be able to provide the more
> decisive information next week for us to make a decision, but it looks
> like the issue is not clear-cut and people on the GCC list disagree on
> how it should be handled.
> 
> 
> The C standard says that "Two pointers compare equal if and only if both
> are null pointers, both are pointers to the same object (including a
> pointer to an object and a subobject at its beginning) or function, both
> are pointers to one past the last element of the same array object, or
> one is a pointer to one past the end of one array object and the other
> is a pointer to the start of a different array object that happens to
> immediately follow the first array object in the address space."
> 
> In short, the compiler is free to return false in a pointer comparison
> if it believes that the pointers point to different non-consecutive
> object.

And it is this "it believes" which we undermine with the construct:
As long as the compiler can't prove two pointers point to different
objects, it can't eliminate the actual comparison. As soon as the
actual comparison is in place, we're fine code-wise. Whether that's
also fine MISRA-wise is a different thing, but as said before I
question the validity of demanding C standard compliance of
constructs originating from other than C (and perhaps not even
expressible in standard compliant C).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-14  3:45                             ` Stewart Hildebrand
@ 2019-01-14 10:26                               ` Jan Beulich
  2019-01-14 21:18                                 ` Stefano Stabellini
       [not found]                                 ` <7A8C0A4F020000EEB8D7C7D4@prv1-mh.provo.novell.com>
  0 siblings, 2 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-14 10:26 UTC (permalink / raw)
  To: Stewart Hildebrand, Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, xen-devel

>>> On 14.01.19 at 04:45, <Stewart.Hildebrand@dornerworks.com> wrote:
> So let's keep the linker-accessible variable as a type that works for the
> linker (which really could be anything as long as you use the address, not
> the value), but name it something else - a name that screams "DON'T USE ME
> UNLESS YOU KNOW WHAT YOU'RE DOING". And then before the first use, copy
> that value to "uintptr_t _start;".
> 
> The following is a quick proof of concept for aarch64. I changed the type
> of _start and _end, and added code to copy the linker-assigned value to
> _start and _end. Upon booting, I see the correct values:

Global symbols starting with underscores should already be shouting
enough. But what's worse - the whole idea if using array types is to
avoid the intermediate variables.

> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -726,6 +726,12 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size)
>  
>  size_t __read_mostly dcache_line_bytes;
>  
> +typedef char TYPE_DOESNT_MATTER;
> +extern TYPE_DOESNT_MATTER _start_linker_assigned_dont_use_me,
> +                          _end_linker_assigned_dont_use_me;

This and ...

> @@ -770,10 +776,17 @@ void __init start_xen(unsigned long boot_phys_offset,
>      printk("Command line: %s\n", cmdline);
>      cmdline_parse(cmdline);
>  
> +    _start = (uintptr_t)&_start_linker_assigned_dont_use_me;
> +    _end = (uintptr_t)&_end_linker_assigned_dont_use_me;

... this violates what the symbol names say. And if you want to
avoid issues, you'd want to keep out of C files uses of those
symbols altogether anyway, and you easily can: In any
assembly file, have

_start:	.long _start_linker_assigned_dont_use_me
_end:	.long _end_linker_assigned_dont_use_me

In particular, they don't need to be runtime initialized, saving
you from needing to set them before first use. But as said -
things are the way they are precisely to avoid such variables.

> --- a/xen/Rules.mk
> +++ b/xen/Rules.mk
> @@ -54,7 +54,7 @@ CFLAGS += -fomit-frame-pointer
>  endif
>  
>  CFLAGS += -nostdinc -fno-builtin -fno-common
> -CFLAGS += -Werror -Wredundant-decls -Wno-pointer-arith
> +CFLAGS += -Wredundant-decls -Wno-pointer-arith

This I would consider bad even in a PoC. If you make a change
which causes compiler warnings, surely you now violate something
else.

>> But, instead of converting _start to unsigned long via SYMBOL_HIDE, we
>> could convert it to uintptr_t instead, it would be a trivial change on
>> top of the existing unsigned long series. Not sure if it is beneficial.
> 
> The difference would be whether we want to rely on implementation-defined
> behavior or not.

Why not? Simply specify that compilers with implementation defined
behavior not matching our expectations are unsuitable. And btw, I
suppose this is just the tiny tip of the iceberg of our reliance on
implementation defined behavior.

> In this case, whether "unsigned long" is wide enough to
> hold a pointer value or not.

This is a basic assumption of UNIX and its derivatives, afaik.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-14 10:11                     ` Jan Beulich
@ 2019-01-14 15:41                       ` Julien Grall
  2019-01-14 15:52                         ` Jan Beulich
  0 siblings, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-14 15:41 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, xen-devel

Hi Jan,

On 14/01/2019 10:11, Jan Beulich wrote:
>>>> On 11.01.19 at 19:04, <sstabellini@kernel.org> wrote:
>> On Fri, 11 Jan 2019, Jan Beulich wrote:
>>>>>> On 11.01.19 at 03:14, <sstabellini@kernel.org> wrote:
>>>> Hi Juergen, Jan,
>>>>
>>>> I spoke with Julien: we are both convinced that the unsigned long
>>>> solution is best. But Julien also did some research and he thinks that
>>>> Jan's version (returning pointer type) not only does not help with
>>>> MISRA-C, but also doesn't solve the potential GCC problem either. A
>>>> description of the GCC issue is available here:
>>>>
>>>>
>> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.h
>> tml?m=1
>>>
>>> I've read through it, and besides not agreeing with some of the
>>> author's arguments I wasn't able to spot where it tells me why/how
>>> the suggested approach doesn't solve the problem.
>>>
>>>> (Also keep in mind that Linux uses the unsigned long solution to solve
>>>> the GCC issue, deviating from it doesn't seem wise.)
>>>
>>> Which specific gcc issue (that is not solved by retaining type)?
>>
>> I am hoping Julien and his team will be able to provide the more
>> decisive information next week for us to make a decision, but it looks
>> like the issue is not clear-cut and people on the GCC list disagree on
>> how it should be handled.
>>
>>
>> The C standard says that "Two pointers compare equal if and only if both
>> are null pointers, both are pointers to the same object (including a
>> pointer to an object and a subobject at its beginning) or function, both
>> are pointers to one past the last element of the same array object, or
>> one is a pointer to one past the end of one array object and the other
>> is a pointer to the start of a different array object that happens to
>> immediately follow the first array object in the address space."
>>
>> In short, the compiler is free to return false in a pointer comparison
>> if it believes that the pointers point to different non-consecutive
>> object.
> 
> And it is this "it believes" which we undermine with the construct:
> As long as the compiler can't prove two pointers point to different
> objects, it can't eliminate the actual comparison.

May I ask where does this come from? A compiler could technically be free to 
assume the inverse. I.e as long as it can't prove two pointers point to 
different objects, it can rely on the undefined behavior to optimize it.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-14 15:41                       ` Julien Grall
@ 2019-01-14 15:52                         ` Jan Beulich
  2019-01-14 16:26                           ` Stewart Hildebrand
  2019-01-14 16:28                           ` Julien Grall
  0 siblings, 2 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-14 15:52 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, xen-devel

>>> On 14.01.19 at 16:41, <julien.grall@arm.com> wrote:
> Hi Jan,
> 
> On 14/01/2019 10:11, Jan Beulich wrote:
>>>>> On 11.01.19 at 19:04, <sstabellini@kernel.org> wrote:
>>> On Fri, 11 Jan 2019, Jan Beulich wrote:
>>>>>>> On 11.01.19 at 03:14, <sstabellini@kernel.org> wrote:
>>>>> Hi Juergen, Jan,
>>>>>
>>>>> I spoke with Julien: we are both convinced that the unsigned long
>>>>> solution is best. But Julien also did some research and he thinks that
>>>>> Jan's version (returning pointer type) not only does not help with
>>>>> MISRA-C, but also doesn't solve the potential GCC problem either. A
>>>>> description of the GCC issue is available here:
>>>>>
>>>>>
>>> 
> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization 
> .h
>>> tml?m=1
>>>>
>>>> I've read through it, and besides not agreeing with some of the
>>>> author's arguments I wasn't able to spot where it tells me why/how
>>>> the suggested approach doesn't solve the problem.
>>>>
>>>>> (Also keep in mind that Linux uses the unsigned long solution to solve
>>>>> the GCC issue, deviating from it doesn't seem wise.)
>>>>
>>>> Which specific gcc issue (that is not solved by retaining type)?
>>>
>>> I am hoping Julien and his team will be able to provide the more
>>> decisive information next week for us to make a decision, but it looks
>>> like the issue is not clear-cut and people on the GCC list disagree on
>>> how it should be handled.
>>>
>>>
>>> The C standard says that "Two pointers compare equal if and only if both
>>> are null pointers, both are pointers to the same object (including a
>>> pointer to an object and a subobject at its beginning) or function, both
>>> are pointers to one past the last element of the same array object, or
>>> one is a pointer to one past the end of one array object and the other
>>> is a pointer to the start of a different array object that happens to
>>> immediately follow the first array object in the address space."
>>>
>>> In short, the compiler is free to return false in a pointer comparison
>>> if it believes that the pointers point to different non-consecutive
>>> object.
>> 
>> And it is this "it believes" which we undermine with the construct:
>> As long as the compiler can't prove two pointers point to different
>> objects, it can't eliminate the actual comparison.
> 
> May I ask where does this come from? A compiler could technically be free to 
> assume the inverse. I.e as long as it can't prove two pointers point to 
> different objects, it can rely on the undefined behavior to optimize it.

No. As long as there's a chance that both pointers point to the same
object, it can't do bad things, because _if_ they do, the result of the
comparison has to be correct (as per the text still quoted above).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-14 15:52                         ` Jan Beulich
@ 2019-01-14 16:26                           ` Stewart Hildebrand
  2019-01-14 16:39                             ` Jan Beulich
  2019-01-14 16:28                           ` Julien Grall
  1 sibling, 1 reply; 102+ messages in thread
From: Stewart Hildebrand @ 2019-01-14 16:26 UTC (permalink / raw)
  To: Jan Beulich, Julien Grall, Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, xen-devel

On Monday, January 14, 2019 10:53 AM, Jan Beulich wrote:
> > Hi Jan,
> >
> > On 14/01/2019 10:11, Jan Beulich wrote:
> >>>>> On 11.01.19 at 19:04, <sstabellini@kernel.org> wrote:
> >>> On Fri, 11 Jan 2019, Jan Beulich wrote:
> >>>>>>> On 11.01.19 at 03:14, <sstabellini@kernel.org> wrote:
> >>>>> Hi Juergen, Jan,
> >>>>>
> >>>>> I spoke with Julien: we are both convinced that the unsigned long
> >>>>> solution is best. But Julien also did some research and he thinks that
> >>>>> Jan's version (returning pointer type) not only does not help with
> >>>>> MISRA-C, but also doesn't solve the potential GCC problem either. A
> >>>>> description of the GCC issue is available here:
> >>>>>
> >>>>>
> >>>
> > https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization
> > .h
> >>> tml?m=1
> >>>>
> >>>> I've read through it, and besides not agreeing with some of the
> >>>> author's arguments I wasn't able to spot where it tells me why/how
> >>>> the suggested approach doesn't solve the problem.
> >>>>
> >>>>> (Also keep in mind that Linux uses the unsigned long solution to solve
> >>>>> the GCC issue, deviating from it doesn't seem wise.)
> >>>>
> >>>> Which specific gcc issue (that is not solved by retaining type)?
> >>>
> >>> I am hoping Julien and his team will be able to provide the more
> >>> decisive information next week for us to make a decision, but it looks
> >>> like the issue is not clear-cut and people on the GCC list disagree on
> >>> how it should be handled.
> >>>
> >>>
> >>> The C standard says that "Two pointers compare equal if and only if both
> >>> are null pointers, both are pointers to the same object (including a
> >>> pointer to an object and a subobject at its beginning) or function, both
> >>> are pointers to one past the last element of the same array object, or
> >>> one is a pointer to one past the end of one array object and the other
> >>> is a pointer to the start of a different array object that happens to
> >>> immediately follow the first array object in the address space."
> >>>
> >>> In short, the compiler is free to return false in a pointer comparison
> >>> if it believes that the pointers point to different non-consecutive
> >>> object.
> >>
> >> And it is this "it believes" which we undermine with the construct:
> >> As long as the compiler can't prove two pointers point to different
> >> objects, it can't eliminate the actual comparison.
> >
> > May I ask where does this come from? A compiler could technically be free to
> > assume the inverse. I.e as long as it can't prove two pointers point to
> > different objects, it can rely on the undefined behavior to optimize it.
> 
> No. As long as there's a chance that both pointers point to the same
> object, it can't do bad things, because _if_ they do, the result of the
> comparison has to be correct (as per the text still quoted above).
> 
> Jan

In the following declaration:
extern char _start[], _end[];
it's still a valid interpretation that _start and _end point to different objects. In fact, I think is already making this assumption, given that GCC 7.3 will emit the following warning if "extern" is removed: "warning: array ‘_end’ assumed to have one element"

Who's to say that GCC someday won't become smart enough to sniff out all the inline assembly and pointer-type casts and still draw the conclusion that they point to separate objects? What if GCC someday will make these decisions based on some sort of link-time optimization, like using an intermediary or iterative symbol/linker map of sorts? It seems that using pointer types is at best a cat-and-mouse chase. GCC has too many unregulated moving pieces - it's a much stronger argument just to adhere to the standard. My vote still goes to uintptr_t.

Also, the above text only concerns equality comparisons. What about greater/less/neq comparisons, arithmetic, and dereferencing?

Stew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-14 15:52                         ` Jan Beulich
  2019-01-14 16:26                           ` Stewart Hildebrand
@ 2019-01-14 16:28                           ` Julien Grall
  2019-01-14 16:44                             ` Jan Beulich
  1 sibling, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-14 16:28 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, xen-devel

Hi Jan,

On 14/01/2019 15:52, Jan Beulich wrote:
>>>> On 14.01.19 at 16:41, <julien.grall@arm.com> wrote:
>> Hi Jan,
>>
>> On 14/01/2019 10:11, Jan Beulich wrote:
>>>>>> On 11.01.19 at 19:04, <sstabellini@kernel.org> wrote:
>>>> On Fri, 11 Jan 2019, Jan Beulich wrote:
>>>>>>>> On 11.01.19 at 03:14, <sstabellini@kernel.org> wrote:
>>>>>> Hi Juergen, Jan,
>>>>>>
>>>>>> I spoke with Julien: we are both convinced that the unsigned long
>>>>>> solution is best. But Julien also did some research and he thinks that
>>>>>> Jan's version (returning pointer type) not only does not help with
>>>>>> MISRA-C, but also doesn't solve the potential GCC problem either. A
>>>>>> description of the GCC issue is available here:
>>>>>>
>>>>>>
>>>>
>> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization
>> .h
>>>> tml?m=1
>>>>>
>>>>> I've read through it, and besides not agreeing with some of the
>>>>> author's arguments I wasn't able to spot where it tells me why/how
>>>>> the suggested approach doesn't solve the problem.
>>>>>
>>>>>> (Also keep in mind that Linux uses the unsigned long solution to solve
>>>>>> the GCC issue, deviating from it doesn't seem wise.)
>>>>>
>>>>> Which specific gcc issue (that is not solved by retaining type)?
>>>>
>>>> I am hoping Julien and his team will be able to provide the more
>>>> decisive information next week for us to make a decision, but it looks
>>>> like the issue is not clear-cut and people on the GCC list disagree on
>>>> how it should be handled.
>>>>
>>>>
>>>> The C standard says that "Two pointers compare equal if and only if both
>>>> are null pointers, both are pointers to the same object (including a
>>>> pointer to an object and a subobject at its beginning) or function, both
>>>> are pointers to one past the last element of the same array object, or
>>>> one is a pointer to one past the end of one array object and the other
>>>> is a pointer to the start of a different array object that happens to
>>>> immediately follow the first array object in the address space."
>>>>
>>>> In short, the compiler is free to return false in a pointer comparison
>>>> if it believes that the pointers point to different non-consecutive
>>>> object.
>>>
>>> And it is this "it believes" which we undermine with the construct:
>>> As long as the compiler can't prove two pointers point to different
>>> objects, it can't eliminate the actual comparison.
>>
>> May I ask where does this come from? A compiler could technically be free to
>> assume the inverse. I.e as long as it can't prove two pointers point to
>> different objects, it can rely on the undefined behavior to optimize it.
> 
> No. As long as there's a chance that both pointers point to the same
> object, it can't do bad things, because _if_ they do, the result of the
> comparison has to be correct (as per the text still quoted above).

In the following example (taken from [1]):

extern struct my_struct __start[];
extern struct my_struct __end[];

void foo(void)
{
     for (struct my_struct *x = __start; x != __end; x++)
         do_something(x);
}

The compiler can't be sure that __start and __end are not equal. Yet it may 
decide they are always different and optimize it to an infinite loop. So surely, 
the compiler can do bad things with even simple code.

I am struggling to understand how using "asm volatile" and still returning a 
pointer would help here. If the compiler managed to infer that __start and __end 
are always different, then there are no reason for this to not happen with the 
new construct.

I have been told that -fno-strict-aliasing may help us for pointer arithmetic. 
But I am still haven't find any evidence yet.

Cheers,

[1] 
https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1


-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-14 16:26                           ` Stewart Hildebrand
@ 2019-01-14 16:39                             ` Jan Beulich
  0 siblings, 0 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-14 16:39 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, xen-devel

>>> On 14.01.19 at 17:26, <Stewart.Hildebrand@dornerworks.com> wrote:
> On Monday, January 14, 2019 10:53 AM, Jan Beulich wrote:
>> > Hi Jan,
>> >
>> > On 14/01/2019 10:11, Jan Beulich wrote:
>> >>>>> On 11.01.19 at 19:04, <sstabellini@kernel.org> wrote:
>> >>> On Fri, 11 Jan 2019, Jan Beulich wrote:
>> >>>>>>> On 11.01.19 at 03:14, <sstabellini@kernel.org> wrote:
>> >>>>> Hi Juergen, Jan,
>> >>>>>
>> >>>>> I spoke with Julien: we are both convinced that the unsigned long
>> >>>>> solution is best. But Julien also did some research and he thinks that
>> >>>>> Jan's version (returning pointer type) not only does not help with
>> >>>>> MISRA-C, but also doesn't solve the potential GCC problem either. A
>> >>>>> description of the GCC issue is available here:
>> >>>>>
>> >>>>>
>> >>>
>> > 
> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization 
>> > .h
>> >>> tml?m=1
>> >>>>
>> >>>> I've read through it, and besides not agreeing with some of the
>> >>>> author's arguments I wasn't able to spot where it tells me why/how
>> >>>> the suggested approach doesn't solve the problem.
>> >>>>
>> >>>>> (Also keep in mind that Linux uses the unsigned long solution to solve
>> >>>>> the GCC issue, deviating from it doesn't seem wise.)
>> >>>>
>> >>>> Which specific gcc issue (that is not solved by retaining type)?
>> >>>
>> >>> I am hoping Julien and his team will be able to provide the more
>> >>> decisive information next week for us to make a decision, but it looks
>> >>> like the issue is not clear-cut and people on the GCC list disagree on
>> >>> how it should be handled.
>> >>>
>> >>>
>> >>> The C standard says that "Two pointers compare equal if and only if both
>> >>> are null pointers, both are pointers to the same object (including a
>> >>> pointer to an object and a subobject at its beginning) or function, both
>> >>> are pointers to one past the last element of the same array object, or
>> >>> one is a pointer to one past the end of one array object and the other
>> >>> is a pointer to the start of a different array object that happens to
>> >>> immediately follow the first array object in the address space."
>> >>>
>> >>> In short, the compiler is free to return false in a pointer comparison
>> >>> if it believes that the pointers point to different non-consecutive
>> >>> object.
>> >>
>> >> And it is this "it believes" which we undermine with the construct:
>> >> As long as the compiler can't prove two pointers point to different
>> >> objects, it can't eliminate the actual comparison.
>> >
>> > May I ask where does this come from? A compiler could technically be free to
>> > assume the inverse. I.e as long as it can't prove two pointers point to
>> > different objects, it can rely on the undefined behavior to optimize it.
>> 
>> No. As long as there's a chance that both pointers point to the same
>> object, it can't do bad things, because _if_ they do, the result of the
>> comparison has to be correct (as per the text still quoted above).
>> 
>> Jan
> 
> In the following declaration:
> extern char _start[], _end[];
> it's still a valid interpretation that _start and _end point to different 
> objects. In fact, I think is already making this assumption, given that GCC 
> 7.3 will emit the following warning if "extern" is removed: "warning: array 
> ‘_end’ assumed to have one element"

I'm afraid you're mixing up things. Removing the "extern" converts the
construct from a declaration to a (latent, determined once the entire
CU has been parsed) definition. The expansion to one element that
the compiler does is specifically to avoid pointers to compare equal
despite pointing at different objects (if there was no element, the
next sequential object in memory would have the same address, and
hence pointers would compare equal).

> Who's to say that GCC someday won't become smart enough to sniff out all the 
> inline assembly and pointer-type casts and still draw the conclusion that 
> they point to separate objects?

Looking through casts between pointer types is already happening.
Inspecting the actual text of an asm() is, otoh, something that
should never happen, as it would be extremely error prone.

> What if GCC someday will make these decisions 
> based on some sort of link-time optimization, like using an intermediary or 
> iterative symbol/linker map of sorts?

That's why hiding it through an asm() is imo best, because the asm()
will be retained in the internal representation, and hence remain
immune from such optimizations.

> Also, the above text only concerns equality comparisons. What about 
> greater/less/neq comparisons, arithmetic, and dereferencing?

Different subject.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-14 16:28                           ` Julien Grall
@ 2019-01-14 16:44                             ` Jan Beulich
  2019-01-14 17:24                               ` Julien Grall
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-14 16:44 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, xen-devel

>>> On 14.01.19 at 17:28, <julien.grall@arm.com> wrote:
> Hi Jan,
> 
> On 14/01/2019 15:52, Jan Beulich wrote:
>>>>> On 14.01.19 at 16:41, <julien.grall@arm.com> wrote:
>>> Hi Jan,
>>>
>>> On 14/01/2019 10:11, Jan Beulich wrote:
>>>>>>> On 11.01.19 at 19:04, <sstabellini@kernel.org> wrote:
>>>>> On Fri, 11 Jan 2019, Jan Beulich wrote:
>>>>>>>>> On 11.01.19 at 03:14, <sstabellini@kernel.org> wrote:
>>>>>>> Hi Juergen, Jan,
>>>>>>>
>>>>>>> I spoke with Julien: we are both convinced that the unsigned long
>>>>>>> solution is best. But Julien also did some research and he thinks that
>>>>>>> Jan's version (returning pointer type) not only does not help with
>>>>>>> MISRA-C, but also doesn't solve the potential GCC problem either. A
>>>>>>> description of the GCC issue is available here:
>>>>>>>
>>>>>>>
>>>>>
>>> 
> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization 
>>> .h
>>>>> tml?m=1
>>>>>>
>>>>>> I've read through it, and besides not agreeing with some of the
>>>>>> author's arguments I wasn't able to spot where it tells me why/how
>>>>>> the suggested approach doesn't solve the problem.
>>>>>>
>>>>>>> (Also keep in mind that Linux uses the unsigned long solution to solve
>>>>>>> the GCC issue, deviating from it doesn't seem wise.)
>>>>>>
>>>>>> Which specific gcc issue (that is not solved by retaining type)?
>>>>>
>>>>> I am hoping Julien and his team will be able to provide the more
>>>>> decisive information next week for us to make a decision, but it looks
>>>>> like the issue is not clear-cut and people on the GCC list disagree on
>>>>> how it should be handled.
>>>>>
>>>>>
>>>>> The C standard says that "Two pointers compare equal if and only if both
>>>>> are null pointers, both are pointers to the same object (including a
>>>>> pointer to an object and a subobject at its beginning) or function, both
>>>>> are pointers to one past the last element of the same array object, or
>>>>> one is a pointer to one past the end of one array object and the other
>>>>> is a pointer to the start of a different array object that happens to
>>>>> immediately follow the first array object in the address space."
>>>>>
>>>>> In short, the compiler is free to return false in a pointer comparison
>>>>> if it believes that the pointers point to different non-consecutive
>>>>> object.
>>>>
>>>> And it is this "it believes" which we undermine with the construct:
>>>> As long as the compiler can't prove two pointers point to different
>>>> objects, it can't eliminate the actual comparison.
>>>
>>> May I ask where does this come from? A compiler could technically be free to
>>> assume the inverse. I.e as long as it can't prove two pointers point to
>>> different objects, it can rely on the undefined behavior to optimize it.
>> 
>> No. As long as there's a chance that both pointers point to the same
>> object, it can't do bad things, because _if_ they do, the result of the
>> comparison has to be correct (as per the text still quoted above).
> 
> In the following example (taken from [1]):
> 
> extern struct my_struct __start[];
> extern struct my_struct __end[];
> 
> void foo(void)
> {
>      for (struct my_struct *x = __start; x != __end; x++)
>          do_something(x);
> }
> 
> The compiler can't be sure that __start and __end are not equal.

You're inverting what was said before: Of course the compiler
can#t be sure the addresses are not equal. But from the mere
language structure it knows that __start[] and __end[] are
two different objects.

> Yet it may 
> decide they are always different and optimize it to an infinite loop. So surely, 
> the compiler can do bad things with even simple code.

Of course.

> I am struggling to understand how using "asm volatile" and still returning a 
> pointer would help here. If the compiler managed to infer that __start and __end 
> are always different, then there are no reason for this to not happen with the 
> new construct.

Of course there is: There's no connection to the original object(s)
anymore. Same with

extern struct my_struct __start[];
extern struct my_struct __end[];

void test(const struct my_struct *);

void foo(int i) {
	test(i ? __start : __end);
}

Wherever test() is defined, the compiler can't make assumptions
anymore (unless of course it gets to see the definition, e.g. via
whole program optimization). But if the function's implementation
lives in a different binary, the compiler just can't make any
assumptions anymore.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-14 16:44                             ` Jan Beulich
@ 2019-01-14 17:24                               ` Julien Grall
  2019-01-15  8:04                                 ` Jan Beulich
  0 siblings, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-14 17:24 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, xen-devel

Hi Jan,

On 14/01/2019 16:44, Jan Beulich wrote:
> 
> extern struct my_struct __start[];
> extern struct my_struct __end[];
> 
> void test(const struct my_struct *);
> 
> void foo(int i) {
> 	test(i ? __start : __end);
> }

Your example doesn't contain any potential undefined behavior. So, how this 
relevant with our discussion here?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-14 10:26                               ` Jan Beulich
@ 2019-01-14 21:18                                 ` Stefano Stabellini
       [not found]                                   ` <1CACC1FB020000D800417A66@prv1-mh.provo.novell.com>
       [not found]                                 ` <7A8C0A4F020000EEB8D7C7D4@prv1-mh.provo.novell.com>
  1 sibling, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-14 21:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3772 bytes --]

Hi Jan,

One question below to make a decision on the way forward.

On Mon, 14 Jan 2019, Jan Beulich wrote:
> >>> On 14.01.19 at 04:45, <Stewart.Hildebrand@dornerworks.com> wrote:
> > So let's keep the linker-accessible variable as a type that works for the
> > linker (which really could be anything as long as you use the address, not
> > the value), but name it something else - a name that screams "DON'T USE ME
> > UNLESS YOU KNOW WHAT YOU'RE DOING". And then before the first use, copy
> > that value to "uintptr_t _start;".
> > 
> > The following is a quick proof of concept for aarch64. I changed the type
> > of _start and _end, and added code to copy the linker-assigned value to
> > _start and _end. Upon booting, I see the correct values:
> 
> Global symbols starting with underscores should already be shouting
> enough. But what's worse - the whole idea if using array types is to
> avoid the intermediate variables.
> 
> > --- a/xen/arch/arm/setup.c
> > +++ b/xen/arch/arm/setup.c
> > @@ -726,6 +726,12 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size)
> >  
> >  size_t __read_mostly dcache_line_bytes;
> >  
> > +typedef char TYPE_DOESNT_MATTER;
> > +extern TYPE_DOESNT_MATTER _start_linker_assigned_dont_use_me,
> > +                          _end_linker_assigned_dont_use_me;
> 
> This and ...
> 
> > @@ -770,10 +776,17 @@ void __init start_xen(unsigned long boot_phys_offset,
> >      printk("Command line: %s\n", cmdline);
> >      cmdline_parse(cmdline);
> >  
> > +    _start = (uintptr_t)&_start_linker_assigned_dont_use_me;
> > +    _end = (uintptr_t)&_end_linker_assigned_dont_use_me;
> 
> ... this violates what the symbol names say. And if you want to
> avoid issues, you'd want to keep out of C files uses of those
> symbols altogether anyway, and you easily can: In any
> assembly file, have
> 
> _start:	.long _start_linker_assigned_dont_use_me
> _end:	.long _end_linker_assigned_dont_use_me
> 
> In particular, they don't need to be runtime initialized, saving
> you from needing to set them before first use. But as said -
> things are the way they are precisely to avoid such variables.
> 
> >> But, instead of converting _start to unsigned long via SYMBOL_HIDE, we
> >> could convert it to uintptr_t instead, it would be a trivial change on
> >> top of the existing unsigned long series. Not sure if it is beneficial.
> > 
> > The difference would be whether we want to rely on implementation-defined
> > behavior or not.
> 
> Why not? Simply specify that compilers with implementation defined
> behavior not matching our expectations are unsuitable. And btw, I
> suppose this is just the tiny tip of the iceberg of our reliance on
> implementation defined behavior.

The reason is that relying on undefined behavior is not reliable, it is
not C compliant, it is not allowed by MISRA-C, and not guaranteed to
work with any compiler. Yes, this instance is only the tip of the
iceberg, we have a long road ahead, but we shouldn't really give up
because it is going to be difficult :-) Stewart's approach would
actually be compliant and help toward reducing reliance on undefined
behavior.

Would you be OK if I rework the series to follow his approach using
intermediate variables? See the attached patch as a reference, it only
"converts" _start and _end as an example. Fortunately, it will be
textually similar to the previous SYMBOL returning unsigned long version
of the series.

If you are OK with it, do you have any suggestions on how would you like
the intermediate variables to be called? I went with _start/start_ and
_end/end_ but I am open to suggestions. Also to which assembly file you
would like the new variables being added -- I created a new one for the
purpose named var.S in the attached example.

[-- Attachment #2: Type: TEXT/PLAIN, Size: 10335 bytes --]

diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
index 52ed7ed..b79536d 100644
--- a/xen/arch/arm/alternative.c
+++ b/xen/arch/arm/alternative.c
@@ -187,8 +187,8 @@ static int __apply_alternatives_multi_stop(void *unused)
     {
         int ret;
         struct alt_region region;
-        mfn_t xen_mfn = virt_to_mfn(_start);
-        paddr_t xen_size = _end - _start;
+        mfn_t xen_mfn = virt_to_mfn(start_);
+        paddr_t xen_size = end_ - start_;
         unsigned int xen_order = get_order_from_bytes(xen_size);
         void *xenmap;
 
@@ -206,7 +206,7 @@ static int __apply_alternatives_multi_stop(void *unused)
         region.begin = __alt_instructions;
         region.end = __alt_instructions_end;
 
-        ret = __apply_alternatives(&region, xenmap - (void *)_start);
+        ret = __apply_alternatives(&region, (uintptr_t)xenmap - start_);
         /* The patching is not expected to fail during boot. */
         BUG_ON(ret != 0);
 
diff --git a/xen/arch/arm/arm32/Makefile b/xen/arch/arm/arm32/Makefile
index 0ac254f..983fb82 100644
--- a/xen/arch/arm/arm32/Makefile
+++ b/xen/arch/arm/arm32/Makefile
@@ -10,4 +10,5 @@ obj-y += proc-v7.o proc-caxx.o
 obj-y += smpboot.o
 obj-y += traps.o
 obj-y += vfp.o
+obj-y += var.o
 
diff --git a/xen/arch/arm/arm32/livepatch.c b/xen/arch/arm/arm32/livepatch.c
index 41378a5..ad9e9ca 100644
--- a/xen/arch/arm/arm32/livepatch.c
+++ b/xen/arch/arm/arm32/livepatch.c
@@ -56,7 +56,7 @@ void arch_livepatch_apply(struct livepatch_func *func)
     else
         insn = 0xe1a00000; /* mov r0, r0 */
 
-    new_ptr = func->old_addr - (void *)_start + vmap_of_xen_text;
+    new_ptr = (uintptr_t)func->old_addr - start_ + (uintptr_t)vmap_of_xen_text;
     len = len / sizeof(uint32_t);
 
     /* PATCH! */
diff --git a/xen/arch/arm/arm32/var.S b/xen/arch/arm/arm32/var.S
new file mode 100644
index 0000000..f8b94a1
--- /dev/null
+++ b/xen/arch/arm/arm32/var.S
@@ -0,0 +1,4 @@
+GLOBAL(start_)
+  .long  _start
+GLOBAL(end_)
+  .long  _end
diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
index c4f3a28..a5d9558 100644
--- a/xen/arch/arm/arm64/Makefile
+++ b/xen/arch/arm/arm64/Makefile
@@ -13,3 +13,4 @@ obj-y += smpboot.o
 obj-y += traps.o
 obj-y += vfp.o
 obj-y += vsysreg.o
+obj-y += var.o
diff --git a/xen/arch/arm/arm64/livepatch.c b/xen/arch/arm/arm64/livepatch.c
index 2247b92..60616d6 100644
--- a/xen/arch/arm/arm64/livepatch.c
+++ b/xen/arch/arm/arm64/livepatch.c
@@ -43,7 +43,7 @@ void arch_livepatch_apply(struct livepatch_func *func)
     /* Verified in livepatch_verify_distance. */
     ASSERT(insn != AARCH64_BREAK_FAULT);
 
-    new_ptr = func->old_addr - (void *)_start + vmap_of_xen_text;
+    new_ptr = (uintptr_t)func->old_addr - start_ + (uintptr_t)vmap_of_xen_text;
     len = len / sizeof(uint32_t);
 
     /* PATCH! */
diff --git a/xen/arch/arm/arm64/var.S b/xen/arch/arm/arm64/var.S
new file mode 100644
index 0000000..566e06c
--- /dev/null
+++ b/xen/arch/arm/arm64/var.S
@@ -0,0 +1,4 @@
+GLOBAL(start_)
+  .quad  _start
+GLOBAL(end_)
+  .quad  _end
diff --git a/xen/arch/arm/livepatch.c b/xen/arch/arm/livepatch.c
index 279d52c..7ab66d3 100644
--- a/xen/arch/arm/livepatch.c
+++ b/xen/arch/arm/livepatch.c
@@ -26,8 +26,8 @@ int arch_livepatch_quiesce(void)
     if ( vmap_of_xen_text )
         return -EINVAL;
 
-    text_mfn = virt_to_mfn(_start);
-    text_order = get_order_from_bytes(_end - _start);
+    text_mfn = virt_to_mfn(start_);
+    text_order = get_order_from_bytes(end_ - start_);
 
     /*
      * The text section is read-only. So re-map Xen to be able to patch
@@ -78,7 +78,7 @@ void arch_livepatch_revert(const struct livepatch_func *func)
     uint32_t *new_ptr;
     unsigned int len;
 
-    new_ptr = func->old_addr - (void *)_start + vmap_of_xen_text;
+    new_ptr = (uintptr_t)func->old_addr - start_ + (uintptr_t)vmap_of_xen_text;
 
     len = livepatch_insn_len(func);
     memcpy(new_ptr, func->opaque, len);
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 01ae2cc..2a083be 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1073,7 +1073,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int flags)
 }
 
 enum mg { mg_clear, mg_ro, mg_rw, mg_rx };
-static void set_pte_flags_on_range(const char *p, unsigned long l, enum mg mg)
+static void set_pte_flags_on_range(const uintptr_t p, unsigned long l, enum mg mg)
 {
     lpae_t pte;
     int i;
@@ -1084,8 +1084,8 @@ static void set_pte_flags_on_range(const char *p, unsigned long l, enum mg mg)
     ASSERT(!((unsigned long) p & ~PAGE_MASK));
     ASSERT(!(l & ~PAGE_MASK));
 
-    for ( i = (p - _start) / PAGE_SIZE; 
-          i < (p + l - _start) / PAGE_SIZE; 
+    for ( i = (p - start_) / PAGE_SIZE; 
+          i < (p + l - start_) / PAGE_SIZE; 
           i++ )
     {
         pte = xen_xenmap[i];
@@ -1127,7 +1127,7 @@ void free_init_memory(void)
     unsigned int i, nr = len / sizeof(insn);
     uint32_t *p;
 
-    set_pte_flags_on_range(__init_begin, len, mg_rw);
+    set_pte_flags_on_range((uintptr_t)__init_begin, len, mg_rw);
 #ifdef CONFIG_ARM_32
     /* udf instruction i.e (see A8.8.247 in ARM DDI 0406C.c) */
     insn = 0xe7f000f0;
@@ -1138,7 +1138,7 @@ void free_init_memory(void)
     for ( i = 0; i < nr; i++ )
         *(p + i) = insn;
 
-    set_pte_flags_on_range(__init_begin, len, mg_clear);
+    set_pte_flags_on_range((uintptr_t)__init_begin, len, mg_clear);
     init_domheap_pages(pa, pa + len);
     printk("Freed %ldkB init memory.\n", (long)(__init_end-__init_begin)>>10);
 }
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 444857a..9a0df43 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -772,8 +772,8 @@ void __init start_xen(unsigned long boot_phys_offset,
 
     /* Register Xen's load address as a boot module. */
     xen_bootmodule = add_boot_module(BOOTMOD_XEN,
-                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
-                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
+                             (start_ + boot_phys_offset),
+                             (end_ - start_ + 1), false);
     BUG_ON(!xen_bootmodule);
 
     setup_pagetables(boot_phys_offset);
diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
index e103882..8327d2a 100644
--- a/xen/arch/x86/boot/Makefile
+++ b/xen/arch/x86/boot/Makefile
@@ -1,4 +1,5 @@
 obj-bin-y += head.o
+obj-bin-y += var.o
 
 DEFS_H_DEPS = defs.h $(BASEDIR)/include/xen/stdbool.h
 
diff --git a/xen/arch/x86/boot/var.S b/xen/arch/x86/boot/var.S
new file mode 100644
index 0000000..566e06c
--- /dev/null
+++ b/xen/arch/x86/boot/var.S
@@ -0,0 +1,4 @@
+GLOBAL(start_)
+  .quad  _start
+GLOBAL(end_)
+  .quad  _end
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 06eb483..7d6c3ec 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1039,7 +1039,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
          * Is the region size greater than zero and does it begin
          * at or above the end of current Xen image placement?
          */
-        if ( (end > s) && (end - reloc_size + XEN_IMG_OFFSET >= __pa(_end)) )
+        if ( (end > s) && (end - reloc_size + XEN_IMG_OFFSET >= __pa(end_)) )
         {
             l4_pgentry_t *pl4e;
             l3_pgentry_t *pl3e;
@@ -1067,7 +1067,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
              * data until after we have switched to the relocated pagetables!
              */
             barrier();
-            move_memory(e + XEN_IMG_OFFSET, XEN_IMG_OFFSET, _end - _start, 1);
+            move_memory(e + XEN_IMG_OFFSET, XEN_IMG_OFFSET, end_ - start_, 1);
 
             /* Walk initial pagetables, relocating page directory entries. */
             pl4e = __va(__pa(idle_pg_table));
@@ -1382,7 +1382,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     }
 #endif
 
-    xen_virt_end = ((unsigned long)_end + (1UL << L2_PAGETABLE_SHIFT) - 1) &
+    xen_virt_end = (end_ + (1UL << L2_PAGETABLE_SHIFT) - 1) &
                    ~((1UL << L2_PAGETABLE_SHIFT) - 1);
     destroy_xen_mappings(xen_virt_end, XEN_VIRT_START + BOOTSTRAP_MAP_BASE);
 
diff --git a/xen/arch/x86/x86_64/machine_kexec.c b/xen/arch/x86/x86_64/machine_kexec.c
index f4a005c..cf435ac 100644
--- a/xen/arch/x86/x86_64/machine_kexec.c
+++ b/xen/arch/x86/x86_64/machine_kexec.c
@@ -13,8 +13,8 @@
 
 int machine_kexec_get_xen(xen_kexec_range_t *range)
 {
-        range->start = virt_to_maddr(_start);
-        range->size = virt_to_maddr(_end) - (unsigned long)range->start;
+        range->start = virt_to_maddr(start_);
+        range->size = virt_to_maddr(end_) - (unsigned long)range->start;
         return 0;
 }
 
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index eafa26f..e72ffb2 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -151,8 +151,8 @@ extern vaddr_t xenheap_virt_start;
 #endif
 
 #define is_xen_fixed_mfn(mfn)                                   \
-    ((pfn_to_paddr(mfn) >= virt_to_maddr(&_start)) &&       \
-     (pfn_to_paddr(mfn) <= virt_to_maddr(&_end)))
+    ((pfn_to_paddr(mfn) >= virt_to_maddr(&start_)) &&       \
+     (pfn_to_paddr(mfn) <= virt_to_maddr(&end_)))
 
 #define page_get_owner(_p)    (_p)->v.inuse.domain
 #define page_set_owner(_p,_d) ((_p)->v.inuse.domain = (_d))
diff --git a/xen/include/xen/kernel.h b/xen/include/xen/kernel.h
index 548b64d..b508f65 100644
--- a/xen/include/xen/kernel.h
+++ b/xen/include/xen/kernel.h
@@ -65,10 +65,11 @@
 	1;                                      \
 })
 
-extern char _start[], _end[], start[];
-#define is_kernel(p) ({                         \
-    char *__p = (char *)(unsigned long)(p);     \
-    (__p >= _start) && (__p < _end);            \
+extern char start[];
+extern uintptr_t start_, end_;
+#define is_kernel(p) ({                                    \
+    const uintptr_t p__ = (const uintptr_t)(p);            \
+    (p__ >= start_) && (p__ < end_);                       \
 })
 
 extern char _stext[], _etext[];

[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-14 17:24                               ` Julien Grall
@ 2019-01-15  8:04                                 ` Jan Beulich
  0 siblings, 0 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-15  8:04 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, xen-devel

>>> On 14.01.19 at 18:24, <julien.grall@arm.com> wrote:
> On 14/01/2019 16:44, Jan Beulich wrote:
>> 
>> extern struct my_struct __start[];
>> extern struct my_struct __end[];
>> 
>> void test(const struct my_struct *);
>> 
>> void foo(int i) {
>> 	test(i ? __start : __end);
>> }
> 
> Your example doesn't contain any potential undefined behavior. So, how this 
> relevant with our discussion here?

I was merely trying to give an example where the compiler similarly
can't make implications. The potential undefined behavior would be
in test(), depending on what comparisons of the pointers it does.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                   ` <1CACC1FB020000D800417A66@prv1-mh.provo.novell.com>
@ 2019-01-15  8:21                                     ` Jan Beulich
  2019-01-15 11:51                                       ` Julien Grall
                                                         ` (2 more replies)
       [not found]                                     ` <95DC675902000028AB59E961@prv1-mh.provo.novell.com>
  1 sibling, 3 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-15  8:21 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 14.01.19 at 22:18, <sstabellini@kernel.org> wrote:
> Hi Jan,
> 
> One question below to make a decision on the way forward.
> 
> On Mon, 14 Jan 2019, Jan Beulich wrote:
>> >>> On 14.01.19 at 04:45, <Stewart.Hildebrand@dornerworks.com> wrote:
>> > So let's keep the linker-accessible variable as a type that works for the
>> > linker (which really could be anything as long as you use the address, not
>> > the value), but name it something else - a name that screams "DON'T USE ME
>> > UNLESS YOU KNOW WHAT YOU'RE DOING". And then before the first use, copy
>> > that value to "uintptr_t _start;".
>> > 
>> > The following is a quick proof of concept for aarch64. I changed the type
>> > of _start and _end, and added code to copy the linker-assigned value to
>> > _start and _end. Upon booting, I see the correct values:
>> 
>> Global symbols starting with underscores should already be shouting
>> enough. But what's worse - the whole idea if using array types is to
>> avoid the intermediate variables.
>> 
>> > --- a/xen/arch/arm/setup.c
>> > +++ b/xen/arch/arm/setup.c
>> > @@ -726,6 +726,12 @@ static void __init setup_mm(unsigned long dtb_paddr, 
> size_t dtb_size)
>> >  
>> >  size_t __read_mostly dcache_line_bytes;
>> >  
>> > +typedef char TYPE_DOESNT_MATTER;
>> > +extern TYPE_DOESNT_MATTER _start_linker_assigned_dont_use_me,
>> > +                          _end_linker_assigned_dont_use_me;
>> 
>> This and ...
>> 
>> > @@ -770,10 +776,17 @@ void __init start_xen(unsigned long boot_phys_offset,
>> >      printk("Command line: %s\n", cmdline);
>> >      cmdline_parse(cmdline);
>> >  
>> > +    _start = (uintptr_t)&_start_linker_assigned_dont_use_me;
>> > +    _end = (uintptr_t)&_end_linker_assigned_dont_use_me;
>> 
>> ... this violates what the symbol names say. And if you want to
>> avoid issues, you'd want to keep out of C files uses of those
>> symbols altogether anyway, and you easily can: In any
>> assembly file, have
>> 
>> _start:	.long _start_linker_assigned_dont_use_me
>> _end:	.long _end_linker_assigned_dont_use_me
>> 
>> In particular, they don't need to be runtime initialized, saving
>> you from needing to set them before first use. But as said -
>> things are the way they are precisely to avoid such variables.
>> 
>> >> But, instead of converting _start to unsigned long via SYMBOL_HIDE, we
>> >> could convert it to uintptr_t instead, it would be a trivial change on
>> >> top of the existing unsigned long series. Not sure if it is beneficial.
>> > 
>> > The difference would be whether we want to rely on implementation-defined
>> > behavior or not.
>> 
>> Why not? Simply specify that compilers with implementation defined
>> behavior not matching our expectations are unsuitable. And btw, I
>> suppose this is just the tiny tip of the iceberg of our reliance on
>> implementation defined behavior.
> 
> The reason is that relying on undefined behavior is not reliable, it is
> not C compliant, it is not allowed by MISRA-C, and not guaranteed to
> work with any compiler.

"undefined behavior" != "implementation defined behavior"

> Yes, this instance is only the tip of the
> iceberg, we have a long road ahead, but we shouldn't really give up
> because it is going to be difficult :-) Stewart's approach would
> actually be compliant and help toward reducing reliance on undefined
> behavior.
> 
> Would you be OK if I rework the series to follow his approach using
> intermediate variables? See the attached patch as a reference, it only
> "converts" _start and _end as an example. Fortunately, it will be
> textually similar to the previous SYMBOL returning unsigned long version
> of the series.

Well, I've given reasons why I dislike that, and why (I think) it was
done without such intermediate variables. Nevertheless, if this is
_the only way_ to achieve compliance, I don't think I could
reasonably NAK it.

The thing that I don't understand though is how the undefined
behavior (if there really is any) goes away: Even if you compare
the contents of the variables instead of the original (perhaps
casted) pointers, in the end you still compare what C would
consider pointers to different objects. It's merely a different
way of hiding that fact from C. Undefined behavior would imo
go away only if those comparisons/subtractions didn't happen
in C anymore. IOW - see my .startof.() / .sizeof.() proposal.

> If you are OK with it, do you have any suggestions on how would you like
> the intermediate variables to be called? I went with _start/start_ and
> _end/end_ but I am open to suggestions. Also to which assembly file you
> would like the new variables being added -- I created a new one for the
> purpose named var.S in the attached example.

First of all we should explore whether the variables could also be
linker generated, in particular to avoid the current symbols to be
global (thus making it impossible to access them from C files in the
first place). Failing that, I don't think it matters much where these
helper symbols live, and hence your choice is probably fine (I'd
prefer though if, just like on Arm, the x86 file didn't live in the
boot/ subdirectory; in the end it might even be possible to have
some of them in xen/common/var.S).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-11 21:37                           ` Stefano Stabellini
  2019-01-14  3:45                             ` Stewart Hildebrand
@ 2019-01-15 11:46                             ` Julien Grall
  2019-01-15 12:23                               ` Julien Grall
  1 sibling, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-15 11:46 UTC (permalink / raw)
  To: Stefano Stabellini, Stewart Hildebrand
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Jan Beulich, xen-devel

Hi Stefano,

On 1/11/19 9:37 PM, Stefano Stabellini wrote:
> On Fri, 11 Jan 2019, Stewart Hildebrand wrote:
>> On Friday, January 11, 2019 3:36 PM, Julien Grall wrote:
>>> On Fri, 11 Jan 2019, 12:53 Stewart Hildebrand wrote:
>>>>
>>>> Why don't we change the type of _start so it's not a pointer type?
>>>
>>> Can you suggest a type that would be suitable?
>>>
>>> Cheers,
>>
>> Yes. My opinion is that the "sufficient-width integer type" should be a
>> "uintptr_t" or "intptr_t", since those types by definition are *integer* types
>> wide enough to hold a value converted from a void pointer. While "unsigned
>> long" seems to work for Linux, the definition of that type doesn't provide the
>> same guarantee. Since uintptr_t is an *integer* type by definition (and not a
>> pointer type), my interpretation of the C standard is that
>> subtraction/comparison of uintptr_t types won't be subject to the potential
>> "pointer to object" issues in question.
>>
>> If I had to choose between "uintptr_t" or "intptr_t" I guess I would choose
>> "uintptr_t" since that type is already used in various places in the Xen
>> codebase. And the Linux workaround is also using an unsigned integer type.
> 
> On changing type of _start & friends: we cannot declare _start as
> uintptr_t, the linker won't be able to set the value. It needs to be an
> array type. At that point, it is basically a pointer, it doesn't matter
> if it is a char[] or uintptr_t[]. It won't help.

Are you sure about this? I wrote a quick patch (see below) to switch 
_start/_end to uintptr_t and didn't notice any specific linker issue. I 
borrowed the idea from ATF which have been using uintptr_t for linker 
symbol.

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 340a1d1548..ab98cabbb7 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1073,10 +1073,11 @@ int modify_xen_mappings(unsigned long s, 
unsigned long e, unsigned int flags)
  }

  enum mg { mg_clear, mg_ro, mg_rw, mg_rx };
-static void set_pte_flags_on_range(const char *p, unsigned long l, enum 
mg mg)
+static void set_pte_flags_on_range(const char *__p, unsigned long l, 
enum mg mg)
  {
      lpae_t pte;
      int i;
+    uintptr_t p = (uintptr_t)__p;

      ASSERT(is_kernel(p) && is_kernel(p + l));

diff --git a/xen/include/xen/kernel.h b/xen/include/xen/kernel.h
index 548b64da9f..94bb08fc65 100644
--- a/xen/include/xen/kernel.h
+++ b/xen/include/xen/kernel.h
@@ -65,9 +65,9 @@
         1;                                      \
  })

-extern char _start[], _end[], start[];
+extern uintptr_t _start, _end, start;
  #define is_kernel(p) ({                         \
-    char *__p = (char *)(unsigned long)(p);     \
+    uintptr_t __p = (uintptr_t)(p);             \
      (__p >= _start) && (__p < _end);            \
  })

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-15  8:21                                     ` Jan Beulich
@ 2019-01-15 11:51                                       ` Julien Grall
       [not found]                                         ` <AB1DA25B020000B95C475325@prv1-mh.provo.novell.com>
  2019-01-15 20:03                                       ` Stewart Hildebrand
  2019-01-15 23:36                                       ` Stefano Stabellini
  2 siblings, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-15 11:51 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Stewart Hildebrand, xen-devel

Hi Jan,

On 1/15/19 8:21 AM, Jan Beulich wrote:
>>>> On 14.01.19 at 22:18, <sstabellini@kernel.org> wrote:
>> Hi Jan,
>>
>> One question below to make a decision on the way forward.
>>
>> On Mon, 14 Jan 2019, Jan Beulich wrote:
>>>>>> On 14.01.19 at 04:45, <Stewart.Hildebrand@dornerworks.com> wrote:
>>>> So let's keep the linker-accessible variable as a type that works for the
>>>> linker (which really could be anything as long as you use the address, not
>>>> the value), but name it something else - a name that screams "DON'T USE ME
>>>> UNLESS YOU KNOW WHAT YOU'RE DOING". And then before the first use, copy
>>>> that value to "uintptr_t _start;".
>>>>
>>>> The following is a quick proof of concept for aarch64. I changed the type
>>>> of _start and _end, and added code to copy the linker-assigned value to
>>>> _start and _end. Upon booting, I see the correct values:
>>>
>>> Global symbols starting with underscores should already be shouting
>>> enough. But what's worse - the whole idea if using array types is to
>>> avoid the intermediate variables.
>>>
>>>> --- a/xen/arch/arm/setup.c
>>>> +++ b/xen/arch/arm/setup.c
>>>> @@ -726,6 +726,12 @@ static void __init setup_mm(unsigned long dtb_paddr,
>> size_t dtb_size)
>>>>   
>>>>   size_t __read_mostly dcache_line_bytes;
>>>>   
>>>> +typedef char TYPE_DOESNT_MATTER;
>>>> +extern TYPE_DOESNT_MATTER _start_linker_assigned_dont_use_me,
>>>> +                          _end_linker_assigned_dont_use_me;
>>>
>>> This and ...
>>>
>>>> @@ -770,10 +776,17 @@ void __init start_xen(unsigned long boot_phys_offset,
>>>>       printk("Command line: %s\n", cmdline);
>>>>       cmdline_parse(cmdline);
>>>>   
>>>> +    _start = (uintptr_t)&_start_linker_assigned_dont_use_me;
>>>> +    _end = (uintptr_t)&_end_linker_assigned_dont_use_me;
>>>
>>> ... this violates what the symbol names say. And if you want to
>>> avoid issues, you'd want to keep out of C files uses of those
>>> symbols altogether anyway, and you easily can: In any
>>> assembly file, have
>>>
>>> _start:	.long _start_linker_assigned_dont_use_me
>>> _end:	.long _end_linker_assigned_dont_use_me
>>>
>>> In particular, they don't need to be runtime initialized, saving
>>> you from needing to set them before first use. But as said -
>>> things are the way they are precisely to avoid such variables.
>>>
>>>>> But, instead of converting _start to unsigned long via SYMBOL_HIDE, we
>>>>> could convert it to uintptr_t instead, it would be a trivial change on
>>>>> top of the existing unsigned long series. Not sure if it is beneficial.
>>>>
>>>> The difference would be whether we want to rely on implementation-defined
>>>> behavior or not.
>>>
>>> Why not? Simply specify that compilers with implementation defined
>>> behavior not matching our expectations are unsuitable. And btw, I
>>> suppose this is just the tiny tip of the iceberg of our reliance on
>>> implementation defined behavior.
>>
>> The reason is that relying on undefined behavior is not reliable, it is
>> not C compliant, it is not allowed by MISRA-C, and not guaranteed to
>> work with any compiler.
> 
> "undefined behavior" != "implementation defined behavior"
> 
>> Yes, this instance is only the tip of the
>> iceberg, we have a long road ahead, but we shouldn't really give up
>> because it is going to be difficult :-) Stewart's approach would
>> actually be compliant and help toward reducing reliance on undefined
>> behavior.
>>
>> Would you be OK if I rework the series to follow his approach using
>> intermediate variables? See the attached patch as a reference, it only
>> "converts" _start and _end as an example. Fortunately, it will be
>> textually similar to the previous SYMBOL returning unsigned long version
>> of the series.
> 
> Well, I've given reasons why I dislike that, and why (I think) it was
> done without such intermediate variables. Nevertheless, if this is
> _the only way_ to achieve compliance, I don't think I could
> reasonably NAK it. >
> The thing that I don't understand though is how the undefined
> behavior (if there really is any) goes away: Even if you compare
> the contents of the variables instead of the original (perhaps
> casted) pointers, in the end you still compare what C would
> consider pointers to different objects. It's merely a different
> way of hiding that fact from C. Undefined behavior would imo
> go away only if those comparisons/subtractions didn't happen
> in C anymore. IOW - see my .startof.() / .sizeof.() proposal.

Do you have a pointer to the series using startof/sizeof?

> 
>> If you are OK with it, do you have any suggestions on how would you like
>> the intermediate variables to be called? I went with _start/start_ and
>> _end/end_ but I am open to suggestions. Also to which assembly file you
>> would like the new variables being added -- I created a new one for the
>> purpose named var.S in the attached example.
> 
> First of all we should explore whether the variables could also be
> linker generated, in particular to avoid the current symbols to be
> global (thus making it impossible to access them from C files in the
> first place). Failing that, I don't think it matters much where these
> helper symbols live, and hence your choice is probably fine (I'd
> prefer though if, just like on Arm, the x86 file didn't live in the
> boot/ subdirectory; in the end it might even be possible to have
> some of them in xen/common/var.S).

 From my test [1], I don't think intermediate variables are necessary. 
You could directly define the symbol with uintptr_t.

Cheers,

[1] https://lists.xen.org/archives/html/xen-devel/2019-01/msg01109.html

> 
> Jan
> 

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                         ` <AB1DA25B020000B95C475325@prv1-mh.provo.novell.com>
@ 2019-01-15 12:04                                           ` Jan Beulich
  2019-01-15 12:23                                             ` Julien Grall
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-15 12:04 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 15.01.19 at 12:51, <julien.grall@arm.com> wrote:
> Hi Jan,
> 
> On 1/15/19 8:21 AM, Jan Beulich wrote:
>>>>> On 14.01.19 at 22:18, <sstabellini@kernel.org> wrote:
>>> Hi Jan,
>>>
>>> One question below to make a decision on the way forward.
>>>
>>> On Mon, 14 Jan 2019, Jan Beulich wrote:
>>>>>>> On 14.01.19 at 04:45, <Stewart.Hildebrand@dornerworks.com> wrote:
>>>>> So let's keep the linker-accessible variable as a type that works for the
>>>>> linker (which really could be anything as long as you use the address, not
>>>>> the value), but name it something else - a name that screams "DON'T USE ME
>>>>> UNLESS YOU KNOW WHAT YOU'RE DOING". And then before the first use, copy
>>>>> that value to "uintptr_t _start;".
>>>>>
>>>>> The following is a quick proof of concept for aarch64. I changed the type
>>>>> of _start and _end, and added code to copy the linker-assigned value to
>>>>> _start and _end. Upon booting, I see the correct values:
>>>>
>>>> Global symbols starting with underscores should already be shouting
>>>> enough. But what's worse - the whole idea if using array types is to
>>>> avoid the intermediate variables.
>>>>
>>>>> --- a/xen/arch/arm/setup.c
>>>>> +++ b/xen/arch/arm/setup.c
>>>>> @@ -726,6 +726,12 @@ static void __init setup_mm(unsigned long dtb_paddr,
>>> size_t dtb_size)
>>>>>   
>>>>>   size_t __read_mostly dcache_line_bytes;
>>>>>   
>>>>> +typedef char TYPE_DOESNT_MATTER;
>>>>> +extern TYPE_DOESNT_MATTER _start_linker_assigned_dont_use_me,
>>>>> +                          _end_linker_assigned_dont_use_me;
>>>>
>>>> This and ...
>>>>
>>>>> @@ -770,10 +776,17 @@ void __init start_xen(unsigned long boot_phys_offset,
>>>>>       printk("Command line: %s\n", cmdline);
>>>>>       cmdline_parse(cmdline);
>>>>>   
>>>>> +    _start = (uintptr_t)&_start_linker_assigned_dont_use_me;
>>>>> +    _end = (uintptr_t)&_end_linker_assigned_dont_use_me;
>>>>
>>>> ... this violates what the symbol names say. And if you want to
>>>> avoid issues, you'd want to keep out of C files uses of those
>>>> symbols altogether anyway, and you easily can: In any
>>>> assembly file, have
>>>>
>>>> _start:	.long _start_linker_assigned_dont_use_me
>>>> _end:	.long _end_linker_assigned_dont_use_me
>>>>
>>>> In particular, they don't need to be runtime initialized, saving
>>>> you from needing to set them before first use. But as said -
>>>> things are the way they are precisely to avoid such variables.
>>>>
>>>>>> But, instead of converting _start to unsigned long via SYMBOL_HIDE, we
>>>>>> could convert it to uintptr_t instead, it would be a trivial change on
>>>>>> top of the existing unsigned long series. Not sure if it is beneficial.
>>>>>
>>>>> The difference would be whether we want to rely on implementation-defined
>>>>> behavior or not.
>>>>
>>>> Why not? Simply specify that compilers with implementation defined
>>>> behavior not matching our expectations are unsuitable. And btw, I
>>>> suppose this is just the tiny tip of the iceberg of our reliance on
>>>> implementation defined behavior.
>>>
>>> The reason is that relying on undefined behavior is not reliable, it is
>>> not C compliant, it is not allowed by MISRA-C, and not guaranteed to
>>> work with any compiler.
>> 
>> "undefined behavior" != "implementation defined behavior"
>> 
>>> Yes, this instance is only the tip of the
>>> iceberg, we have a long road ahead, but we shouldn't really give up
>>> because it is going to be difficult :-) Stewart's approach would
>>> actually be compliant and help toward reducing reliance on undefined
>>> behavior.
>>>
>>> Would you be OK if I rework the series to follow his approach using
>>> intermediate variables? See the attached patch as a reference, it only
>>> "converts" _start and _end as an example. Fortunately, it will be
>>> textually similar to the previous SYMBOL returning unsigned long version
>>> of the series.
>> 
>> Well, I've given reasons why I dislike that, and why (I think) it was
>> done without such intermediate variables. Nevertheless, if this is
>> _the only way_ to achieve compliance, I don't think I could
>> reasonably NAK it. >
>> The thing that I don't understand though is how the undefined
>> behavior (if there really is any) goes away: Even if you compare
>> the contents of the variables instead of the original (perhaps
>> casted) pointers, in the end you still compare what C would
>> consider pointers to different objects. It's merely a different
>> way of hiding that fact from C. Undefined behavior would imo
>> go away only if those comparisons/subtractions didn't happen
>> in C anymore. IOW - see my .startof.() / .sizeof.() proposal.
> 
> Do you have a pointer to the series using startof/sizeof?

https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg02718.html

>>> If you are OK with it, do you have any suggestions on how would you like
>>> the intermediate variables to be called? I went with _start/start_ and
>>> _end/end_ but I am open to suggestions. Also to which assembly file you
>>> would like the new variables being added -- I created a new one for the
>>> purpose named var.S in the attached example.
>> 
>> First of all we should explore whether the variables could also be
>> linker generated, in particular to avoid the current symbols to be
>> global (thus making it impossible to access them from C files in the
>> first place). Failing that, I don't think it matters much where these
>> helper symbols live, and hence your choice is probably fine (I'd
>> prefer though if, just like on Arm, the x86 file didn't live in the
>> boot/ subdirectory; in the end it might even be possible to have
>> some of them in xen/common/var.S).
> 
>  From my test [1], I don't think intermediate variables are necessary. 
> You could directly define the symbol with uintptr_t.

If I understand correctly what you mean, then this was proposed
before (by Stewart iirc) and proven to not work. But saying just
"the symbol" here leaves room for ambiguity; this

extern uintptr_t _start, _end, start;

can't possibly work (although it'll build fine) from all I can tell.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-15 12:04                                           ` Jan Beulich
@ 2019-01-15 12:23                                             ` Julien Grall
       [not found]                                               ` <BAE986750200003A5C475325@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-15 12:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Stewart Hildebrand, xen-devel

Hi,

On 1/15/19 12:04 PM, Jan Beulich wrote:
>>>> On 15.01.19 at 12:51, <julien.grall@arm.com> wrote:
>> Hi Jan,
>>
>> On 1/15/19 8:21 AM, Jan Beulich wrote:
>>>>>> On 14.01.19 at 22:18, <sstabellini@kernel.org> wrote:
>>>> Hi Jan,
>>>>
>>>> One question below to make a decision on the way forward.
>>>>
>>>> On Mon, 14 Jan 2019, Jan Beulich wrote:
>>>>>>>> On 14.01.19 at 04:45, <Stewart.Hildebrand@dornerworks.com> wrote:
>>>>>> So let's keep the linker-accessible variable as a type that works for the
>>>>>> linker (which really could be anything as long as you use the address, not
>>>>>> the value), but name it something else - a name that screams "DON'T USE ME
>>>>>> UNLESS YOU KNOW WHAT YOU'RE DOING". And then before the first use, copy
>>>>>> that value to "uintptr_t _start;".
>>>>>>
>>>>>> The following is a quick proof of concept for aarch64. I changed the type
>>>>>> of _start and _end, and added code to copy the linker-assigned value to
>>>>>> _start and _end. Upon booting, I see the correct values:
>>>>>
>>>>> Global symbols starting with underscores should already be shouting
>>>>> enough. But what's worse - the whole idea if using array types is to
>>>>> avoid the intermediate variables.
>>>>>
>>>>>> --- a/xen/arch/arm/setup.c
>>>>>> +++ b/xen/arch/arm/setup.c
>>>>>> @@ -726,6 +726,12 @@ static void __init setup_mm(unsigned long dtb_paddr,
>>>> size_t dtb_size)
>>>>>>    
>>>>>>    size_t __read_mostly dcache_line_bytes;
>>>>>>    
>>>>>> +typedef char TYPE_DOESNT_MATTER;
>>>>>> +extern TYPE_DOESNT_MATTER _start_linker_assigned_dont_use_me,
>>>>>> +                          _end_linker_assigned_dont_use_me;
>>>>>
>>>>> This and ...
>>>>>
>>>>>> @@ -770,10 +776,17 @@ void __init start_xen(unsigned long boot_phys_offset,
>>>>>>        printk("Command line: %s\n", cmdline);
>>>>>>        cmdline_parse(cmdline);
>>>>>>    
>>>>>> +    _start = (uintptr_t)&_start_linker_assigned_dont_use_me;
>>>>>> +    _end = (uintptr_t)&_end_linker_assigned_dont_use_me;
>>>>>
>>>>> ... this violates what the symbol names say. And if you want to
>>>>> avoid issues, you'd want to keep out of C files uses of those
>>>>> symbols altogether anyway, and you easily can: In any
>>>>> assembly file, have
>>>>>
>>>>> _start:	.long _start_linker_assigned_dont_use_me
>>>>> _end:	.long _end_linker_assigned_dont_use_me
>>>>>
>>>>> In particular, they don't need to be runtime initialized, saving
>>>>> you from needing to set them before first use. But as said -
>>>>> things are the way they are precisely to avoid such variables.
>>>>>
>>>>>>> But, instead of converting _start to unsigned long via SYMBOL_HIDE, we
>>>>>>> could convert it to uintptr_t instead, it would be a trivial change on
>>>>>>> top of the existing unsigned long series. Not sure if it is beneficial.
>>>>>>
>>>>>> The difference would be whether we want to rely on implementation-defined
>>>>>> behavior or not.
>>>>>
>>>>> Why not? Simply specify that compilers with implementation defined
>>>>> behavior not matching our expectations are unsuitable. And btw, I
>>>>> suppose this is just the tiny tip of the iceberg of our reliance on
>>>>> implementation defined behavior.
>>>>
>>>> The reason is that relying on undefined behavior is not reliable, it is
>>>> not C compliant, it is not allowed by MISRA-C, and not guaranteed to
>>>> work with any compiler.
>>>
>>> "undefined behavior" != "implementation defined behavior"
>>>
>>>> Yes, this instance is only the tip of the
>>>> iceberg, we have a long road ahead, but we shouldn't really give up
>>>> because it is going to be difficult :-) Stewart's approach would
>>>> actually be compliant and help toward reducing reliance on undefined
>>>> behavior.
>>>>
>>>> Would you be OK if I rework the series to follow his approach using
>>>> intermediate variables? See the attached patch as a reference, it only
>>>> "converts" _start and _end as an example. Fortunately, it will be
>>>> textually similar to the previous SYMBOL returning unsigned long version
>>>> of the series.
>>>
>>> Well, I've given reasons why I dislike that, and why (I think) it was
>>> done without such intermediate variables. Nevertheless, if this is
>>> _the only way_ to achieve compliance, I don't think I could
>>> reasonably NAK it. >
>>> The thing that I don't understand though is how the undefined
>>> behavior (if there really is any) goes away: Even if you compare
>>> the contents of the variables instead of the original (perhaps
>>> casted) pointers, in the end you still compare what C would
>>> consider pointers to different objects. It's merely a different
>>> way of hiding that fact from C. Undefined behavior would imo
>>> go away only if those comparisons/subtractions didn't happen
>>> in C anymore. IOW - see my .startof.() / .sizeof.() proposal.
>>
>> Do you have a pointer to the series using startof/sizeof?
> 
> https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg02718.html
> 
Thank you! Looking at the thread, Andrew had some concerns about it. Do 
you have any update on them?

> "the symbol" here leaves room for ambiguity; this
> 
> extern uintptr_t _start, _end, start;
> 
> can't possibly work (although it'll build fine) from all I can tell.

Doh, yes. sorry for the noise.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-15 11:46                             ` Julien Grall
@ 2019-01-15 12:23                               ` Julien Grall
  0 siblings, 0 replies; 102+ messages in thread
From: Julien Grall @ 2019-01-15 12:23 UTC (permalink / raw)
  To: Stefano Stabellini, Stewart Hildebrand
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Jan Beulich, xen-devel

Hi,

On 1/15/19 11:46 AM, Julien Grall wrote:
> Hi Stefano,
> 
> On 1/11/19 9:37 PM, Stefano Stabellini wrote:
>> On Fri, 11 Jan 2019, Stewart Hildebrand wrote:
>>> On Friday, January 11, 2019 3:36 PM, Julien Grall wrote:
>>>> On Fri, 11 Jan 2019, 12:53 Stewart Hildebrand wrote:
>>>>>
>>>>> Why don't we change the type of _start so it's not a pointer type?
>>>>
>>>> Can you suggest a type that would be suitable?
>>>>
>>>> Cheers,
>>>
>>> Yes. My opinion is that the "sufficient-width integer type" should be a
>>> "uintptr_t" or "intptr_t", since those types by definition are 
>>> *integer* types
>>> wide enough to hold a value converted from a void pointer. While 
>>> "unsigned
>>> long" seems to work for Linux, the definition of that type doesn't 
>>> provide the
>>> same guarantee. Since uintptr_t is an *integer* type by definition 
>>> (and not a
>>> pointer type), my interpretation of the C standard is that
>>> subtraction/comparison of uintptr_t types won't be subject to the 
>>> potential
>>> "pointer to object" issues in question.
>>>
>>> If I had to choose between "uintptr_t" or "intptr_t" I guess I would 
>>> choose
>>> "uintptr_t" since that type is already used in various places in the Xen
>>> codebase. And the Linux workaround is also using an unsigned integer 
>>> type.
>>
>> On changing type of _start & friends: we cannot declare _start as
>> uintptr_t, the linker won't be able to set the value. It needs to be an
>> array type. At that point, it is basically a pointer, it doesn't matter
>> if it is a char[] or uintptr_t[]. It won't help.
> 
> Are you sure about this? I wrote a quick patch (see below) to switch 
> _start/_end to uintptr_t and didn't notice any specific linker issue. I 
> borrowed the idea from ATF which have been using uintptr_t for linker 
> symbol.
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 340a1d1548..ab98cabbb7 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1073,10 +1073,11 @@ int modify_xen_mappings(unsigned long s, 
> unsigned long e, unsigned int flags)
>   }
> 
>   enum mg { mg_clear, mg_ro, mg_rw, mg_rx };
> -static void set_pte_flags_on_range(const char *p, unsigned long l, enum 
> mg mg)
> +static void set_pte_flags_on_range(const char *__p, unsigned long l, 
> enum mg mg)
>   {
>       lpae_t pte;
>       int i;
> +    uintptr_t p = (uintptr_t)__p;
> 
>       ASSERT(is_kernel(p) && is_kernel(p + l));
> 
> diff --git a/xen/include/xen/kernel.h b/xen/include/xen/kernel.h
> index 548b64da9f..94bb08fc65 100644
> --- a/xen/include/xen/kernel.h
> +++ b/xen/include/xen/kernel.h
> @@ -65,9 +65,9 @@
>          1;                                      \
>   })
> 
> -extern char _start[], _end[], start[];
> +extern uintptr_t _start, _end, start;
>   #define is_kernel(p) ({                         \
> -    char *__p = (char *)(unsigned long)(p);     \
> +    uintptr_t __p = (uintptr_t)(p);             \
>       (__p >= _start) && (__p < _end);            \
>   })
> 
> Cheers,
> 

Please ignore this patch, I was wrong here. Sorry for the noise.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                               ` <BAE986750200003A5C475325@prv1-mh.provo.novell.com>
@ 2019-01-15 12:44                                                 ` Jan Beulich
  0 siblings, 0 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-15 12:44 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 15.01.19 at 13:23, <julien.grall@arm.com> wrote:
> Hi,
> 
> On 1/15/19 12:04 PM, Jan Beulich wrote:
>>>>> On 15.01.19 at 12:51, <julien.grall@arm.com> wrote:
>>> Hi Jan,
>>>
>>> On 1/15/19 8:21 AM, Jan Beulich wrote:
>>>>>>> On 14.01.19 at 22:18, <sstabellini@kernel.org> wrote:
>>>>> Hi Jan,
>>>>>
>>>>> One question below to make a decision on the way forward.
>>>>>
>>>>> On Mon, 14 Jan 2019, Jan Beulich wrote:
>>>>>>>>> On 14.01.19 at 04:45, <Stewart.Hildebrand@dornerworks.com> wrote:
>>>>>>> So let's keep the linker-accessible variable as a type that works for the
>>>>>>> linker (which really could be anything as long as you use the address, not
>>>>>>> the value), but name it something else - a name that screams "DON'T USE ME
>>>>>>> UNLESS YOU KNOW WHAT YOU'RE DOING". And then before the first use, copy
>>>>>>> that value to "uintptr_t _start;".
>>>>>>>
>>>>>>> The following is a quick proof of concept for aarch64. I changed the type
>>>>>>> of _start and _end, and added code to copy the linker-assigned value to
>>>>>>> _start and _end. Upon booting, I see the correct values:
>>>>>>
>>>>>> Global symbols starting with underscores should already be shouting
>>>>>> enough. But what's worse - the whole idea if using array types is to
>>>>>> avoid the intermediate variables.
>>>>>>
>>>>>>> --- a/xen/arch/arm/setup.c
>>>>>>> +++ b/xen/arch/arm/setup.c
>>>>>>> @@ -726,6 +726,12 @@ static void __init setup_mm(unsigned long dtb_paddr,
>>>>> size_t dtb_size)
>>>>>>>    
>>>>>>>    size_t __read_mostly dcache_line_bytes;
>>>>>>>    
>>>>>>> +typedef char TYPE_DOESNT_MATTER;
>>>>>>> +extern TYPE_DOESNT_MATTER _start_linker_assigned_dont_use_me,
>>>>>>> +                          _end_linker_assigned_dont_use_me;
>>>>>>
>>>>>> This and ...
>>>>>>
>>>>>>> @@ -770,10 +776,17 @@ void __init start_xen(unsigned long boot_phys_offset,
>>>>>>>        printk("Command line: %s\n", cmdline);
>>>>>>>        cmdline_parse(cmdline);
>>>>>>>    
>>>>>>> +    _start = (uintptr_t)&_start_linker_assigned_dont_use_me;
>>>>>>> +    _end = (uintptr_t)&_end_linker_assigned_dont_use_me;
>>>>>>
>>>>>> ... this violates what the symbol names say. And if you want to
>>>>>> avoid issues, you'd want to keep out of C files uses of those
>>>>>> symbols altogether anyway, and you easily can: In any
>>>>>> assembly file, have
>>>>>>
>>>>>> _start:	.long _start_linker_assigned_dont_use_me
>>>>>> _end:	.long _end_linker_assigned_dont_use_me
>>>>>>
>>>>>> In particular, they don't need to be runtime initialized, saving
>>>>>> you from needing to set them before first use. But as said -
>>>>>> things are the way they are precisely to avoid such variables.
>>>>>>
>>>>>>>> But, instead of converting _start to unsigned long via SYMBOL_HIDE, we
>>>>>>>> could convert it to uintptr_t instead, it would be a trivial change on
>>>>>>>> top of the existing unsigned long series. Not sure if it is beneficial.
>>>>>>>
>>>>>>> The difference would be whether we want to rely on implementation-defined
>>>>>>> behavior or not.
>>>>>>
>>>>>> Why not? Simply specify that compilers with implementation defined
>>>>>> behavior not matching our expectations are unsuitable. And btw, I
>>>>>> suppose this is just the tiny tip of the iceberg of our reliance on
>>>>>> implementation defined behavior.
>>>>>
>>>>> The reason is that relying on undefined behavior is not reliable, it is
>>>>> not C compliant, it is not allowed by MISRA-C, and not guaranteed to
>>>>> work with any compiler.
>>>>
>>>> "undefined behavior" != "implementation defined behavior"
>>>>
>>>>> Yes, this instance is only the tip of the
>>>>> iceberg, we have a long road ahead, but we shouldn't really give up
>>>>> because it is going to be difficult :-) Stewart's approach would
>>>>> actually be compliant and help toward reducing reliance on undefined
>>>>> behavior.
>>>>>
>>>>> Would you be OK if I rework the series to follow his approach using
>>>>> intermediate variables? See the attached patch as a reference, it only
>>>>> "converts" _start and _end as an example. Fortunately, it will be
>>>>> textually similar to the previous SYMBOL returning unsigned long version
>>>>> of the series.
>>>>
>>>> Well, I've given reasons why I dislike that, and why (I think) it was
>>>> done without such intermediate variables. Nevertheless, if this is
>>>> _the only way_ to achieve compliance, I don't think I could
>>>> reasonably NAK it. >
>>>> The thing that I don't understand though is how the undefined
>>>> behavior (if there really is any) goes away: Even if you compare
>>>> the contents of the variables instead of the original (perhaps
>>>> casted) pointers, in the end you still compare what C would
>>>> consider pointers to different objects. It's merely a different
>>>> way of hiding that fact from C. Undefined behavior would imo
>>>> go away only if those comparisons/subtractions didn't happen
>>>> in C anymore. IOW - see my .startof.() / .sizeof.() proposal.
>>>
>>> Do you have a pointer to the series using startof/sizeof?
>> 
>> https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg02718.html 
>> 
> Thank you! Looking at the thread, Andrew had some concerns about it. Do 
> you have any update on them?

I withdrew the patch, as I don't see how to address the clang
concern (unless clang support these constructs, which I doubt).
Considering the context here, I nevertheless thought it may be
worthwhile to consider reviving it, but I didn't check whether
there were any other open points.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-15  8:21                                     ` Jan Beulich
  2019-01-15 11:51                                       ` Julien Grall
@ 2019-01-15 20:03                                       ` Stewart Hildebrand
  2019-01-16  6:01                                         ` Juergen Gross
  2019-01-16 10:19                                         ` Jan Beulich
  2019-01-15 23:36                                       ` Stefano Stabellini
  2 siblings, 2 replies; 102+ messages in thread
From: Stewart Hildebrand @ 2019-01-15 20:03 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, xen-devel

On Tuesday, January 15, 2019 3:21 AM, Jan Beulich wrote:
> >>> On 14.01.19 at 22:18, <sstabellini@kernel.org> wrote:
> > Hi Jan,
> >
> > One question below to make a decision on the way forward.
> >
> > On Mon, 14 Jan 2019, Jan Beulich wrote:
> >> >>> On 14.01.19 at 04:45, <Stewart.Hildebrand@dornerworks.com> wrote:
> >> > The difference would be whether we want to rely on implementation-defined
> >> > behavior or not.
> >>
> >> Why not? Simply specify that compilers with implementation defined
> >> behavior not matching our expectations are unsuitable. And btw, I
> >> suppose this is just the tiny tip of the iceberg of our reliance on
> >> implementation defined behavior.
> >
> > The reason is that relying on undefined behavior is not reliable, it is
> > not C compliant, it is not allowed by MISRA-C, and not guaranteed to
> > work with any compiler.
> 
> "undefined behavior" != "implementation defined behavior"
> 

The C standard gives definitions for unspecified, implementation defined,
and undefined behavior.
To paraphrase:
- Unspecified behavior: the C standard intentionally leaves a choice for
  the implementation to make.
- Implementation defined behavior: the implementation's choice of the
  unspecified behavior.
- Undefined behavior: the C standard does not impose any requirements.

Annex J in the C99 standard lists cases of unspecified, implementation
defined, and undefined behavior.

Two relevant examples are:
- The width of unsigned long is implementation-defined (i.e. ULONG_MAX is
  implementation-defined). The example given in Annex E in the C standard
  is "#define ULONG_MAX 4294967295", but that is to be replaced by the
  implementation's choice.
- Performing subtraction on pointers to different objects is undefined
  behavior.

> > Yes, this instance is only the tip of the
> > iceberg, we have a long road ahead, but we shouldn't really give up
> > because it is going to be difficult :-) Stewart's approach would
> > actually be compliant and help toward reducing reliance on undefined
> > behavior.
> >
> > Would you be OK if I rework the series to follow his approach using
> > intermediate variables? See the attached patch as a reference, it only
> > "converts" _start and _end as an example. Fortunately, it will be
> > textually similar to the previous SYMBOL returning unsigned long version
> > of the series.
> 
> Well, I've given reasons why I dislike that, and why (I think) it was
> done without such intermediate variables. Nevertheless, if this is
> _the only way_ to achieve compliance, I don't think I could
> reasonably NAK it.
> 
> The thing that I don't understand though is how the undefined
> behavior (if there really is any) goes away: Even if you compare
> the contents of the variables instead of the original (perhaps
> casted) pointers, in the end you still compare what C would
> consider pointers to different objects. It's merely a different
> way of hiding that fact from C. Undefined behavior would imo
> go away only if those comparisons/subtractions didn't happen
> in C anymore. IOW - see my .startof.() / .sizeof.() proposal.

No, the C standard provides us a guarantee.

To quote the ISO/IEC 9899 C99 standard regarding the subtract operator:
> For subtraction, one of the following shall hold:
> - both operands have arithmetic type;
> - both operands are pointers to qualified or unqualified versions of
>   compatible object types; or
> - the left operand is a pointer to an object type and the right operand
>   has integer type.
> 
> If both operands have arithmetic type, the usual arithmetic conversions
> are performed on them.
> 
> When an expression that has integer type is added to or subtracted from
> a pointer ... If both the pointer operand and the result point to
> elements of the same array object, or one past the last element of the
> array object, the evaluation shall not produce an overflow; otherwise,
> the behavior is undefined.

Here, "arithmetic type" is either integer type or floating point type (but
NOT pointer type).

There is similar language for the equality comparison operator that
Stefano quoted earlier in the thread.

My interpretation of the standard is:
Subtract/compare operations where one or both operands are pointer types
are potentially subject to the "pointers to different objects" issue, and
the compiler is free to make that determination by any means available.
Subtract/compare operations where both operands are integer types are well
defined in the C standard, and, per the C standard, are NOT subject to the
"pointers to different objects" issue. If the compiler starts to consider
integer types being "pointers to different objects" then the compiler
clearly does not adhere to the C standard. The compiler may look through
*pointer type* casts, but if it started to look through *integer type*
casts, we would have good reason to complain to the GCC mailing list.

You could achieve both operands being integer types either by changing the
type of _start (using intermediate variables) or by casting (using
SYMBOL_HIDE with an integer return type). I would argue that changing the
type of _start would be less prone to human error, since developers
wouldn't have to remember to explicitly wrap each reference to _start and
friends in a macro. That's probably not an issue for the patches submitted
to xen-devel that are subject to informed review, but in potential
downstream/forks of Xen, it would be easy for somebody to miss the
requirement of having to use SYMBOL_HIDE every time the variable is
referenced. Somebody wrote the code this way in the first place, and the
potentially undefined behavior has been in upstream for years without any
remediation.

With the SYMBOL_HIDE approach, we are probably violating a few MISRA rules
with all the tricks going on inside SYMBOL_HIDE, so we'd have to write up
deviations with justification for that and track each occurrence. With the
approach of changing the type of _start, I believe there will be fewer
MISRA rule violations.

Just to reiterate, MISRA C says: don't subtract/compare *pointer types*
pointing to different objects, otherwise it's "undefined behavior" except
in one irrelevant corner case (I'm paraphrasing since the actual text is
copyrighted). If the operands are both integer types (not pointer types),
we don't risk violating the MISRA rules pertaining to pointer types. For
completeness, the corner case is pointer A pointing to one element past
the end of object A, and pointer B pointing to the beginning of object B.

> 
> > If you are OK with it, do you have any suggestions on how would you like
> > the intermediate variables to be called? I went with _start/start_ and
> > _end/end_ but I am open to suggestions. Also to which assembly file you
> > would like the new variables being added -- I created a new one for the
> > purpose named var.S in the attached example.
> 
> First of all we should explore whether the variables could also be
> linker generated, in particular to avoid the current symbols to be
> global (thus making it impossible to access them from C files in the
> first place).

Interesting idea. That certainly would be ideal if the linker will allow
it.

> Failing that, I don't think it matters much where these
> helper symbols live, and hence your choice is probably fine (I'd
> prefer though if, just like on Arm, the x86 file didn't live in the
> boot/ subdirectory; in the end it might even be possible to have
> some of them in xen/common/var.S).
> 
> Jan

(if you're tired of reading my walls of text, you can stop here)

Lastly, please let me take a moment to reiterate why MISRA C exists and
how safety certification relates to the Xen project. Here is a quick
definition two distinct concepts:
Safety: preservation of human life and avoiding harmful behavior.
Security: locking up your valuables.

As embedded devices gain more connectivity and functionality, it is
becoming more economical to consolidate functions (both safety and
non-safety critical) on to a single hardware platform, hence the need for
a hypervisor. One of the draws of potentially using Xen in a safety
critical system is that it already has an extremely large user base that
cares a lot about security. Although safety and security are two distinct
concepts, there is still a lot of overlap. A coding error that allows a
hacker to access a private database could just as well cause unintended
acceleration in a drive-by-wire system. In the safety critical world, it
is not enough to say that something works, we also have to do our due
diligence to ensure that it won't exhibit unintended behavior and that it
keeps working reliably. The rigor and assurance involved in a safety
critical process is a pretty powerful benefit that I think carries over to
those who care about security.

From a MISRA perspective, we have to document and ensure developers
understand implementation defined behavior (MISRA C:2012 Directive 1.1).
We also can't use any undefined behavior (Rule 1.3). Where it is
unavoidable to violate a rule, we have to write up deviations for MISRA
rules that are violated, and justify/maintain each violation of a MISRA
rule. It's much more maintainable and justifiable from a MISRA perspective
to not violate the rule in the first place. The longer our list of
violations becomes, the larger the burden imposed on the community that
cares about safety certification. This is not something that would be
necessary for the entire codebase, only a safety certified subset. For
MISRA, we have no choice but to pick apart the entire safety certified
subset of the iceberg.

Let's assume, for example, that we would have to document why "inspecting
the text of an asm() is something that should never happen", thus
guaranteeing that the compiler won't make the assumption that the value
passed through that inline assembly could be considered a "pointer to a
different object". Documenting this could range anywhere from simply
referencing compiler documentation to inspecting compiler source code and
potentially imposing certain restrictions on what optimization flags we're
allowed to use, and then likely pinning the compiler version.

Once the safety critical subset has been fully defined, I wouldn't rule
out the possibility that somebody will try to use a compiler built for
safety critical applications. GCC has too many unregulated moving pieces
to really be suitable for higher levels of safety assurance (unless you
painstakingly perform manual object code verification - which we obviously
try to avoid if we can due to the effort involved. And even with the right
compiler, sometimes this still has to be done for certain software on
commercial airliners - but Xen has quite a way to go before that would
reasonably happen).

At the end of the day, I have a harder time justifying more and more
implementation defined behavior and rule violations when there is a
proposed solution that avoids violating certain MISRA rules in the first
place. I live in a world where *should never* isn't something I would
trust my life to.

Stew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-15  8:21                                     ` Jan Beulich
  2019-01-15 11:51                                       ` Julien Grall
  2019-01-15 20:03                                       ` Stewart Hildebrand
@ 2019-01-15 23:36                                       ` Stefano Stabellini
  2019-01-16  8:47                                         ` Juergen Gross
       [not found]                                         ` <2EA6D6FD0200001F00417A66@prv1-mh.provo.novell.com>
  2 siblings, 2 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-15 23:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

On Tue, 15 Jan 2019, Jan Beulich wrote:
> > Yes, this instance is only the tip of the
> > iceberg, we have a long road ahead, but we shouldn't really give up
> > because it is going to be difficult :-) Stewart's approach would
> > actually be compliant and help toward reducing reliance on undefined
> > behavior.
> > 
> > Would you be OK if I rework the series to follow his approach using
> > intermediate variables? See the attached patch as a reference, it only
> > "converts" _start and _end as an example. Fortunately, it will be
> > textually similar to the previous SYMBOL returning unsigned long version
> > of the series.
> 
> Well, I've given reasons why I dislike that, and why (I think) it was
> done without such intermediate variables. Nevertheless, if this is
> _the only way_ to achieve compliance, I don't think I could
> reasonably NAK it.
> 
> The thing that I don't understand though is how the undefined
> behavior (if there really is any) goes away: Even if you compare
> the contents of the variables instead of the original (perhaps
> casted) pointers, in the end you still compare what C would
> consider pointers to different objects. It's merely a different
> way of hiding that fact from C.

I saw that Stewart wrote a long and detailed reply, but this is my short
take on this. I don't think so: with this approach there are no dubious
pointers in C land at all[1]. It is perfectly fine to have addresses as
integers in C, compare and subtracts addresses as integers, then casting
one of them to a pointer and accessing a structure with the pointer.
_start becomes only defined and used outside of C. I think both C and
MISRAC compliance would be satisfied.

([1]: There a catch with the way we use the pointers in alternative.c, both
x86 and arm, but is easy to fix in a follow-up series. Everything else
is taken care of.)


> Undefined behavior would imo
> go away only if those comparisons/subtractions didn't happen
> in C anymore. IOW - see my .startof.() / .sizeof.() proposal.
>
> > If you are OK with it, do you have any suggestions on how would you like
> > the intermediate variables to be called? I went with _start/start_ and
> > _end/end_ but I am open to suggestions. Also to which assembly file you
> > would like the new variables being added -- I created a new one for the
> > purpose named var.S in the attached example.
> 
> First of all we should explore whether the variables could also be
> linker generated, in particular to avoid the current symbols to be
> global (thus making it impossible to access them from C files in the
> first place).

That would be fantastic. I looked around, I found interesting things
like PROVIDE, but I don't think what you describe is possible. The
linker scripts only define symbols, they cannot set or define variables.


> Failing that, I don't think it matters much where these
> helper symbols live, and hence your choice is probably fine (I'd
> prefer though if, just like on Arm, the x86 file didn't live in the
> boot/ subdirectory; in the end it might even be possible to have
> some of them in xen/common/var.S).

OK, I'll move the x86 var.S to xen/arch/x86/x86_64. I cannot share var.S
because arm32 is using long instead of quad.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-15 20:03                                       ` Stewart Hildebrand
@ 2019-01-16  6:01                                         ` Juergen Gross
  2019-01-16 10:19                                         ` Jan Beulich
  1 sibling, 0 replies; 102+ messages in thread
From: Juergen Gross @ 2019-01-16  6:01 UTC (permalink / raw)
  To: Stewart Hildebrand, Jan Beulich, Stefano Stabellini
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Julien Grall,
	Julien Grall, xen-devel

On 15/01/2019 21:03, Stewart Hildebrand wrote:
> On Tuesday, January 15, 2019 3:21 AM, Jan Beulich wrote:
>> First of all we should explore whether the variables could also be
>> linker generated, in particular to avoid the current symbols to be
>> global (thus making it impossible to access them from C files in the
>> first place).
> 
> Interesting idea. That certainly would be ideal if the linker will allow
> it.

For each variable needed have a small assembler source with:

.globl var
var: .long .

and then put it at the correct places in the linker script.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-15 23:36                                       ` Stefano Stabellini
@ 2019-01-16  8:47                                         ` Juergen Gross
       [not found]                                         ` <2EA6D6FD0200001F00417A66@prv1-mh.provo.novell.com>
  1 sibling, 0 replies; 102+ messages in thread
From: Juergen Gross @ 2019-01-16  8:47 UTC (permalink / raw)
  To: Stefano Stabellini, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Julien Grall,
	Julien Grall, Stewart Hildebrand, xen-devel

On 16/01/2019 00:36, Stefano Stabellini wrote:
> On Tue, 15 Jan 2019, Jan Beulich wrote:
>>> Yes, this instance is only the tip of the
>>> iceberg, we have a long road ahead, but we shouldn't really give up
>>> because it is going to be difficult :-) Stewart's approach would
>>> actually be compliant and help toward reducing reliance on undefined
>>> behavior.
>>>
>>> Would you be OK if I rework the series to follow his approach using
>>> intermediate variables? See the attached patch as a reference, it only
>>> "converts" _start and _end as an example. Fortunately, it will be
>>> textually similar to the previous SYMBOL returning unsigned long version
>>> of the series.
>>
>> Well, I've given reasons why I dislike that, and why (I think) it was
>> done without such intermediate variables. Nevertheless, if this is
>> _the only way_ to achieve compliance, I don't think I could
>> reasonably NAK it.
>>
>> The thing that I don't understand though is how the undefined
>> behavior (if there really is any) goes away: Even if you compare
>> the contents of the variables instead of the original (perhaps
>> casted) pointers, in the end you still compare what C would
>> consider pointers to different objects. It's merely a different
>> way of hiding that fact from C.
> 
> I saw that Stewart wrote a long and detailed reply, but this is my short
> take on this. I don't think so: with this approach there are no dubious
> pointers in C land at all[1]. It is perfectly fine to have addresses as
> integers in C, compare and subtracts addresses as integers, then casting
> one of them to a pointer and accessing a structure with the pointer.
> _start becomes only defined and used outside of C. I think both C and
> MISRAC compliance would be satisfied.
> 
> ([1]: There a catch with the way we use the pointers in alternative.c, both
> x86 and arm, but is easy to fix in a follow-up series. Everything else
> is taken care of.)
> 
> 
>> Undefined behavior would imo
>> go away only if those comparisons/subtractions didn't happen
>> in C anymore. IOW - see my .startof.() / .sizeof.() proposal.
>>
>>> If you are OK with it, do you have any suggestions on how would you like
>>> the intermediate variables to be called? I went with _start/start_ and
>>> _end/end_ but I am open to suggestions. Also to which assembly file you
>>> would like the new variables being added -- I created a new one for the
>>> purpose named var.S in the attached example.
>>
>> First of all we should explore whether the variables could also be
>> linker generated, in particular to avoid the current symbols to be
>> global (thus making it impossible to access them from C files in the
>> first place).
> 
> That would be fantastic. I looked around, I found interesting things
> like PROVIDE, but I don't think what you describe is possible. The
> linker scripts only define symbols, they cannot set or define variables.
> 
> 
>> Failing that, I don't think it matters much where these
>> helper symbols live, and hence your choice is probably fine (I'd
>> prefer though if, just like on Arm, the x86 file didn't live in the
>> boot/ subdirectory; in the end it might even be possible to have
>> some of them in xen/common/var.S).
> 
> OK, I'll move the x86 var.S to xen/arch/x86/x86_64. I cannot share var.S
> because arm32 is using long instead of quad.

Have an architecture specific define ASM_UINTPTR (.quad or .long) for
that purpose?


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-15 20:03                                       ` Stewart Hildebrand
  2019-01-16  6:01                                         ` Juergen Gross
@ 2019-01-16 10:19                                         ` Jan Beulich
  2019-01-17  0:37                                           ` Stefano Stabellini
       [not found]                                           ` <C8F95655020000CAB8D7C7D4@prv1-mh.provo.novell.com>
  1 sibling, 2 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-16 10:19 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, xen-devel

>>> On 15.01.19 at 21:03, <Stewart.Hildebrand@dornerworks.com> wrote:
> On Tuesday, January 15, 2019 3:21 AM, Jan Beulich wrote:
>> The thing that I don't understand though is how the undefined
>> behavior (if there really is any) goes away: Even if you compare
>> the contents of the variables instead of the original (perhaps
>> casted) pointers, in the end you still compare what C would
>> consider pointers to different objects. It's merely a different
>> way of hiding that fact from C. Undefined behavior would imo
>> go away only if those comparisons/subtractions didn't happen
>> in C anymore. IOW - see my .startof.() / .sizeof.() proposal.
> 
> No, the C standard provides us a guarantee.
> 
> To quote the ISO/IEC 9899 C99 standard regarding the subtract operator:
>> For subtraction, one of the following shall hold:
>> - both operands have arithmetic type;
>> - both operands are pointers to qualified or unqualified versions of
>>   compatible object types; or
>> - the left operand is a pointer to an object type and the right operand
>>   has integer type.
>> 
>> If both operands have arithmetic type, the usual arithmetic conversions
>> are performed on them.
>> 
>> When an expression that has integer type is added to or subtracted from
>> a pointer ... If both the pointer operand and the result point to
>> elements of the same array object, or one past the last element of the
>> array object, the evaluation shall not produce an overflow; otherwise,
>> the behavior is undefined.
> 
> Here, "arithmetic type" is either integer type or floating point type (but
> NOT pointer type).
> 
> There is similar language for the equality comparison operator that
> Stefano quoted earlier in the thread.
> 
> My interpretation of the standard is:
> Subtract/compare operations where one or both operands are pointer types
> are potentially subject to the "pointers to different objects" issue, and
> the compiler is free to make that determination by any means available.
> Subtract/compare operations where both operands are integer types are well
> defined in the C standard, and, per the C standard, are NOT subject to the
> "pointers to different objects" issue. If the compiler starts to consider
> integer types being "pointers to different objects" then the compiler
> clearly does not adhere to the C standard. The compiler may look through
> *pointer type* casts, but if it started to look through *integer type*
> casts, we would have good reason to complain to the GCC mailing list.

All fine. Yet wasn't it you who suggested that a future, very smart
compiler could "look through" casts and even inline assembly? Of
course subtraction and comparison of arithmetic types is well
defined. The question is whether this also holds for pointers
casted to such types. Let's not forget that in the abstract case,
casting a pointer to an integral type may be lossy, and subtraction
of two such casted values may not represent what you'd expect
it to be.

The best way to demonstrate this are the historic large and huge
memory models on 16-bit x86. Pointers are comprised of a
segment/selector and an offset there. When the former is a
segment (real or vm86 modes), conversion can be done such that
the difference is "meaningful" in our sense. When it's a selector,
otoh, I can't think of a conversion that would allow meaningful
comparison / subtraction. Even worse, two entirely distinct
pointers (different selectors referring to descriptors with
different base addresses) may point at the same object.

Luckily we don't have to consider such obscure environments
(and hence we can make certain implications), but the C standard
has to.

In any event - since intermediate variables merely hide the
casting from the compiler, but they don't remove the casts, the
solution involving casts is better imo, for incurring less overhead.

Since casts, as discussed before, are not meaningfully more
helpful than hiding the origin object from the compiler, retaining
pointer types is (to me) further preferable over the casting to
integer types, not the least because of the general risk involved
with type changing casts (for the last so many years I've been
objecting to unnecessary casts in all of the reviews I've done
for this very reason).

> Just to reiterate, MISRA C says: don't subtract/compare *pointer types*
> pointing to different objects, otherwise it's "undefined behavior" except
> in one irrelevant corner case (I'm paraphrasing since the actual text is
> copyrighted). If the operands are both integer types (not pointer types),
> we don't risk violating the MISRA rules pertaining to pointer types.

I continue to have two problems with this: For one this doesn't
talk about pointers cast to integers. And then the term
"different object" is fuzzy as soon as we're talking about things
coming from outside of C land. And taking into consideration
language extensions (are such inside or outside of C land?) like
weak aliases, things become even more fuzzy. gcc looks to be
prepared for such - just look at the generated code for

extern int ei1, ei2;
int i1 = 1, i2 = 2;
extern int ai1[], ai2[];
int __attribute__((weak)) wi1 = 1;
int __attribute__((weak)) wi2 = 2;

int test1(void) {
	return &ei1 == &ei2;
}

int test2(void) {
	return ai1 == ai2;
}

int test3(void) {
	return &i1 == &i2;
}

int test4(void) {
	return &wi1 == &wi2;
}

And there are further issues of fuzziness - take for example the
folding of literals. Are two distinct instances of identical (string)
literals one object, or two different ones? I've searched the spec,
but couldn't spot any statement. Yet the "if and only if" in the
wording of the equality operator descriptions requires this to be
well defined.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                         ` <2EA6D6FD0200001F00417A66@prv1-mh.provo.novell.com>
@ 2019-01-16 10:25                                           ` Jan Beulich
  2019-01-17  0:41                                             ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-16 10:25 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 16.01.19 at 00:36, <sstabellini@kernel.org> wrote:
> On Tue, 15 Jan 2019, Jan Beulich wrote:
>> First of all we should explore whether the variables could also be
>> linker generated, in particular to avoid the current symbols to be
>> global (thus making it impossible to access them from C files in the
>> first place).
> 
> That would be fantastic. I looked around, I found interesting things
> like PROVIDE, but I don't think what you describe is possible. The
> linker scripts only define symbols, they cannot set or define variables.

Yeah, it didn't seem very likely. Then again I think the next best
approach would still be to use .startof. / .sizeof., just not the
way my original patch did, but in your var.S file(s). The
fundamental goal still being to avoid exposure of the symbols
we don't want to be used in C altogether. (All of this provided
we need to go this intermediate variable route in the first place,
which I continue to be unconvinced of, despite you having
posted a respective v8 of your series.)

>> Failing that, I don't think it matters much where these
>> helper symbols live, and hence your choice is probably fine (I'd
>> prefer though if, just like on Arm, the x86 file didn't live in the
>> boot/ subdirectory; in the end it might even be possible to have
>> some of them in xen/common/var.S).
> 
> OK, I'll move the x86 var.S to xen/arch/x86/x86_64. I cannot share var.S
> because arm32 is using long instead of quad.

Excuse me, but no. This is extremely easy to abstract away - see
x86 Linux'es _ASM_PTR.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-16 10:19                                         ` Jan Beulich
@ 2019-01-17  0:37                                           ` Stefano Stabellini
       [not found]                                             ` <B4D3ABC30200003B88BF86FB@prv1-mh.provo.novell.com>
       [not found]                                           ` <C8F95655020000CAB8D7C7D4@prv1-mh.provo.novell.com>
  1 sibling, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-17  0:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

On Wed, 16 Jan 2019, Jan Beulich wrote:
> >>> On 15.01.19 at 21:03, <Stewart.Hildebrand@dornerworks.com> wrote:
> > On Tuesday, January 15, 2019 3:21 AM, Jan Beulich wrote:
> >> The thing that I don't understand though is how the undefined
> >> behavior (if there really is any) goes away: Even if you compare
> >> the contents of the variables instead of the original (perhaps
> >> casted) pointers, in the end you still compare what C would
> >> consider pointers to different objects. It's merely a different
> >> way of hiding that fact from C. Undefined behavior would imo
> >> go away only if those comparisons/subtractions didn't happen
> >> in C anymore. IOW - see my .startof.() / .sizeof.() proposal.
> > 
> > No, the C standard provides us a guarantee.
> > 
> > To quote the ISO/IEC 9899 C99 standard regarding the subtract operator:
> >> For subtraction, one of the following shall hold:
> >> - both operands have arithmetic type;
> >> - both operands are pointers to qualified or unqualified versions of
> >>   compatible object types; or
> >> - the left operand is a pointer to an object type and the right operand
> >>   has integer type.
> >> 
> >> If both operands have arithmetic type, the usual arithmetic conversions
> >> are performed on them.
> >> 
> >> When an expression that has integer type is added to or subtracted from
> >> a pointer ... If both the pointer operand and the result point to
> >> elements of the same array object, or one past the last element of the
> >> array object, the evaluation shall not produce an overflow; otherwise,
> >> the behavior is undefined.
> > 
> > Here, "arithmetic type" is either integer type or floating point type (but
> > NOT pointer type).
> > 
> > There is similar language for the equality comparison operator that
> > Stefano quoted earlier in the thread.
> > 
> > My interpretation of the standard is:
> > Subtract/compare operations where one or both operands are pointer types
> > are potentially subject to the "pointers to different objects" issue, and
> > the compiler is free to make that determination by any means available.
> > Subtract/compare operations where both operands are integer types are well
> > defined in the C standard, and, per the C standard, are NOT subject to the
> > "pointers to different objects" issue. If the compiler starts to consider
> > integer types being "pointers to different objects" then the compiler
> > clearly does not adhere to the C standard. The compiler may look through
> > *pointer type* casts, but if it started to look through *integer type*
> > casts, we would have good reason to complain to the GCC mailing list.
> 
> All fine. Yet wasn't it you who suggested that a future, very smart
> compiler could "look through" casts and even inline assembly? 

Yes, that is because we were doing:

  pointers -- (asm) --> pointers
  pointers -- (asm) --> unsigned long

Either way there were pointers to start with, plus some asm obfuscation.
Stewart's approach is very different.  More on this below.


> Of
> course subtraction and comparison of arithmetic types is well
> defined. The question is whether this also holds for pointers
> casted to such types. Let's not forget that in the abstract case,
> casting a pointer to an integral type may be lossy, and subtraction
> of two such casted values may not represent what you'd expect
> it to be.
> 
> The best way to demonstrate this are the historic large and huge
> memory models on 16-bit x86. Pointers are comprised of a
> segment/selector and an offset there. When the former is a
> segment (real or vm86 modes), conversion can be done such that
> the difference is "meaningful" in our sense. When it's a selector,
> otoh, I can't think of a conversion that would allow meaningful
> comparison / subtraction. Even worse, two entirely distinct
> pointers (different selectors referring to descriptors with
> different base addresses) may point at the same object.
> 
> Luckily we don't have to consider such obscure environments
> (and hence we can make certain implications), but the C standard
> has to.
> 
> In any event - since intermediate variables merely hide the
> casting from the compiler, but they don't remove the casts, the
> solution involving casts is better imo, for incurring less overhead.

This is where I completely disagree. The intermediate variables are not
hiding casts from the compiler. There were never any pointers in this
case.  The linker creates "symbols", not pointers, completely invisible
from C land. Assembly uses these symbols to initialize variables. We
expose these assembly variables as integer to C lands. LD scripts and
assembly have their own terminology and rules: neither "_start" nor
"start" are pointers at any point in time. The operations done in var.S
is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
really a win-win.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-16 10:25                                           ` Jan Beulich
@ 2019-01-17  0:41                                             ` Stefano Stabellini
       [not found]                                               ` <4EA2F2F90200004D00417A66@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-17  0:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

On Wed, 16 Jan 2019, Jan Beulich wrote:
> >>> On 16.01.19 at 00:36, <sstabellini@kernel.org> wrote:
> > On Tue, 15 Jan 2019, Jan Beulich wrote:
> >> First of all we should explore whether the variables could also be
> >> linker generated, in particular to avoid the current symbols to be
> >> global (thus making it impossible to access them from C files in the
> >> first place).
> > 
> > That would be fantastic. I looked around, I found interesting things
> > like PROVIDE, but I don't think what you describe is possible. The
> > linker scripts only define symbols, they cannot set or define variables.
> 
> Yeah, it didn't seem very likely. Then again I think the next best
> approach would still be to use .startof. / .sizeof., just not the
> way my original patch did, but in your var.S file(s). The
> fundamental goal still being to avoid exposure of the symbols
> we don't want to be used in C altogether. (All of this provided
> we need to go this intermediate variable route in the first place,
> which I continue to be unconvinced of, despite you having
> posted a respective v8 of your series.)
> 
> >> Failing that, I don't think it matters much where these
> >> helper symbols live, and hence your choice is probably fine (I'd
> >> prefer though if, just like on Arm, the x86 file didn't live in the
> >> boot/ subdirectory; in the end it might even be possible to have
> >> some of them in xen/common/var.S).
> > 
> > OK, I'll move the x86 var.S to xen/arch/x86/x86_64. I cannot share var.S
> > because arm32 is using long instead of quad.
> 
> Excuse me, but no. This is extremely easy to abstract away - see
> x86 Linux'es _ASM_PTR.

I am happy to make this change and also work on your suggestion above
about using .startof. / .sizeof. in var.S, if we agree on this approach.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                               ` <529ED2F90200004D00417A66@prv1-mh.provo.novell.com>
@ 2019-01-17 11:45                                                 ` Jan Beulich
  2019-01-18  1:24                                                   ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-17 11:45 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
> On Wed, 16 Jan 2019, Jan Beulich wrote:
>> In any event - since intermediate variables merely hide the
>> casting from the compiler, but they don't remove the casts, the
>> solution involving casts is better imo, for incurring less overhead.
> 
> This is where I completely disagree. The intermediate variables are not
> hiding casts from the compiler. There were never any pointers in this
> case.  The linker creates "symbols", not pointers, completely invisible
> from C land. Assembly uses these symbols to initialize variables. We
> expose these assembly variables as integer to C lands. LD scripts and
> assembly have their own terminology and rules: neither "_start" nor
> "start" are pointers at any point in time. The operations done in var.S
> is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
> happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
> really a win-win.

Well, that's a position one can take. But we have to settle on another
aspect then first: Does what is not done in C underly C's rules? I
thought you were of the opinion that what comes from linker scripts
does. In which case what comes from assembly files ought to, too.
(FAOD my implication is: If the answer is yes, both approaches
violate C's rules. If the answer is no, no change is needed at all.)

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                               ` <4EA2F2F90200004D00417A66@prv1-mh.provo.novell.com>
@ 2019-01-17 11:46                                                 ` Jan Beulich
  0 siblings, 0 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-17 11:46 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 17.01.19 at 01:41, <sstabellini@kernel.org> wrote:
> I am happy to make this change and also work on your suggestion above
> about using .startof. / .sizeof. in var.S, if we agree on this approach.

But sadly we look to be quite far from agreeing on anything yet.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-17 11:45                                                 ` Jan Beulich
@ 2019-01-18  1:24                                                   ` Stefano Stabellini
       [not found]                                                     ` <76A2DEED0200005600417A66@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-18  1:24 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

On Thu, 17 Jan 2019, Jan Beulich wrote:
> >>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
> > On Wed, 16 Jan 2019, Jan Beulich wrote:
> >> In any event - since intermediate variables merely hide the
> >> casting from the compiler, but they don't remove the casts, the
> >> solution involving casts is better imo, for incurring less overhead.
> > 
> > This is where I completely disagree. The intermediate variables are not
> > hiding casts from the compiler. There were never any pointers in this
> > case.  The linker creates "symbols", not pointers, completely invisible
> > from C land. Assembly uses these symbols to initialize variables. We
> > expose these assembly variables as integer to C lands. LD scripts and
> > assembly have their own terminology and rules: neither "_start" nor
> > "start" are pointers at any point in time. The operations done in var.S
> > is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
> > happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
> > really a win-win.
> 
> Well, that's a position one can take. But we have to settle on another
> aspect then first: Does what is not done in C underly C's rules? I
> thought you were of the opinion that what comes from linker scripts
> does. In which case what comes from assembly files ought to, too.
> (FAOD my implication is: If the answer is yes, both approaches
> violate C's rules. If the answer is no, no change is needed at all.)

Great question, that is the core of the issue. Also, let me premise that
I agree on the comments you made on the patches (I dislike "start_"
too), and I can address them if we agree to continue down this path.

But no, I do not think that what is done outside of C-land should follow
C rules. But I do not agree with your conclusion that in that case there
is no difference between the approaches. Let's get more into the
details.


1) SYMBOL_HIDE returning pointer type

Let's take _start and _end as an example. _start is born as a linker
symbol, and it becomes a C pointer when we do:

  extern char _start[], _end[]

Now it is a pointer (actually I should say an array, but let's pretend
they are the same thing for this discussion).

When we do:

  SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start)

We are still subtracting pointers: the pointers returned by SYMBOL_HIDE.
We cannot prove that they are pointers to the same object or subsequence
objects in memory, so it is undefined behavior, which is not allowed.
This solution allows us to highlight the problematic call sites, it
helps with compiler issues, but I am not convinced it helps with C
compliance. Better than nothing, but the worst of the lot.


2) SYMBOL_HIDE returning unsigned long

Similarly to the previous case, _start and _end are born as linker
symbols and become pointers when we do:

  extern char _start[], _end[]

Then the operation:

  SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start)

SYMBOL_HIDE returns unsigned long, so the pointers disappear. We are
comparing unsigned long, which should solve the C compliance issue.
Because the pointers to unsigned long conversion is done in assembly, C
compliance does not have a say on the nature of the unsigned long
returned by SYMBOL_HIDE.

However, given that _start and _end are still defined as pointers in
C-land and given that SYMBOL_HIDE, although assembly, is basically a
fancy cast, I concede that the solution is not ideal. I still think is
acceptable, but inferior to the next solution.


3) var.S + start_ as unsigned long

With this approach, _start is born as a linker symbol. It is never
exported to C, so from C point of view, it doesn't exist. There is
another variable named "start_" defined in assembly and initialized to
_start. Now we go into C land with:

  extern uintptr_t start_, end_

start_ and end_ are uintptr_t from the beginning from C point of view.
They have never been pointers or in any way connected to _start. They
are "clean".

When we do:

  _end - _start

it is a subtraction between uintptr_t, which is allowed. When we do:

    for ( call = (const initcall_t *)initcall_start_;
          (uintptr_t)call < presmp_initcall_end_;

The comparison is still between uintptr_t types, and the value of "call"
still comes from an unsigned long initially. There is never a comparison
between dubious pointers. (Interger to pointer conversions and pointer
to integer conversions are allowed by MISRA with some limitations, but I
am double-checking.) Even:

   (uintptr_t)random_pointer < presmp_initcall_end_

would be acceptable because presmp_initcall_end_ is an integer and has
always been an integer from C point of view.


However, there are still a couple of issued not correctly solved by v8
of the series. For starters: 

        apply_alternatives((struct alt_instr *)alt_instructions_,
                           (struct alt_instr *)alt_instructions_end_);

I can see how the pointers comparisons in apply_alternatives could be
considered wrong given the way the pointers are initialized:

    for ( a = base = start; a < end; a++ )
    {

start and end come from alt_instructions_ and alt_instructions_end_. It
doesn't matter that alt_instructions_ and alt_instructions_end_ are
"special", they could be perfectly normal integers and we would still
have the same problem: we cannot prove that "start" and "end" point to
the same object or subsequent objects in memory.

The way to fix it is by changing the parameters of apply_alternatives to
interger types, making comparison between integers, and only using
pointers to access the data.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                                     ` <76A2DEED0200005600417A66@prv1-mh.provo.novell.com>
@ 2019-01-18  9:54                                                       ` Jan Beulich
  2019-01-18 10:48                                                         ` Julien Grall
  2019-01-18 23:05                                                         ` Stefano Stabellini
  0 siblings, 2 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-18  9:54 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
> On Thu, 17 Jan 2019, Jan Beulich wrote:
>> >>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>> > On Wed, 16 Jan 2019, Jan Beulich wrote:
>> >> In any event - since intermediate variables merely hide the
>> >> casting from the compiler, but they don't remove the casts, the
>> >> solution involving casts is better imo, for incurring less overhead.
>> > 
>> > This is where I completely disagree. The intermediate variables are not
>> > hiding casts from the compiler. There were never any pointers in this
>> > case.  The linker creates "symbols", not pointers, completely invisible
>> > from C land. Assembly uses these symbols to initialize variables. We
>> > expose these assembly variables as integer to C lands. LD scripts and
>> > assembly have their own terminology and rules: neither "_start" nor
>> > "start" are pointers at any point in time. The operations done in var.S
>> > is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
>> > happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
>> > really a win-win.
>> 
>> Well, that's a position one can take. But we have to settle on another
>> aspect then first: Does what is not done in C underly C's rules? I
>> thought you were of the opinion that what comes from linker scripts
>> does. In which case what comes from assembly files ought to, too.
>> (FAOD my implication is: If the answer is yes, both approaches
>> violate C's rules. If the answer is no, no change is needed at all.)
> 
> Great question, that is the core of the issue. Also, let me premise that
> I agree on the comments you made on the patches (I dislike "start_"
> too), and I can address them if we agree to continue down this path.
> 
> But no, I do not think that what is done outside of C-land should follow
> C rules. But I do not agree with your conclusion that in that case there
> is no difference between the approaches. Let's get more into the
> details.
> 
> 
> 1) SYMBOL_HIDE returning pointer type
> 
> Let's take _start and _end as an example. _start is born as a linker
> symbol, and it becomes a C pointer when we do:
> 
>   extern char _start[], _end[]
> 
> Now it is a pointer (actually I should say an array, but let's pretend
> they are the same thing for this discussion).
> 
> When we do:
> 
>   SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start)
> 
> We are still subtracting pointers: the pointers returned by SYMBOL_HIDE.
> We cannot prove that they are pointers to the same object or subsequence
> objects in memory, so it is undefined behavior, which is not allowed.

Stop. No. We very much can prove they are - _end points at
one past the last element of _start[]. It is the compiler which
can't prove the opposite, and hence it can't leverage
undefined behavior for optimization purposes.

> 3) var.S + start_ as unsigned long
> 
> With this approach, _start is born as a linker symbol. It is never
> exported to C, so from C point of view, it doesn't exist. There is
> another variable named "start_" defined in assembly and initialized to
> _start. Now we go into C land with:
> 
>   extern uintptr_t start_, end_
> 
> start_ and end_ are uintptr_t from the beginning from C point of view.
> They have never been pointers or in any way connected to _start. They
> are "clean".
> 
> When we do:
> 
>   _end - _start
> 
> it is a subtraction between uintptr_t, which is allowed. When we do:
> 
>     for ( call = (const initcall_t *)initcall_start_;
>           (uintptr_t)call < presmp_initcall_end_;
> 
> The comparison is still between uintptr_t types, and the value of "call"
> still comes from an unsigned long initially. There is never a comparison
> between dubious pointers. (Interger to pointer conversions and pointer
> to integer conversions are allowed by MISRA with some limitations, but I
> am double-checking.) Even:
> 
>    (uintptr_t)random_pointer < presmp_initcall_end_
> 
> would be acceptable because presmp_initcall_end_ is an integer and has
> always been an integer from C point of view.

Well, as said - this is one of the possible positions to take. Personally
I see no difference between the helper symbols defined in
assembly sources, or in C sources the object files for which are never
made part of potential whole program optimization. Using C files for
this is still in conflict with the supposed undefined behavior, but I
think you agree that C and assembly files could be set up such that
the resulting binary data is identical. In which case it is bogus to call
one satisfactory, but not the other.

> However, there are still a couple of issued not correctly solved by v8
> of the series. For starters: 
> 
>         apply_alternatives((struct alt_instr *)alt_instructions_,
>                            (struct alt_instr *)alt_instructions_end_);
> 
> I can see how the pointers comparisons in apply_alternatives could be
> considered wrong given the way the pointers are initialized:
> 
>     for ( a = base = start; a < end; a++ )
>     {
> 
> start and end come from alt_instructions_ and alt_instructions_end_. It
> doesn't matter that alt_instructions_ and alt_instructions_end_ are
> "special", they could be perfectly normal integers and we would still
> have the same problem: we cannot prove that "start" and "end" point to
> the same object or subsequent objects in memory.
> 
> The way to fix it is by changing the parameters of apply_alternatives to
> interger types, making comparison between integers, and only using
> pointers to access the data.

You know my position on casts from integer to pointer types, especially
ones taking a type out of thin air. This applies to your addition to the
apply_alternatives() construct as well as the alternative of adding such
in order to access memory. The quote from the standard that I gave
makes such casts not provably (by the compiler) defined behavior as
well, so it all boils down to the same distinction as pointed out above in
the first part of my reply here: _We_ can prove it, but the compiler
can't. Hence we're still depending on whose proof is necessary to
eliminate MISRA's undefined behavior concerns.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-18  9:54                                                       ` Jan Beulich
@ 2019-01-18 10:48                                                         ` Julien Grall
       [not found]                                                           ` <9F511FC70200005E5C475325@prv1-mh.provo.novell.com>
  2019-01-18 23:05                                                         ` Stefano Stabellini
  1 sibling, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-18 10:48 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Stewart Hildebrand, xen-devel

Hi Jan,

On 18/01/2019 09:54, Jan Beulich wrote:
>>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
>> On Thu, 17 Jan 2019, Jan Beulich wrote:
>>>>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
> Stop. No. We very much can prove they are - _end points at
> one past the last element of _start[]. It is the compiler which
> can't prove the opposite, and hence it can't leverage
> undefined behavior for optimization purposes.

You keep saying the compiler can't leverage it for optimization purpose, however 
there are confirmations that GCC may actually leverage it (e.g [1]). You 
actually need to trick the compiler to avoid the optimization (e.g RELOC_HIDE).

So obviously, this is not only a MISRA "problem" as you state here and below.

I believe Stefano, Stewart and I provided plenty of documentation/thread to 
support our positions. Can you provide us documentation/thread showing the 
compiler will not try to leverage that case?

Cheers,

[1] 
https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                                           ` <9F511FC70200005E5C475325@prv1-mh.provo.novell.com>
@ 2019-01-18 11:09                                                             ` Jan Beulich
  2019-01-18 15:22                                                               ` Julien Grall
  2019-01-21  9:34                                                             ` Jan Beulich
  1 sibling, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-18 11:09 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 18.01.19 at 11:48, <julien.grall@arm.com> wrote:
> On 18/01/2019 09:54, Jan Beulich wrote:
>>>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
>>> On Thu, 17 Jan 2019, Jan Beulich wrote:
>>>>>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
>> Stop. No. We very much can prove they are - _end points at
>> one past the last element of _start[]. It is the compiler which
>> can't prove the opposite, and hence it can't leverage
>> undefined behavior for optimization purposes.
> 
> You keep saying the compiler can't leverage it for optimization purpose, however 
> there are confirmations that GCC may actually leverage it (e.g [1]). You 
> actually need to trick the compiler to avoid the optimization (e.g RELOC_HIDE).

Correct - that's the case I'm referring to when saying it can't leverage
undefined behavior optimizations anymore. Without the hiding of
course it can.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-18 11:09                                                             ` Jan Beulich
@ 2019-01-18 15:22                                                               ` Julien Grall
       [not found]                                                                 ` <3A8206D8020000035C475325@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-18 15:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Stewart Hildebrand, xen-devel

Hi Jan,

On 18/01/2019 11:09, Jan Beulich wrote:
>>>> On 18.01.19 at 11:48, <julien.grall@arm.com> wrote:
>> On 18/01/2019 09:54, Jan Beulich wrote:
>>>>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
>>>> On Thu, 17 Jan 2019, Jan Beulich wrote:
>>>>>>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>>>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
>>> Stop. No. We very much can prove they are - _end points at
>>> one past the last element of _start[]. It is the compiler which
>>> can't prove the opposite, and hence it can't leverage
>>> undefined behavior for optimization purposes.
>>
>> You keep saying the compiler can't leverage it for optimization purpose, however
>> there are confirmations that GCC may actually leverage it (e.g [1]). You
>> actually need to trick the compiler to avoid the optimization (e.g RELOC_HIDE).
> 
> Correct - that's the case I'm referring to when saying it can't leverage
> undefined behavior optimizations anymore. Without the hiding of
> course it can.

But this trick is GCC specific, right? So we would need to have one trick for 
each compiler we support. Note that the solution originally suggested by Stefano 
has the same issue (i.e return unsigned long).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-18  9:54                                                       ` Jan Beulich
  2019-01-18 10:48                                                         ` Julien Grall
@ 2019-01-18 23:05                                                         ` Stefano Stabellini
  2019-01-21  5:24                                                           ` Stewart Hildebrand
       [not found]                                                           ` <5A96F2FD0200008D00417A66@prv1-mh.provo.novell.com>
  1 sibling, 2 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-18 23:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

On Fri, 18 Jan 2019, Jan Beulich wrote:
> >>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
> > On Thu, 17 Jan 2019, Jan Beulich wrote:
> >> >>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
> >> > On Wed, 16 Jan 2019, Jan Beulich wrote:
> >> >> In any event - since intermediate variables merely hide the
> >> >> casting from the compiler, but they don't remove the casts, the
> >> >> solution involving casts is better imo, for incurring less overhead.
> >> > 
> >> > This is where I completely disagree. The intermediate variables are not
> >> > hiding casts from the compiler. There were never any pointers in this
> >> > case.  The linker creates "symbols", not pointers, completely invisible
> >> > from C land. Assembly uses these symbols to initialize variables. We
> >> > expose these assembly variables as integer to C lands. LD scripts and
> >> > assembly have their own terminology and rules: neither "_start" nor
> >> > "start" are pointers at any point in time. The operations done in var.S
> >> > is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
> >> > happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
> >> > really a win-win.
> >> 
> >> Well, that's a position one can take. But we have to settle on another
> >> aspect then first: Does what is not done in C underly C's rules? I
> >> thought you were of the opinion that what comes from linker scripts
> >> does. In which case what comes from assembly files ought to, too.
> >> (FAOD my implication is: If the answer is yes, both approaches
> >> violate C's rules. If the answer is no, no change is needed at all.)
> > 
> > Great question, that is the core of the issue. Also, let me premise that
> > I agree on the comments you made on the patches (I dislike "start_"
> > too), and I can address them if we agree to continue down this path.
> > 
> > But no, I do not think that what is done outside of C-land should follow
> > C rules. But I do not agree with your conclusion that in that case there
> > is no difference between the approaches. Let's get more into the
> > details.
> > 
> > 
> > 1) SYMBOL_HIDE returning pointer type
> > 
> > Let's take _start and _end as an example. _start is born as a linker
> > symbol, and it becomes a C pointer when we do:
> > 
> >   extern char _start[], _end[]
> > 
> > Now it is a pointer (actually I should say an array, but let's pretend
> > they are the same thing for this discussion).
> > 
> > When we do:
> > 
> >   SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start)
> > 
> > We are still subtracting pointers: the pointers returned by SYMBOL_HIDE.
> > We cannot prove that they are pointers to the same object or subsequence
> > objects in memory, so it is undefined behavior, which is not allowed.
> 
> Stop. No. We very much can prove they are - _end points at
> one past the last element of _start[]. It is the compiler which
> can't prove the opposite, and hence it can't leverage
> undefined behavior for optimization purposes.

This is an interesting comment. However, even for normal pointers it is
unreliable to count on one pointing one past the last element of the
other. This was well explained in the GCC thread linked earlier in this
thread. The vision of at least one of the GCC maintainers is that the
compiler is free to place things in memory where it wishes, so as a
programmer you cannot count on pointers pointing one past the last
element of the other. Ever. In this case, where _start and _end are
defined outside of C-land, I think it is even more true, and it remains
undefined.

Moreover, I went back to MISRAC (finally I have a copy) and rule 18.2
says: "subtraction between pointers shall only be applied to pointers
that address elements of the same array". So, all the evidence we have
seems to say that we cannot rely on _end pointing one past the last
element of _start in this matter.


> > 3) var.S + start_ as unsigned long
> > 
> > With this approach, _start is born as a linker symbol. It is never
> > exported to C, so from C point of view, it doesn't exist. There is
> > another variable named "start_" defined in assembly and initialized to
> > _start. Now we go into C land with:
> > 
> >   extern uintptr_t start_, end_
> > 
> > start_ and end_ are uintptr_t from the beginning from C point of view.
> > They have never been pointers or in any way connected to _start. They
> > are "clean".
> > 
> > When we do:
> > 
> >   _end - _start
> > 
> > it is a subtraction between uintptr_t, which is allowed. When we do:
> > 
> >     for ( call = (const initcall_t *)initcall_start_;
> >           (uintptr_t)call < presmp_initcall_end_;
> > 
> > The comparison is still between uintptr_t types, and the value of "call"
> > still comes from an unsigned long initially. There is never a comparison
> > between dubious pointers. (Interger to pointer conversions and pointer
> > to integer conversions are allowed by MISRA with some limitations, but I
> > am double-checking.) Even:
> > 
> >    (uintptr_t)random_pointer < presmp_initcall_end_
> > 
> > would be acceptable because presmp_initcall_end_ is an integer and has
> > always been an integer from C point of view.
> 
> Well, as said - this is one of the possible positions to take. Personally
> I see no difference between the helper symbols defined in
> assembly sources, or in C sources the object files for which are never
> made part of potential whole program optimization. 

I don't think this is the case for MISRAC. C rules apply to C. Other
rules apply to assembly and linker scripts. This is something that
should be easy to check, and I hope that Stewart should be able to
confirm.


> Using C files for this is still in conflict with the supposed
> undefined behavior, but I think you agree that C and assembly files
> could be set up such that the resulting binary data is identical. In
> which case it is bogus to call one satisfactory, but not the other.

I see what you are saying, but it doesn't work that way from a spec
compliance point of view.


> > However, there are still a couple of issued not correctly solved by v8
> > of the series. For starters: 
> > 
> >         apply_alternatives((struct alt_instr *)alt_instructions_,
> >                            (struct alt_instr *)alt_instructions_end_);
> > 
> > I can see how the pointers comparisons in apply_alternatives could be
> > considered wrong given the way the pointers are initialized:
> > 
> >     for ( a = base = start; a < end; a++ )
> >     {
> > 
> > start and end come from alt_instructions_ and alt_instructions_end_. It
> > doesn't matter that alt_instructions_ and alt_instructions_end_ are
> > "special", they could be perfectly normal integers and we would still
> > have the same problem: we cannot prove that "start" and "end" point to
> > the same object or subsequent objects in memory.
> > 
> > The way to fix it is by changing the parameters of apply_alternatives to
> > interger types, making comparison between integers, and only using
> > pointers to access the data.
> 
> You know my position on casts from integer to pointer types, especially
> ones taking a type out of thin air. This applies to your addition to the
> apply_alternatives() construct as well as the alternative of adding such
> in order to access memory. The quote from the standard that I gave
> makes such casts not provably (by the compiler) defined behavior as
> well, so it all boils down to the same distinction as pointed out above in
> the first part of my reply here: _We_ can prove it, but the compiler
> can't. Hence we're still depending on whose proof is necessary to
> eliminate MISRA's undefined behavior concerns.

Comparisons between pointers to different objects is undefined by the C
spec, and not allowed by MISRAC.

Casting pointers to integers and casting integers to pointers is
implementation-defined, which is not the same thing as undefined.

Specifically, casting integers to pointers and pointers to integers is
allowed by MISRAC with the caveat that we should avoid misaligned
pointers (char* are always allowed), and that a compatible pointer type
is used when accessing the object (char* is always compatible). Stewart
will send a longer explanation over the weekend.

I don't make up the rules, I am only trying to follow them :-)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-18 23:05                                                         ` Stefano Stabellini
@ 2019-01-21  5:24                                                           ` Stewart Hildebrand
       [not found]                                                           ` <5A96F2FD0200008D00417A66@prv1-mh.provo.novell.com>
  1 sibling, 0 replies; 102+ messages in thread
From: Stewart Hildebrand @ 2019-01-21  5:24 UTC (permalink / raw)
  To: Stefano Stabellini, Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, xen-devel

On Friday, January 18, 2019 6:05 PM, Stefano Stabellini wrote:
> On Fri, 18 Jan 2019, Jan Beulich wrote:
> > >>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
> > > On Thu, 17 Jan 2019, Jan Beulich wrote:
> > >> >>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
> > >> > On Wed, 16 Jan 2019, Jan Beulich wrote:
> > >> >> In any event - since intermediate variables merely hide the
> > >> >> casting from the compiler, but they don't remove the casts, the
> > >> >> solution involving casts is better imo, for incurring less overhead.
> > >> >
> > >> > This is where I completely disagree. The intermediate variables are not
> > >> > hiding casts from the compiler. There were never any pointers in this
> > >> > case.  The linker creates "symbols", not pointers, completely invisible
> > >> > from C land. Assembly uses these symbols to initialize variables. We
> > >> > expose these assembly variables as integer to C lands. LD scripts and
> > >> > assembly have their own terminology and rules: neither "_start" nor
> > >> > "start" are pointers at any point in time. The operations done in var.S
> > >> > is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
> > >> > happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
> > >> > really a win-win.
> > >>
> > >> Well, that's a position one can take. But we have to settle on another
> > >> aspect then first: Does what is not done in C underly C's rules? I
> > >> thought you were of the opinion that what comes from linker scripts
> > >> does. In which case what comes from assembly files ought to, too.
> > >> (FAOD my implication is: If the answer is yes, both approaches
> > >> violate C's rules. If the answer is no, no change is needed at all.)
> > >
> > > Great question, that is the core of the issue. Also, let me premise that
> > > I agree on the comments you made on the patches (I dislike "start_"
> > > too), and I can address them if we agree to continue down this path.
> > >
> > > But no, I do not think that what is done outside of C-land should follow
> > > C rules. But I do not agree with your conclusion that in that case there
> > > is no difference between the approaches. Let's get more into the
> > > details.
> > >
> > >
> > > 1) SYMBOL_HIDE returning pointer type
> > >
> > > Let's take _start and _end as an example. _start is born as a linker
> > > symbol, and it becomes a C pointer when we do:
> > >
> > >   extern char _start[], _end[]
> > >
> > > Now it is a pointer (actually I should say an array, but let's pretend
> > > they are the same thing for this discussion).
> > >
> > > When we do:
> > >
> > >   SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start)
> > >
> > > We are still subtracting pointers: the pointers returned by SYMBOL_HIDE.
> > > We cannot prove that they are pointers to the same object or subsequence
> > > objects in memory, so it is undefined behavior, which is not allowed.
> >
> > Stop. No. We very much can prove they are - _end points at
> > one past the last element of _start[]. It is the compiler which
> > can't prove the opposite, and hence it can't leverage
> > undefined behavior for optimization purposes.
> 
> This is an interesting comment. However, even for normal pointers it is
> unreliable to count on one pointing one past the last element of the
> other. This was well explained in the GCC thread linked earlier in this
> thread. The vision of at least one of the GCC maintainers is that the
> compiler is free to place things in memory where it wishes, so as a
> programmer you cannot count on pointers pointing one past the last
> element of the other. Ever. In this case, where _start and _end are
> defined outside of C-land, I think it is even more true, and it remains
> undefined.
> 
> Moreover, I went back to MISRAC (finally I have a copy) and rule 18.2
> says: "subtraction between pointers shall only be applied to pointers
> that address elements of the same array". So, all the evidence we have
> seems to say that we cannot rely on _end pointing one past the last
> element of _start in this matter.
> 
> 
> > > 3) var.S + start_ as unsigned long
> > >
> > > With this approach, _start is born as a linker symbol. It is never
> > > exported to C, so from C point of view, it doesn't exist. There is
> > > another variable named "start_" defined in assembly and initialized to
> > > _start. Now we go into C land with:
> > >
> > >   extern uintptr_t start_, end_
> > >
> > > start_ and end_ are uintptr_t from the beginning from C point of view.
> > > They have never been pointers or in any way connected to _start. They
> > > are "clean".
> > >
> > > When we do:
> > >
> > >   _end - _start
> > >
> > > it is a subtraction between uintptr_t, which is allowed. When we do:
> > >
> > >     for ( call = (const initcall_t *)initcall_start_;
> > >           (uintptr_t)call < presmp_initcall_end_;
> > >
> > > The comparison is still between uintptr_t types, and the value of "call"
> > > still comes from an unsigned long initially. There is never a comparison
> > > between dubious pointers. (Interger to pointer conversions and pointer
> > > to integer conversions are allowed by MISRA with some limitations, but I
> > > am double-checking.) Even:
> > >
> > >    (uintptr_t)random_pointer < presmp_initcall_end_
> > >
> > > would be acceptable because presmp_initcall_end_ is an integer and has
> > > always been an integer from C point of view.
> >
> > Well, as said - this is one of the possible positions to take. Personally
> > I see no difference between the helper symbols defined in
> > assembly sources, or in C sources the object files for which are never
> > made part of potential whole program optimization.
> 
> I don't think this is the case for MISRAC. C rules apply to C. Other
> rules apply to assembly and linker scripts. This is something that
> should be easy to check, and I hope that Stewart should be able to
> confirm.

Would it help to provide a guarantee that during processing of one
compilation unit, the compiler doesn't have visibility into other
compilation units or object files?

With GCC, we have the luxury of being able to specify no link time
optimization and no whole program optimization. This could also involve
one of -fno-lto, -fno-whole-program, or both.

We should also specify to invoke the compiler separately for each .c file
(i.e. don't do "gcc -c foo.c bar.c", rather they should be separate steps
"gcc -c foo.c" and "gcc -c bar.c").

Can we agree that this would give us a guarantee C land is separate from
assembly and linker lands?

I have not investigated clang, but we should make sure we can provide this
guarantee for clang as well.

With those guarantees in place, can we agree that what happens in an
assembly source file is not subject to the potential undefined pointers to
different objects behavior described in C99 section 6.5.6 and 6.5.8, and
the "if and only if" clause in 6.5.9? (I'm not talking about inline
assembly).

If we wanted to achieve a more warm and fuzzy feeling about this, we could
investigate potentially invoking "as" or "gcc -xassembly" for translating
an assembly source to object code (disclaimer: I didn't check to see if
this is already being done or not).

C99 footnote 56 gives us a hint about the intent as it relates to
execution environments, which reminded me: we might also consider
specifying certain requirements for the execution environment. For
example: "pointers shall be meaningfully able to be represented in integer
types" and/or "compare/subtract operations on pointers shall yield
meaningful results" or something like that (these probably could use some
word-smithing). I do believe we're already making certain assumptions
about memory addressing and execution environment - we should spell out
the assumptions clearly and specify that our choice of compilers, allowed
compiler options, and architectures don't violate the assumptions.


> > Using C files for this is still in conflict with the supposed
> > undefined behavior, but I think you agree that C and assembly files
> > could be set up such that the resulting binary data is identical. In
> > which case it is bogus to call one satisfactory, but not the other.
> 
> I see what you are saying, but it doesn't work that way from a spec
> compliance point of view.
> 
> 
> > > However, there are still a couple of issued not correctly solved by v8
> > > of the series. For starters:
> > >
> > >         apply_alternatives((struct alt_instr *)alt_instructions_,
> > >                            (struct alt_instr *)alt_instructions_end_);
> > >
> > > I can see how the pointers comparisons in apply_alternatives could be
> > > considered wrong given the way the pointers are initialized:
> > >
> > >     for ( a = base = start; a < end; a++ )
> > >     {
> > >
> > > start and end come from alt_instructions_ and alt_instructions_end_. It
> > > doesn't matter that alt_instructions_ and alt_instructions_end_ are
> > > "special", they could be perfectly normal integers and we would still
> > > have the same problem: we cannot prove that "start" and "end" point to
> > > the same object or subsequent objects in memory.
> > >
> > > The way to fix it is by changing the parameters of apply_alternatives to
> > > interger types, making comparison between integers, and only using
> > > pointers to access the data.
> >
> > You know my position on casts from integer to pointer types, especially
> > ones taking a type out of thin air. This applies to your addition to the
> > apply_alternatives() construct as well as the alternative of adding such
> > in order to access memory. The quote from the standard that I gave
> > makes such casts not provably (by the compiler) defined behavior as
> > well, so it all boils down to the same distinction as pointed out above in
> > the first part of my reply here: _We_ can prove it, but the compiler
> > can't. Hence we're still depending on whose proof is necessary to
> > eliminate MISRA's undefined behavior concerns.
> 
> Comparisons between pointers to different objects is undefined by the C
> spec, and not allowed by MISRAC.
> 
> Casting pointers to integers and casting integers to pointers is
> implementation-defined, which is not the same thing as undefined.
> 
> Specifically, casting integers to pointers and pointers to integers is
> allowed by MISRAC with the caveat that we should avoid misaligned
> pointers (char* are always allowed), and that a compatible pointer type
> is used when accessing the object (char* is always compatible). Stewart
> will send a longer explanation over the weekend.
> 
> I don't make up the rules, I am only trying to follow them :-)

I'll get to that in a bit, but first, it's time for another radical new
idea. Let's call it approach number 4.

The undefined behavior and "if and only if" clause (C99 6.5.6/8/9) only
pertain to the subtract/compare operators. So, if we don't use the
subtract/compare operators in C land, we won't be subject to the undefined
behavior. Let's move the pointer subtract/compare operations to assembly.
Not inline assembly, but to a separate assembly source file.

We would write subroutines in assembly (callable from C) for each
subtract/compare operation required. For example:
char * subtract_ptr_ptr(char *, char *);
char * subtract_ptr_int(char *, uintptr_t);
int test_equal(char *, char *);

That could even open up the door for common operations like "_end - _start":
size_t get_program_size(void);

If we can prove to the compiler that we're subtracting/comparing pointers
to the same object, or one element past the last, then we're still OK to
use the subtract/compare operators. Otherwise, call these functions.

This approach relies on being able to provide some or all of the
guarantees discussed above.

Do you think this will prevent GCC from doing its code-breaking
optimization in questions and help with MISRA C?


Lastly, back to the casts question (though it may be irrelevant if we
choose the approach I just outlined): the C standard guarantees that you
can reliably convert a void pointer to uintptr_t and back (C99 section
7.18.1.4). This is fully defined by the C standard: no unspecified,
implementation-defined, or undefined behavior about that. It does not make
the same guarantee for other pointer types. Rather, conversion between
pointer types (other than void*) and integers is implementation defined
(C99 section 6.3.2.3 paragraphs 5 and 6). Further, converting any pointer
type (except function pointer types) to a "void *" and back is not lossy
(C99 section 6.3.2.3 paragraph 1).

So, let's say you have a "char *" that you want to convert to uintptr_t,
you'd first have to convert to "void *".

char * im_a_char_ptr;
uintptr_t im_a_uintptr_t;
/* ... initialization ... */
im_a_uintptr_t = (uintptr_t)(void*)im_a_char_ptr;
im_a_char_ptr = (char*)(void*)im_a_uintptr_t;

It may not be pretty, and I fully sympathize with your resistance toward
unnecessary casts, but we have a fully C99 standard compliant way to
convert between uintptr_t and pointer types and back without loss, and
without relying on unspecified, implementation-defined, or undefined
behavior.

MISRA C advises that you shouldn't do such casting, but recognizes that it
is necessary in some cases, so it gives guidelines for the case when an
integer type is converted to a pointer type:
1. Take care to avoid misaligned pointers ("char *" will always be
   aligned, assuming certain properties of the execution environment)
2. Ensure that a compatible pointer type is used when accessing the object
   ("char *" is always guaranteed to be compatible)

I refer you to C99 section 6.5 paragraph 7, and MISRA C:2012 Rules 11.3,
11.4, 11.5 for further details.

Stew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                                           ` <9F511FC70200005E5C475325@prv1-mh.provo.novell.com>
  2019-01-18 11:09                                                             ` Jan Beulich
@ 2019-01-21  9:34                                                             ` Jan Beulich
  2019-01-21 10:22                                                               ` Julien Grall
  1 sibling, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-21  9:34 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 18.01.19 at 11:48, <julien.grall@arm.com> wrote:
> On 18/01/2019 09:54, Jan Beulich wrote:
>>>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
>>> On Thu, 17 Jan 2019, Jan Beulich wrote:
>>>>>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
>> Stop. No. We very much can prove they are - _end points at
>> one past the last element of _start[]. It is the compiler which
>> can't prove the opposite, and hence it can't leverage
>> undefined behavior for optimization purposes.
> 
> You keep saying the compiler can't leverage it for optimization purpose, 
> however 
> there are confirmations that GCC may actually leverage it (e.g [1]). You 
> actually need to trick the compiler to avoid the optimization (e.g 
> RELOC_HIDE).
> 
> So obviously, this is not only a MISRA "problem" as you state here and 
> below.
> 
> I believe Stefano, Stewart and I provided plenty of documentation/thread to 
> support our positions. Can you provide us documentation/thread showing the 
> compiler will not try to leverage that case?
> 
> Cheers,
> 
> [1] 
> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1

Btw., the __start[] / __end[] example given there does not match
up with what I see. Only symbols defined in the same CU as where
the comparison sits get "optimized" this way. Externs as well as
weak symbols defined locally don't get dealt with like this. And how
could they? Nothing tells the compiler that two distinct symbols
refer to two distinct objects. It is easy to create objects with
multiple names, not only in assembly but also in C (using the "alias"
attribute).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                                                 ` <3A8206D8020000035C475325@prv1-mh.provo.novell.com>
@ 2019-01-21  9:39                                                                   ` Jan Beulich
  0 siblings, 0 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-21  9:39 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 18.01.19 at 16:22, <julien.grall@arm.com> wrote:
> On 18/01/2019 11:09, Jan Beulich wrote:
>>>>> On 18.01.19 at 11:48, <julien.grall@arm.com> wrote:
>>> On 18/01/2019 09:54, Jan Beulich wrote:
>>>>>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
>>>>> On Thu, 17 Jan 2019, Jan Beulich wrote:
>>>>>>>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>>>>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
>>>> Stop. No. We very much can prove they are - _end points at
>>>> one past the last element of _start[]. It is the compiler which
>>>> can't prove the opposite, and hence it can't leverage
>>>> undefined behavior for optimization purposes.
>>>
>>> You keep saying the compiler can't leverage it for optimization purpose, however
>>> there are confirmations that GCC may actually leverage it (e.g [1]). You
>>> actually need to trick the compiler to avoid the optimization (e.g RELOC_HIDE).
>> 
>> Correct - that's the case I'm referring to when saying it can't leverage
>> undefined behavior optimizations anymore. Without the hiding of
>> course it can.
> 
> But this trick is GCC specific, right? So we would need to have one trick for 
> each compiler we support.

I don't think so; I can't see it to be legitimate for a compiler to derive
anything from what's inside an asm(). It may not be spelled out that
way, but it is my understanding that all knowledge the compiler is
allowed to derive from an asm() is encoded in the in/out/clobber etc
operands of the asm(); the first operand - the string literal - is
supposed to be opaque as far as the asm()'s operation goes.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                                           ` <5A96F2FD0200008D00417A66@prv1-mh.provo.novell.com>
@ 2019-01-21  9:50                                                             ` Jan Beulich
  2019-01-21 23:41                                                               ` Stefano Stabellini
       [not found]                                                             ` <58377FAD0200004688BF86FB@prv1-mh.provo.novell.com>
  1 sibling, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-21  9:50 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 19.01.19 at 00:05, <sstabellini@kernel.org> wrote:
> On Fri, 18 Jan 2019, Jan Beulich wrote:
>> >>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
>> > On Thu, 17 Jan 2019, Jan Beulich wrote:
>> >> >>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>> >> > On Wed, 16 Jan 2019, Jan Beulich wrote:
>> >> >> In any event - since intermediate variables merely hide the
>> >> >> casting from the compiler, but they don't remove the casts, the
>> >> >> solution involving casts is better imo, for incurring less overhead.
>> >> > 
>> >> > This is where I completely disagree. The intermediate variables are not
>> >> > hiding casts from the compiler. There were never any pointers in this
>> >> > case.  The linker creates "symbols", not pointers, completely invisible
>> >> > from C land. Assembly uses these symbols to initialize variables. We
>> >> > expose these assembly variables as integer to C lands. LD scripts and
>> >> > assembly have their own terminology and rules: neither "_start" nor
>> >> > "start" are pointers at any point in time. The operations done in var.S
>> >> > is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
>> >> > happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
>> >> > really a win-win.
>> >> 
>> >> Well, that's a position one can take. But we have to settle on another
>> >> aspect then first: Does what is not done in C underly C's rules? I
>> >> thought you were of the opinion that what comes from linker scripts
>> >> does. In which case what comes from assembly files ought to, too.
>> >> (FAOD my implication is: If the answer is yes, both approaches
>> >> violate C's rules. If the answer is no, no change is needed at all.)
>> > 
>> > Great question, that is the core of the issue. Also, let me premise that
>> > I agree on the comments you made on the patches (I dislike "start_"
>> > too), and I can address them if we agree to continue down this path.
>> > 
>> > But no, I do not think that what is done outside of C-land should follow
>> > C rules. But I do not agree with your conclusion that in that case there
>> > is no difference between the approaches. Let's get more into the
>> > details.
>> > 
>> > 
>> > 1) SYMBOL_HIDE returning pointer type
>> > 
>> > Let's take _start and _end as an example. _start is born as a linker
>> > symbol, and it becomes a C pointer when we do:
>> > 
>> >   extern char _start[], _end[]
>> > 
>> > Now it is a pointer (actually I should say an array, but let's pretend
>> > they are the same thing for this discussion).
>> > 
>> > When we do:
>> > 
>> >   SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start)
>> > 
>> > We are still subtracting pointers: the pointers returned by SYMBOL_HIDE.
>> > We cannot prove that they are pointers to the same object or subsequence
>> > objects in memory, so it is undefined behavior, which is not allowed.
>> 
>> Stop. No. We very much can prove they are - _end points at
>> one past the last element of _start[]. It is the compiler which
>> can't prove the opposite, and hence it can't leverage
>> undefined behavior for optimization purposes.
> 
> This is an interesting comment. However, even for normal pointers it is
> unreliable to count on one pointing one past the last element of the
> other. This was well explained in the GCC thread linked earlier in this
> thread. The vision of at least one of the GCC maintainers is that the
> compiler is free to place things in memory where it wishes, so as a
> programmer you cannot count on pointers pointing one past the last
> element of the other. Ever. In this case, where _start and _end are
> defined outside of C-land, I think it is even more true, and it remains
> undefined.

You mix up two things: One is the chance of two objects being
adjacent to one another. We don't care about this. The other is
a pointer truly pointing one past the last element of an array (as
will naturally result with e.g.

    for ( ptr = arr; ptr < arr + ARRAY_SIZE(arr); ++ptr )

It is this second case which all the cases we care about here fall
into. As per my other mail, just like the same object can have multiple
names, symbols may also refer to places other than the start of
an object; the fact that plain C can't produce such symbols is not
relevant as long as there's no requirement that C code may
interface only with other C code.

> Moreover, I went back to MISRAC (finally I have a copy) and rule 18.2
> says: "subtraction between pointers shall only be applied to pointers
> that address elements of the same array". So, all the evidence we have
> seems to say that we cannot rely on _end pointing one past the last
> element of _start in this matter.

With the C standard's wording in mind, this surely is to include
the "one past the last element" case, in which case all is fine. _end
does not point at or into a different object, it points at the end of
_start[].

>> > 3) var.S + start_ as unsigned long
>> > 
>> > With this approach, _start is born as a linker symbol. It is never
>> > exported to C, so from C point of view, it doesn't exist. There is
>> > another variable named "start_" defined in assembly and initialized to
>> > _start. Now we go into C land with:
>> > 
>> >   extern uintptr_t start_, end_
>> > 
>> > start_ and end_ are uintptr_t from the beginning from C point of view.
>> > They have never been pointers or in any way connected to _start. They
>> > are "clean".
>> > 
>> > When we do:
>> > 
>> >   _end - _start
>> > 
>> > it is a subtraction between uintptr_t, which is allowed. When we do:
>> > 
>> >     for ( call = (const initcall_t *)initcall_start_;
>> >           (uintptr_t)call < presmp_initcall_end_;
>> > 
>> > The comparison is still between uintptr_t types, and the value of "call"
>> > still comes from an unsigned long initially. There is never a comparison
>> > between dubious pointers. (Interger to pointer conversions and pointer
>> > to integer conversions are allowed by MISRA with some limitations, but I
>> > am double-checking.) Even:
>> > 
>> >    (uintptr_t)random_pointer < presmp_initcall_end_
>> > 
>> > would be acceptable because presmp_initcall_end_ is an integer and has
>> > always been an integer from C point of view.
>> 
>> Well, as said - this is one of the possible positions to take. Personally
>> I see no difference between the helper symbols defined in
>> assembly sources, or in C sources the object files for which are never
>> made part of potential whole program optimization. 
> 
> I don't think this is the case for MISRAC. C rules apply to C. Other
> rules apply to assembly and linker scripts. This is something that
> should be easy to check, and I hope that Stewart should be able to
> confirm.

As per above - the interesting aspect is what rules apply to the
case of C interfacing with another language.

>> Using C files for this is still in conflict with the supposed
>> undefined behavior, but I think you agree that C and assembly files
>> could be set up such that the resulting binary data is identical. In
>> which case it is bogus to call one satisfactory, but not the other.
> 
> I see what you are saying, but it doesn't work that way from a spec
> compliance point of view.
> 
> 
>> > However, there are still a couple of issued not correctly solved by v8
>> > of the series. For starters: 
>> > 
>> >         apply_alternatives((struct alt_instr *)alt_instructions_,
>> >                            (struct alt_instr *)alt_instructions_end_);
>> > 
>> > I can see how the pointers comparisons in apply_alternatives could be
>> > considered wrong given the way the pointers are initialized:
>> > 
>> >     for ( a = base = start; a < end; a++ )
>> >     {
>> > 
>> > start and end come from alt_instructions_ and alt_instructions_end_. It
>> > doesn't matter that alt_instructions_ and alt_instructions_end_ are
>> > "special", they could be perfectly normal integers and we would still
>> > have the same problem: we cannot prove that "start" and "end" point to
>> > the same object or subsequent objects in memory.
>> > 
>> > The way to fix it is by changing the parameters of apply_alternatives to
>> > interger types, making comparison between integers, and only using
>> > pointers to access the data.
>> 
>> You know my position on casts from integer to pointer types, especially
>> ones taking a type out of thin air. This applies to your addition to the
>> apply_alternatives() construct as well as the alternative of adding such
>> in order to access memory. The quote from the standard that I gave
>> makes such casts not provably (by the compiler) defined behavior as
>> well, so it all boils down to the same distinction as pointed out above in
>> the first part of my reply here: _We_ can prove it, but the compiler
>> can't. Hence we're still depending on whose proof is necessary to
>> eliminate MISRA's undefined behavior concerns.
> 
> Comparisons between pointers to different objects is undefined by the C
> spec, and not allowed by MISRAC.
> 
> Casting pointers to integers and casting integers to pointers is
> implementation-defined, which is not the same thing as undefined.

Of course it is not, but the result possibly not even being a valid
pointer can't make it much better than "undefined".

> I don't make up the rules, I am only trying to follow them :-)

Sure. But we shouldn't uglify our code just to follow insane
(exaggeration intended) rules.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                                             ` <58377FAD0200004688BF86FB@prv1-mh.provo.novell.com>
@ 2019-01-21 10:06                                                               ` Jan Beulich
  2019-02-06 15:41                                                                 ` Ian Jackson
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-21 10:06 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, xen-devel

>>> On 21.01.19 at 06:24, <Stewart.Hildebrand@dornerworks.com> wrote:
> On Friday, January 18, 2019 6:05 PM, Stefano Stabellini wrote:
>> I don't think this is the case for MISRAC. C rules apply to C. Other
>> rules apply to assembly and linker scripts. This is something that
>> should be easy to check, and I hope that Stewart should be able to
>> confirm.
> 
> Would it help to provide a guarantee that during processing of one
> compilation unit, the compiler doesn't have visibility into other
> compilation units or object files?
> 
> With GCC, we have the luxury of being able to specify no link time
> optimization and no whole program optimization. This could also involve
> one of -fno-lto, -fno-whole-program, or both.
> 
> We should also specify to invoke the compiler separately for each .c file
> (i.e. don't do "gcc -c foo.c bar.c", rather they should be separate steps
> "gcc -c foo.c" and "gcc -c bar.c").

I don't see how use of whole program optimization matters here:
In order to do so, the compiler leverages information it has stored
in the object files originating from .c ones. Such information is
necessarily missing from object files resulting from assembly sources
or the symbols originating from linker scripts.

> Can we agree that this would give us a guarantee C land is separate from
> assembly and linker lands?
> 
> I have not investigated clang, but we should make sure we can provide this
> guarantee for clang as well.
> 
> With those guarantees in place, can we agree that what happens in an
> assembly source file is not subject to the potential undefined pointers to
> different objects behavior described in C99 section 6.5.6 and 6.5.8, and
> the "if and only if" clause in 6.5.9? (I'm not talking about inline
> assembly).

I simply don't know. Interfacing with other languages is, I'm
afraid, beyond the scope of the C spec.

>> Comparisons between pointers to different objects is undefined by the C
>> spec, and not allowed by MISRAC.
>> 
>> Casting pointers to integers and casting integers to pointers is
>> implementation-defined, which is not the same thing as undefined.
>> 
>> Specifically, casting integers to pointers and pointers to integers is
>> allowed by MISRAC with the caveat that we should avoid misaligned
>> pointers (char* are always allowed), and that a compatible pointer type
>> is used when accessing the object (char* is always compatible). Stewart
>> will send a longer explanation over the weekend.
>> 
>> I don't make up the rules, I am only trying to follow them :-)
> 
> I'll get to that in a bit, but first, it's time for another radical new
> idea. Let's call it approach number 4.
> 
> The undefined behavior and "if and only if" clause (C99 6.5.6/8/9) only
> pertain to the subtract/compare operators. So, if we don't use the
> subtract/compare operators in C land, we won't be subject to the undefined
> behavior. Let's move the pointer subtract/compare operations to assembly.
> Not inline assembly, but to a separate assembly source file.
> 
> We would write subroutines in assembly (callable from C) for each
> subtract/compare operation required. For example:
> char * subtract_ptr_ptr(char *, char *);
> char * subtract_ptr_int(char *, uintptr_t);
> int test_equal(char *, char *);
> 
> That could even open up the door for common operations like "_end - _start":
> size_t get_program_size(void);
> 
> If we can prove to the compiler that we're subtracting/comparing pointers
> to the same object, or one element past the last, then we're still OK to
> use the subtract/compare operators. Otherwise, call these functions.
> 
> This approach relies on being able to provide some or all of the
> guarantees discussed above.
> 
> Do you think this will prevent GCC from doing its code-breaking
> optimization in questions and help with MISRA C?

As per my earlier reply, I've yet to see proof of a "code-breaking
optimization" that actually matches our case(s). As to MISRA-C -
maybe; I simply can't tell. What I can tell though that in terms of
code uglification this new approach is not really better than what
was proposed before. Anyway - before thinking about the least
bad option of how to change our code, I'd like to be convinced
that we need to make changes in the first place.

> Lastly, back to the casts question (though it may be irrelevant if we
> choose the approach I just outlined): the C standard guarantees that you
> can reliably convert a void pointer to uintptr_t and back (C99 section
> 7.18.1.4). This is fully defined by the C standard: no unspecified,
> implementation-defined, or undefined behavior about that. It does not make
> the same guarantee for other pointer types. Rather, conversion between
> pointer types (other than void*) and integers is implementation defined
> (C99 section 6.3.2.3 paragraphs 5 and 6). Further, converting any pointer
> type (except function pointer types) to a "void *" and back is not lossy
> (C99 section 6.3.2.3 paragraph 1).

Ah yes, I see. Two caveats: It's the library specification of the spec,
and hence not directly applicable (as we simply have no library in the
hypervisor). And the two types are optional. But yes, I agree it helps
clarify the overall intent.

> So, let's say you have a "char *" that you want to convert to uintptr_t,
> you'd first have to convert to "void *".
> 
> char * im_a_char_ptr;
> uintptr_t im_a_uintptr_t;
> /* ... initialization ... */
> im_a_uintptr_t = (uintptr_t)(void*)im_a_char_ptr;
> im_a_char_ptr = (char*)(void*)im_a_uintptr_t;
> 
> It may not be pretty, and I fully sympathize with your resistance toward
> unnecessary casts, but we have a fully C99 standard compliant way to
> convert between uintptr_t and pointer types and back without loss, and
> without relying on unspecified, implementation-defined, or undefined
> behavior.

Except that, as was mentioned before, it remains unclear whether the
compiler may legitimately "look through" such casts and apply gained
knowledge to its "undefined behavior optimization".

> MISRA C advises that you shouldn't do such casting, but recognizes that it
> is necessary in some cases, so it gives guidelines for the case when an
> integer type is converted to a pointer type:
> 1. Take care to avoid misaligned pointers ("char *" will always be
>    aligned, assuming certain properties of the execution environment)
> 2. Ensure that a compatible pointer type is used when accessing the object
>    ("char *" is always guaranteed to be compatible)

And how would proof of, in particular, point 2 look like for a random
piece of code?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-21  9:34                                                             ` Jan Beulich
@ 2019-01-21 10:22                                                               ` Julien Grall
       [not found]                                                                 ` <E16AB350020000435C475325@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 102+ messages in thread
From: Julien Grall @ 2019-01-21 10:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Stewart Hildebrand, xen-devel

Hi Jan,

On 21/01/2019 09:34, Jan Beulich wrote:
>>>> On 18.01.19 at 11:48, <julien.grall@arm.com> wrote:
>> On 18/01/2019 09:54, Jan Beulich wrote:
>>>>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
>>>> On Thu, 17 Jan 2019, Jan Beulich wrote:
>>>>>>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>>>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
>>> Stop. No. We very much can prove they are - _end points at
>>> one past the last element of _start[]. It is the compiler which
>>> can't prove the opposite, and hence it can't leverage
>>> undefined behavior for optimization purposes.
>>
>> You keep saying the compiler can't leverage it for optimization purpose,
>> however
>> there are confirmations that GCC may actually leverage it (e.g [1]). You
>> actually need to trick the compiler to avoid the optimization (e.g
>> RELOC_HIDE).
>>
>> So obviously, this is not only a MISRA "problem" as you state here and
>> below.
>>
>> I believe Stefano, Stewart and I provided plenty of documentation/thread to
>> support our positions. Can you provide us documentation/thread showing the
>> compiler will not try to leverage that case?
>>
>> Cheers,
>>
>> [1]
>> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1
> 
> Btw., the __start[] / __end[] example given there does not match
> up with what I see.
What you see in a specific version of GCC. This does not mean this behavior is 
valid across all the released versions and future one.

> Only symbols defined in the same CU as where
> the comparison sits get "optimized" this way. Externs as well as
> weak symbols defined locally don't get dealt with like this. And how
> could they? Nothing tells the compiler that two distinct symbols
> refer to two distinct objects. It is easy to create objects with
> multiple names, not only in assembly but also in C (using the "alias"
> attribute).

Similarly, nothing tells the compiler that they are not two distinct symbols. 
You haven't yet provided evidence a compiler cannot use that for optimization.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                                                 ` <E16AB350020000435C475325@prv1-mh.provo.novell.com>
@ 2019-01-21 10:31                                                                   ` Jan Beulich
  2019-01-21 23:15                                                                     ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-21 10:31 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 21.01.19 at 11:22, <julien.grall@arm.com> wrote:
> Hi Jan,
> 
> On 21/01/2019 09:34, Jan Beulich wrote:
>>>>> On 18.01.19 at 11:48, <julien.grall@arm.com> wrote:
>>> On 18/01/2019 09:54, Jan Beulich wrote:
>>>>>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
>>>>> On Thu, 17 Jan 2019, Jan Beulich wrote:
>>>>>>>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>>>>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
>>>> Stop. No. We very much can prove they are - _end points at
>>>> one past the last element of _start[]. It is the compiler which
>>>> can't prove the opposite, and hence it can't leverage
>>>> undefined behavior for optimization purposes.
>>>
>>> You keep saying the compiler can't leverage it for optimization purpose,
>>> however
>>> there are confirmations that GCC may actually leverage it (e.g [1]). You
>>> actually need to trick the compiler to avoid the optimization (e.g
>>> RELOC_HIDE).
>>>
>>> So obviously, this is not only a MISRA "problem" as you state here and
>>> below.
>>>
>>> I believe Stefano, Stewart and I provided plenty of documentation/thread to
>>> support our positions. Can you provide us documentation/thread showing the
>>> compiler will not try to leverage that case?
>>>
>>> Cheers,
>>>
>>> [1]
>>> 
> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1
>> 
>> Btw., the __start[] / __end[] example given there does not match
>> up with what I see.
> What you see in a specific version of GCC. This does not mean this behavior is 
> valid across all the released versions and future one.

Are you suggesting that for the purpose of certification we need to
deal with compiler bugs? Imo such a compiler should simply be
excluded for use to build Xen.

>> Only symbols defined in the same CU as where
>> the comparison sits get "optimized" this way. Externs as well as
>> weak symbols defined locally don't get dealt with like this. And how
>> could they? Nothing tells the compiler that two distinct symbols
>> refer to two distinct objects. It is easy to create objects with
>> multiple names, not only in assembly but also in C (using the "alias"
>> attribute).
> 
> Similarly, nothing tells the compiler that they are not two distinct symbols. 
> You haven't yet provided evidence a compiler cannot use that for optimization.

The compiler can leverage for optimization only what it can prove
(to be undefined behavior or symbols referring to distinct objects
or ...). A compiler may never use guesses for optimization. That
is in the case here it is not us who need to tell the compiler that
two different symbols may refer to the same object, but it is the
compiler which needs to prove that two symbols cannot possibly
refer to the same object. This is possible for automatic and static
objects. This is also possible for some non-static objects defined
in the CU under compilation. But this is not possible in the general
case.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-21 10:31                                                                   ` Jan Beulich
@ 2019-01-21 23:15                                                                     ` Stefano Stabellini
       [not found]                                                                       ` <5EA2B4FA0200008000417A66@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-21 23:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

On Mon, 21 Jan 2019, Jan Beulich wrote:
> >>> On 21.01.19 at 11:22, <julien.grall@arm.com> wrote:
> > Hi Jan,
> > 
> > On 21/01/2019 09:34, Jan Beulich wrote:
> >>>>> On 18.01.19 at 11:48, <julien.grall@arm.com> wrote:
> >>> On 18/01/2019 09:54, Jan Beulich wrote:
> >>>>>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
> >>>>> On Thu, 17 Jan 2019, Jan Beulich wrote:
> >>>>>>>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
> >>>>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
> >>>> Stop. No. We very much can prove they are - _end points at
> >>>> one past the last element of _start[]. It is the compiler which
> >>>> can't prove the opposite, and hence it can't leverage
> >>>> undefined behavior for optimization purposes.
> >>>
> >>> You keep saying the compiler can't leverage it for optimization purpose,
> >>> however
> >>> there are confirmations that GCC may actually leverage it (e.g [1]). You
> >>> actually need to trick the compiler to avoid the optimization (e.g
> >>> RELOC_HIDE).
> >>>
> >>> So obviously, this is not only a MISRA "problem" as you state here and
> >>> below.
> >>>
> >>> I believe Stefano, Stewart and I provided plenty of documentation/thread to
> >>> support our positions. Can you provide us documentation/thread showing the
> >>> compiler will not try to leverage that case?
> >>>
> >>> Cheers,
> >>>
> >>> [1]
> >>> 
> > https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html?m=1
> >> 
> >> Btw., the __start[] / __end[] example given there does not match
> >> up with what I see.
> > What you see in a specific version of GCC. This does not mean this behavior is 
> > valid across all the released versions and future one.
> 
> Are you suggesting that for the purpose of certification we need to
> deal with compiler bugs? Imo such a compiler should simply be
> excluded for use to build Xen.
> 
> >> Only symbols defined in the same CU as where
> >> the comparison sits get "optimized" this way. Externs as well as
> >> weak symbols defined locally don't get dealt with like this. And how
> >> could they? Nothing tells the compiler that two distinct symbols
> >> refer to two distinct objects. It is easy to create objects with
> >> multiple names, not only in assembly but also in C (using the "alias"
> >> attribute).
> > 
> > Similarly, nothing tells the compiler that they are not two distinct symbols. 
> > You haven't yet provided evidence a compiler cannot use that for optimization.
> 
> The compiler can leverage for optimization only what it can prove
> (to be undefined behavior or symbols referring to distinct objects
> or ...). A compiler may never use guesses for optimization. That
> is in the case here it is not us who need to tell the compiler that
> two different symbols may refer to the same object, but it is the
> compiler which needs to prove that two symbols cannot possibly
> refer to the same object. This is possible for automatic and static
> objects. This is also possible for some non-static objects defined
> in the CU under compilation. But this is not possible in the general
> case.

Clearly from the GCC thread not everybody agrees with you:

 Just because two pointers print the same and have the same bit-pattern 
 doesn't mean they need to compare equal

 So the only way within the C standard you could deduce that two objects 
 follow each other in memory is that the address of one compares equal to 
 one past the address of the other - but that does not mean they follow 
 each other in memory for any other comparison.

  > Are you saying it's possible that y immediately follows x in the
  > address space when that line of output is printed, and that y *doesn't*
  > immediately follow x in the address space when "inconsistent behavior:"
  > is printed?
  
  Yes.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-21  9:50                                                             ` Jan Beulich
@ 2019-01-21 23:41                                                               ` Stefano Stabellini
  2019-01-22  6:08                                                                 ` Juergen Gross
       [not found]                                                                 ` <42A2C4FA0200009000417A66@prv1-mh.provo.novell.com>
  0 siblings, 2 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-01-21 23:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

On Mon, 21 Jan 2019, Jan Beulich wrote:
> >>> On 19.01.19 at 00:05, <sstabellini@kernel.org> wrote:
> > On Fri, 18 Jan 2019, Jan Beulich wrote:
> >> >>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
> >> > On Thu, 17 Jan 2019, Jan Beulich wrote:
> >> >> >>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
> >> >> > On Wed, 16 Jan 2019, Jan Beulich wrote:
> >> >> >> In any event - since intermediate variables merely hide the
> >> >> >> casting from the compiler, but they don't remove the casts, the
> >> >> >> solution involving casts is better imo, for incurring less overhead.
> >> >> > 
> >> >> > This is where I completely disagree. The intermediate variables are not
> >> >> > hiding casts from the compiler. There were never any pointers in this
> >> >> > case.  The linker creates "symbols", not pointers, completely invisible
> >> >> > from C land. Assembly uses these symbols to initialize variables. We
> >> >> > expose these assembly variables as integer to C lands. LD scripts and
> >> >> > assembly have their own terminology and rules: neither "_start" nor
> >> >> > "start" are pointers at any point in time. The operations done in var.S
> >> >> > is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
> >> >> > happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
> >> >> > really a win-win.
> >> >> 
> >> >> Well, that's a position one can take. But we have to settle on another
> >> >> aspect then first: Does what is not done in C underly C's rules? I
> >> >> thought you were of the opinion that what comes from linker scripts
> >> >> does. In which case what comes from assembly files ought to, too.
> >> >> (FAOD my implication is: If the answer is yes, both approaches
> >> >> violate C's rules. If the answer is no, no change is needed at all.)
> >> > 
> >> > Great question, that is the core of the issue. Also, let me premise that
> >> > I agree on the comments you made on the patches (I dislike "start_"
> >> > too), and I can address them if we agree to continue down this path.
> >> > 
> >> > But no, I do not think that what is done outside of C-land should follow
> >> > C rules. But I do not agree with your conclusion that in that case there
> >> > is no difference between the approaches. Let's get more into the
> >> > details.
> >> > 
> >> > 
> >> > 1) SYMBOL_HIDE returning pointer type
> >> > 
> >> > Let's take _start and _end as an example. _start is born as a linker
> >> > symbol, and it becomes a C pointer when we do:
> >> > 
> >> >   extern char _start[], _end[]
> >> > 
> >> > Now it is a pointer (actually I should say an array, but let's pretend
> >> > they are the same thing for this discussion).
> >> > 
> >> > When we do:
> >> > 
> >> >   SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start)
> >> > 
> >> > We are still subtracting pointers: the pointers returned by SYMBOL_HIDE.
> >> > We cannot prove that they are pointers to the same object or subsequence
> >> > objects in memory, so it is undefined behavior, which is not allowed.
> >> 
> >> Stop. No. We very much can prove they are - _end points at
> >> one past the last element of _start[]. It is the compiler which
> >> can't prove the opposite, and hence it can't leverage
> >> undefined behavior for optimization purposes.
> > 
> > This is an interesting comment. However, even for normal pointers it is
> > unreliable to count on one pointing one past the last element of the
> > other. This was well explained in the GCC thread linked earlier in this
> > thread. The vision of at least one of the GCC maintainers is that the
> > compiler is free to place things in memory where it wishes, so as a
> > programmer you cannot count on pointers pointing one past the last
> > element of the other. Ever. In this case, where _start and _end are
> > defined outside of C-land, I think it is even more true, and it remains
> > undefined.
> 
> You mix up two things: One is the chance of two objects being
> adjacent to one another. We don't care about this. The other is
> a pointer truly pointing one past the last element of an array (as
> will naturally result with e.g.
> 
>     for ( ptr = arr; ptr < arr + ARRAY_SIZE(arr); ++ptr )
> 
> It is this second case which all the cases we care about here fall
> into. As per my other mail, just like the same object can have multiple
> names, symbols may also refer to places other than the start of
> an object; the fact that plain C can't produce such symbols is not
> relevant as long as there's no requirement that C code may
> interface only with other C code.

The chance of two objects being adjacent to one another is relevant
because the compiler could rightfully decide that the programmer can
never rely on pointers pointing one past the last element of the other,
even if they truly point one past the last element of an array.

Otherwise, Linux would have never needed to introduce RELOC_HIDE in the
first place.


> > Moreover, I went back to MISRAC (finally I have a copy) and rule 18.2
> > says: "subtraction between pointers shall only be applied to pointers
> > that address elements of the same array". So, all the evidence we have
> > seems to say that we cannot rely on _end pointing one past the last
> > element of _start in this matter.
> 
> With the C standard's wording in mind, this surely is to include
> the "one past the last element" case, in which case all is fine. _end
> does not point at or into a different object, it points at the end of
> _start[].
>
> >> > 3) var.S + start_ as unsigned long
> >> > 
> >> > With this approach, _start is born as a linker symbol. It is never
> >> > exported to C, so from C point of view, it doesn't exist. There is
> >> > another variable named "start_" defined in assembly and initialized to
> >> > _start. Now we go into C land with:
> >> > 
> >> >   extern uintptr_t start_, end_
> >> > 
> >> > start_ and end_ are uintptr_t from the beginning from C point of view.
> >> > They have never been pointers or in any way connected to _start. They
> >> > are "clean".
> >> > 
> >> > When we do:
> >> > 
> >> >   _end - _start
> >> > 
> >> > it is a subtraction between uintptr_t, which is allowed. When we do:
> >> > 
> >> >     for ( call = (const initcall_t *)initcall_start_;
> >> >           (uintptr_t)call < presmp_initcall_end_;
> >> > 
> >> > The comparison is still between uintptr_t types, and the value of "call"
> >> > still comes from an unsigned long initially. There is never a comparison
> >> > between dubious pointers. (Interger to pointer conversions and pointer
> >> > to integer conversions are allowed by MISRA with some limitations, but I
> >> > am double-checking.) Even:
> >> > 
> >> >    (uintptr_t)random_pointer < presmp_initcall_end_
> >> > 
> >> > would be acceptable because presmp_initcall_end_ is an integer and has
> >> > always been an integer from C point of view.
> >> 
> >> Well, as said - this is one of the possible positions to take. Personally
> >> I see no difference between the helper symbols defined in
> >> assembly sources, or in C sources the object files for which are never
> >> made part of potential whole program optimization. 
> > 
> > I don't think this is the case for MISRAC. C rules apply to C. Other
> > rules apply to assembly and linker scripts. This is something that
> > should be easy to check, and I hope that Stewart should be able to
> > confirm.
> 
> As per above - the interesting aspect is what rules apply to the
> case of C interfacing with another language.
> 
> >> Using C files for this is still in conflict with the supposed
> >> undefined behavior, but I think you agree that C and assembly files
> >> could be set up such that the resulting binary data is identical. In
> >> which case it is bogus to call one satisfactory, but not the other.
> > 
> > I see what you are saying, but it doesn't work that way from a spec
> > compliance point of view.
> > 
> > 
> >> > However, there are still a couple of issued not correctly solved by v8
> >> > of the series. For starters: 
> >> > 
> >> >         apply_alternatives((struct alt_instr *)alt_instructions_,
> >> >                            (struct alt_instr *)alt_instructions_end_);
> >> > 
> >> > I can see how the pointers comparisons in apply_alternatives could be
> >> > considered wrong given the way the pointers are initialized:
> >> > 
> >> >     for ( a = base = start; a < end; a++ )
> >> >     {
> >> > 
> >> > start and end come from alt_instructions_ and alt_instructions_end_. It
> >> > doesn't matter that alt_instructions_ and alt_instructions_end_ are
> >> > "special", they could be perfectly normal integers and we would still
> >> > have the same problem: we cannot prove that "start" and "end" point to
> >> > the same object or subsequent objects in memory.
> >> > 
> >> > The way to fix it is by changing the parameters of apply_alternatives to
> >> > interger types, making comparison between integers, and only using
> >> > pointers to access the data.
> >> 
> >> You know my position on casts from integer to pointer types, especially
> >> ones taking a type out of thin air. This applies to your addition to the
> >> apply_alternatives() construct as well as the alternative of adding such
> >> in order to access memory. The quote from the standard that I gave
> >> makes such casts not provably (by the compiler) defined behavior as
> >> well, so it all boils down to the same distinction as pointed out above in
> >> the first part of my reply here: _We_ can prove it, but the compiler
> >> can't. Hence we're still depending on whose proof is necessary to
> >> eliminate MISRA's undefined behavior concerns.
> > 
> > Comparisons between pointers to different objects is undefined by the C
> > spec, and not allowed by MISRAC.
> > 
> > Casting pointers to integers and casting integers to pointers is
> > implementation-defined, which is not the same thing as undefined.
> 
> Of course it is not, but the result possibly not even being a valid
> pointer can't make it much better than "undefined".
>
> > I don't make up the rules, I am only trying to follow them :-)
> 
> Sure. But we shouldn't uglify our code just to follow insane
> (exaggeration intended) rules.

We haven't managed to reach consensus on this topic. Your view might be
correct, but it is not necessarily supported by compilers' behavior,
which depends on the opinion of compilers engineers on the topic, and
MISRAC compliance, which depends on the opinion of MISRAC specialists on
the topic. If we take your suggested approach we end up with the code
most likely to break in case the compilers engineers or the MISRAC
experts disagree with you. In this case, being right doesn't necessarily
lead to the code less likely to break.

Regardless, if that is the decision of the Xen community as a whole,
I'll follow it. My preference remains with approach 3. (var.S), followed
by approach 2. (SYMBOL_HIDE returns uintptr_t), but I am willing to
refresh my series to do approach 1. (SYMBOL_HIDE returns pointer type)
if that is the only way forward.

Let us come to a conclusion so that we can move on.

Jan, Julien, Juergen, anybody else interested, let me know what you want
me to do.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-21 23:41                                                               ` Stefano Stabellini
@ 2019-01-22  6:08                                                                 ` Juergen Gross
       [not found]                                                                 ` <42A2C4FA0200009000417A66@prv1-mh.provo.novell.com>
  1 sibling, 0 replies; 102+ messages in thread
From: Juergen Gross @ 2019-01-22  6:08 UTC (permalink / raw)
  To: Stefano Stabellini, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Julien Grall,
	Julien Grall, Stewart Hildebrand, xen-devel

On 22/01/2019 00:41, Stefano Stabellini wrote:
> On Mon, 21 Jan 2019, Jan Beulich wrote:
>>>>> On 19.01.19 at 00:05, <sstabellini@kernel.org> wrote:
>>> On Fri, 18 Jan 2019, Jan Beulich wrote:
>>>>>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
>>>>> On Thu, 17 Jan 2019, Jan Beulich wrote:
>>>>>>>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>>>>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
>>>>>>>> In any event - since intermediate variables merely hide the
>>>>>>>> casting from the compiler, but they don't remove the casts, the
>>>>>>>> solution involving casts is better imo, for incurring less overhead.
>>>>>>>
>>>>>>> This is where I completely disagree. The intermediate variables are not
>>>>>>> hiding casts from the compiler. There were never any pointers in this
>>>>>>> case.  The linker creates "symbols", not pointers, completely invisible
>>>>>>> from C land. Assembly uses these symbols to initialize variables. We
>>>>>>> expose these assembly variables as integer to C lands. LD scripts and
>>>>>>> assembly have their own terminology and rules: neither "_start" nor
>>>>>>> "start" are pointers at any point in time. The operations done in var.S
>>>>>>> is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
>>>>>>> happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
>>>>>>> really a win-win.
>>>>>>
>>>>>> Well, that's a position one can take. But we have to settle on another
>>>>>> aspect then first: Does what is not done in C underly C's rules? I
>>>>>> thought you were of the opinion that what comes from linker scripts
>>>>>> does. In which case what comes from assembly files ought to, too.
>>>>>> (FAOD my implication is: If the answer is yes, both approaches
>>>>>> violate C's rules. If the answer is no, no change is needed at all.)
>>>>>
>>>>> Great question, that is the core of the issue. Also, let me premise that
>>>>> I agree on the comments you made on the patches (I dislike "start_"
>>>>> too), and I can address them if we agree to continue down this path.
>>>>>
>>>>> But no, I do not think that what is done outside of C-land should follow
>>>>> C rules. But I do not agree with your conclusion that in that case there
>>>>> is no difference between the approaches. Let's get more into the
>>>>> details.
>>>>>
>>>>>
>>>>> 1) SYMBOL_HIDE returning pointer type
>>>>>
>>>>> Let's take _start and _end as an example. _start is born as a linker
>>>>> symbol, and it becomes a C pointer when we do:
>>>>>
>>>>>   extern char _start[], _end[]
>>>>>
>>>>> Now it is a pointer (actually I should say an array, but let's pretend
>>>>> they are the same thing for this discussion).
>>>>>
>>>>> When we do:
>>>>>
>>>>>   SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start)
>>>>>
>>>>> We are still subtracting pointers: the pointers returned by SYMBOL_HIDE.
>>>>> We cannot prove that they are pointers to the same object or subsequence
>>>>> objects in memory, so it is undefined behavior, which is not allowed.
>>>>
>>>> Stop. No. We very much can prove they are - _end points at
>>>> one past the last element of _start[]. It is the compiler which
>>>> can't prove the opposite, and hence it can't leverage
>>>> undefined behavior for optimization purposes.
>>>
>>> This is an interesting comment. However, even for normal pointers it is
>>> unreliable to count on one pointing one past the last element of the
>>> other. This was well explained in the GCC thread linked earlier in this
>>> thread. The vision of at least one of the GCC maintainers is that the
>>> compiler is free to place things in memory where it wishes, so as a
>>> programmer you cannot count on pointers pointing one past the last
>>> element of the other. Ever. In this case, where _start and _end are
>>> defined outside of C-land, I think it is even more true, and it remains
>>> undefined.
>>
>> You mix up two things: One is the chance of two objects being
>> adjacent to one another. We don't care about this. The other is
>> a pointer truly pointing one past the last element of an array (as
>> will naturally result with e.g.
>>
>>     for ( ptr = arr; ptr < arr + ARRAY_SIZE(arr); ++ptr )
>>
>> It is this second case which all the cases we care about here fall
>> into. As per my other mail, just like the same object can have multiple
>> names, symbols may also refer to places other than the start of
>> an object; the fact that plain C can't produce such symbols is not
>> relevant as long as there's no requirement that C code may
>> interface only with other C code.
> 
> The chance of two objects being adjacent to one another is relevant
> because the compiler could rightfully decide that the programmer can
> never rely on pointers pointing one past the last element of the other,
> even if they truly point one past the last element of an array.
> 
> Otherwise, Linux would have never needed to introduce RELOC_HIDE in the
> first place.
> 
> 
>>> Moreover, I went back to MISRAC (finally I have a copy) and rule 18.2
>>> says: "subtraction between pointers shall only be applied to pointers
>>> that address elements of the same array". So, all the evidence we have
>>> seems to say that we cannot rely on _end pointing one past the last
>>> element of _start in this matter.
>>
>> With the C standard's wording in mind, this surely is to include
>> the "one past the last element" case, in which case all is fine. _end
>> does not point at or into a different object, it points at the end of
>> _start[].
>>
>>>>> 3) var.S + start_ as unsigned long
>>>>>
>>>>> With this approach, _start is born as a linker symbol. It is never
>>>>> exported to C, so from C point of view, it doesn't exist. There is
>>>>> another variable named "start_" defined in assembly and initialized to
>>>>> _start. Now we go into C land with:
>>>>>
>>>>>   extern uintptr_t start_, end_
>>>>>
>>>>> start_ and end_ are uintptr_t from the beginning from C point of view.
>>>>> They have never been pointers or in any way connected to _start. They
>>>>> are "clean".
>>>>>
>>>>> When we do:
>>>>>
>>>>>   _end - _start
>>>>>
>>>>> it is a subtraction between uintptr_t, which is allowed. When we do:
>>>>>
>>>>>     for ( call = (const initcall_t *)initcall_start_;
>>>>>           (uintptr_t)call < presmp_initcall_end_;
>>>>>
>>>>> The comparison is still between uintptr_t types, and the value of "call"
>>>>> still comes from an unsigned long initially. There is never a comparison
>>>>> between dubious pointers. (Interger to pointer conversions and pointer
>>>>> to integer conversions are allowed by MISRA with some limitations, but I
>>>>> am double-checking.) Even:
>>>>>
>>>>>    (uintptr_t)random_pointer < presmp_initcall_end_
>>>>>
>>>>> would be acceptable because presmp_initcall_end_ is an integer and has
>>>>> always been an integer from C point of view.
>>>>
>>>> Well, as said - this is one of the possible positions to take. Personally
>>>> I see no difference between the helper symbols defined in
>>>> assembly sources, or in C sources the object files for which are never
>>>> made part of potential whole program optimization. 
>>>
>>> I don't think this is the case for MISRAC. C rules apply to C. Other
>>> rules apply to assembly and linker scripts. This is something that
>>> should be easy to check, and I hope that Stewart should be able to
>>> confirm.
>>
>> As per above - the interesting aspect is what rules apply to the
>> case of C interfacing with another language.
>>
>>>> Using C files for this is still in conflict with the supposed
>>>> undefined behavior, but I think you agree that C and assembly files
>>>> could be set up such that the resulting binary data is identical. In
>>>> which case it is bogus to call one satisfactory, but not the other.
>>>
>>> I see what you are saying, but it doesn't work that way from a spec
>>> compliance point of view.
>>>
>>>
>>>>> However, there are still a couple of issued not correctly solved by v8
>>>>> of the series. For starters: 
>>>>>
>>>>>         apply_alternatives((struct alt_instr *)alt_instructions_,
>>>>>                            (struct alt_instr *)alt_instructions_end_);
>>>>>
>>>>> I can see how the pointers comparisons in apply_alternatives could be
>>>>> considered wrong given the way the pointers are initialized:
>>>>>
>>>>>     for ( a = base = start; a < end; a++ )
>>>>>     {
>>>>>
>>>>> start and end come from alt_instructions_ and alt_instructions_end_. It
>>>>> doesn't matter that alt_instructions_ and alt_instructions_end_ are
>>>>> "special", they could be perfectly normal integers and we would still
>>>>> have the same problem: we cannot prove that "start" and "end" point to
>>>>> the same object or subsequent objects in memory.
>>>>>
>>>>> The way to fix it is by changing the parameters of apply_alternatives to
>>>>> interger types, making comparison between integers, and only using
>>>>> pointers to access the data.
>>>>
>>>> You know my position on casts from integer to pointer types, especially
>>>> ones taking a type out of thin air. This applies to your addition to the
>>>> apply_alternatives() construct as well as the alternative of adding such
>>>> in order to access memory. The quote from the standard that I gave
>>>> makes such casts not provably (by the compiler) defined behavior as
>>>> well, so it all boils down to the same distinction as pointed out above in
>>>> the first part of my reply here: _We_ can prove it, but the compiler
>>>> can't. Hence we're still depending on whose proof is necessary to
>>>> eliminate MISRA's undefined behavior concerns.
>>>
>>> Comparisons between pointers to different objects is undefined by the C
>>> spec, and not allowed by MISRAC.
>>>
>>> Casting pointers to integers and casting integers to pointers is
>>> implementation-defined, which is not the same thing as undefined.
>>
>> Of course it is not, but the result possibly not even being a valid
>> pointer can't make it much better than "undefined".
>>
>>> I don't make up the rules, I am only trying to follow them :-)
>>
>> Sure. But we shouldn't uglify our code just to follow insane
>> (exaggeration intended) rules.
> 
> We haven't managed to reach consensus on this topic. Your view might be
> correct, but it is not necessarily supported by compilers' behavior,
> which depends on the opinion of compilers engineers on the topic, and
> MISRAC compliance, which depends on the opinion of MISRAC specialists on
> the topic. If we take your suggested approach we end up with the code
> most likely to break in case the compilers engineers or the MISRAC
> experts disagree with you. In this case, being right doesn't necessarily
> lead to the code less likely to break.
> 
> Regardless, if that is the decision of the Xen community as a whole,
> I'll follow it. My preference remains with approach 3. (var.S), followed
> by approach 2. (SYMBOL_HIDE returns uintptr_t), but I am willing to
> refresh my series to do approach 1. (SYMBOL_HIDE returns pointer type)
> if that is the only way forward.
> 
> Let us come to a conclusion so that we can move on.
> 
> Jan, Julien, Juergen, anybody else interested, let me know what you want
> me to do.
> 

I am "only" the release manager, so I can opt for the series to go into
4.12 in case the committers are ready to give it a go. The decision for
4.12 depends on the time consensus is reached. Right now I'd give it my
Rab, but in case some more weeks are needed I might not want to take the
risk.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                                                       ` <5EA2B4FA0200008000417A66@prv1-mh.provo.novell.com>
@ 2019-01-22  9:06                                                                         ` Jan Beulich
  0 siblings, 0 replies; 102+ messages in thread
From: Jan Beulich @ 2019-01-22  9:06 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 22.01.19 at 00:15, <sstabellini@kernel.org> wrote:
> On Mon, 21 Jan 2019, Jan Beulich wrote:
>> >>> On 21.01.19 at 11:22, <julien.grall@arm.com> wrote:
>> > Hi Jan,
>> > 
>> > On 21/01/2019 09:34, Jan Beulich wrote:
>> >>>>> On 18.01.19 at 11:48, <julien.grall@arm.com> wrote:
>> >>> On 18/01/2019 09:54, Jan Beulich wrote:
>> >>>>>>> On 18.01.19 at 02:24, <sstabellini@kernel.org> wrote:
>> >>>>> On Thu, 17 Jan 2019, Jan Beulich wrote:
>> >>>>>>>>> On 17.01.19 at 01:37, <sstabellini@kernel.org> wrote:
>> >>>>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
>> >>>> Stop. No. We very much can prove they are - _end points at
>> >>>> one past the last element of _start[]. It is the compiler which
>> >>>> can't prove the opposite, and hence it can't leverage
>> >>>> undefined behavior for optimization purposes.
>> >>>
>> >>> You keep saying the compiler can't leverage it for optimization purpose,
>> >>> however
>> >>> there are confirmations that GCC may actually leverage it (e.g [1]). You
>> >>> actually need to trick the compiler to avoid the optimization (e.g
>> >>> RELOC_HIDE).
>> >>>
>> >>> So obviously, this is not only a MISRA "problem" as you state here and
>> >>> below.
>> >>>
>> >>> I believe Stefano, Stewart and I provided plenty of documentation/thread to
>> >>> support our positions. Can you provide us documentation/thread showing the
>> >>> compiler will not try to leverage that case?
>> >>>
>> >>> Cheers,
>> >>>
>> >>> [1]
>> >>> 
>> > 
> https://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.h 
> tml?m=1
>> >> 
>> >> Btw., the __start[] / __end[] example given there does not match
>> >> up with what I see.
>> > What you see in a specific version of GCC. This does not mean this behavior 
> is 
>> > valid across all the released versions and future one.
>> 
>> Are you suggesting that for the purpose of certification we need to
>> deal with compiler bugs? Imo such a compiler should simply be
>> excluded for use to build Xen.
>> 
>> >> Only symbols defined in the same CU as where
>> >> the comparison sits get "optimized" this way. Externs as well as
>> >> weak symbols defined locally don't get dealt with like this. And how
>> >> could they? Nothing tells the compiler that two distinct symbols
>> >> refer to two distinct objects. It is easy to create objects with
>> >> multiple names, not only in assembly but also in C (using the "alias"
>> >> attribute).
>> > 
>> > Similarly, nothing tells the compiler that they are not two distinct 
> symbols. 
>> > You haven't yet provided evidence a compiler cannot use that for 
> optimization.
>> 
>> The compiler can leverage for optimization only what it can prove
>> (to be undefined behavior or symbols referring to distinct objects
>> or ...). A compiler may never use guesses for optimization. That
>> is in the case here it is not us who need to tell the compiler that
>> two different symbols may refer to the same object, but it is the
>> compiler which needs to prove that two symbols cannot possibly
>> refer to the same object. This is possible for automatic and static
>> objects. This is also possible for some non-static objects defined
>> in the CU under compilation. But this is not possible in the general
>> case.
> 
> Clearly from the GCC thread not everybody agrees with you:
> 
>  Just because two pointers print the same and have the same bit-pattern 
>  doesn't mean they need to compare equal
> 
>  So the only way within the C standard you could deduce that two objects 
>  follow each other in memory is that the address of one compares equal to 
>  one past the address of the other - but that does not mean they follow 
>  each other in memory for any other comparison.

I think continuing to hit on this aspect is just adding confusion: We
_do not_ leverage the "end" labels to happen to point at the end of
the previous object. That's a pattern the subsequent objects' "start"
label may happen to match. The "end" labels, otoh, don't point at
the start of any object, they point at what we point them at - the
end of the preceding object. Once again I'd like to emphasize the
difference between "object" and "symbol"; as said before I have
not been able to find anything in the spec saying that there's a
requirement that symbols can only point at the start of objects, or
that there can only be a single symbol pointing at a given object.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                                                 ` <42A2C4FA0200009000417A66@prv1-mh.provo.novell.com>
@ 2019-01-22  9:16                                                                   ` Jan Beulich
  2019-02-01 18:52                                                                     ` George Dunlap
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-01-22  9:16 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Stewart Hildebrand, xen-devel

>>> On 22.01.19 at 00:41, <sstabellini@kernel.org> wrote:
> We haven't managed to reach consensus on this topic. Your view might be
> correct, but it is not necessarily supported by compilers' behavior,
> which depends on the opinion of compilers engineers on the topic, and
> MISRAC compliance, which depends on the opinion of MISRAC specialists on
> the topic. If we take your suggested approach we end up with the code
> most likely to break in case the compilers engineers or the MISRAC
> experts disagree with you. In this case, being right doesn't necessarily
> lead to the code less likely to break.
> 
> Regardless, if that is the decision of the Xen community as a whole,
> I'll follow it. My preference remains with approach 3. (var.S), followed
> by approach 2. (SYMBOL_HIDE returns uintptr_t), but I am willing to
> refresh my series to do approach 1. (SYMBOL_HIDE returns pointer type)
> if that is the only way forward.
> 
> Let us come to a conclusion so that we can move on.

How can we come to a conclusion when things remain unclear? I see
only two ways forward - either we settle the dispute (which I'm
afraid would require involvement of someone accepted by all of us
as a "C language lawyer", which would include judgment about the
MISRA-C implications), or you request a vote, by which my
objection to _any_ change here without proper justification can be
outvoted. Only at that point can we then decide whether any of
the proposed "solutions" (in quotes because I remain unconvinced
there's a problem to solve here other than working around compiler
bugs) is/are necessary _and_ fulfilling the purpose, and if multiple
remain, which of them we like best / is the least bad one.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-22  9:16                                                                   ` Jan Beulich
@ 2019-02-01 18:52                                                                     ` George Dunlap
  2019-02-01 20:53                                                                       ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: George Dunlap @ 2019-02-01 18:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

On Tue, Jan 22, 2019 at 9:17 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 22.01.19 at 00:41, <sstabellini@kernel.org> wrote:
> > We haven't managed to reach consensus on this topic. Your view might be
> > correct, but it is not necessarily supported by compilers' behavior,
> > which depends on the opinion of compilers engineers on the topic, and
> > MISRAC compliance, which depends on the opinion of MISRAC specialists on
> > the topic. If we take your suggested approach we end up with the code
> > most likely to break in case the compilers engineers or the MISRAC
> > experts disagree with you. In this case, being right doesn't necessarily
> > lead to the code less likely to break.
> >
> > Regardless, if that is the decision of the Xen community as a whole,
> > I'll follow it. My preference remains with approach 3. (var.S), followed
> > by approach 2. (SYMBOL_HIDE returns uintptr_t), but I am willing to
> > refresh my series to do approach 1. (SYMBOL_HIDE returns pointer type)
> > if that is the only way forward.
> >
> > Let us come to a conclusion so that we can move on.
>
> How can we come to a conclusion when things remain unclear? I see
> only two ways forward - either we settle the dispute (which I'm
> afraid would require involvement of someone accepted by all of us
> as a "C language lawyer", which would include judgment about the
> MISRA-C implications),

Well, no, I don't think a "C language lawyer" would help here.

You keep making the case for C spec compliance as though we're dealing
with zealous but ultimately rational people, who will a) almost never
violate the C spec, and b) actually fix their compiler if their
language does violate the spec.

But it's clear from reading those threads that this is not the case.
Over and over people presented  clear arguments, from the spec and the
spec committee, showing that what gcc was doing was both wrong and
impractical; and the compiler people kept coming up with more and more
tortuous interpretations to justify the compiler's behavior.  The
whole thing with supposing that the C standard anticipated a
compacting garbage collector was the cherry on the cake.

We're not living in a rational world where if we just follow the rules
we'll be safe.  We have a dictat from the high council in the form of
a C spec which is divorced from actual usage and utility, and we have
a load of insane fanatics trying to impose their interpretation
doctrinal purity on the world, and ready to burn any heretics writing
code that doesn't match the view they happen to hold that day.  "The
compiler can't optimize this because it can't prove they're different
objects" is a fine principle, but it's pretty clear that they're
willing to continue optimizing things even if you *can* prove they
fall inside the rules of pointer comparison.

In such a situation, *there is no perfectly safe option*.  No matter
what position you take, the fanatics may end up deciding that you're a
heretic and need to be burned at the stake.  Might they decide that
they know that extern pointers point to different objects, and
therefore can't be compared? Maybe!  Might they decide they can pierce
the veil of asm to determine the source of unsigned longs they're
comparing? Possibly!  Could they decide that a uintptr_t received from
the heathen lands of assembly is anathema, and therefore casting it to
a pointer is undefined behavior?  They certainly could!

And even if you do convince them their interpretation is wrong and
they fix their compiler, the damage is still done: there are still,
out in the wild, vulnerable binaries built with buggy compilers and
buggy compilers that produce vulnerable binaries, until they all die
of old age.

*Any* interpretation we choose may be declared at some point by the
compiler folks to be heresy.  But, there are less safe option and more
safe options.  Our goal with regard to the C Standard cannot,
unfortunately, be "follow the rules".  Our goal must be to *minimize
the risk* of being caught in the next wave of the compiler
optimization purges.

MISRA is quirky and often impractical, but ultimately their goal with
rules like this is to try to protect you from the fanatics who write
compilers (insofar as that is possible).  So if we do our best to be
as safe as possible from the compiler fanatics, we have a pretty good
chance of being considered MISRA compliant as well.

It seems to me that anything that involves directly comparing pointers
is simply more likely to be come the target of optimization (and thus
more dangerous) than comparing unsigned long and uintptr_t.  And
although I'm not terribly familiar with the "intptr" types, it seems
to me that casting from uintptr_t is less likely to ever be considered
deviant behavior than casting from unsigned long.

As such, I think casting the return value of asm to a pointer is far
too dangerous.  Using extern pointers seems quite dangerous to me as
well.  So it seems to me that using asm to generate an unsigned long
would be absolute minimum behavior.  Using uintprt_t values, and in
particular importing them from assembly, might give us yet another
level of safety (in case unsigned long -> pointer casts ever become a
target).

Are these guaranteed to avoid "UB hazard" issues in the future?  Of
course not; nothing can.  But they seem to me to be a lot less risky
than asm -> ptr or extern pointers.

> Only at that point can we then decide whether any of
> the proposed "solutions" (in quotes because I remain unconvinced
> there's a problem to solve here other than working around compiler
> bugs) is/are necessary _and_ fulfilling the purpose, and if multiple
> remain, which of them we like best / is the least bad one.

Improvements this series seeks to make, as I understand it, include
(in order of urgency):

1. Fixing one concrete instance of "UB hazard"
2. Minimize risk of further "UB hazard" in this bit of functionality
3. Retain the effort Stefano has put in identifying all the places
where such UB hazards need to be addressed.
4. Move towards MISRA-C compliance.

As far as I can tell, primary objections you've leveled at the options
which try to address 2-4 are:

a. "UB hazard" still not zero
b. MISRA compliancy no 100%
c. Ugly
d. Inefficient

(Obviously some proposals have had more technical discussion.)

Anything I missed?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-01 18:52                                                                     ` George Dunlap
@ 2019-02-01 20:53                                                                       ` Stefano Stabellini
  0 siblings, 0 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-02-01 20:53 UTC (permalink / raw)
  To: George Dunlap
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Jan Beulich,
	Stewart Hildebrand, xen-devel

On Fri, 1 Feb 2019, George Dunlap wrote:
> On Tue, Jan 22, 2019 at 9:17 AM Jan Beulich <JBeulich@suse.com> wrote:
> >
> > >>> On 22.01.19 at 00:41, <sstabellini@kernel.org> wrote:
> > > We haven't managed to reach consensus on this topic. Your view might be
> > > correct, but it is not necessarily supported by compilers' behavior,
> > > which depends on the opinion of compilers engineers on the topic, and
> > > MISRAC compliance, which depends on the opinion of MISRAC specialists on
> > > the topic. If we take your suggested approach we end up with the code
> > > most likely to break in case the compilers engineers or the MISRAC
> > > experts disagree with you. In this case, being right doesn't necessarily
> > > lead to the code less likely to break.
> > >
> > > Regardless, if that is the decision of the Xen community as a whole,
> > > I'll follow it. My preference remains with approach 3. (var.S), followed
> > > by approach 2. (SYMBOL_HIDE returns uintptr_t), but I am willing to
> > > refresh my series to do approach 1. (SYMBOL_HIDE returns pointer type)
> > > if that is the only way forward.
> > >
> > > Let us come to a conclusion so that we can move on.
> >
> > How can we come to a conclusion when things remain unclear? I see
> > only two ways forward - either we settle the dispute (which I'm
> > afraid would require involvement of someone accepted by all of us
> > as a "C language lawyer", which would include judgment about the
> > MISRA-C implications),
> 
> Well, no, I don't think a "C language lawyer" would help here.
> 
> You keep making the case for C spec compliance as though we're dealing
> with zealous but ultimately rational people, who will a) almost never
> violate the C spec, and b) actually fix their compiler if their
> language does violate the spec.
> 
> But it's clear from reading those threads that this is not the case.
> Over and over people presented  clear arguments, from the spec and the
> spec committee, showing that what gcc was doing was both wrong and
> impractical; and the compiler people kept coming up with more and more
> tortuous interpretations to justify the compiler's behavior.  The
> whole thing with supposing that the C standard anticipated a
> compacting garbage collector was the cherry on the cake.
> 
> We're not living in a rational world where if we just follow the rules
> we'll be safe.  We have a dictat from the high council in the form of
> a C spec which is divorced from actual usage and utility, and we have
> a load of insane fanatics trying to impose their interpretation
> doctrinal purity on the world, and ready to burn any heretics writing
> code that doesn't match the view they happen to hold that day.  "The
> compiler can't optimize this because it can't prove they're different
> objects" is a fine principle, but it's pretty clear that they're
> willing to continue optimizing things even if you *can* prove they
> fall inside the rules of pointer comparison.

That made me laugh very hard :-D  in a "sad but true" kind of way.


> In such a situation, *there is no perfectly safe option*.  No matter
> what position you take, the fanatics may end up deciding that you're a
> heretic and need to be burned at the stake.  Might they decide that
> they know that extern pointers point to different objects, and
> therefore can't be compared? Maybe!  Might they decide they can pierce
> the veil of asm to determine the source of unsigned longs they're
> comparing? Possibly!  Could they decide that a uintptr_t received from
> the heathen lands of assembly is anathema, and therefore casting it to
> a pointer is undefined behavior?  They certainly could!
> 
> And even if you do convince them their interpretation is wrong and
> they fix their compiler, the damage is still done: there are still,
> out in the wild, vulnerable binaries built with buggy compilers and
> buggy compilers that produce vulnerable binaries, until they all die
> of old age.
> 
> *Any* interpretation we choose may be declared at some point by the
> compiler folks to be heresy.  But, there are less safe option and more
> safe options.  Our goal with regard to the C Standard cannot,
> unfortunately, be "follow the rules".  Our goal must be to *minimize
> the risk* of being caught in the next wave of the compiler
> optimization purges.
> 
> MISRA is quirky and often impractical, but ultimately their goal with
> rules like this is to try to protect you from the fanatics who write
> compilers (insofar as that is possible).  So if we do our best to be
> as safe as possible from the compiler fanatics, we have a pretty good
> chance of being considered MISRA compliant as well.
> 
> It seems to me that anything that involves directly comparing pointers
> is simply more likely to be come the target of optimization (and thus
> more dangerous) than comparing unsigned long and uintptr_t.  And
> although I'm not terribly familiar with the "intptr" types, it seems
> to me that casting from uintptr_t is less likely to ever be considered
> deviant behavior than casting from unsigned long.
> 
> As such, I think casting the return value of asm to a pointer is far
> too dangerous.  Using extern pointers seems quite dangerous to me as
> well.  So it seems to me that using asm to generate an unsigned long
> would be absolute minimum behavior.  Using uintprt_t values, and in
> particular importing them from assembly, might give us yet another
> level of safety (in case unsigned long -> pointer casts ever become a
> target).
> 
> Are these guaranteed to avoid "UB hazard" issues in the future?  Of
> course not; nothing can.  But they seem to me to be a lot less risky
> than asm -> ptr or extern pointers.

This is a great well-written writeup George. Maybe worthy of a blog
post, once we settle this issue :-)


> > Only at that point can we then decide whether any of
> > the proposed "solutions" (in quotes because I remain unconvinced
> > there's a problem to solve here other than working around compiler
> > bugs) is/are necessary _and_ fulfilling the purpose, and if multiple
> > remain, which of them we like best / is the least bad one.
> 
> Improvements this series seeks to make, as I understand it, include
> (in order of urgency):
> 
> 1. Fixing one concrete instance of "UB hazard"
> 2. Minimize risk of further "UB hazard" in this bit of functionality
> 3. Retain the effort Stefano has put in identifying all the places
> where such UB hazards need to be addressed.
> 4. Move towards MISRA-C compliance.

This is exactly right.


> As far as I can tell, primary objections you've leveled at the options
> which try to address 2-4 are:
> 
> a. "UB hazard" still not zero
> b. MISRA compliancy no 100%
> c. Ugly
> d. Inefficient
> 
> (Obviously some proposals have had more technical discussion.)
> 
> Anything I missed?

I would like to add here the reply I got from the MISRAC experts, that
matches your view above.

Predictably, they dislike both SYMBOL_HIDE workarounds, because they are
just compiler workarounds rather than compliance and/or code
improvements.

Instead, they suggest an approach very similar to the var.S approach,
but simpler, without the assembly redirection. Their suggestion is to
declare uintptr_t variables in C corresponding to the linker symbols and
initialize them _once_ to the linker symbol values:

  /* linker symbols */
  extern char _start[], _end[];
  /* corresponding uintptr_t variables in C */
  uintptr_t start, end;

  /* initialization of the uintptr_t variables */
  start = (uintptr_t) _start;
  end = (uintptr_t) _end;
  
  /* example usage */
  size = (_end - _start);


Thus, I think it is best to follow-up on the var.S approach. Whether we
declare the variables in assembly in var.S or in a C file like
suggested, is a minor detail.

But before I proceed in reworking the series once more, I would like to
get an agreement on the way forward. I don't think Jan's solution is
good enough, but I am willing to follow through with it if that's the
decision. But I would really love to avoid sending yet another series
update whose fundamental approach gets rejected again.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                     ` <95DC675902000028AB59E961@prv1-mh.provo.novell.com>
@ 2019-02-04  9:37                                       ` Jan Beulich
  2019-02-04 19:08                                         ` Stefano Stabellini
  2019-02-05 14:56                                         ` George Dunlap
  0 siblings, 2 replies; 102+ messages in thread
From: Jan Beulich @ 2019-02-04  9:37 UTC (permalink / raw)
  To: George Dunlap
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

>>> On 01.02.19 at 19:52, <dunlapg@umich.edu> wrote:

I'm not going to reply in detail to all of what you wrote about fanatics,
but I would like to say that I think compiler people less of that than
you appear to imply, at least the ones I know. In particular, they can
be convinced of there being bugs by pointing out what aspect of the
standard their implementation violates. (Of course there are also
going to be areas where interpretations of the standard vary too
much to come to an agreement.)

> Improvements this series seeks to make, as I understand it, include
> (in order of urgency):
> 
> 1. Fixing one concrete instance of "UB hazard"

Right, and we want to work around compiler bugs here.

> 2. Minimize risk of further "UB hazard" in this bit of functionality
> 3. Retain the effort Stefano has put in identifying all the places
> where such UB hazards need to be addressed.
> 4. Move towards MISRA-C compliance.
> 
> As far as I can tell, primary objections you've leveled at the options
> which try to address 2-4 are:
> 
> a. "UB hazard" still not zero
> b. MISRA compliancy no 100%
> c. Ugly
> d. Inefficient
> 
> (Obviously some proposals have had more technical discussion.)
> 
> Anything I missed?

I don't think so, especially since various aspects can fall under "ugly"
and/or "inefficient". What I'm not sure I see is what you mean to
express with all you wrote in terms of finding a way out of the
current situation (besides requesting a vote): Improving on a. and
b. is not a good excuse to extend c., at least not unequivocally.
Whether d. actually matters is a separate aspect, partly because it
may mean different things (it could e.g. be taken as another
wording for a. and b.).

And btw - I can't judge on b. anyway, as I still don't know what
exactly MISRA compliance is to mean, with the rules to adhere to
suitably justified.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-04  9:37                                       ` Jan Beulich
@ 2019-02-04 19:08                                         ` Stefano Stabellini
  2019-02-05  6:02                                           ` Juergen Gross
       [not found]                                           ` <2E9DDEFD0200007B00417A66@prv1-mh.provo.novell.com>
  2019-02-05 14:56                                         ` George Dunlap
  1 sibling, 2 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-02-04 19:08 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	lars.kurth.xen, Andrew Cooper, George Dunlap, Julien Grall,
	Julien Grall, Stewart Hildebrand, xen-devel

On Mon, 4 Feb 2019, Jan Beulich wrote:
> >>> On 01.02.19 at 19:52, <dunlapg@umich.edu> wrote:
> 
> I'm not going to reply in detail to all of what you wrote about fanatics,
> but I would like to say that I think compiler people less of that than
> you appear to imply, at least the ones I know. In particular, they can
> be convinced of there being bugs by pointing out what aspect of the
> standard their implementation violates. (Of course there are also
> going to be areas where interpretations of the standard vary too
> much to come to an agreement.)
> 
> > Improvements this series seeks to make, as I understand it, include
> > (in order of urgency):
> > 
> > 1. Fixing one concrete instance of "UB hazard"
> 
> Right, and we want to work around compiler bugs here.
> 
> > 2. Minimize risk of further "UB hazard" in this bit of functionality
> > 3. Retain the effort Stefano has put in identifying all the places
> > where such UB hazards need to be addressed.
> > 4. Move towards MISRA-C compliance.
> > 
> > As far as I can tell, primary objections you've leveled at the options
> > which try to address 2-4 are:
> > 
> > a. "UB hazard" still not zero
> > b. MISRA compliancy no 100%
> > c. Ugly
> > d. Inefficient
> > 
> > (Obviously some proposals have had more technical discussion.)
> > 
> > Anything I missed?
> 
> I don't think so, especially since various aspects can fall under "ugly"
> and/or "inefficient". What I'm not sure I see is what you mean to
> express with all you wrote in terms of finding a way out of the
> current situation (besides requesting a vote): Improving on a. and
> b. is not a good excuse to extend c., at least not unequivocally.
> Whether d. actually matters is a separate aspect, partly because it
> may mean different things (it could e.g. be taken as another
> wording for a. and b.).

I would be OK with a vote (or Juergen making a decision for us), but
this issue is not so fundamentally critical that I want to move forward
with it at the cost of making one or more maintainers unhappy. Ideally,
I would like to find an option that is acceptable for everybody.
Unfortunately, it doesn't look like it's possible.


> And btw - I can't judge on b. anyway, as I still don't know what
> exactly MISRA compliance is to mean, with the rules to adhere to
> suitably justified.

I can't pretend to know exactly what MISRAC compliance means for this
specific issue, but we do have the recommended way forward by the
compliance experts, which also matches the rough understanding of most
of the engineers involved in this discussion. Picking the option
suggested by the MISRAC people, could be a decent way to settle this
debate?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-04 19:08                                         ` Stefano Stabellini
@ 2019-02-05  6:02                                           ` Juergen Gross
       [not found]                                           ` <2E9DDEFD0200007B00417A66@prv1-mh.provo.novell.com>
  1 sibling, 0 replies; 102+ messages in thread
From: Juergen Gross @ 2019-02-05  6:02 UTC (permalink / raw)
  To: Stefano Stabellini, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, lars.kurth.xen, Andrew Cooper,
	George Dunlap, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

On 04/02/2019 20:08, Stefano Stabellini wrote:
> On Mon, 4 Feb 2019, Jan Beulich wrote:
>>>>> On 01.02.19 at 19:52, <dunlapg@umich.edu> wrote:
>>
>> I'm not going to reply in detail to all of what you wrote about fanatics,
>> but I would like to say that I think compiler people less of that than
>> you appear to imply, at least the ones I know. In particular, they can
>> be convinced of there being bugs by pointing out what aspect of the
>> standard their implementation violates. (Of course there are also
>> going to be areas where interpretations of the standard vary too
>> much to come to an agreement.)
>>
>>> Improvements this series seeks to make, as I understand it, include
>>> (in order of urgency):
>>>
>>> 1. Fixing one concrete instance of "UB hazard"
>>
>> Right, and we want to work around compiler bugs here.
>>
>>> 2. Minimize risk of further "UB hazard" in this bit of functionality
>>> 3. Retain the effort Stefano has put in identifying all the places
>>> where such UB hazards need to be addressed.
>>> 4. Move towards MISRA-C compliance.
>>>
>>> As far as I can tell, primary objections you've leveled at the options
>>> which try to address 2-4 are:
>>>
>>> a. "UB hazard" still not zero
>>> b. MISRA compliancy no 100%
>>> c. Ugly
>>> d. Inefficient
>>>
>>> (Obviously some proposals have had more technical discussion.)
>>>
>>> Anything I missed?
>>
>> I don't think so, especially since various aspects can fall under "ugly"
>> and/or "inefficient". What I'm not sure I see is what you mean to
>> express with all you wrote in terms of finding a way out of the
>> current situation (besides requesting a vote): Improving on a. and
>> b. is not a good excuse to extend c., at least not unequivocally.
>> Whether d. actually matters is a separate aspect, partly because it
>> may mean different things (it could e.g. be taken as another
>> wording for a. and b.).
> 
> I would be OK with a vote (or Juergen making a decision for us), but
> this issue is not so fundamentally critical that I want to move forward
> with it at the cost of making one or more maintainers unhappy. Ideally,
> I would like to find an option that is acceptable for everybody.
> Unfortunately, it doesn't look like it's possible.

I can make a decision whether the series is fine for 4.12, but for being
ready to be committed I can only have an opinion or make a suggestion.

In my opinion we should try to move forward. Fighting opinions of
compiler developers won't help as George pointed out in a slightly
sarcastic way. ;-)

While a completely future proof solution would be nice I don't think
this is achievable now. And we should be aware that a solution being
better than what we have today should be preferred over a perfect
solution which doesn't work due to compiler issues.

>> And btw - I can't judge on b. anyway, as I still don't know what
>> exactly MISRA compliance is to mean, with the rules to adhere to
>> suitably justified.
> 
> I can't pretend to know exactly what MISRAC compliance means for this
> specific issue, but we do have the recommended way forward by the
> compliance experts, which also matches the rough understanding of most
> of the engineers involved in this discussion. Picking the option
> suggested by the MISRAC people, could be a decent way to settle this
> debate?

This would be my suggestion, too.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                           ` <2E9DDEFD0200007B00417A66@prv1-mh.provo.novell.com>
@ 2019-02-05  7:53                                             ` Jan Beulich
  0 siblings, 0 replies; 102+ messages in thread
From: Jan Beulich @ 2019-02-05  7:53 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Lars Kurth,
	Andrew Cooper, George Dunlap, Julien Grall, Julien Grall,
	Stewart Hildebrand, xen-devel

>>> On 04.02.19 at 20:08, <sstabellini@kernel.org> wrote:
> On Mon, 4 Feb 2019, Jan Beulich wrote:
>> And btw - I can't judge on b. anyway, as I still don't know what
>> exactly MISRA compliance is to mean, with the rules to adhere to
>> suitably justified.
> 
> I can't pretend to know exactly what MISRAC compliance means for this
> specific issue, but we do have the recommended way forward by the
> compliance experts, which also matches the rough understanding of most
> of the engineers involved in this discussion. Picking the option
> suggested by the MISRAC people, could be a decent way to settle this
> debate?

It could be, if their reasoning was acceptable, which to me in particular
means (a) backed by references to the C spec and (b) not resorting to
use compiler bugs as an excuse for (pseudo) rules. But I'm afraid by
saying this I'm not really opening up any new avenues...

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-04  9:37                                       ` Jan Beulich
  2019-02-04 19:08                                         ` Stefano Stabellini
@ 2019-02-05 14:56                                         ` George Dunlap
       [not found]                                           ` <E730A9F90200001DAB59E961@prv1-mh.provo.novell.com>
  1 sibling, 1 reply; 102+ messages in thread
From: George Dunlap @ 2019-02-05 14:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

On Mon, Feb 4, 2019 at 9:37 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 01.02.19 at 19:52, <dunlapg@umich.edu> wrote:
>
> I'm not going to reply in detail to all of what you wrote about fanatics,
> but I would like to say that I think compiler people less of that than
> you appear to imply, at least the ones I know. In particular, they can
> be convinced of there being bugs by pointing out what aspect of the
> standard their implementation violates. (Of course there are also
> going to be areas where interpretations of the standard vary too
> much to come to an agreement.)

Right, so I did realize after sending the mail that it was pretty
harsh, and that a compiler person who read it might be angered or hurt
by it.  I'm not sure I would have changed the bulk of it, but I may
have added some caveats to help reframe it.

I spent a chunk of the day Friday reading this thread and all the
references, and was frankly outraged at the kinds of arguments made in
those threads in defense of gcc's behavior.  I'm sure that most
compiler people are nice and friendly in person, and also that the
majority of them are sensible and open to reason.  But that doesn't
change the fact that every couple of years, the OS community has
exactly this sort of interaction -- where the OS is trying to do
something that it absolutely must do, and the compiler people are
telling them that the C spec doesn't allow it, and there's a big long
discussion back and forth, where the conclusion ends up being that the
OS people have to make some crazy ugly work-around.

I spoke hyperbolically to try to make a point, but I stand by the
principles I was advocating: With regard to undefined behavior, we
cannot assume that we'll be safe by following the rules.  Our goal
should be to minimize the risk of tripping over UD behavior at all.

[snip]
> What I'm not sure I see is what you mean to
> express with all you wrote in terms of finding a way out of the
> current situation (besides requesting a vote)

If you're just tired of this discussion and want it to be done, then
of course we can just take a vote.

But ideally I think votes are best when everyone sees the landscape of
the decision clearly, and agrees on exactly what it is they disagree
about. Furthermore, it seems to me from reading this discussion that
it's more than just a few specific examples that I disagree with you
about, but about a number of principles; as such, investing time
trying to come to a common understanding should pay dividends in the
form of reduced friction in the future.

Before I expressed an opinion, I wanted to make sure that I hadn't
misunderstood you or missed a big aspect of the discussion.

> > Improvements this series seeks to make, as I understand it, include
> > (in order of urgency):
> >
> > 1. Fixing one concrete instance of "UB hazard"
>
> Right, and we want to work around compiler bugs here.
>
> > 2. Minimize risk of further "UB hazard" in this bit of functionality
> > 3. Retain the effort Stefano has put in identifying all the places
> > where such UB hazards need to be addressed.
> > 4. Move towards MISRA-C compliance.
> >
> > As far as I can tell, primary objections you've leveled at the options
> > which try to address 2-4 are:
> >
> > a. "UB hazard" still not zero
> > b. MISRA compliancy no 100%
> > c. Ugly
> > d. Inefficient
> >
> > (Obviously some proposals have had more technical discussion.)
> >
> > Anything I missed?
>
> I don't think so, especially since various aspects can fall under "ugly"
> and/or "inefficient".

Well for instance, other objections that you might have made that I
don't include in those include:

* Incorrect (i.e., known ways in which the patch will break functionality)
* Misleading / confusing (i.e., someone modifying it is likely to
introduce regressions)
* Fragile (i.e., likely to break due to small or unrelated changes).

[snip]
>: Improving on a. and
> b. is not a good excuse to extend c., at least not unequivocally.
> Whether d. actually matters is a separate aspect, partly because it
> may mean different things (it could e.g. be taken as another
> wording for a. and b.).

I take it you mean 2 and 4 (reduced UB hazard and increased chance of
MISRA-C compliance) are not a good excuse for c (ugly).

The "ugliness" here involves, variously:
* Passing a variable through an asm "barrier"
* Casing pointers to other types, sometimes multiple times at once

Most of them are a handful of lines hidden behind a macro in a header file.

To me, on a scale of 1 to 10, I'd give them an ugliness factor 2 or 3.

On the other hand, I find 2-4 compelling:

* I consider your suggested approach, of using simple
pointer-to-pointer casting,  or casting to a pointer after asm and
comparing the resulting pointers, to have a reasonably high chance of
tripping over UB behavior at some point in the future.  Regardless of
the outcome of that -- whether we change our work-around again or
whether the compiler authors change the compilers -- both we and our
users and customers will have had a lot of hassle to deal with.
Avoiding that hassle is worth the slight ugliness introduced by the
other solutions.

* Stefano has done a fair amount of work identifying the places that
need to be changed.  We know that we're likely to need to make *some
sort* of change like this for MISRA compliance at some point.
Throwing away work that then will need to be duplicated is both a
waste of time, and of developer motivation.  Even if we didn't think
it would impove UB behavior *or* get us closer to MISRA C compliance,
retaining the work he's done would be worth accepting a patch creating
such a macro.

* The patch takes the code base one step closer to being MISRA C
compliant, by setting up infrastructure likely needed by whatever it
needs.  Even before we had the recommendation from MISRA C, I would
consider preparing for that eventuality to be worth the minor ugliness
introduced.

And so, to me, the unitptr_t casting proposal seems like an obvious "accept".

Do you disagree with any of my assessments above?  Did I miss anything
that should be factored in?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                           ` <E730A9F90200001DAB59E961@prv1-mh.provo.novell.com>
@ 2019-02-06 11:59                                             ` Jan Beulich
  0 siblings, 0 replies; 102+ messages in thread
From: Jan Beulich @ 2019-02-06 11:59 UTC (permalink / raw)
  To: George Dunlap
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

>>> On 05.02.19 at 15:56, <dunlapg@umich.edu> wrote:
> On Mon, Feb 4, 2019 at 9:37 AM Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 01.02.19 at 19:52, <dunlapg@umich.edu> wrote:
>> What I'm not sure I see is what you mean to
>> express with all you wrote in terms of finding a way out of the
>> current situation (besides requesting a vote)
> 
> If you're just tired of this discussion and want it to be done, then
> of course we can just take a vote.
> 
> But ideally I think votes are best when everyone sees the landscape of
> the decision clearly, and agrees on exactly what it is they disagree
> about. Furthermore, it seems to me from reading this discussion that
> it's more than just a few specific examples that I disagree with you
> about, but about a number of principles; as such, investing time
> trying to come to a common understanding should pay dividends in the
> form of reduced friction in the future.

Which I appreciate and agree with. I've been mentioning the option
of a vote solely to show a way to make forward progress without
us reaching agreement.

> Before I expressed an opinion, I wanted to make sure that I hadn't
> misunderstood you or missed a big aspect of the discussion.
> 
>> > Improvements this series seeks to make, as I understand it, include
>> > (in order of urgency):
>> >
>> > 1. Fixing one concrete instance of "UB hazard"
>>
>> Right, and we want to work around compiler bugs here.
>>
>> > 2. Minimize risk of further "UB hazard" in this bit of functionality
>> > 3. Retain the effort Stefano has put in identifying all the places
>> > where such UB hazards need to be addressed.
>> > 4. Move towards MISRA-C compliance.
>> >
>> > As far as I can tell, primary objections you've leveled at the options
>> > which try to address 2-4 are:
>> >
>> > a. "UB hazard" still not zero
>> > b. MISRA compliancy no 100%
>> > c. Ugly
>> > d. Inefficient
>> >
>> > (Obviously some proposals have had more technical discussion.)
>> >
>> > Anything I missed?
>>
>> I don't think so, especially since various aspects can fall under "ugly"
>> and/or "inefficient".
> 
> Well for instance, other objections that you might have made that I
> don't include in those include:
> 
> * Incorrect (i.e., known ways in which the patch will break functionality)
> * Misleading / confusing (i.e., someone modifying it is likely to
> introduce regressions)
> * Fragile (i.e., likely to break due to small or unrelated changes).

There are none of the first category that I'm aware of, but to me
the latter two categories at least overlap with "inefficient", and
to me especially the var.S approach falls in the fragile and maybe
also in the misleading/confusing category.

> [snip]
>>: Improving on a. and
>> b. is not a good excuse to extend c., at least not unequivocally.
>> Whether d. actually matters is a separate aspect, partly because it
>> may mean different things (it could e.g. be taken as another
>> wording for a. and b.).
> 
> I take it you mean 2 and 4 (reduced UB hazard and increased chance of
> MISRA-C compliance) are not a good excuse for c (ugly).
> 
> The "ugliness" here involves, variously:
> * Passing a variable through an asm "barrier"
> * Casing pointers to other types, sometimes multiple times at once
> 
> Most of them are a handful of lines hidden behind a macro in a header file.
> 
> To me, on a scale of 1 to 10, I'd give them an ugliness factor 2 or 3.
> 
> On the other hand, I find 2-4 compelling:
> 
> * I consider your suggested approach, of using simple
> pointer-to-pointer casting,  or casting to a pointer after asm and
> comparing the resulting pointers, to have a reasonably high chance of
> tripping over UB behavior at some point in the future.  Regardless of
> the outcome of that -- whether we change our work-around again or
> whether the compiler authors change the compilers -- both we and our
> users and customers will have had a lot of hassle to deal with.
> Avoiding that hassle is worth the slight ugliness introduced by the
> other solutions.
> 
> * Stefano has done a fair amount of work identifying the places that
> need to be changed.  We know that we're likely to need to make *some
> sort* of change like this for MISRA compliance at some point.
> Throwing away work that then will need to be duplicated is both a
> waste of time, and of developer motivation.  Even if we didn't think
> it would impove UB behavior *or* get us closer to MISRA C compliance,
> retaining the work he's done would be worth accepting a patch creating
> such a macro.
> 
> * The patch takes the code base one step closer to being MISRA C
> compliant, by setting up infrastructure likely needed by whatever it
> needs.  Even before we had the recommendation from MISRA C, I would
> consider preparing for that eventuality to be worth the minor ugliness
> introduced.
> 
> And so, to me, the unitptr_t casting proposal seems like an obvious 
> "accept".
> 
> Do you disagree with any of my assessments above?  Did I miss anything
> that should be factored in?

Well, I think the picture you've given isn't complete. For one, I'm
certainly willing to accept a certain level of ugliness (or even
fragility, which I consider even more problematic) if the proposed
adjustments indeed _guarantee_ an improvement. But so far I've
not seen any proof of this (and your explanations of why you
find "2-4 compelling" also doesn't seem to add any). Of course one
might make changes just in the hope of an improvement, but then
my personal tolerance to it (potentially) having undesirable side
effects goes down.

Furthermore, as with any other change, I think it is a fair
expectation that it be made clear what improvements will result.
As by this point I remain unconvinced that any change is needed
at all (other than to work around compiler bugs), it is - I think -
clear that I'm not seeing the supposed improvements as actual
ones.

So to evaluate the proposals in these terms:

- the var.S approach simply is ugliest among all of them (which
  I admit is a subjective assessment of mine), but gives the
   highest level of "hope" of being compliant
- the cast to integer as well as the var.S approach are more
  fragile than the one retaining (or to be precise, re-establishing)
  original types
- the cast-retaining-types approach is the least fragile one,
  but also the one delivering the least level of "hope"

On the balance, i.e. weighing upsides and downsides, I would
probably come to a zero for all of them, which is the same as
simply not changing anything. Which is why I continue to think
that, again besides dealing with known (but not "predicted" or
whatever you might call it) compiler bugs, leaving the code as
is will be the best choice at this point in time.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-01-21 10:06                                                               ` Jan Beulich
@ 2019-02-06 15:41                                                                 ` Ian Jackson
  0 siblings, 0 replies; 102+ messages in thread
From: Ian Jackson @ 2019-02-06 15:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

Jan Beulich writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
> As per my earlier reply, I've yet to see proof of a "code-breaking
> optimization" that actually matches our case(s).

I have personally experienced a program being miscompiled because of
the mistaken belief by the compiler that _start and _end were
different objects.  I have read haven't read back the whole history of
this but this is definitely a real bug.

I agree with George that this is a compiler bug.  However, this bug is
not going to be fixed because the compiler community's behaviour is as
unreasonable as George paints :-(.

Our only option is some kind of bodge.

I don't believe in the asm fragment bodge because we don't have a
promise anywhere from the compiler authors that the asm hides pointer
provenance.  I am not aware of any C technique which can be reliably
used to obscure pointer provenance and prevent this kind of
misoptimisation.  Linux is skating on thin ice here.

That leaves:

(i) define indirection variables eg end_ in an assembly language file.
(ii) convert to uintptr_t before comparing

(i) is IMO wholly safe but it is a bit ugly and slightly less
performant.

I think (ii) is fairly safe.  I doubt that we will find (ii) broken.
In particular, because there is little motivation for compiler authors
to try to `optimise it'.  The difficulty with it providing automatic
way of detecting when we accidentallyf fail to cast.  I suggest the
following:

 extern const struct wombat _wombats_start[];
 extern const struct abstract_symbol _wombats_end[];

and providing a macro which compares any pointer with a struct
abstract_symbol* by converting the latter to a uintptr_t.  Direct
comparisons of _wombats_start with _wombats_end will result in a
compilation error due to type mismatch.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                 ` <7A8C0A4F020000EEB8D7C7D4@prv1-mh.provo.novell.com>
@ 2019-02-06 16:21                                   ` Jan Beulich
  2019-02-06 16:37                                     ` Ian Jackson
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-02-06 16:21 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

>>> On 06.02.19 at 16:41, <ian.jackson@citrix.com> wrote:
> Jan Beulich writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
>> As per my earlier reply, I've yet to see proof of a "code-breaking
>> optimization" that actually matches our case(s).
> 
> I have personally experienced a program being miscompiled because of
> the mistaken belief by the compiler that _start and _end were
> different objects.  I have read haven't read back the whole history of
> this but this is definitely a real bug.
> 
> I agree with George that this is a compiler bug.  However, this bug is
> not going to be fixed because the compiler community's behaviour is as
> unreasonable as George paints :-(.
> 
> Our only option is some kind of bodge.
> 
> I don't believe in the asm fragment bodge because we don't have a
> promise anywhere from the compiler authors that the asm hides pointer
> provenance.  I am not aware of any C technique which can be reliably
> used to obscure pointer provenance and prevent this kind of
> misoptimisation.  Linux is skating on thin ice here.
> 
> That leaves:
> 
> (i) define indirection variables eg end_ in an assembly language file.
> (ii) convert to uintptr_t before comparing
> 
> (i) is IMO wholly safe but it is a bit ugly and slightly less
> performant.
> 
> I think (ii) is fairly safe.  I doubt that we will find (ii) broken.
> In particular, because there is little motivation for compiler authors
> to try to `optimise it'.

If you want to be "prepared" for them taking apart asm()-s, how
would you expect them to never "look through" casts?

>  The difficulty with it providing automatic
> way of detecting when we accidentallyf fail to cast.  I suggest the
> following:
> 
>  extern const struct wombat _wombats_start[];
>  extern const struct abstract_symbol _wombats_end[];
> 
> and providing a macro which compares any pointer with a struct
> abstract_symbol* by converting the latter to a uintptr_t.  Direct
> comparisons of _wombats_start with _wombats_end will result in a
> compilation error due to type mismatch.

Hmm, that's certainly an interesting approach, and requires
care only when introducing a new pair of symbols of this kind.
But of course the macro you suggest to carry out the
comparison will have the same weakness as any open coded
cast to a suitable integer type. But there are benefits:
- it marks problem sites clearly (one of Stefano's goals),
- it isolates future changes to how exactly the comparisons
  are to be done to the comparison macro(s)
- it doesn't undermine type safety of the main (start-of-
  whatever) symbols (one of my goals),
- it allows the end-of-whatever symbols to also be handed to
  functions in a type-safe manner (we could even have
  per-instance structure flavors, such that unrelated "end"
  symbols can't be mixed up; for example the type could
  simply be a structure wrapping a field of the original base
  type).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-06 16:21                                   ` Jan Beulich
@ 2019-02-06 16:37                                     ` Ian Jackson
       [not found]                                       ` <08D440470200001BB8D7C7D4@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 102+ messages in thread
From: Ian Jackson @ 2019-02-06 16:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

Jan Beulich writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
> On 06.02.19 at 16:41, <ian.jackson@citrix.com> wrote:
> > (i) define indirection variables eg end_ in an assembly language file.
> > (ii) convert to uintptr_t before comparing
> > 
> > (i) is IMO wholly safe but it is a bit ugly and slightly less
> > performant.
> > 
> > I think (ii) is fairly safe.  I doubt that we will find (ii) broken.
> > In particular, because there is little motivation for compiler authors
> > to try to `optimise it'.
> 
> If you want to be "prepared" for them taking apart asm()-s, how
> would you expect them to never "look through" casts?

I'm not sure what you mean by `look through casts'.  I expect the
compiler to `look through casts' for both value and provenance
purposes.

But comparing uintptr_t's is never UB, no matter their value or
provenance.  So there is simply unarguably no UB if the comparison is
done with uintptr_t's.  The conversions themselves are ID, not UB.

Whether comparing pointer values is UB depends on their value and
provenance, and in general the compiler is able to look through most
transformations (including casts) to determine the ultimate provenance
- ie the provenance rules are not defeated by casts.

So that's de jure.

De facto, there is little incentive for a compiler to misoptimise
uintptr_t comparisons, and much incentive for it to get `better' at
disentangling pointer provenance for the purpose of (mis)optimising
pointer comparisons.

> >  extern const struct wombat _wombats_start[];
> >  extern const struct abstract_symbol _wombats_end[];
...
> Hmm, that's certainly an interesting approach, and requires
> care only when introducing a new pair of symbols of this kind.
> But of course the macro you suggest to carry out the
> comparison will have the same weakness as any open coded
> cast to a suitable integer type. But there are benefits:

libxl already relies on casting to uintptr_t as a way of avoiding the
rules restricting pointer comparisons.  See these comments:
  libxl_event.c     l.476  libxl__watch_slot_contents
  libxl_internal.h  l.322  typedef struct libxl__ev_watch_slot

> - it marks problem sites clearly (one of Stefano's goals),
> - it isolates future changes to how exactly the comparisons
>   are to be done to the comparison macro(s)
> - it doesn't undermine type safety of the main (start-of-
>   whatever) symbols (one of my goals),
> - it allows the end-of-whatever symbols to also be handed to
>   functions in a type-safe manner

Yes.  However...

>   (we could even have per-instance structure flavors, such that
>   unrelated "end" symbols can't be mixed up; for example the type
>   could simply be a structure wrapping a field of the original base
>   type).

I think this would be difficult to achieve without writing a forbidden
pointer comparison in the macro.  It could perhaps be achieved with
typeof() if that is available in hypervisor code.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                       ` <08D440470200001BB8D7C7D4@prv1-mh.provo.novell.com>
@ 2019-02-06 16:47                                         ` Jan Beulich
  2019-02-06 16:52                                           ` Ian Jackson
  0 siblings, 1 reply; 102+ messages in thread
From: Jan Beulich @ 2019-02-06 16:47 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

>>> On 06.02.19 at 17:37, <ian.jackson@citrix.com> wrote:
> Jan Beulich writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
>> - it marks problem sites clearly (one of Stefano's goals),
>> - it isolates future changes to how exactly the comparisons
>>   are to be done to the comparison macro(s)
>> - it doesn't undermine type safety of the main (start-of-
>>   whatever) symbols (one of my goals),
>> - it allows the end-of-whatever symbols to also be handed to
>>   functions in a type-safe manner
> 
> Yes.  However...
> 
>>   (we could even have per-instance structure flavors, such that
>>   unrelated "end" symbols can't be mixed up; for example the type
>>   could simply be a structure wrapping a field of the original base
>>   type).
> 
> I think this would be difficult to achieve without writing a forbidden
> pointer comparison in the macro.  It could perhaps be achieved with
> typeof() if that is available in hypervisor code.

I'm afraid I don't understand - you want to cast to an integer
type anyway for doing the comparison.

As to typeof() - this being a compiler construct, it is available
whenever the compiler supports it. We certainly use it
elsewhere in hypervisor code.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-06 16:47                                         ` Jan Beulich
@ 2019-02-06 16:52                                           ` Ian Jackson
  2019-02-06 23:39                                             ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Ian Jackson @ 2019-02-06 16:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, xen-devel,
	Stewart Hildebrand, Ian Jackson

Jan Beulich writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
> On 06.02.19 at 17:37, <ian.jackson@citrix.com> wrote:
> > Jan Beulich writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
> >> - it allows the end-of-whatever symbols to also be handed to
> >>   functions in a type-safe manner
> > 
> > Yes.  However...
> > 
> >>   (we could even have per-instance structure flavors, such that
> >>   unrelated "end" symbols can't be mixed up; for example the type
> >>   could simply be a structure wrapping a field of the original base
> >>   type).
> > 
> > I think this would be difficult to achieve without writing a forbidden
> > pointer comparison in the macro.  It could perhaps be achieved with
> > typeof() if that is available in hypervisor code.
> 
> I'm afraid I don't understand - you want to cast to an integer
> type anyway for doing the comparison.

The usual approach to haking a macro perform an explicit typecheck
(ie, to have the macro check that the types of its arguments are
right) is to have the macro expansion contain a `spurious' comparison
whose result is ignored but which will yield a compile-type type
mismatch if the argument types were wrong.  But this is only legal if
the provenance of the compared pointers is legal for a pointer
comparison.  The bad effects of evaluating an UB expression are not
limited by within-program causality.

> As to typeof() - this being a compiler construct, it is available
> whenever the compiler supports it. We certainly use it
> elsewhere in hypervisor code.

I think then that
   (typeof(X))0 == (typeof(Y))0
is the correct formulation of the type check - because it is legal no
matter the provenance of X and Y.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-06 16:52                                           ` Ian Jackson
@ 2019-02-06 23:39                                             ` Stefano Stabellini
  2019-02-07 11:48                                               ` Ian Jackson
  0 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-02-06 23:39 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Jan Beulich,
	Stewart Hildebrand, xen-devel

On Wed, 6 Feb 2019, Ian Jackson wrote:
> Jan Beulich writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
> > On 06.02.19 at 17:37, <ian.jackson@citrix.com> wrote:
> > > Jan Beulich writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
> > >> - it allows the end-of-whatever symbols to also be handed to
> > >>   functions in a type-safe manner
> > > 
> > > Yes.  However...

I am OK with this approach. Maybe not the best IMO, but good enough. It
should also satisfy the MISRAC guys, as they wrote "ideally cast to
uintptr_t only once": here we wouldn't be casting only once, but at
least we would do it inside a single well-defined macro.

I did manage to convert v4 of the series to this approach before writing
this answer -- everything looks plausible and compiles OK. Also, see
below.


> > >>   (we could even have per-instance structure flavors, such that
> > >>   unrelated "end" symbols can't be mixed up; for example the type
> > >>   could simply be a structure wrapping a field of the original base
> > >>   type).
> > > 
> > > I think this would be difficult to achieve without writing a forbidden
> > > pointer comparison in the macro.  It could perhaps be achieved with
> > > typeof() if that is available in hypervisor code.
> > 
> > I'm afraid I don't understand - you want to cast to an integer
> > type anyway for doing the comparison.
> 
> The usual approach to haking a macro perform an explicit typecheck
> (ie, to have the macro check that the types of its arguments are
> right) is to have the macro expansion contain a `spurious' comparison
> whose result is ignored but which will yield a compile-type type
> mismatch if the argument types were wrong.  But this is only legal if
> the provenance of the compared pointers is legal for a pointer
> comparison.  The bad effects of evaluating an UB expression are not
> limited by within-program causality.
> 
> > As to typeof() - this being a compiler construct, it is available
> > whenever the compiler supports it. We certainly use it
> > elsewhere in hypervisor code.
> 
> I think then that
>    (typeof(X))0 == (typeof(Y))0
> is the correct formulation of the type check - because it is legal no
> matter the provenance of X and Y.

Thank you, Ian. I think I understand what you are saying. I'll probably
leave this out for the next iteration though, but I am happy to add it
afterwards.

I was thinking of going with two MACROs:

+/*
+ * Performs x - y, returns the original pointer type. To be used when
+ * either x or y or both are linker symbols.
+ */
+#define SYMBOLS_SUBTRACT(x, y) ({                                             \
+    __typeof__(*(y)) *ptr_;                                                   \
+    ptr_ = (typeof(ptr_)) (((uintptr_t)(x) - (uintptr_t)(y)) / sizeof(*(y))); \
+    ptr_;                                                                     \
+})
+
+/*
+ * Performs x - y, returns uintptr_t. To be used when either x or y or
+ * both are linker symbols.
+ */
+#define SYMBOLS_COMPARE(x, y) ({                                              \
+    uintptr_t ptr_;                                                           \
+    ptr_ = ((uintptr_t)(x) - (uintptr_t)(y)) / sizeof(*(y));                  \
+    ptr_;                                                                     \
+})

Examples:

+    new_ptr = SYMBOLS_SUBTRACT(func->old_addr, _start) + vmap_of_xen_text;

and:

+    for ( alt = region->begin;
+          SYMBOLS_COMPARE(alt, region->end) < 0;
+          alt++ )

We could also define a third macro such as:

  #define SYMBOLS_SUBTRACT_INT(x, y)  SYMBOLS_COMPARE((x), (y))

because we have many places where we need the result of SYMBOLS_SUBTRACT
converted to an integer type. For instance:

  paddr_t xen_size = (uintptr_t)SYMBOLS_SUBTRAC(_end, _start);

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-06 23:39                                             ` Stefano Stabellini
@ 2019-02-07 11:48                                               ` Ian Jackson
  2019-02-07 18:18                                                 ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Ian Jackson @ 2019-02-07 11:48 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Jan Beulich, Stewart Hildebrand,
	xen-devel

Stefano Stabellini writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
> I am OK with this approach. Maybe not the best IMO, but good enough. It
> should also satisfy the MISRAC guys, as they wrote "ideally cast to
> uintptr_t only once": here we wouldn't be casting only once, but at
> least we would do it inside a single well-defined macro.

Right.  I think it meets the goals of MISRA-C, probably better than
most other approaches.

FAOD, I think you should expect people to declare the linker symbols
either as I suggested:

     extern const struct wombat _wombats_start[];
     extern const struct abstract_symbol _wombats_end[];

(or along the lines of Jan's suggestion, but frankly I think that is
going to be too hard to sort out now.)

> +/*
> + * Performs x - y, returns the original pointer type. To be used when
> + * either x or y or both are linker symbols.
> + */
> +#define SYMBOLS_SUBTRACT(x, y) ({                                             \
> +    __typeof__(*(y)) *ptr_;                                                   \
> +    ptr_ = (typeof(ptr_)) (((uintptr_t)(x) - (uintptr_t)(y)) / sizeof(*(y))); \
> +    ptr_;                                                                     \
> +})

This is type-incoherent.  The difference between two pointers is a
scalar, not another pointer.  Also "the original pointer type" is
ambiguous.  It should refer explicitly to y.  IMO this function should
contain a typecheck which assures that x is of the right type.

How about something like this:

  /*
   * Calculate (end - start), where start and end are linker symbols,
   * giving a ptrdiff_t.  The size is in units of start's referent.
   * end must be a `struct abstract_symbol*'.
   */
  #define SYMBOLS_ARRAY_LEN(start,end) ({
     ((end) == (struct abstract_symbol*)0);                               
     (ptrdiff_t)((uintptr_t)(end) - (uintptr_t)(start)) / sizeof(*start);
  })

  /*
   * Given two pointers A,B of arbitrary types, gives the difference
   * B-A in bytes.  Can be used for comparisons:
   *   If A<B, gives a negative number
   *   if A==B, gives zero
   *   If A>B, gives a positive number
   * Legal even if the pointers are to different objects.
   */
  #define POINTER_CMP(a,b) ({
     ((a) == (void*)0);
     ((b) == (void*)0);
     (ptrdiff_t)((uintptr_t)(end) - (uintptr_t)(start));
  })

The application of these two your two examples is complex because your
examples seem wrong to me.

> +/*
> + * Performs x - y, returns uintptr_t. To be used when either x or y or

This is wrong.  Comparisons should give a signed output.

> + * both are linker symbols.

In neither of your example below are the things in question linker
symbols so your examples violate your own preconditions...


> Examples:
> 
> +    new_ptr = SYMBOLS_SUBTRACT(func->old_addr, _start) + vmap_of_xen_text;

This is punning wildly between pointers and integers.  I infer that
old_addr is a pointer of some kind and vmap_of_xen_text is an integer.
I also infer that sizeof(*old_addr) is 1 because otherwise you
multiply vmap_of_xen_text by the size which is clearly entirely wrong.
Ie this code is just entirely wrong.

This is presumably some kind of relocation.  I don't think it makes
much sense to macro this.  Instead, it is better to make
vmap_of_xen_text a pointer and do this:

  +    /* Relocation.  We need to calculate the offset of the address
  +     * from _start, and apply that to our own map, to find where we
  +     * have this mapped.  Doing these kind of games directly with
  +     * pointers is contrary to the C rules for what pointers may be
  +     * compared and computed.  So we do the offset calculation with
  +     * integers, which is always legal.  The subsequent addition of
  +     * the offset to the vmap_of_xen_text pointer is legal because
  +     * the computed pointer is indeed a valid part of the object
  +     * referred to by vmap_of_xen_text - namely, the byte array
  +     * of our mapping of the Xen text. */
  +    new_ptr = ((uintptr_t)func->old_addr - (uintptr_t)_start) + vmap_of_xen_text;

Note that, unfortunately, any deviation from completely normal pointer
handling *must* be accompanied by this kind of a proof, to explain why
it is OK.

> and:
> 
> +    for ( alt = region->begin;
> +          SYMBOLS_COMPARE(alt, region->end) < 0;
> +          alt++ )

region->begin and ->end aren't linker symbols, are they ?  So the
wrong assumption by the compiler (which is at the root of this thread)
that different linker symbols are necessarily different objects
(resulting from the need to declare them in C as if they were) does
not arise.  I think you mean maybe something like _region_start and
_region_end.  So with my proposed macro:

> We could also define a third macro such as:
>   #define SYMBOLS_SUBTRACT_INT(x, y)  SYMBOLS_COMPARE((x), (y))
> because we have many places where we need the result of SYMBOLS_SUBTRACT
> converted to an integer type. For instance:
>   paddr_t xen_size = (uintptr_t)SYMBOLS_SUBTRAC(_end, _start);

This need arises because the difference between two pointers is indeed
an integer and not another pointer.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
       [not found]                                               ` <DACE7A5F020000B1B8D7C7D4@prv1-mh.provo.novell.com>
@ 2019-02-07 14:51                                                 ` Jan Beulich
  0 siblings, 0 replies; 102+ messages in thread
From: Jan Beulich @ 2019-02-07 14:51 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Stewart Hildebrand,
	xen-devel

>>> On 07.02.19 at 12:48, <ian.jackson@citrix.com> wrote:
> Stefano Stabellini writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce 
> SYMBOL"):
>> I am OK with this approach. Maybe not the best IMO, but good enough. It
>> should also satisfy the MISRAC guys, as they wrote "ideally cast to
>> uintptr_t only once": here we wouldn't be casting only once, but at
>> least we would do it inside a single well-defined macro.
> 
> Right.  I think it meets the goals of MISRA-C, probably better than
> most other approaches.
> 
> FAOD, I think you should expect people to declare the linker symbols
> either as I suggested:
> 
>      extern const struct wombat _wombats_start[];
>      extern const struct abstract_symbol _wombats_end[];
> 
> (or along the lines of Jan's suggestion, but frankly I think that is
> going to be too hard to sort out now.)

Hmm, not overly difficult (and a macro just because there are no
templates in C):

#define WHATEVER(type, name, pfx) \
\
struct abstract_ ## name { \
    type _; \
}; \
\
extern const type pfx ## _start[]; \
extern const struct abstract_ ## name pfx ## _end[]; \
\
static inline _Bool name ## _lt(type const s1[], \
                                const struct abstract_ ## name s2[]) \
{ \
    return (unsigned long)s1 < (unsigned long)s2; \
} \
\
static inline long name ## _diff(type const s1[], \
                                 const struct abstract_ ## name s2[]) \
{ \
    return ((unsigned long)s2 - (unsigned long)s1) / sizeof(*s1); \
}

WHATEVER(unsigned char, uchar, )
WHATEVER(const struct scheduler *, scheduler, schedulers)

/* Example usage */

unsigned long image_size(void)
{
    return uchar_diff(_start, _end);
}

void iterate_schedulers(void (*func)(const struct scheduler *))
{
    const struct scheduler *const *s;

    for ( s = schedulers_start; scheduler_lt(s, schedulers_end); ++s )
        func(*s);
}

I've intentionally used unsigned long and _Bool such that the
example would compile completely standalone. This isn't mean to
be that way in whatever would go into the hypervisor of course.

As you can see type safety gets achieved without any
comparisons at all, hence not even raising the UB-ness question.
As you can further see passing around pointers to the
abstract_* structures is working quite okay, which is what I
was after when talking about the type safety aspect.

Whether the extern array declarations would be part of the
macro is to be determined. On one hand doing so tightly couples
start and end symbols, i.e. they can't be used in non-matching
pairs. Otoh this then requires to declare and use different
"name"s when multiple start/end pairs share their base types.
An option might be to have two macros, one without the decls
and the other having the decls and invoking the first.

And I think you can guess that I've used WHATEVER as a name
because I couldn't really figure a good one (and I also wasn't
overly happy with various names I had seen so far).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-07 11:48                                               ` Ian Jackson
@ 2019-02-07 18:18                                                 ` Stefano Stabellini
  2019-02-12 11:31                                                   ` Ian Jackson
  0 siblings, 1 reply; 102+ messages in thread
From: Stefano Stabellini @ 2019-02-07 18:18 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Jan Beulich,
	Stewart Hildebrand, xen-devel

On Thu, 7 Feb 2019, Ian Jackson wrote:
> Stefano Stabellini writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
> > I am OK with this approach. Maybe not the best IMO, but good enough. It
> > should also satisfy the MISRAC guys, as they wrote "ideally cast to
> > uintptr_t only once": here we wouldn't be casting only once, but at
> > least we would do it inside a single well-defined macro.
> 
> Right.  I think it meets the goals of MISRA-C, probably better than
> most other approaches.
> 
> FAOD, I think you should expect people to declare the linker symbols
> either as I suggested:
> 
>      extern const struct wombat _wombats_start[];
>      extern const struct abstract_symbol _wombats_end[];
> 
> (or along the lines of Jan's suggestion, but frankly I think that is
> going to be too hard to sort out now.)

Yes, they are already declared this way, I would prefer to avoid
changing the declaration as part of this series.


> > +/*
> > + * Performs x - y, returns the original pointer type. To be used when
> > + * either x or y or both are linker symbols.
> > + */
> > +#define SYMBOLS_SUBTRACT(x, y) ({                                             \
> > +    __typeof__(*(y)) *ptr_;                                                   \
> > +    ptr_ = (typeof(ptr_)) (((uintptr_t)(x) - (uintptr_t)(y)) / sizeof(*(y))); \
> > +    ptr_;                                                                     \
> > +})
> 
> This is type-incoherent.  The difference between two pointers is a
> scalar, not another pointer.

I am glad you highlighted this. The vast majority of changes in this
series are subtractions or comparisons.  So, if subtractions (and also
comparisons as you wrote below) need to return a scalar, then we might
as well return uintptr_t or ptrdiff_t from the two macros. It makes a
lot of sense to me.


> Also "the original pointer type" is
> ambiguous.  It should refer explicitly to y.  IMO this function should
> contain a typecheck which assures that x is of the right type.
> 
> How about something like this:
> 
>   /*
>    * Calculate (end - start), where start and end are linker symbols,
>    * giving a ptrdiff_t.  The size is in units of start's referent.
>    * end must be a `struct abstract_symbol*'.
>    */
>   #define SYMBOLS_ARRAY_LEN(start,end) ({
>      ((end) == (struct abstract_symbol*)0);                               
>      (ptrdiff_t)((uintptr_t)(end) - (uintptr_t)(start)) / sizeof(*start);
>   })

Sounds good, but the issue is that we might have to use this macro with:

- start is a linker symbol and end as a normal pointer
- start is a normal pointer and end as a linker symbol
- both are linker symbols

If so, do we need three slightly different variations of this macro?


>   /*
>    * Given two pointers A,B of arbitrary types, gives the difference
>    * B-A in bytes.  Can be used for comparisons:
>    *   If A<B, gives a negative number
>    *   if A==B, gives zero
>    *   If A>B, gives a positive number
>    * Legal even if the pointers are to different objects.
>    */
>   #define POINTER_CMP(a,b) ({
>      ((a) == (void*)0);
>      ((b) == (void*)0);
>      (ptrdiff_t)((uintptr_t)(end) - (uintptr_t)(start));
>   })
> 
> The application of these two your two examples is complex because your
> examples seem wrong to me.

Yeah, I realize it wasn't really possible to understand my examples
unless one was very familiar with past versions of the series. I'll add
more context below.


> > +/*
> > + * Performs x - y, returns uintptr_t. To be used when either x or y or
> 
> This is wrong.  Comparisons should give a signed output.
> 
> > + * both are linker symbols.
> 
> In neither of your example below are the things in question linker
> symbols so your examples violate your own preconditions...
> 
> 
> > Examples:
> > 
> > +    new_ptr = SYMBOLS_SUBTRACT(func->old_addr, _start) + vmap_of_xen_text;
> 
> This is punning wildly between pointers and integers.  I infer that
> old_addr is a pointer of some kind and vmap_of_xen_text is an integer.
> I also infer that sizeof(*old_addr) is 1 because otherwise you
> multiply vmap_of_xen_text by the size which is clearly entirely wrong.
> Ie this code is just entirely wrong.
> 
> This is presumably some kind of relocation.  I don't think it makes
> much sense to macro this.  Instead, it is better to make
> vmap_of_xen_text a pointer and do this:
> 
>   +    /* Relocation.  We need to calculate the offset of the address
>   +     * from _start, and apply that to our own map, to find where we
>   +     * have this mapped.  Doing these kind of games directly with
>   +     * pointers is contrary to the C rules for what pointers may be
>   +     * compared and computed.  So we do the offset calculation with
>   +     * integers, which is always legal.  The subsequent addition of
>   +     * the offset to the vmap_of_xen_text pointer is legal because
>   +     * the computed pointer is indeed a valid part of the object
>   +     * referred to by vmap_of_xen_text - namely, the byte array
>   +     * of our mapping of the Xen text. */
>   +    new_ptr = ((uintptr_t)func->old_addr - (uintptr_t)_start) + vmap_of_xen_text;
> 
> Note that, unfortunately, any deviation from completely normal pointer
> handling *must* be accompanied by this kind of a proof, to explain why
> it is OK.

OK. Most of the call sites only do things like (_end - _start) or (p >
_end). I wanted to bring up cases that are not trivial.

We have a couple of cases where we are "punning wildly between pointers
and integers", for instance:

xen/arch/arm/arm64/livepatch.c:arch_livepatch_apply
xen/arch/arm/setup.c:start_xen  line 772
xen/arch/x86/setup.c:__start_xen  line 1382

I think it is OK to manually cast to (uintptr_t) in those cases as you
suggest.


> > and:
> > 
> > +    for ( alt = region->begin;
> > +          SYMBOLS_COMPARE(alt, region->end) < 0;
> > +          alt++ )
> 
> region->begin and ->end aren't linker symbols, are they ?

I made this example because this is a common pattern that we have in the
hypervisor. A better example using your suggested macro would be:

+    for ( call = __initcall_start;
+          POINTER_CMP(call, __presmp_initcall_end) < 0;
+          call++ )
 
Where __initcall_start and __presmp_initcall_end are linker symbols.
(Above region->begin and region->end are initialized to two linker
symbols.)


> So the
> wrong assumption by the compiler (which is at the root of this thread)
> that different linker symbols are necessarily different objects
> (resulting from the need to declare them in C as if they were) does
> not arise.  I think you mean maybe something like _region_start and
> _region_end.  So with my proposed macro:
> 
> > We could also define a third macro such as:
> >   #define SYMBOLS_SUBTRACT_INT(x, y)  SYMBOLS_COMPARE((x), (y))
> > because we have many places where we need the result of SYMBOLS_SUBTRACT
> > converted to an integer type. For instance:
> >   paddr_t xen_size = (uintptr_t)SYMBOLS_SUBTRAC(_end, _start);
> 
> This need arises because the difference between two pointers is indeed
> an integer and not another pointer.

Yes, I get it.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-07 18:18                                                 ` Stefano Stabellini
@ 2019-02-12 11:31                                                   ` Ian Jackson
  2019-02-13  0:09                                                     ` Stefano Stabellini
  0 siblings, 1 reply; 102+ messages in thread
From: Ian Jackson @ 2019-02-12 11:31 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Julien Grall, Julien Grall, Jan Beulich, Stewart Hildebrand,
	xen-devel

Stefano Stabellini writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
> On Thu, 7 Feb 2019, Ian Jackson wrote:
> > FAOD, I think you should expect people to declare the linker symbols
> > either as I suggested:
> > 
> >      extern const struct wombat _wombats_start[];
> >      extern const struct abstract_symbol _wombats_end[];
> > 
> > (or along the lines of Jan's suggestion, but frankly I think that is
> > going to be too hard to sort out now.)
> 
> Yes, they are already declared this way, I would prefer to avoid
> changing the declaration as part of this series.

I'm not sure why you didn't CC me on your revised version(s) ?

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v6 1/4] xen: introduce SYMBOL
  2019-02-12 11:31                                                   ` Ian Jackson
@ 2019-02-13  0:09                                                     ` Stefano Stabellini
  0 siblings, 0 replies; 102+ messages in thread
From: Stefano Stabellini @ 2019-02-13  0:09 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Juergen Gross, Stefano Stabellini, Stefano Stabellini, Wei Liu,
	Andrew Cooper, Julien Grall, Julien Grall, Jan Beulich,
	Stewart Hildebrand, xen-devel

On Tue, 12 Feb 2019, Ian Jackson wrote:
> Stefano Stabellini writes ("Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL"):
> > On Thu, 7 Feb 2019, Ian Jackson wrote:
> > > FAOD, I think you should expect people to declare the linker symbols
> > > either as I suggested:
> > > 
> > >      extern const struct wombat _wombats_start[];
> > >      extern const struct abstract_symbol _wombats_end[];
> > > 
> > > (or along the lines of Jan's suggestion, but frankly I think that is
> > > going to be too hard to sort out now.)
> > 
> > Yes, they are already declared this way, I would prefer to avoid
> > changing the declaration as part of this series.
> 
> I'm not sure why you didn't CC me on your revised version(s) ?

I didn't know if you wanted to be involved in all the details, so erred
on the safe side and CC'ed you only on patch #0.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 102+ messages in thread

end of thread, other threads:[~2019-02-13  0:09 UTC | newest]

Thread overview: 102+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-09 23:41 [PATCH v6 0/4] misc safety certification fixes Stefano Stabellini
2019-01-09 23:42 ` [PATCH v6 1/4] xen: introduce SYMBOL Stefano Stabellini
2019-01-10  2:40   ` Julien Grall
2019-01-10  8:24     ` Jan Beulich
2019-01-10 17:29       ` Stefano Stabellini
2019-01-10 18:46         ` Stewart Hildebrand
2019-01-10 19:03           ` Stefano Stabellini
2019-01-11 10:35           ` Jan Beulich
2019-01-11 17:01             ` Stefano Stabellini
2019-01-10 19:24         ` Julien Grall
2019-01-10 21:36           ` Stefano Stabellini
2019-01-10 23:31             ` Julien Grall
2019-01-11  2:14               ` Stefano Stabellini
2019-01-11  6:52                 ` Juergen Gross
2019-01-11 16:52                   ` Stefano Stabellini
2019-01-11 10:48                 ` Jan Beulich
2019-01-11 18:04                   ` Stefano Stabellini
2019-01-11 18:53                     ` Stewart Hildebrand
2019-01-11 20:35                       ` Julien Grall
2019-01-11 20:46                         ` Stewart Hildebrand
2019-01-11 21:37                           ` Stefano Stabellini
2019-01-14  3:45                             ` Stewart Hildebrand
2019-01-14 10:26                               ` Jan Beulich
2019-01-14 21:18                                 ` Stefano Stabellini
     [not found]                                   ` <1CACC1FB020000D800417A66@prv1-mh.provo.novell.com>
2019-01-15  8:21                                     ` Jan Beulich
2019-01-15 11:51                                       ` Julien Grall
     [not found]                                         ` <AB1DA25B020000B95C475325@prv1-mh.provo.novell.com>
2019-01-15 12:04                                           ` Jan Beulich
2019-01-15 12:23                                             ` Julien Grall
     [not found]                                               ` <BAE986750200003A5C475325@prv1-mh.provo.novell.com>
2019-01-15 12:44                                                 ` Jan Beulich
2019-01-15 20:03                                       ` Stewart Hildebrand
2019-01-16  6:01                                         ` Juergen Gross
2019-01-16 10:19                                         ` Jan Beulich
2019-01-17  0:37                                           ` Stefano Stabellini
     [not found]                                             ` <B4D3ABC30200003B88BF86FB@prv1-mh.provo.novell.com>
     [not found]                                               ` <529ED2F90200004D00417A66@prv1-mh.provo.novell.com>
2019-01-17 11:45                                                 ` Jan Beulich
2019-01-18  1:24                                                   ` Stefano Stabellini
     [not found]                                                     ` <76A2DEED0200005600417A66@prv1-mh.provo.novell.com>
2019-01-18  9:54                                                       ` Jan Beulich
2019-01-18 10:48                                                         ` Julien Grall
     [not found]                                                           ` <9F511FC70200005E5C475325@prv1-mh.provo.novell.com>
2019-01-18 11:09                                                             ` Jan Beulich
2019-01-18 15:22                                                               ` Julien Grall
     [not found]                                                                 ` <3A8206D8020000035C475325@prv1-mh.provo.novell.com>
2019-01-21  9:39                                                                   ` Jan Beulich
2019-01-21  9:34                                                             ` Jan Beulich
2019-01-21 10:22                                                               ` Julien Grall
     [not found]                                                                 ` <E16AB350020000435C475325@prv1-mh.provo.novell.com>
2019-01-21 10:31                                                                   ` Jan Beulich
2019-01-21 23:15                                                                     ` Stefano Stabellini
     [not found]                                                                       ` <5EA2B4FA0200008000417A66@prv1-mh.provo.novell.com>
2019-01-22  9:06                                                                         ` Jan Beulich
2019-01-18 23:05                                                         ` Stefano Stabellini
2019-01-21  5:24                                                           ` Stewart Hildebrand
     [not found]                                                           ` <5A96F2FD0200008D00417A66@prv1-mh.provo.novell.com>
2019-01-21  9:50                                                             ` Jan Beulich
2019-01-21 23:41                                                               ` Stefano Stabellini
2019-01-22  6:08                                                                 ` Juergen Gross
     [not found]                                                                 ` <42A2C4FA0200009000417A66@prv1-mh.provo.novell.com>
2019-01-22  9:16                                                                   ` Jan Beulich
2019-02-01 18:52                                                                     ` George Dunlap
2019-02-01 20:53                                                                       ` Stefano Stabellini
     [not found]                                                             ` <58377FAD0200004688BF86FB@prv1-mh.provo.novell.com>
2019-01-21 10:06                                                               ` Jan Beulich
2019-02-06 15:41                                                                 ` Ian Jackson
     [not found]                                           ` <C8F95655020000CAB8D7C7D4@prv1-mh.provo.novell.com>
     [not found]                                             ` <5867EFE6020000DB00417A66@prv1-mh.provo.novell.com>
     [not found]                                               ` <DACE7A5F020000B1B8D7C7D4@prv1-mh.provo.novell.com>
2019-02-07 14:51                                                 ` Jan Beulich
2019-01-15 23:36                                       ` Stefano Stabellini
2019-01-16  8:47                                         ` Juergen Gross
     [not found]                                         ` <2EA6D6FD0200001F00417A66@prv1-mh.provo.novell.com>
2019-01-16 10:25                                           ` Jan Beulich
2019-01-17  0:41                                             ` Stefano Stabellini
     [not found]                                               ` <4EA2F2F90200004D00417A66@prv1-mh.provo.novell.com>
2019-01-17 11:46                                                 ` Jan Beulich
     [not found]                                     ` <95DC675902000028AB59E961@prv1-mh.provo.novell.com>
2019-02-04  9:37                                       ` Jan Beulich
2019-02-04 19:08                                         ` Stefano Stabellini
2019-02-05  6:02                                           ` Juergen Gross
     [not found]                                           ` <2E9DDEFD0200007B00417A66@prv1-mh.provo.novell.com>
2019-02-05  7:53                                             ` Jan Beulich
2019-02-05 14:56                                         ` George Dunlap
     [not found]                                           ` <E730A9F90200001DAB59E961@prv1-mh.provo.novell.com>
2019-02-06 11:59                                             ` Jan Beulich
     [not found]                                 ` <7A8C0A4F020000EEB8D7C7D4@prv1-mh.provo.novell.com>
2019-02-06 16:21                                   ` Jan Beulich
2019-02-06 16:37                                     ` Ian Jackson
     [not found]                                       ` <08D440470200001BB8D7C7D4@prv1-mh.provo.novell.com>
2019-02-06 16:47                                         ` Jan Beulich
2019-02-06 16:52                                           ` Ian Jackson
2019-02-06 23:39                                             ` Stefano Stabellini
2019-02-07 11:48                                               ` Ian Jackson
2019-02-07 18:18                                                 ` Stefano Stabellini
2019-02-12 11:31                                                   ` Ian Jackson
2019-02-13  0:09                                                     ` Stefano Stabellini
2019-01-15 11:46                             ` Julien Grall
2019-01-15 12:23                               ` Julien Grall
2019-01-14 10:11                     ` Jan Beulich
2019-01-14 15:41                       ` Julien Grall
2019-01-14 15:52                         ` Jan Beulich
2019-01-14 16:26                           ` Stewart Hildebrand
2019-01-14 16:39                             ` Jan Beulich
2019-01-14 16:28                           ` Julien Grall
2019-01-14 16:44                             ` Jan Beulich
2019-01-14 17:24                               ` Julien Grall
2019-01-15  8:04                                 ` Jan Beulich
2019-01-10 17:22     ` Stefano Stabellini
2019-01-10  8:34   ` Jan Beulich
2019-01-10 18:09     ` Stefano Stabellini
2019-01-09 23:42 ` [PATCH v6 2/4] xen/arm: use SYMBOL when required Stefano Stabellini
2019-01-10  8:41   ` Jan Beulich
2019-01-10 17:44     ` Stefano Stabellini
2019-01-11 10:52       ` Jan Beulich
2019-01-11 16:58         ` Stefano Stabellini
2019-01-14  9:23           ` Jan Beulich
2019-01-09 23:42 ` [PATCH v6 3/4] xen/x86: " Stefano Stabellini
2019-01-10  8:43   ` Jan Beulich
2019-01-10 17:45     ` Stefano Stabellini
2019-01-09 23:42 ` [PATCH v6 4/4] xen/common: " Stefano Stabellini
2019-01-10  8:49   ` Jan Beulich
2019-01-10 17:48     ` Stefano Stabellini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.