xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11] assorted replacement of x[mz]alloc_bytes()
@ 2021-04-08 12:13 Jan Beulich
  2021-04-08 12:16 ` [PATCH 01/11] x86/HVM: avoid effectively open-coding xzalloc_flex_struct() Jan Beulich
                   ` (11 more replies)
  0 siblings, 12 replies; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, Roger Pau Monné

In the long run I think we want to do away with these type-unsafe
interfaces, the more that they also request (typically) excess
alignment. This series of entirely independent patches is
eliminating the instances where it's relatively clear that they're
not just "blob" allocations.

01: x86/HVM: avoid effectively open-coding xzalloc_flex_struct()
02: x86/vPMU: avoid effectively open-coding xzalloc_flex_struct()
03: x86/MCE: avoid effectively open-coding xmalloc_array()
04: x86/HVM: avoid effectively open-coding xmalloc_array()
05: x86/oprofile: avoid effectively open-coding xmalloc_array()
06: x86/IRQ: avoid over-alignment in alloc_pirq_struct()
07: EFI/runtime: avoid effectively open-coding xmalloc_array()
08: hypfs: avoid effectively open-coding xzalloc_array()
09: kexec: avoid effectively open-coding xzalloc_flex_struct()
10: video/lfb: avoid effectively open-coding xzalloc_array()
11: Arm/optee: don't open-code xzalloc_flex_struct()

Jan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/11] x86/HVM: avoid effectively open-coding xzalloc_flex_struct()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
@ 2021-04-08 12:16 ` Jan Beulich
  2021-04-08 12:20   ` Andrew Cooper
  2021-04-08 12:17 ` [PATCH 02/11] x86/vPMU: " Jan Beulich
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, George Dunlap, Wei Liu, Roger Pau Monné

Drop hvm_irq_size(), which exists for just this purpose.

There is a difference in generated code: xzalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -642,7 +642,8 @@ int hvm_domain_initialise(struct domain
     d->arch.hvm.params = xzalloc_array(uint64_t, HVM_NR_PARAMS);
     d->arch.hvm.io_handler = xzalloc_array(struct hvm_io_handler,
                                            NR_IO_HANDLERS);
-    d->arch.hvm.irq = xzalloc_bytes(hvm_irq_size(nr_gsis));
+    d->arch.hvm.irq = xzalloc_flex_struct(struct hvm_irq,
+                                          gsi_assert_count, nr_gsis);
 
     rc = -ENOMEM;
     if ( !d->arch.hvm.pl_time || !d->arch.hvm.irq ||
--- a/xen/include/asm-x86/hvm/irq.h
+++ b/xen/include/asm-x86/hvm/irq.h
@@ -98,7 +98,6 @@ struct hvm_irq {
 #define hvm_pci_intx_link(dev, intx) \
     (((dev) + (intx)) & 3)
 #define hvm_domain_irq(d) ((d)->arch.hvm.irq)
-#define hvm_irq_size(cnt) offsetof(struct hvm_irq, gsi_assert_count[cnt])
 
 #define hvm_isa_irq_to_gsi(isa_irq) ((isa_irq) ? : 2)
 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 02/11] x86/vPMU: avoid effectively open-coding xzalloc_flex_struct()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
  2021-04-08 12:16 ` [PATCH 01/11] x86/HVM: avoid effectively open-coding xzalloc_flex_struct() Jan Beulich
@ 2021-04-08 12:17 ` Jan Beulich
  2021-04-08 12:25   ` Andrew Cooper
  2021-04-08 12:17 ` [PATCH 03/11] x86/MCE: avoid effectively open-coding xmalloc_array() Jan Beulich
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:17 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Wei Liu, Roger Pau Monné,
	Boris Ostrovsky

There is a difference in generated code: xzalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/cpu/vpmu_intel.c
+++ b/xen/arch/x86/cpu/vpmu_intel.c
@@ -461,10 +461,10 @@ static int core2_vpmu_alloc_resource(str
             goto out_err;
     }
 
-    core2_vpmu_cxt = xzalloc_bytes(sizeof(*core2_vpmu_cxt) +
-                                   sizeof(uint64_t) * fixed_pmc_cnt +
-                                   sizeof(struct xen_pmu_cntr_pair) *
-                                   arch_pmc_cnt);
+    core2_vpmu_cxt = xzalloc_flex_struct(struct xen_pmu_intel_ctxt, regs,
+                                         fixed_pmc_cnt + arch_pmc_cnt *
+                                         (sizeof(struct xen_pmu_cntr_pair) /
+                                          sizeof(*core2_vpmu_cxt->regs)));
     p = xzalloc(uint64_t);
     if ( !core2_vpmu_cxt || !p )
         goto out_err;



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 03/11] x86/MCE: avoid effectively open-coding xmalloc_array()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
  2021-04-08 12:16 ` [PATCH 01/11] x86/HVM: avoid effectively open-coding xzalloc_flex_struct() Jan Beulich
  2021-04-08 12:17 ` [PATCH 02/11] x86/vPMU: " Jan Beulich
@ 2021-04-08 12:17 ` Jan Beulich
  2021-04-08 12:18 ` [PATCH 04/11] x86/HVM: " Jan Beulich
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, George Dunlap, Wei Liu, Roger Pau Monné

There is a difference in generated code: xmalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/cpu/mcheck/mctelem.c
+++ b/xen/arch/x86/cpu/mcheck/mctelem.c
@@ -345,7 +345,7 @@ void __init mctelem_init(unsigned int da
 
 	if ((mctctl.mctc_elems = xmalloc_array(struct mctelem_ent,
 	    MC_NENT)) == NULL ||
-	    (datarr = xmalloc_bytes(MC_NENT * datasz)) == NULL) {
+	    (datarr = xmalloc_array(char, MC_NENT * datasz)) == NULL) {
 		xfree(mctctl.mctc_elems);
 		printk("Allocations for MCA telemetry failed\n");
 		return;



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 04/11] x86/HVM: avoid effectively open-coding xmalloc_array()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
                   ` (2 preceding siblings ...)
  2021-04-08 12:17 ` [PATCH 03/11] x86/MCE: avoid effectively open-coding xmalloc_array() Jan Beulich
@ 2021-04-08 12:18 ` Jan Beulich
  2021-04-08 12:19 ` [PATCH 05/11] x86/oprofile: " Jan Beulich
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:18 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Wei Liu, Roger Pau Monné,
	Paul Durrant

There is a difference in generated code: xmalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1924,7 +1924,7 @@ static int hvmemul_rep_movs(
         dgpa -= bytes - bytes_per_rep;
 
     /* Allocate temporary buffer. Fall back to slow emulation if this fails. */
-    buf = xmalloc_bytes(bytes);
+    buf = xmalloc_array(char, bytes);
     if ( buf == NULL )
         return X86EMUL_UNHANDLEABLE;
 
@@ -2037,7 +2037,7 @@ static int hvmemul_rep_stos(
         for ( ; ; )
         {
             bytes = *reps * bytes_per_rep;
-            buf = xmalloc_bytes(bytes);
+            buf = xmalloc_array(char, bytes);
             if ( buf || *reps <= 1 )
                 break;
             *reps >>= 1;



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 05/11] x86/oprofile: avoid effectively open-coding xmalloc_array()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
                   ` (3 preceding siblings ...)
  2021-04-08 12:18 ` [PATCH 04/11] x86/HVM: " Jan Beulich
@ 2021-04-08 12:19 ` Jan Beulich
  2021-04-08 12:20 ` [PATCH 06/11] x86/IRQ: avoid over-alignment in alloc_pirq_struct() Jan Beulich
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:19 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, George Dunlap, Wei Liu, Roger Pau Monné

There is a difference in generated code: xmalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

While at it also use XFREE() instead of open-coding it and change loop
induction variable types.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/oprofile/nmi_int.c
+++ b/xen/arch/x86/oprofile/nmi_int.c
@@ -142,30 +142,29 @@ static void nmi_save_registers(void * du
 
 static void free_msrs(void)
 {
-	int i;
+	unsigned int i;
+
 	for (i = 0; i < nr_cpu_ids; ++i) {
-		xfree(cpu_msrs[i].counters);
-		cpu_msrs[i].counters = NULL;
-		xfree(cpu_msrs[i].controls);
-		cpu_msrs[i].controls = NULL;
+		XFREE(cpu_msrs[i].counters);
+		XFREE(cpu_msrs[i].controls);
 	}
 }
 
 
 static int allocate_msrs(void)
 {
+	unsigned int i;
 	int success = 1;
-	size_t controls_size = sizeof(struct op_msr) * model->num_controls;
-	size_t counters_size = sizeof(struct op_msr) * model->num_counters;
 
-	int i;
 	for_each_online_cpu (i) {
-		cpu_msrs[i].counters = xmalloc_bytes(counters_size);
+		cpu_msrs[i].counters = xmalloc_array(struct op_msr,
+						     model->num_counters);
 		if (!cpu_msrs[i].counters) {
 			success = 0;
 			break;
 		}
-		cpu_msrs[i].controls = xmalloc_bytes(controls_size);
+		cpu_msrs[i].controls = xmalloc_array(struct op_msr,
+						     model->num_controls);
 		if (!cpu_msrs[i].controls) {
 			success = 0;
 			break;



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/11] x86/HVM: avoid effectively open-coding xzalloc_flex_struct()
  2021-04-08 12:16 ` [PATCH 01/11] x86/HVM: avoid effectively open-coding xzalloc_flex_struct() Jan Beulich
@ 2021-04-08 12:20   ` Andrew Cooper
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Cooper @ 2021-04-08 12:20 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: George Dunlap, Wei Liu, Roger Pau Monné

On 08/04/2021 13:16, Jan Beulich wrote:
> Drop hvm_irq_size(), which exists for just this purpose.
>
> There is a difference in generated code: xzalloc_bytes() forces
> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
> actually don't want it.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 06/11] x86/IRQ: avoid over-alignment in alloc_pirq_struct()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
                   ` (4 preceding siblings ...)
  2021-04-08 12:19 ` [PATCH 05/11] x86/oprofile: " Jan Beulich
@ 2021-04-08 12:20 ` Jan Beulich
  2021-04-08 12:20 ` [PATCH 07/11] EFI/runtime: avoid effectively open-coding xmalloc_array() Jan Beulich
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:20 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, George Dunlap, Wei Liu, Roger Pau Monné

In particular in the PV case xzalloc_bytes() forcing SMP_CACHE_BYTES
alignment is counterproductive, as the allocation size there is only 40
bytes.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1313,9 +1313,12 @@ void cleanup_domain_irq_mapping(struct d
 
 struct pirq *alloc_pirq_struct(struct domain *d)
 {
-    size_t sz = is_hvm_domain(d) ? sizeof(struct pirq) :
-                                   offsetof(struct pirq, arch.hvm);
-    struct pirq *pirq = xzalloc_bytes(sz);
+    union pirq_pv {
+        char space[offsetof(struct pirq, arch.hvm)];
+        void *align;
+    };
+    struct pirq *pirq = is_hvm_domain(d) ? xzalloc(struct pirq)
+                                         : (void *)xzalloc(union pirq_pv);
 
     if ( pirq )
     {



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 07/11] EFI/runtime: avoid effectively open-coding xmalloc_array()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
                   ` (5 preceding siblings ...)
  2021-04-08 12:20 ` [PATCH 06/11] x86/IRQ: avoid over-alignment in alloc_pirq_struct() Jan Beulich
@ 2021-04-08 12:20 ` Jan Beulich
  2021-04-08 12:21 ` [PATCH 08/11] hypfs: avoid effectively open-coding xzalloc_array() Jan Beulich
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:20 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, Roger Pau Monné

There is a difference in generated code: xmalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/efi/runtime.c
+++ b/xen/common/efi/runtime.c
@@ -486,7 +486,7 @@ int efi_runtime_call(struct xenpf_efi_ru
         size = op->u.get_variable.size;
         if ( size )
         {
-            data = xmalloc_bytes(size);
+            data = xmalloc_array(unsigned char, size);
             if ( !data )
             {
                 xfree(name);
@@ -536,7 +536,7 @@ int efi_runtime_call(struct xenpf_efi_ru
             return -EIO;
         }
 
-        data = xmalloc_bytes(op->u.set_variable.size);
+        data = xmalloc_array(unsigned char, op->u.set_variable.size);
         if ( !data )
             rc = -ENOMEM;
         else if ( copy_from_guest(data, op->u.set_variable.data,
@@ -571,7 +571,7 @@ int efi_runtime_call(struct xenpf_efi_ru
             return -EINVAL;
 
         size = op->u.get_next_variable_name.size;
-        name.raw = xzalloc_bytes(size);
+        name.raw = xzalloc_array(unsigned char, size);
         if ( !name.raw )
             return -ENOMEM;
         if ( copy_from_guest(name.raw, op->u.get_next_variable_name.name,



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 08/11] hypfs: avoid effectively open-coding xzalloc_array()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
                   ` (6 preceding siblings ...)
  2021-04-08 12:20 ` [PATCH 07/11] EFI/runtime: avoid effectively open-coding xmalloc_array() Jan Beulich
@ 2021-04-08 12:21 ` Jan Beulich
  2021-04-08 14:28   ` Juergen Gross
  2021-04-08 12:21 ` [PATCH 09/11] kexec: avoid effectively open-coding xzalloc_flex_struct() Jan Beulich
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:21 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Juergen Gross

There is a difference in generated code: xzalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

To avoid the need to add a cast, do away with the only forward-declared
struct hypfs_dyndata.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/hypfs.c
+++ b/xen/common/hypfs.c
@@ -72,7 +72,7 @@ enum hypfs_lock_state {
     hypfs_write_locked
 };
 static DEFINE_PER_CPU(enum hypfs_lock_state, hypfs_locked);
-static DEFINE_PER_CPU(struct hypfs_dyndata *, hypfs_dyndata);
+static DEFINE_PER_CPU(void *, hypfs_dyndata);
 
 static DEFINE_PER_CPU(const struct hypfs_entry *, hypfs_last_node_entered);
 
@@ -160,19 +160,19 @@ static void node_exit_all(void)
 void *hypfs_alloc_dyndata(unsigned long size)
 {
     unsigned int cpu = smp_processor_id();
-    struct hypfs_dyndata **dyndata = &per_cpu(hypfs_dyndata, cpu);
+    void **dyndata = &per_cpu(hypfs_dyndata, cpu);
 
     ASSERT(per_cpu(hypfs_locked, cpu) != hypfs_unlocked);
     ASSERT(*dyndata == NULL);
 
-    *dyndata = xzalloc_bytes(size);
+    *dyndata = xzalloc_array(unsigned char, size);
 
     return *dyndata;
 }
 
 void *hypfs_get_dyndata(void)
 {
-    struct hypfs_dyndata *dyndata = this_cpu(hypfs_dyndata);
+    void *dyndata = this_cpu(hypfs_dyndata);
 
     ASSERT(dyndata);
 
@@ -181,7 +181,7 @@ void *hypfs_get_dyndata(void)
 
 void hypfs_free_dyndata(void)
 {
-    struct hypfs_dyndata **dyndata = &this_cpu(hypfs_dyndata);
+    void **dyndata = &this_cpu(hypfs_dyndata);
 
     XFREE(*dyndata);
 }



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 09/11] kexec: avoid effectively open-coding xzalloc_flex_struct()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
                   ` (7 preceding siblings ...)
  2021-04-08 12:21 ` [PATCH 08/11] hypfs: avoid effectively open-coding xzalloc_array() Jan Beulich
@ 2021-04-08 12:21 ` Jan Beulich
  2021-04-09 12:54   ` Andrew Cooper
  2021-04-08 12:22 ` [PATCH 10/11] video/lfb: avoid effectively open-coding xzalloc_array() Jan Beulich
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:21 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, George Dunlap

There is a difference in generated code: xzalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/common/kexec.c
+++ b/xen/common/kexec.c
@@ -463,7 +463,10 @@ static void * alloc_from_crash_heap(cons
 /* Allocate a crash note buffer for a newly onlined cpu. */
 static int kexec_init_cpu_notes(const unsigned long cpu)
 {
-    Elf_Note * note = NULL;
+    struct elf_notes {
+        Elf_Note first;
+        unsigned char more[];
+    } *notes = NULL;
     int ret = 0;
     int nr_bytes = 0;
 
@@ -477,7 +480,8 @@ static int kexec_init_cpu_notes(const un
 
     /* If we dont care about the position of allocation, malloc. */
     if ( low_crashinfo_mode == LOW_CRASHINFO_NONE )
-        note = xzalloc_bytes(nr_bytes);
+        notes = xzalloc_flex_struct(struct elf_notes, more,
+                                    nr_bytes - sizeof(notes->first));
 
     /* Protect the write into crash_notes[] with a spinlock, as this function
      * is on a hotplug path and a hypercall path. */
@@ -490,26 +494,28 @@ static int kexec_init_cpu_notes(const un
         spin_unlock(&crash_notes_lock);
         /* Always return ok, because whether we successfully allocated or not,
          * another CPU has successfully allocated. */
-        xfree(note);
+        xfree(notes);
     }
     else
     {
         /* If we care about memory possition, alloc from the crash heap,
          * also protected by the crash_notes_lock. */
         if ( low_crashinfo_mode > LOW_CRASHINFO_NONE )
-            note = alloc_from_crash_heap(nr_bytes);
+            notes = alloc_from_crash_heap(nr_bytes);
 
-        crash_notes[cpu].start = note;
+        crash_notes[cpu].start = &notes->first;
         crash_notes[cpu].size = nr_bytes;
         spin_unlock(&crash_notes_lock);
 
         /* If the allocation failed, and another CPU did not beat us, give
          * up with ENOMEM. */
-        if ( ! note )
+        if ( ! notes )
             ret = -ENOMEM;
         /* else all is good so lets set up the notes. */
         else
         {
+            Elf_Note *note = &notes->first;
+
             /* Set up CORE note. */
             setup_note(note, "CORE", NT_PRSTATUS, sizeof(ELF_Prstatus));
             note = ELFNOTE_NEXT(note);



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 10/11] video/lfb: avoid effectively open-coding xzalloc_array()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
                   ` (8 preceding siblings ...)
  2021-04-08 12:21 ` [PATCH 09/11] kexec: avoid effectively open-coding xzalloc_flex_struct() Jan Beulich
@ 2021-04-08 12:22 ` Jan Beulich
  2021-04-08 12:23 ` [PATCH 11/11] Arm/optee: don't open-code xzalloc_flex_struct() Jan Beulich
  2021-04-08 12:57 ` [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Andrew Cooper
  11 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, Roger Pau Monné

There is a difference in generated code: xzalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/drivers/video/lfb.c
+++ b/xen/drivers/video/lfb.c
@@ -147,8 +147,9 @@ int __init lfb_init(struct lfb_prop *lfb
 {
     lfb.lfbp = *lfbp;
 
-    lfb.lbuf = xmalloc_bytes(lfb.lfbp.bytes_per_line);
-    lfb.text_buf = xzalloc_bytes(lfb.lfbp.text_columns * lfb.lfbp.text_rows);
+    lfb.lbuf = xmalloc_array(unsigned char, lfb.lfbp.bytes_per_line);
+    lfb.text_buf = xzalloc_array(unsigned char,
+                                 lfb.lfbp.text_columns * lfb.lfbp.text_rows);
     lfb.line_len = xzalloc_array(unsigned int, lfb.lfbp.text_columns);
 
     if ( !lfb.lbuf || !lfb.text_buf || !lfb.line_len )



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 11/11] Arm/optee: don't open-code xzalloc_flex_struct()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
                   ` (9 preceding siblings ...)
  2021-04-08 12:22 ` [PATCH 10/11] video/lfb: avoid effectively open-coding xzalloc_array() Jan Beulich
@ 2021-04-08 12:23 ` Jan Beulich
  2021-04-13 18:19   ` Julien Grall
  2021-04-08 12:57 ` [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Andrew Cooper
  11 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:23 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Julien Grall, Stefano Stabellini, Volodymyr Babchuk

There is a difference in generated code: xzalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/arm/tee/optee.c
+++ b/xen/arch/arm/tee/optee.c
@@ -529,8 +529,7 @@ static struct optee_shm_buf *allocate_op
     while ( unlikely(old != atomic_cmpxchg(&ctx->optee_shm_buf_pages,
                                            old, new)) );
 
-    optee_shm_buf = xzalloc_bytes(sizeof(struct optee_shm_buf) +
-                                  pages_cnt * sizeof(struct page *));
+    optee_shm_buf = xzalloc_flex_struct(struct optee_shm_buf, pages, pages_cnt);
     if ( !optee_shm_buf )
     {
         err_code = -ENOMEM;



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/11] x86/vPMU: avoid effectively open-coding xzalloc_flex_struct()
  2021-04-08 12:17 ` [PATCH 02/11] x86/vPMU: " Jan Beulich
@ 2021-04-08 12:25   ` Andrew Cooper
  2021-04-08 12:29     ` Jan Beulich
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Cooper @ 2021-04-08 12:25 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Wei Liu, Roger Pau Monné, Boris Ostrovsky

On 08/04/2021 13:17, Jan Beulich wrote:
> There is a difference in generated code: xzalloc_bytes() forces
> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
> actually don't want it.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

>
> --- a/xen/arch/x86/cpu/vpmu_intel.c
> +++ b/xen/arch/x86/cpu/vpmu_intel.c
> @@ -461,10 +461,10 @@ static int core2_vpmu_alloc_resource(str
>              goto out_err;
>      }
>  
> -    core2_vpmu_cxt = xzalloc_bytes(sizeof(*core2_vpmu_cxt) +
> -                                   sizeof(uint64_t) * fixed_pmc_cnt +
> -                                   sizeof(struct xen_pmu_cntr_pair) *
> -                                   arch_pmc_cnt);
> +    core2_vpmu_cxt = xzalloc_flex_struct(struct xen_pmu_intel_ctxt, regs,
> +                                         fixed_pmc_cnt + arch_pmc_cnt *
> +                                         (sizeof(struct xen_pmu_cntr_pair) /
> +                                          sizeof(*core2_vpmu_cxt->regs)));
>      p = xzalloc(uint64_t);

However, this is very wtf, and clearly wants reworking.  I'll see if I
can find some time, unless anyone else beats me to it.

~Andrew

>      if ( !core2_vpmu_cxt || !p )
>          goto out_err;
>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/11] x86/vPMU: avoid effectively open-coding xzalloc_flex_struct()
  2021-04-08 12:25   ` Andrew Cooper
@ 2021-04-08 12:29     ` Jan Beulich
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 12:29 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, Wei Liu, Roger Pau Monné, Boris Ostrovsky, xen-devel

On 08.04.2021 14:25, Andrew Cooper wrote:
> On 08/04/2021 13:17, Jan Beulich wrote:
>> There is a difference in generated code: xzalloc_bytes() forces
>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
>> actually don't want it.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks.

>> --- a/xen/arch/x86/cpu/vpmu_intel.c
>> +++ b/xen/arch/x86/cpu/vpmu_intel.c
>> @@ -461,10 +461,10 @@ static int core2_vpmu_alloc_resource(str
>>              goto out_err;
>>      }
>>  
>> -    core2_vpmu_cxt = xzalloc_bytes(sizeof(*core2_vpmu_cxt) +
>> -                                   sizeof(uint64_t) * fixed_pmc_cnt +
>> -                                   sizeof(struct xen_pmu_cntr_pair) *
>> -                                   arch_pmc_cnt);
>> +    core2_vpmu_cxt = xzalloc_flex_struct(struct xen_pmu_intel_ctxt, regs,
>> +                                         fixed_pmc_cnt + arch_pmc_cnt *
>> +                                         (sizeof(struct xen_pmu_cntr_pair) /
>> +                                          sizeof(*core2_vpmu_cxt->regs)));
>>      p = xzalloc(uint64_t);
> 
> However, this is very wtf, and clearly wants reworking.  I'll see if I
> can find some time, unless anyone else beats me to it.

I thought so too while seeing this, but it's quite a bit more of a
rework than I felt I'd like to tackle right away.

Jan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/11] assorted replacement of x[mz]alloc_bytes()
  2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
                   ` (10 preceding siblings ...)
  2021-04-08 12:23 ` [PATCH 11/11] Arm/optee: don't open-code xzalloc_flex_struct() Jan Beulich
@ 2021-04-08 12:57 ` Andrew Cooper
  2021-04-08 14:12   ` Jan Beulich
  11 siblings, 1 reply; 25+ messages in thread
From: Andrew Cooper @ 2021-04-08 12:57 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Wei Liu, Roger Pau Monné

On 08/04/2021 13:13, Jan Beulich wrote:
> In the long run I think we want to do away with these type-unsafe
> interfaces, the more that they also request (typically) excess
> alignment. This series of entirely independent patches is
> eliminating the instances where it's relatively clear that they're
> not just "blob" allocations.
>
>
> 03: x86/MCE: avoid effectively open-coding xmalloc_array()
> 04: x86/HVM: avoid effectively open-coding xmalloc_array()
> 05: x86/oprofile: avoid effectively open-coding xmalloc_array()
> 06: x86/IRQ: avoid over-alignment in alloc_pirq_struct()
> 07: EFI/runtime: avoid effectively open-coding xmalloc_array()
> 08: hypfs: avoid effectively open-coding xzalloc_array()
> 10: video/lfb: avoid effectively open-coding xzalloc_array()

The flex conversions are fine, but I am unconvinced by argument for
interchanging array() and bytes().

The cacheline size is 64 bytes, and the minimum allocation size is 16,
plus a bhdr overhead of 32 bytes, so you're already at most of a
cacheline for a nominally-zero sized allocation.

There can however be a severe penalty from cacheline sharing, which is
why the bytes() form does have a minimum alignment.  There is one
xmalloc heap shared across the entire system, so you've got no idea what
might be sharing the same cache line for sub-line allocations.

We should not support sub-line allocations IMO.  The savings is a
handful of bytes at best, and some horrible performance cliffs to
avoid.  People running virtualisation are not going to be ram
constrained to the order of a few bytes.

For small allocations which don't require specific alignment, then
putting bhdr and the allocation in the same line is fine (if we don't do
this already), but we shouldn't be in the position of having two bhdr's
in the same cache line, even if there are plenty of single-byte
allocations in the theoretical worst case.

~Andrew



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/11] assorted replacement of x[mz]alloc_bytes()
  2021-04-08 12:57 ` [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Andrew Cooper
@ 2021-04-08 14:12   ` Jan Beulich
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2021-04-08 14:12 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Wei Liu, Roger Pau Monné,
	xen-devel

On 08.04.2021 14:57, Andrew Cooper wrote:
> On 08/04/2021 13:13, Jan Beulich wrote:
>> In the long run I think we want to do away with these type-unsafe
>> interfaces, the more that they also request (typically) excess
>> alignment. This series of entirely independent patches is
>> eliminating the instances where it's relatively clear that they're
>> not just "blob" allocations.
>>
>>
>> 03: x86/MCE: avoid effectively open-coding xmalloc_array()
>> 04: x86/HVM: avoid effectively open-coding xmalloc_array()
>> 05: x86/oprofile: avoid effectively open-coding xmalloc_array()
>> 06: x86/IRQ: avoid over-alignment in alloc_pirq_struct()
>> 07: EFI/runtime: avoid effectively open-coding xmalloc_array()
>> 08: hypfs: avoid effectively open-coding xzalloc_array()
>> 10: video/lfb: avoid effectively open-coding xzalloc_array()
> 
> The flex conversions are fine, but I am unconvinced by argument for
> interchanging array() and bytes().
> 
> The cacheline size is 64 bytes, and the minimum allocation size is 16,
> plus a bhdr overhead of 32 bytes, so you're already at most of a
> cacheline for a nominally-zero sized allocation.

But you're aware that the alignment x[mz]alloc_bytes() forces is
128 bytes? Plus, while sizeof(struct bhdr) is indeed 32, the
overhead on allocated blocks is

#define BHDR_OVERHEAD   (sizeof(struct bhdr) - MIN_BLOCK_SIZE)

i.e. 16 (i.e. the other half of the 32 is already the minimum
block size of 16 that you also mention). IOW a cacheline sized
block would yield 48 bytes of usable space. Specifically a
meaningful change in the PV case from what patch 06 does, where
we only need 40 bytes.

> There can however be a severe penalty from cacheline sharing, which is
> why the bytes() form does have a minimum alignment.  There is one
> xmalloc heap shared across the entire system, so you've got no idea what
> might be sharing the same cache line for sub-line allocations.

This would call for distinguishing short-lived allocations (and
ones to be used mainly from a single CPU) from long-lived ones
having system wide use. I.e. a request to gain further allocation
function flavors, when already the introduction of the one new
xv[mz]alloc() has caused long-winded discussions with (so far) no
real outcome.

> We should not support sub-line allocations IMO.  The savings is a
> handful of bytes at best, and some horrible performance cliffs to
> avoid.  People running virtualisation are not going to be ram
> constrained to the order of a few bytes.
> 
> For small allocations which don't require specific alignment, then
> putting bhdr and the allocation in the same line is fine (if we don't do
> this already), but we shouldn't be in the position of having two bhdr's
> in the same cache line, even if there are plenty of single-byte
> allocations in the theoretical worst case.

That's a request to tweak allocator internals then, not an argument
against the conversions done here.

Jan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 08/11] hypfs: avoid effectively open-coding xzalloc_array()
  2021-04-08 12:21 ` [PATCH 08/11] hypfs: avoid effectively open-coding xzalloc_array() Jan Beulich
@ 2021-04-08 14:28   ` Juergen Gross
  0 siblings, 0 replies; 25+ messages in thread
From: Juergen Gross @ 2021-04-08 14:28 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, Roger Pau Monné


[-- Attachment #1.1.1: Type: text/plain, Size: 428 bytes --]

On 08.04.21 14:21, Jan Beulich wrote:
> There is a difference in generated code: xzalloc_bytes() forces
> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
> actually don't want it.
> 
> To avoid the need to add a cast, do away with the only forward-declared
> struct hypfs_dyndata.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 09/11] kexec: avoid effectively open-coding xzalloc_flex_struct()
  2021-04-08 12:21 ` [PATCH 09/11] kexec: avoid effectively open-coding xzalloc_flex_struct() Jan Beulich
@ 2021-04-09 12:54   ` Andrew Cooper
  2021-04-09 13:23     ` Jan Beulich
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Cooper @ 2021-04-09 12:54 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: George Dunlap

On 08/04/2021 13:21, Jan Beulich wrote:
> There is a difference in generated code: xzalloc_bytes() forces
> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
> actually don't want it.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

This honestly looks like a backwards step.  In particular, ...

>
> --- a/xen/common/kexec.c
> +++ b/xen/common/kexec.c
> @@ -463,7 +463,10 @@ static void * alloc_from_crash_heap(cons
>  /* Allocate a crash note buffer for a newly onlined cpu. */
>  static int kexec_init_cpu_notes(const unsigned long cpu)
>  {
> -    Elf_Note * note = NULL;
> +    struct elf_notes {
> +        Elf_Note first;
> +        unsigned char more[];
> +    } *notes = NULL;
>      int ret = 0;
>      int nr_bytes = 0;
>  
> @@ -477,7 +480,8 @@ static int kexec_init_cpu_notes(const un
>  
>      /* If we dont care about the position of allocation, malloc. */
>      if ( low_crashinfo_mode == LOW_CRASHINFO_NONE )
> -        note = xzalloc_bytes(nr_bytes);
> +        notes = xzalloc_flex_struct(struct elf_notes, more,
> +                                    nr_bytes - sizeof(notes->first));

... this is far more error prone than the code you replaced, seeing as
there is now a chance that it can underflow.

~Andrew



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 09/11] kexec: avoid effectively open-coding xzalloc_flex_struct()
  2021-04-09 12:54   ` Andrew Cooper
@ 2021-04-09 13:23     ` Jan Beulich
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2021-04-09 13:23 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, xen-devel

On 09.04.2021 14:54, Andrew Cooper wrote:
> On 08/04/2021 13:21, Jan Beulich wrote:
>> There is a difference in generated code: xzalloc_bytes() forces
>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
>> actually don't want it.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> This honestly looks like a backwards step.  In particular, ...
> 
>>
>> --- a/xen/common/kexec.c
>> +++ b/xen/common/kexec.c
>> @@ -463,7 +463,10 @@ static void * alloc_from_crash_heap(cons
>>  /* Allocate a crash note buffer for a newly onlined cpu. */
>>  static int kexec_init_cpu_notes(const unsigned long cpu)
>>  {
>> -    Elf_Note * note = NULL;
>> +    struct elf_notes {
>> +        Elf_Note first;
>> +        unsigned char more[];
>> +    } *notes = NULL;
>>      int ret = 0;
>>      int nr_bytes = 0;
>>  
>> @@ -477,7 +480,8 @@ static int kexec_init_cpu_notes(const un
>>  
>>      /* If we dont care about the position of allocation, malloc. */
>>      if ( low_crashinfo_mode == LOW_CRASHINFO_NONE )
>> -        note = xzalloc_bytes(nr_bytes);
>> +        notes = xzalloc_flex_struct(struct elf_notes, more,
>> +                                    nr_bytes - sizeof(notes->first));
> 
> ... this is far more error prone than the code you replaced, seeing as
> there is now a chance that it can underflow.

You're kidding, aren't you? sizeof_cpu_notes() can't possibly return
a value which would allow underflow here. There would be bigger
issues than the underflow if any such happened, in particular with
any theoretical underflow here simply resulting in the allocation to
fail.

The original code is type-unsafe and requests more alignment than
necessary. Besides objecting to my present way of addressing this,
I would find it rather helpful if you could mention some kind of
alternative you'd consider acceptable. As said in the cover letter,
I think mid-term we ought to do away with xmalloc_bytes(). Hence
some alternative would be needed here anyway.

Jan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 11/11] Arm/optee: don't open-code xzalloc_flex_struct()
  2021-04-08 12:23 ` [PATCH 11/11] Arm/optee: don't open-code xzalloc_flex_struct() Jan Beulich
@ 2021-04-13 18:19   ` Julien Grall
  2021-04-14  7:03     ` Jan Beulich
  0 siblings, 1 reply; 25+ messages in thread
From: Julien Grall @ 2021-04-13 18:19 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Stefano Stabellini, Volodymyr Babchuk

Hi Jan,

On 08/04/2021 13:23, Jan Beulich wrote:
> There is a difference in generated code: xzalloc_bytes() forces
> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
> actually don't want it.

So I think moving to xmalloc_flex_struct() is a pretty good move. But I 
am actually a bit confused with the argument used.

Could you provide some details why you think forcing the array to be 
aligned to the maximum cache line supported (128 bytes on Arm) is wrong?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 11/11] Arm/optee: don't open-code xzalloc_flex_struct()
  2021-04-13 18:19   ` Julien Grall
@ 2021-04-14  7:03     ` Jan Beulich
  2021-04-15 10:26       ` Julien Grall
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-04-14  7:03 UTC (permalink / raw)
  To: Julien Grall
  Cc: George Dunlap, Stefano Stabellini, Volodymyr Babchuk, xen-devel

On 13.04.2021 20:19, Julien Grall wrote:
> On 08/04/2021 13:23, Jan Beulich wrote:
>> There is a difference in generated code: xzalloc_bytes() forces
>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
>> actually don't want it.
> 
> So I think moving to xmalloc_flex_struct() is a pretty good move. But I 
> am actually a bit confused with the argument used.
> 
> Could you provide some details why you think forcing the array to be 
> aligned to the maximum cache line supported (128 bytes on Arm) is wrong?

It is not "wrong" in that sense, but if this is intended it shouldn't
be arranged via use of xmalloc_bytes(). As also pointed out in a
similar discussion on another sub-thread, imo xmalloc_bytes(), being
type-unsafe, should go away altogether mid-term. If individual callers
have specific alignment requirements (which ought to be the exception),
they should explicitly request the needed alignment. If architectures
would prefer all allocations to have certain minimum alignment (e.g.
to avoid cacheline sharing, which was Andrew's argument) or other
"arrangement" (alignment by itself may not be that interesting due to
the bhdr placed ahead of the allocation), it should be the allocator
itself that provides for this, not individual callers.

Jan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 11/11] Arm/optee: don't open-code xzalloc_flex_struct()
  2021-04-14  7:03     ` Jan Beulich
@ 2021-04-15 10:26       ` Julien Grall
  2021-04-15 11:02         ` Jan Beulich
  0 siblings, 1 reply; 25+ messages in thread
From: Julien Grall @ 2021-04-15 10:26 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Stefano Stabellini, Volodymyr Babchuk, xen-devel

Hi Jan,

On 14/04/2021 08:03, Jan Beulich wrote:
> On 13.04.2021 20:19, Julien Grall wrote:
>> On 08/04/2021 13:23, Jan Beulich wrote:
>>> There is a difference in generated code: xzalloc_bytes() forces
>>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
>>> actually don't want it.
>>
>> So I think moving to xmalloc_flex_struct() is a pretty good move. But I
>> am actually a bit confused with the argument used.
>>
>> Could you provide some details why you think forcing the array to be
>> aligned to the maximum cache line supported (128 bytes on Arm) is wrong?
> 
> It is not "wrong" in that sense, but if this is intended it shouldn't
> be arranged via use of xmalloc_bytes().

This is not very clear from the commit message (or even the cover 
letter). How about:

"
The current use xzalloc_bytes() in optee is nearly an open-coded version 
of xzalloc_flex_struct() which was introduced after the driver was merged.

The main difference is xzalloc_bytes() will also force the allocation to 
be SMP_CACHE_BYTES aligned and therefore avoid sharing the cache line.

While sharing the cache line can have an impact on the performance, this 
is also true for most of the other users of  x*alloc_flex_struct(). So 
if we want to prevent sharing a cache line, it should be part of 
x*alloc_flex_struct().

In this case, we don't need stricter alignment than what the allocator 
does. So the call to xzalloc_bytes() is now replaced with a call 
xzalloc_flex_Struct().
"

Ideally, we want the same sort of the commit message in the other patches.

> As also pointed out in a
> similar discussion on another sub-thread, imo xmalloc_bytes(), being
> type-unsafe, should go away altogether mid-term.

And I will support dropping xmalloc_bytes().

> If individual callers
> have specific alignment requirements (which ought to be the exception),
> they should explicitly request the needed alignment. If architectures
> would prefer all allocations to have certain minimum alignment (e.g.
> to avoid cacheline sharing, which was Andrew's argument) or other
> "arrangement" (alignment by itself may not be that interesting due to
> the bhdr placed ahead of the allocation), it should be the allocator
> itself that provides for this, not individual callers.

And I agree that the allocator should do the alignment if this benefits 
every allocation.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 11/11] Arm/optee: don't open-code xzalloc_flex_struct()
  2021-04-15 10:26       ` Julien Grall
@ 2021-04-15 11:02         ` Jan Beulich
  2021-04-15 11:31           ` Julien Grall
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-04-15 11:02 UTC (permalink / raw)
  To: Julien Grall
  Cc: George Dunlap, Stefano Stabellini, Volodymyr Babchuk, xen-devel

On 15.04.2021 12:26, Julien Grall wrote:
> Hi Jan,
> 
> On 14/04/2021 08:03, Jan Beulich wrote:
>> On 13.04.2021 20:19, Julien Grall wrote:
>>> On 08/04/2021 13:23, Jan Beulich wrote:
>>>> There is a difference in generated code: xzalloc_bytes() forces
>>>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
>>>> actually don't want it.
>>>
>>> So I think moving to xmalloc_flex_struct() is a pretty good move. But I
>>> am actually a bit confused with the argument used.
>>>
>>> Could you provide some details why you think forcing the array to be
>>> aligned to the maximum cache line supported (128 bytes on Arm) is wrong?
>>
>> It is not "wrong" in that sense, but if this is intended it shouldn't
>> be arranged via use of xmalloc_bytes().
> 
> This is not very clear from the commit message (or even the cover 
> letter). How about:
> 
> "
> The current use xzalloc_bytes() in optee is nearly an open-coded version 
> of xzalloc_flex_struct() which was introduced after the driver was merged.
> 
> The main difference is xzalloc_bytes() will also force the allocation to 
> be SMP_CACHE_BYTES aligned and therefore avoid sharing the cache line.
> 
> While sharing the cache line can have an impact on the performance, this 
> is also true for most of the other users of  x*alloc_flex_struct(). So 
> if we want to prevent sharing a cache line, it should be part of 
> x*alloc_flex_struct().
> 
> In this case, we don't need stricter alignment than what the allocator 
> does. So the call to xzalloc_bytes() is now replaced with a call 
> xzalloc_flex_Struct().
> "

Well, okay, I've inserted a slightly edited version of this into
the patch. But ...

> Ideally, we want the same sort of the commit message in the other patches.

... I disagree here: In particular because of the recurring
pattern, I dislike repeated overly verbose descriptions. I could
perhaps see them as being not seen as overly verbose when looking
at every commit on its own (like would happen when someone tries
to do some archaeology later on), but in the context of such a
series this is potentially harmful: If almost a dozen patches
have almost the same sufficiently verbose description, possible
differences may not be paid attention to.

Plus, granted possibly controversial as well, I'm afraid I'm not
happy to (repeatedly) state obvious facts. The abuse (in almost
all cases) of x[mz]alloc_bytes() is, afaict, in no way attributed
to the resulting higher alignment, but rather in the desire to
get away easily without needing to think about using a type-safe
flavor. This said I will admit that prior to the introduction of
x[mz]alloc_flex_struct() I can see how alternatives could quickly
have got quite ugly. And for the few (if any) users which
actually care about this higher alignment, it should have been
noted at the time of introducing the use that the specific need
for certain alignment shouldn't be implied by using this function,
but rather be made explicit. This is even more so as it's not
written down anywhere that x[mz]alloc_bytes() guarantees a certain
alignment (i.e. if the plan wasn't anyway to phase out its use, it
should have been permissible to change the alignment it requests
from the underlying implementation without first auditing all
users).

Jan


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 11/11] Arm/optee: don't open-code xzalloc_flex_struct()
  2021-04-15 11:02         ` Jan Beulich
@ 2021-04-15 11:31           ` Julien Grall
  0 siblings, 0 replies; 25+ messages in thread
From: Julien Grall @ 2021-04-15 11:31 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, Stefano Stabellini, Volodymyr Babchuk, xen-devel



On 15/04/2021 12:02, Jan Beulich wrote:
> On 15.04.2021 12:26, Julien Grall wrote:
>> Hi Jan,
>>
>> On 14/04/2021 08:03, Jan Beulich wrote:
>>> On 13.04.2021 20:19, Julien Grall wrote:
>>>> On 08/04/2021 13:23, Jan Beulich wrote:
>>>>> There is a difference in generated code: xzalloc_bytes() forces
>>>>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
>>>>> actually don't want it.
>>>>
>>>> So I think moving to xmalloc_flex_struct() is a pretty good move. But I
>>>> am actually a bit confused with the argument used.
>>>>
>>>> Could you provide some details why you think forcing the array to be
>>>> aligned to the maximum cache line supported (128 bytes on Arm) is wrong?
>>>
>>> It is not "wrong" in that sense, but if this is intended it shouldn't
>>> be arranged via use of xmalloc_bytes().
>>
>> This is not very clear from the commit message (or even the cover
>> letter). How about:
>>
>> "
>> The current use xzalloc_bytes() in optee is nearly an open-coded version
>> of xzalloc_flex_struct() which was introduced after the driver was merged.
>>
>> The main difference is xzalloc_bytes() will also force the allocation to
>> be SMP_CACHE_BYTES aligned and therefore avoid sharing the cache line.
>>
>> While sharing the cache line can have an impact on the performance, this
>> is also true for most of the other users of  x*alloc_flex_struct(). So
>> if we want to prevent sharing a cache line, it should be part of
>> x*alloc_flex_struct().
>>
>> In this case, we don't need stricter alignment than what the allocator
>> does. So the call to xzalloc_bytes() is now replaced with a call
>> xzalloc_flex_Struct().
>> "
> 
> Well, okay, I've inserted a slightly edited version of this into
> the patch. But ...
> 
>> Ideally, we want the same sort of the commit message in the other patches.
> 
> ... I disagree here: In particular because of the recurring
> pattern, I dislike repeated overly verbose descriptions. I could
> perhaps see them as being not seen as overly verbose when looking
> at every commit on its own (like would happen when someone tries
> to do some archaeology later on), but in the context of such a
> series this is potentially harmful: If almost a dozen patches
> have almost the same sufficiently verbose description, possible
> differences may not be paid attention to.

There are two part to take into account:

1) Reviewer
2) Future reader

For reviewer, I agree that the information is redundant. Although, it is 
not like the cover letter (in this case) contain more information about 
the reasoning... ;)

However, such information is not redundant for a future reader. Patches 
in a series may be applied separately, in a different order, title 
mangled...

> 
> Plus, granted possibly controversial as well, I'm afraid I'm not
> happy to (repeatedly) state obvious facts.
I should probably remind you that what's obvious for you may not be for 
the other.

In particular, writing "I think we only don't need this here, but 
actually don't want it" is very unhelpful because it can be interpreted 
differently (my original answer is an example).

It doesn't cost much to write clear commit message. But it will cost a 
lot for the future reader (or even reviewer) to figure out the original 
intention.

> The abuse (in almost
> all cases) of x[mz]alloc_bytes() is, afaict, in no way attributed
> to the resulting higher alignment, but rather in the desire to
> get away easily without needing to think about using a type-safe
> flavor. This said I will admit that prior to the introduction of
> x[mz]alloc_flex_struct() I can see how alternatives could quickly
> have got quite ugly. And for the few (if any) users which
> actually care about this higher alignment, it should have been
> noted at the time of introducing the use that the specific need
> for certain alignment shouldn't be implied by using this function,
> but rather be made explicit. This is even more so as it's not
> written down anywhere that x[mz]alloc_bytes() guarantees a certain
> alignment (i.e. if the plan wasn't anyway to phase out its use, it
> should have been permissible to change the alignment it requests
> from the underlying implementation without first auditing all
> users).
I don't think it is really helpful to argue on whether the developper 
did it by lazyness/lack of helpers or else.

As I wrote, dropping xmalloc_bytes() is a good move. But we should 
documenting our reasoning rather than hoping that everyone know your 
"obvious facts".

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-04-15 11:31 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-08 12:13 [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Jan Beulich
2021-04-08 12:16 ` [PATCH 01/11] x86/HVM: avoid effectively open-coding xzalloc_flex_struct() Jan Beulich
2021-04-08 12:20   ` Andrew Cooper
2021-04-08 12:17 ` [PATCH 02/11] x86/vPMU: " Jan Beulich
2021-04-08 12:25   ` Andrew Cooper
2021-04-08 12:29     ` Jan Beulich
2021-04-08 12:17 ` [PATCH 03/11] x86/MCE: avoid effectively open-coding xmalloc_array() Jan Beulich
2021-04-08 12:18 ` [PATCH 04/11] x86/HVM: " Jan Beulich
2021-04-08 12:19 ` [PATCH 05/11] x86/oprofile: " Jan Beulich
2021-04-08 12:20 ` [PATCH 06/11] x86/IRQ: avoid over-alignment in alloc_pirq_struct() Jan Beulich
2021-04-08 12:20 ` [PATCH 07/11] EFI/runtime: avoid effectively open-coding xmalloc_array() Jan Beulich
2021-04-08 12:21 ` [PATCH 08/11] hypfs: avoid effectively open-coding xzalloc_array() Jan Beulich
2021-04-08 14:28   ` Juergen Gross
2021-04-08 12:21 ` [PATCH 09/11] kexec: avoid effectively open-coding xzalloc_flex_struct() Jan Beulich
2021-04-09 12:54   ` Andrew Cooper
2021-04-09 13:23     ` Jan Beulich
2021-04-08 12:22 ` [PATCH 10/11] video/lfb: avoid effectively open-coding xzalloc_array() Jan Beulich
2021-04-08 12:23 ` [PATCH 11/11] Arm/optee: don't open-code xzalloc_flex_struct() Jan Beulich
2021-04-13 18:19   ` Julien Grall
2021-04-14  7:03     ` Jan Beulich
2021-04-15 10:26       ` Julien Grall
2021-04-15 11:02         ` Jan Beulich
2021-04-15 11:31           ` Julien Grall
2021-04-08 12:57 ` [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() Andrew Cooper
2021-04-08 14:12   ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).