On Mon, 2019-09-23 at 11:14 -0700, John Hubbard wrote: > On 9/23/19 10:25 AM, Leonardo Bras wrote: > [...] > That part is all fine, but there are no run-time memory barriers in the > atomic_inc() and atomic_dec() additions, which means that this is not > safe, because memory operations on CPU 1 can be reordered. It's safe > as shown *if* there are memory barriers to keep the order as shown: > > CPU 0 CPU 1 > ------ -------------- > atomic_inc(val) (no run-time memory barrier!) > pmd_clear(pte) > if (val) > run_on_all_cpus(): IPI > local_irq_disable() (also not a mem barrier) > > READ(pte) > if(pte) > walk page tables > > local_irq_enable() (still not a barrier) > atomic_dec(val) > > free(pte) > > thanks, This is serialize: void serialize_against_pte_lookup(struct mm_struct *mm) { smp_mb(); if (running_lockless_pgtbl_walk(mm)) smp_call_function_many(mm_cpumask(mm), do_nothing, NULL, 1); } That would mean: CPU 0 CPU 1 ------ -------------- atomic_inc(val) pmd_clear(pte) smp_mb() if (val) run_on_all_cpus(): IPI local_irq_disable() READ(pte) if(pte) walk page tables local_irq_enable() (still not a barrier) atomic_dec(val) By https://www.kernel.org/doc/Documentation/memory-barriers.txt : 'If you need all the CPUs to see a given store at the same time, use smp_mb().' Is it not enough? Do you suggest adding 'smp_mb()' after atomic_{inc,dec} ?