On Thu, 28 Feb 2019, Christopher Lameter wrote: > On Thu, 28 Feb 2019, Paul Walmsley wrote: > > > On Fri, 22 Feb 2019, Christopher Lameter wrote: > > > > > On Fri, 22 Feb 2019, Björn Töpel wrote: > > > > > > > > The problem is that the register needs to be referenced by the amo > > > > > instruction. Can amoadd increment an address relative to a scratch > > > > > register? > > > > > > > > > > > > > It cannot. So, we'll end up with (at least) a preempt disabled region... > > > > > > If we need to do that then we already have so much overhead due to that > > > processing that it may not be worthwhile. lc/sc overhead is minor compared > > > to switching preempt on and off I think. > > > > Is it strictly necessary to disable and re-enable preemption? > > It is necessary if CONFIG_PREEMPT is set because then the execution can be > interrupted at any time and rescheduled on another hardware thread. > > > On a UMA system, it might be preferable to risk the cost of the cache line > > bounce than to flip preemption off and on - assuming there's no other > > impact. > > Its not the cost of a cache line bounce. The counter data will be > corrupted since the RMW operations is not properly serialized. The counter > data will be read from one processor, then the execution context may > change and the writeback may occur either to the new execution context > per cpu area (where we overwrite the old value) or to the old exeuction context > per cpu are (where the counter may already have been incremented by other > code that ran in that context) This is the AMO-based sequence that Palmer wrote in his E-mail: li t0, 1 la t1, counter amoadd.w zero, t0, 0(t1) The RMW takes place in the amoadd, and executes atomically from the point of view of all cores in the cache coherency domain. Or am I misunderstanding you? - Paul