On 10 Apr 2017, at 4:48, Mel Gorman wrote: > A user reported a bug against a distribution kernel while running > a proprietary workload described as "memory intensive that is not > swapping" that is expected to apply to mainline kernels. The workload > is read/write/modifying ranges of memory and checking the contents. They > reported that within a few hours that a bad PMD would be reported followed > by a memory corruption where expected data was all zeros. A partial report > of the bad PMD looked like > > [ 5195.338482] ../mm/pgtable-generic.c:33: bad pmd ffff8888157ba008(000002e0396009e2) > [ 5195.341184] ------------[ cut here ]------------ > [ 5195.356880] kernel BUG at ../mm/pgtable-generic.c:35! > .... > [ 5195.410033] Call Trace: > [ 5195.410471] [] change_protection_range+0x7dd/0x930 > [ 5195.410716] [] change_prot_numa+0x18/0x30 > [ 5195.410918] [] task_numa_work+0x1fe/0x310 > [ 5195.411200] [] task_work_run+0x72/0x90 > [ 5195.411246] [] exit_to_usermode_loop+0x91/0xc2 > [ 5195.411494] [] prepare_exit_to_usermode+0x31/0x40 > [ 5195.411739] [] retint_user+0x8/0x10 > > Decoding revealed that the PMD was a valid prot_numa PMD and the bad PMD > was a false detection. The bug does not trigger if automatic NUMA balancing > or transparent huge pages is disabled. > > The bug is due a race in change_pmd_range between a pmd_trans_huge and > pmd_nond_or_clear_bad check without any locks held. During the pmd_trans_huge > check, a parallel protection update under lock can have cleared the PMD > and filled it with a prot_numa entry between the transhuge check and the > pmd_none_or_clear_bad check. > > While this could be fixed with heavy locking, it's only necessary to > make a copy of the PMD on the stack during change_pmd_range and avoid > races. A new helper is created for this as the check if quite subtle and the > existing similar helpful is not suitable. This passed 154 hours of testing > (usually triggers between 20 minutes and 24 hours) without detecting bad > PMDs or corruption. A basic test of an autonuma-intensive workload showed > no significant change in behaviour. > > Signed-off-by: Mel Gorman > Cc: stable@vger.kernel.org Does this patch fix the same problem fixed by Kirill's patch here? https://lkml.org/lkml/2017/3/2/347 -- Best Regards Yan Zi