On Wed, Apr 12, 2023 at 11:14 AM Mike Kravetz wrote: > On 04/11/23 17:27, Liu Shixin wrote: > > Patch a873dfe1032a ("mm, hwpoison: try to recover from copy-on write > faults") > > introduced a new copy_user_highpage_mc() function, and fix the kernel > crash > > when the kernel is copying a normal page as the result of a copy-on-write > > fault and runs into an uncorrectable error. But it doesn't work for > HugeTLB. > > Andrew asked about user-visible effects. Perhaps, a better way of > stating this in the commit message might be: > > Commit a873dfe1032a ("mm, hwpoison: try to recover from copy-on write > faults") introduced the routine copy_user_highpage_mc() to gracefully > handle copying of user pages with uncorrectable errors. Previously, > such copies would result in a kernel crash. hugetlb has separate code > paths for copy-on-write and does not benefit from the changes made in > commit a873dfe1032a. > > Modify hugetlb copy-on-write code paths to use copy_mc_user_highpage() > so that they can also gracefully handle uncorrectable errors in user > pages. This involves changing the hugetlb specific routine > ?copy_user_folio()? from type void to int so that it can return an error. > Modify the hugetlb userfaultfd code in the same way so that it can return > -EHWPOISON if it encounters an uncorrectable error. > > NOTE - There is still some churn in the series that introduces > copy_user_folio() and the name may change. > > > This is to support HugeTLB by using copy_mc_user_highpage() in > copy_subpage() > > and copy_user_gigantic_page() too. > > > > Moreover, this is also used by userfaultfd, it will return -EHWPOISON if > > running into an uncorrectable error. > > > > Signed-off-by: Liu Shixin > > --- > > include/linux/mm.h | 6 ++--- > > mm/hugetlb.c | 19 +++++++++++---- > > mm/memory.c | 59 +++++++++++++++++++++++++++++----------------- > > 3 files changed, 56 insertions(+), 28 deletions(-) > > Code changes look good to me. > > Acked-by: Mike Kravetz > > Related question perhaps for Tony not directly impacting this patch. > This patch touches the hugetlb clear page paths withour consequence. > > Just wondering if we can/should create something like > clear_mc_user_highpage > to address clearing pages as well? Apologies if this was previously > discussed. Tony may have better answers but allow me to chime in for this question: Memory related #MC only happens when kernel reads encounter hw uncorrectbale memory errors. Writes(clearing memory page) are “safe” to kernel, at least generating no #MC. So I don’t think clear_user_highpage needs a #MC handled version (or even possible at all). > -- > Mike Kravetz > >