> On Aug 2, 2019, at 12:14 PM, Michal Hocko wrote: > > On Fri 02-08-19 11:00:55, Masoud Sharbiani wrote: >> >> >>> On Aug 2, 2019, at 7:41 AM, Michal Hocko wrote: >>> >>> On Fri 02-08-19 07:18:17, Masoud Sharbiani wrote: >>>> >>>> >>>>> On Aug 2, 2019, at 12:40 AM, Michal Hocko wrote: >>>>> >>>>> On Thu 01-08-19 11:04:14, Masoud Sharbiani wrote: >>>>>> Hey folks, >>>>>> I’ve come across an issue that affects most of 4.19, 4.20 and 5.2 linux-stable kernels that has only been fixed in 5.3-rc1. >>>>>> It was introduced by >>>>>> >>>>>> 29ef680 memcg, oom: move out_of_memory back to the charge path >>>>> >>>>> This commit shouldn't really change the OOM behavior for your particular >>>>> test case. It would have changed MAP_POPULATE behavior but your usage is >>>>> triggering the standard page fault path. The only difference with >>>>> 29ef680 is that the OOM killer is invoked during the charge path rather >>>>> than on the way out of the page fault. >>>>> >>>>> Anyway, I tried to run your test case in a loop and leaker always ends >>>>> up being killed as expected with 5.2. See the below oom report. There >>>>> must be something else going on. How much swap do you have on your >>>>> system? >>>> >>>> I do not have swap defined. >>> >>> OK, I have retested with swap disabled and again everything seems to be >>> working as expected. The oom happens earlier because I do not have to >>> wait for the swap to get full. >>> >> >> In my tests (with the script provided), it only loops 11 iterations before hanging, and uttering the soft lockup message. >> >> >>> Which fs do you use to write the file that you mmap? >> >> /dev/sda3 on / type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota) >> >> Part of the soft lockup path actually specifies that it is going through __xfs_filemap_fault(): > > Right, I have just missed that. > > [...] > >> If I switch the backing file to a ext4 filesystem (separate hard drive), it OOMs. >> >> >> If I switch the file used to /dev/zero, it OOMs: >> … >> Todal sum was 0. Loop count is 11 >> Buffer is @ 0x7f2b66c00000 >> ./test-script-devzero.sh: line 16: 3561 Killed ./leaker -p 10240 -c 100000 >> >> >>> Or could you try to >>> simplify your test even further? E.g. does everything work as expected >>> when doing anonymous mmap rather than file backed one? >> >> It also OOMs with MAP_ANON. >> >> Hope that helps. > > It helps to focus more on the xfs reclaim path. Just to be sure, is > there any difference if you use cgroup v2? I do not expect to be but > just to be sure there are no v1 artifacts. I was unable to use cgroups2. I’ve created the new control group, but the attempt to move a running process into it fails with ‘Device or resource busy’. Masoud > -- > Michal Hocko > SUSE Labs