On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote: > The biggest difference is that we have "mark_page_accessed()" show up > after, and not before. There was also a lot of LRU noise in the > non-profile data. I wonder if that is the reason here: the old model > of using generic_perform_write/block_page_mkwrite didn't mark the > pages accessed, and now with iomap_file_buffered_write() they get > marked as active and that screws up the LRU list, and makes us not > flush out the dirty pages well (because they are seen as active and > not good for writeback), and then you get bad memory use. And that's actually a "bug" in the new code - mostly because I failed to pick up changes to the core code happening after we 'forked' it, in this case commit 2457ae ("mm: non-atomically mark page accessed during page cache allocation where possible"). The one liner below (not tested yet) to simply remove it should fix that up. I also noticed we have a spurious pagefault_disable/enable, I need to dig into the history of that first, though. diff --git a/fs/iomap.c b/fs/iomap.c index 48141b8..f39c318 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -199,7 +199,6 @@ again: pagefault_enable(); flush_dcache_page(page); - mark_page_accessed(page); status = iomap_write_end(inode, pos, bytes, copied, page); if (unlikely(status < 0))