From: Johannes Weiner <hannes@cmpxchg.org> To: John Stultz <john.stultz@linaro.org> Cc: LKML <linux-kernel@vger.kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Android Kernel Team <kernel-team@android.com>, Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>, Hugh Dickins <hughd@google.com>, Dave Hansen <dave@sr71.net>, Rik van Riel <riel@redhat.com>, Dmitry Adamushko <dmitry.adamushko@gmail.com>, Neil Brown <neilb@suse.de>, Andrea Arcangeli <aarcange@redhat.com>, Mike Hommey <mh@glandium.org>, Taras Glek <tglek@mozilla.com>, Jan Kara <jack@suse.cz>, KOSAKI Motohiro <kosaki.motohiro@gmail.com>, Michel Lespinasse <walken@google.com>, Minchan Kim <minchan@kernel.org>, Keith Packard <keithp@keithp.com>, "linux-mm@kvack.org" <linux-mm@kvack.org> Subject: Re: [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!) Date: Tue, 3 Jun 2014 10:57:10 -0400 [thread overview] Message-ID: <20140603145710.GQ2878@cmpxchg.org> (raw) In-Reply-To: <536BBB08.3000503@linaro.org> On Thu, May 08, 2014 at 10:12:40AM -0700, John Stultz wrote: > On 04/29/2014 02:21 PM, John Stultz wrote: > > Another few weeks and another volatile ranges patchset... > > > > After getting the sense that the a major objection to the earlier > > patches was the introduction of a new syscall (and its somewhat > > strange dual length/purged-bit return values), I spent some time > > trying to rework the vma manipulations so we can be we won't fail > > mid-way through changing volatility (basically making it atomic). > > I think I have it working, and thus, there is no longer the > > need for a new syscall, and we can go back to using madvise() > > to set and unset pages as volatile. > > Johannes: To get some feedback, maybe I'll needle you directly here a > bit. :) > > Does moving this interface to madvise help reduce your objections? I > feel like your cleaning-the-dirty-bit idea didn't work out, but I was > hoping that by reworking the vma manipulations to be atomic, we could > move to madvise and still avoid the new syscall that you seemed bothered > by. But I've not really heard much from you recently so I worry your > concerns on this were actually elsewhere, and I'm just churning the > patch needlessly. My objection was not the syscall. >From a reclaim perspective, using the dirty state to denote whether a swap-backed page needs writeback before reclaim is quite natural and I much prefer Minchan's changes to the reclaim code over yours. >From an interface point of view, I would prefer the simplicity of cleaning dirty bits to invalidate pages, and a default of zero-filling invalidated pages instead of sending SIGBUS. This also is quite natural when you think of anon/shmem mappings as cache pages on top of /dev/zero (see mmap_zero() and shmem_zero_setup()). And it translates well to tmpfs. At the same time, I acknowledge that there are usecases that want SIGBUS delivery for more than just convenience in order to implement userspace fault handling, and this is the only place where I see a real divergence in actual functionality from Minchan's code. That, however, truly is a separate virtual memory feature. Would it be possible for you to take MADV_FREE and MADV_REVIVE as a base and implement an madvise op that switches the no-page behavior of a VMA from zero-filling to SIGBUS delivery?
WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org> To: John Stultz <john.stultz@linaro.org> Cc: LKML <linux-kernel@vger.kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Android Kernel Team <kernel-team@android.com>, Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>, Hugh Dickins <hughd@google.com>, Dave Hansen <dave@sr71.net>, Rik van Riel <riel@redhat.com>, Dmitry Adamushko <dmitry.adamushko@gmail.com>, Neil Brown <neilb@suse.de>, Andrea Arcangeli <aarcange@redhat.com>, Mike Hommey <mh@glandium.org>, Taras Glek <tglek@mozilla.com>, Jan Kara <jack@suse.cz>, KOSAKI Motohiro <kosaki.motohiro@gmail.com>, Michel Lespinasse <walken@google.com>, Minchan Kim <minchan@kernel.org>, Keith Packard <keithp@keithp.com>, "linux-mm@kvack.org" <linux-mm@kvack.org> Subject: Re: [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!) Date: Tue, 3 Jun 2014 10:57:10 -0400 [thread overview] Message-ID: <20140603145710.GQ2878@cmpxchg.org> (raw) In-Reply-To: <536BBB08.3000503@linaro.org> On Thu, May 08, 2014 at 10:12:40AM -0700, John Stultz wrote: > On 04/29/2014 02:21 PM, John Stultz wrote: > > Another few weeks and another volatile ranges patchset... > > > > After getting the sense that the a major objection to the earlier > > patches was the introduction of a new syscall (and its somewhat > > strange dual length/purged-bit return values), I spent some time > > trying to rework the vma manipulations so we can be we won't fail > > mid-way through changing volatility (basically making it atomic). > > I think I have it working, and thus, there is no longer the > > need for a new syscall, and we can go back to using madvise() > > to set and unset pages as volatile. > > Johannes: To get some feedback, maybe I'll needle you directly here a > bit. :) > > Does moving this interface to madvise help reduce your objections? I > feel like your cleaning-the-dirty-bit idea didn't work out, but I was > hoping that by reworking the vma manipulations to be atomic, we could > move to madvise and still avoid the new syscall that you seemed bothered > by. But I've not really heard much from you recently so I worry your > concerns on this were actually elsewhere, and I'm just churning the > patch needlessly. My objection was not the syscall. >From a reclaim perspective, using the dirty state to denote whether a swap-backed page needs writeback before reclaim is quite natural and I much prefer Minchan's changes to the reclaim code over yours. >From an interface point of view, I would prefer the simplicity of cleaning dirty bits to invalidate pages, and a default of zero-filling invalidated pages instead of sending SIGBUS. This also is quite natural when you think of anon/shmem mappings as cache pages on top of /dev/zero (see mmap_zero() and shmem_zero_setup()). And it translates well to tmpfs. At the same time, I acknowledge that there are usecases that want SIGBUS delivery for more than just convenience in order to implement userspace fault handling, and this is the only place where I see a real divergence in actual functionality from Minchan's code. That, however, truly is a separate virtual memory feature. Would it be possible for you to take MADV_FREE and MADV_REVIVE as a base and implement an madvise op that switches the no-page behavior of a VMA from zero-filling to SIGBUS delivery? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-06-03 14:57 UTC|newest] Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-04-29 21:21 [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!) John Stultz 2014-04-29 21:21 ` John Stultz 2014-04-29 21:21 ` [PATCH 1/4] swap: Cleanup how special swap file numbers are defined John Stultz 2014-04-29 21:21 ` John Stultz 2014-04-29 21:21 ` [PATCH 2/4] MADV_VOLATILE: Add MADV_VOLATILE/NONVOLATILE hooks and handle marking vmas John Stultz 2014-04-29 21:21 ` John Stultz 2014-05-08 1:21 ` Minchan Kim 2014-05-08 1:21 ` Minchan Kim 2014-05-08 16:38 ` John Stultz 2014-05-08 16:38 ` John Stultz 2014-05-08 23:12 ` Minchan Kim 2014-05-08 23:12 ` Minchan Kim 2014-05-08 23:43 ` John Stultz 2014-05-08 23:43 ` John Stultz 2014-05-09 0:07 ` Minchan Kim 2014-05-09 0:07 ` Minchan Kim 2014-05-09 0:24 ` John Stultz 2014-05-09 0:24 ` John Stultz 2014-05-09 0:41 ` Minchan Kim 2014-05-09 0:41 ` Minchan Kim 2014-04-29 21:21 ` [PATCH 3/4] MADV_VOLATILE: Add purged page detection on setting memory non-volatile John Stultz 2014-04-29 21:21 ` John Stultz 2014-05-08 1:51 ` Minchan Kim 2014-05-08 1:51 ` Minchan Kim 2014-05-08 21:45 ` John Stultz 2014-05-08 21:45 ` John Stultz 2014-05-08 23:45 ` Minchan Kim 2014-05-08 23:45 ` Minchan Kim 2014-04-29 21:21 ` [PATCH 4/4] MADV_VOLATILE: Add page purging logic & SIGBUS trap John Stultz 2014-04-29 21:21 ` John Stultz 2014-05-08 5:16 ` Minchan Kim 2014-05-08 5:16 ` Minchan Kim 2014-05-08 16:39 ` John Stultz 2014-05-08 16:39 ` John Stultz 2014-05-08 5:58 ` [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!) Minchan Kim 2014-05-08 5:58 ` Minchan Kim 2014-05-08 17:04 ` John Stultz 2014-05-08 17:04 ` John Stultz 2014-05-08 23:29 ` Minchan Kim 2014-05-08 23:29 ` Minchan Kim 2014-05-08 17:12 ` John Stultz 2014-05-08 17:12 ` John Stultz 2014-06-03 14:57 ` Johannes Weiner [this message] 2014-06-03 14:57 ` Johannes Weiner 2014-06-16 20:12 ` John Stultz 2014-06-16 20:12 ` John Stultz 2014-06-16 22:24 ` Andrea Arcangeli 2014-06-16 22:24 ` Andrea Arcangeli
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20140603145710.GQ2878@cmpxchg.org \ --to=hannes@cmpxchg.org \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=dave@sr71.net \ --cc=dmitry.adamushko@gmail.com \ --cc=hughd@google.com \ --cc=jack@suse.cz \ --cc=john.stultz@linaro.org \ --cc=keithp@keithp.com \ --cc=kernel-team@android.com \ --cc=kosaki.motohiro@gmail.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mel@csn.ul.ie \ --cc=mh@glandium.org \ --cc=minchan@kernel.org \ --cc=neilb@suse.de \ --cc=riel@redhat.com \ --cc=rlove@google.com \ --cc=tglek@mozilla.com \ --cc=walken@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.