From: John Stultz <john.stultz@linaro.org> To: Johannes Weiner <hannes@cmpxchg.org> Cc: LKML <linux-kernel@vger.kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Android Kernel Team <kernel-team@android.com>, Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>, Hugh Dickins <hughd@google.com>, Dave Hansen <dave@sr71.net>, Rik van Riel <riel@redhat.com>, Dmitry Adamushko <dmitry.adamushko@gmail.com>, Neil Brown <neilb@suse.de>, Andrea Arcangeli <aarcange@redhat.com>, Mike Hommey <mh@glandium.org>, Taras Glek <tglek@mozilla.com>, Jan Kara <jack@suse.cz>, KOSAKI Motohiro <kosaki.motohiro@gmail.com>, Michel Lespinasse <walken@google.com>, Minchan Kim <minchan@kernel.org>, Keith Packard <keithp@keithp.com>, "linux-mm@kvack.org" <linux-mm@kvack.org> Subject: Re: [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!) Date: Mon, 16 Jun 2014 13:12:41 -0700 [thread overview] Message-ID: <CALAqxLXBs0scEEb7-rYbq9vHHs8VWUg-9vFXDoK4mzUt4smbYw@mail.gmail.com> (raw) In-Reply-To: <20140603145710.GQ2878@cmpxchg.org> On Tue, Jun 3, 2014 at 7:57 AM, Johannes Weiner <hannes@cmpxchg.org> wrote: > On Thu, May 08, 2014 at 10:12:40AM -0700, John Stultz wrote: >> On 04/29/2014 02:21 PM, John Stultz wrote: >> > Another few weeks and another volatile ranges patchset... >> > >> > After getting the sense that the a major objection to the earlier >> > patches was the introduction of a new syscall (and its somewhat >> > strange dual length/purged-bit return values), I spent some time >> > trying to rework the vma manipulations so we can be we won't fail >> > mid-way through changing volatility (basically making it atomic). >> > I think I have it working, and thus, there is no longer the >> > need for a new syscall, and we can go back to using madvise() >> > to set and unset pages as volatile. >> >> Johannes: To get some feedback, maybe I'll needle you directly here a >> bit. :) >> >> Does moving this interface to madvise help reduce your objections? I >> feel like your cleaning-the-dirty-bit idea didn't work out, but I was >> hoping that by reworking the vma manipulations to be atomic, we could >> move to madvise and still avoid the new syscall that you seemed bothered >> by. But I've not really heard much from you recently so I worry your >> concerns on this were actually elsewhere, and I'm just churning the >> patch needlessly. > > My objection was not the syscall. > > From a reclaim perspective, using the dirty state to denote whether a > swap-backed page needs writeback before reclaim is quite natural and I > much prefer Minchan's changes to the reclaim code over yours. > > From an interface point of view, I would prefer the simplicity of > cleaning dirty bits to invalidate pages, and a default of zero-filling > invalidated pages instead of sending SIGBUS. This also is quite > natural when you think of anon/shmem mappings as cache pages on top of > /dev/zero (see mmap_zero() and shmem_zero_setup()). And it translates > well to tmpfs. > > At the same time, I acknowledge that there are usecases that want > SIGBUS delivery for more than just convenience in order to implement > userspace fault handling, and this is the only place where I see a > real divergence in actual functionality from Minchan's code. Thanks for the clarification and feedback. Sorry for my slow response, as I was on vacation for a week and am just now catching up on this. So again, SIGBUS for userspace fault handling is really of a side-effect of having more userspace friendly semantics, and isn't really the primary goal/usage model. Zerofill semantics are mostly problematic because they make userspace mistakes harder to find and diagnose. Android's ashmem actually uses zerofill semantics, so while I see it as less ideal, technically zerofill would work here. However, combining zerofill with your preferred overloading of the dirty state is particularly problematic because it makes any dirtying of volatile data clear both the volatile state as well as the purged state for the entire page. The volatile state is surprising, but less problematic, but the clearing of the purged state means applications would possibly get a partial zero page (for whatever wasn't written) and no warning that their data was lost. This is a very surprising and unfriendly side-effect from a userspace perspective. For context, Android's ashmem preserves both the volatile and purged state on volatile page dirtying (since the volatility and purged state are kept in their own range structure independently from the VM). > That, however, truly is a separate virtual memory feature. Would it > be possible for you to take MADV_FREE and MADV_REVIVE as a base and > implement an madvise op that switches the no-page behavior of a VMA > from zero-filling to SIGBUS delivery? I'll see if I can look into it if I get some time. However, I suspect its more likely I'll just have to admit defeat on this one and let someone else champion the effort. Interest and reviews have seemingly dropped again here and with other work ramping up, I'm not sure if I'll be able to justify further work on this. :( thanks -john
WARNING: multiple messages have this Message-ID (diff)
From: John Stultz <john.stultz@linaro.org> To: Johannes Weiner <hannes@cmpxchg.org> Cc: LKML <linux-kernel@vger.kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Android Kernel Team <kernel-team@android.com>, Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>, Hugh Dickins <hughd@google.com>, Dave Hansen <dave@sr71.net>, Rik van Riel <riel@redhat.com>, Dmitry Adamushko <dmitry.adamushko@gmail.com>, Neil Brown <neilb@suse.de>, Andrea Arcangeli <aarcange@redhat.com>, Mike Hommey <mh@glandium.org>, Taras Glek <tglek@mozilla.com>, Jan Kara <jack@suse.cz>, KOSAKI Motohiro <kosaki.motohiro@gmail.com>, Michel Lespinasse <walken@google.com>, Minchan Kim <minchan@kernel.org>, Keith Packard <keithp@keithp.com>, "linux-mm@kvack.org" <linux-mm@kvack.org> Subject: Re: [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!) Date: Mon, 16 Jun 2014 13:12:41 -0700 [thread overview] Message-ID: <CALAqxLXBs0scEEb7-rYbq9vHHs8VWUg-9vFXDoK4mzUt4smbYw@mail.gmail.com> (raw) In-Reply-To: <20140603145710.GQ2878@cmpxchg.org> On Tue, Jun 3, 2014 at 7:57 AM, Johannes Weiner <hannes@cmpxchg.org> wrote: > On Thu, May 08, 2014 at 10:12:40AM -0700, John Stultz wrote: >> On 04/29/2014 02:21 PM, John Stultz wrote: >> > Another few weeks and another volatile ranges patchset... >> > >> > After getting the sense that the a major objection to the earlier >> > patches was the introduction of a new syscall (and its somewhat >> > strange dual length/purged-bit return values), I spent some time >> > trying to rework the vma manipulations so we can be we won't fail >> > mid-way through changing volatility (basically making it atomic). >> > I think I have it working, and thus, there is no longer the >> > need for a new syscall, and we can go back to using madvise() >> > to set and unset pages as volatile. >> >> Johannes: To get some feedback, maybe I'll needle you directly here a >> bit. :) >> >> Does moving this interface to madvise help reduce your objections? I >> feel like your cleaning-the-dirty-bit idea didn't work out, but I was >> hoping that by reworking the vma manipulations to be atomic, we could >> move to madvise and still avoid the new syscall that you seemed bothered >> by. But I've not really heard much from you recently so I worry your >> concerns on this were actually elsewhere, and I'm just churning the >> patch needlessly. > > My objection was not the syscall. > > From a reclaim perspective, using the dirty state to denote whether a > swap-backed page needs writeback before reclaim is quite natural and I > much prefer Minchan's changes to the reclaim code over yours. > > From an interface point of view, I would prefer the simplicity of > cleaning dirty bits to invalidate pages, and a default of zero-filling > invalidated pages instead of sending SIGBUS. This also is quite > natural when you think of anon/shmem mappings as cache pages on top of > /dev/zero (see mmap_zero() and shmem_zero_setup()). And it translates > well to tmpfs. > > At the same time, I acknowledge that there are usecases that want > SIGBUS delivery for more than just convenience in order to implement > userspace fault handling, and this is the only place where I see a > real divergence in actual functionality from Minchan's code. Thanks for the clarification and feedback. Sorry for my slow response, as I was on vacation for a week and am just now catching up on this. So again, SIGBUS for userspace fault handling is really of a side-effect of having more userspace friendly semantics, and isn't really the primary goal/usage model. Zerofill semantics are mostly problematic because they make userspace mistakes harder to find and diagnose. Android's ashmem actually uses zerofill semantics, so while I see it as less ideal, technically zerofill would work here. However, combining zerofill with your preferred overloading of the dirty state is particularly problematic because it makes any dirtying of volatile data clear both the volatile state as well as the purged state for the entire page. The volatile state is surprising, but less problematic, but the clearing of the purged state means applications would possibly get a partial zero page (for whatever wasn't written) and no warning that their data was lost. This is a very surprising and unfriendly side-effect from a userspace perspective. For context, Android's ashmem preserves both the volatile and purged state on volatile page dirtying (since the volatility and purged state are kept in their own range structure independently from the VM). > That, however, truly is a separate virtual memory feature. Would it > be possible for you to take MADV_FREE and MADV_REVIVE as a base and > implement an madvise op that switches the no-page behavior of a VMA > from zero-filling to SIGBUS delivery? I'll see if I can look into it if I get some time. However, I suspect its more likely I'll just have to admit defeat on this one and let someone else champion the effort. Interest and reviews have seemingly dropped again here and with other work ramping up, I'm not sure if I'll be able to justify further work on this. :( thanks -john -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-06-16 20:12 UTC|newest] Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-04-29 21:21 [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!) John Stultz 2014-04-29 21:21 ` John Stultz 2014-04-29 21:21 ` [PATCH 1/4] swap: Cleanup how special swap file numbers are defined John Stultz 2014-04-29 21:21 ` John Stultz 2014-04-29 21:21 ` [PATCH 2/4] MADV_VOLATILE: Add MADV_VOLATILE/NONVOLATILE hooks and handle marking vmas John Stultz 2014-04-29 21:21 ` John Stultz 2014-05-08 1:21 ` Minchan Kim 2014-05-08 1:21 ` Minchan Kim 2014-05-08 16:38 ` John Stultz 2014-05-08 16:38 ` John Stultz 2014-05-08 23:12 ` Minchan Kim 2014-05-08 23:12 ` Minchan Kim 2014-05-08 23:43 ` John Stultz 2014-05-08 23:43 ` John Stultz 2014-05-09 0:07 ` Minchan Kim 2014-05-09 0:07 ` Minchan Kim 2014-05-09 0:24 ` John Stultz 2014-05-09 0:24 ` John Stultz 2014-05-09 0:41 ` Minchan Kim 2014-05-09 0:41 ` Minchan Kim 2014-04-29 21:21 ` [PATCH 3/4] MADV_VOLATILE: Add purged page detection on setting memory non-volatile John Stultz 2014-04-29 21:21 ` John Stultz 2014-05-08 1:51 ` Minchan Kim 2014-05-08 1:51 ` Minchan Kim 2014-05-08 21:45 ` John Stultz 2014-05-08 21:45 ` John Stultz 2014-05-08 23:45 ` Minchan Kim 2014-05-08 23:45 ` Minchan Kim 2014-04-29 21:21 ` [PATCH 4/4] MADV_VOLATILE: Add page purging logic & SIGBUS trap John Stultz 2014-04-29 21:21 ` John Stultz 2014-05-08 5:16 ` Minchan Kim 2014-05-08 5:16 ` Minchan Kim 2014-05-08 16:39 ` John Stultz 2014-05-08 16:39 ` John Stultz 2014-05-08 5:58 ` [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!) Minchan Kim 2014-05-08 5:58 ` Minchan Kim 2014-05-08 17:04 ` John Stultz 2014-05-08 17:04 ` John Stultz 2014-05-08 23:29 ` Minchan Kim 2014-05-08 23:29 ` Minchan Kim 2014-05-08 17:12 ` John Stultz 2014-05-08 17:12 ` John Stultz 2014-06-03 14:57 ` Johannes Weiner 2014-06-03 14:57 ` Johannes Weiner 2014-06-16 20:12 ` John Stultz [this message] 2014-06-16 20:12 ` John Stultz 2014-06-16 22:24 ` Andrea Arcangeli 2014-06-16 22:24 ` Andrea Arcangeli
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CALAqxLXBs0scEEb7-rYbq9vHHs8VWUg-9vFXDoK4mzUt4smbYw@mail.gmail.com \ --to=john.stultz@linaro.org \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=dave@sr71.net \ --cc=dmitry.adamushko@gmail.com \ --cc=hannes@cmpxchg.org \ --cc=hughd@google.com \ --cc=jack@suse.cz \ --cc=keithp@keithp.com \ --cc=kernel-team@android.com \ --cc=kosaki.motohiro@gmail.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mel@csn.ul.ie \ --cc=mh@glandium.org \ --cc=minchan@kernel.org \ --cc=neilb@suse.de \ --cc=riel@redhat.com \ --cc=rlove@google.com \ --cc=tglek@mozilla.com \ --cc=walken@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.