From: Michal Hocko <email@example.com> To: Daniel Colascione <firstname.lastname@example.org> Cc: Minchan Kim <email@example.com>, Andrew Morton <firstname.lastname@example.org>, LKML <email@example.com>, linux-mm <firstname.lastname@example.org>, Johannes Weiner <email@example.com>, Tim Murray <firstname.lastname@example.org>, Joel Fernandes <email@example.com>, Suren Baghdasaryan <firstname.lastname@example.org>, Shakeel Butt <email@example.com>, Sonny Rao <firstname.lastname@example.org>, Brian Geffon <email@example.com>, Linux API <firstname.lastname@example.org> Subject: Re: [RFC 7/7] mm: madvise support MADV_ANONYMOUS_FILTER and MADV_FILE_FILTER Date: Tue, 28 May 2019 14:32:08 +0200 [thread overview] Message-ID: <20190528123208.GC1658@dhcp22.suse.cz> (raw) In-Reply-To: <CAKOZueuerHTCPbQqowSxi-_sRsqxYQQqgyi1UOh7EkZcS3DCnA@mail.gmail.com> On Tue 28-05-19 05:11:16, Daniel Colascione wrote: > On Tue, May 28, 2019 at 4:49 AM Michal Hocko <email@example.com> wrote: [...] > > > We have various system calls that provide hints for open files, but > > > the memory operations are distinct. Modeling anonymous memory as a > > > kind of file-backed memory for purposes of VMA manipulation would also > > > be a departure from existing practice. Can you help me understand why > > > you seem to favor the FD-per-VMA approach so heavily? I don't see any > > > arguments *for* an FD-per-VMA model for remove memory manipulation and > > > I see a lot of arguments against it. Is there some compelling > > > advantage I'm missing? > > > > First and foremost it provides an easy cookie to the userspace to > > guarantee time-to-check-time-to-use consistency. > > But only for one VMA at a time. Which is the unit we operate on, right? > > It also naturally > > extend an existing fadvise interface that achieves madvise semantic on > > files. > > There are lots of things that madvise can do that fadvise can't and > that don't even really make sense for fadvise, e.g., MADV_FREE. It > seems odd to me to duplicate much of the madvise interface into > fadvise so that we can use file APIs to give madvise hints. It seems > simpler to me to just provide a mechanism to put the madvise hints > where they're needed. I do not see why we would duplicate. I confess I haven't tried to implement this so I might be overlooking something but it seems to me that we could simply reuse the same functionality from both APIs. > > I am not really pushing hard for this particular API but I really > > do care about a programming model that would be sane. > > You've used "sane" twice so far in this message. Can you specify more > precisely what you mean by that word? Well, I would consider a model which would prevent from unintended side effects (e.g. working on a completely different object) without a tricky synchronization sane. > I agree that there needs to be > some defense against TOCTOU races when doing remote memory management, > but I don't think providing this robustness via a file descriptor is > any more sane than alternative approaches. A file descriptor comes > with a lot of other features --- e.g., SCM_RIGHTS, fstat, and a > concept of owning a resource --- that aren't needed to achieve > robustness. > > Normally, a file descriptor refers to some resource that the kernel > holds as long as the file descriptor (well, the open file description > or struct file) lives -- things like graphics buffers, files, and > sockets. If we're using an FD *just* as a cookie and not a resource, > I'd rather just expose the cookie directly. You are absolutely right. But doesn't that apply to any other revalidation method that would be tracking VMA status as well. As I've said I am not married to this approach as long as there are better alternatives. So far we are in a discussion what should be the actual semantic of the operation and how much do we want to tolerate races. And it seems that we are diving into implementation details rather than landing with a firm decision that the current proposed API is suitable or not. > > If we have a > > different means to achieve the same then all fine by me but so far I > > haven't heard any sound arguments to invent something completely new > > when we have established APIs to use. > > Doesn't the next sentence describe something profoundly new? :-) > > > Exporting anonymous mappings via > > proc the same way we do for file mappings doesn't seem to be stepping > > outside of the current practice way too much. > > It seems like a radical departure from existing practice to provide > filesystem interfaces to anonymous memory regions, e.g., anon_vma. > You've never been able to refer to those memory regions with file > descriptors. > > All I'm suggesting is that we take the existing madvise mechanism, > make it work cross-process, and make it robust against TOCTOU > problems, all one step at a time. Maybe my sense of API "size" is > miscalibrated, but adding a new type of FD to refer to anonymous VMA > regions feels like a bigger departure and so requires stronger > justification, especially if the result of the FD approach is probably > something less efficient than a cookie-based one. Feel free to propose the way to achieve that in the respective email thread. > > and we should focus on discussing whether this is a > > sane model. And I think it would be much better to discuss that under > > the respective patch which introduces that API rather than here. > > I think it's important to discuss what that API should look like. :-) It will be fun to follow this discussion and make some sense of different parallel threads. -- Michal Hocko SUSE Labs
next prev parent reply other threads:[~2019-05-28 12:32 UTC|newest] Thread overview: 138+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-05-20 3:52 [RFC 0/7] introduce memory hinting API for external process Minchan Kim 2019-05-20 3:52 ` [RFC 1/7] mm: introduce MADV_COOL Minchan Kim 2019-05-20 8:16 ` Michal Hocko 2019-05-20 8:19 ` Michal Hocko 2019-05-20 15:08 ` Suren Baghdasaryan 2019-05-20 22:55 ` Minchan Kim 2019-05-20 22:54 ` Minchan Kim 2019-05-21 6:04 ` Michal Hocko 2019-05-21 9:11 ` Minchan Kim 2019-05-21 10:05 ` Michal Hocko 2019-05-28 8:53 ` Hillf Danton 2019-05-28 10:58 ` Minchan Kim 2019-05-20 3:52 ` [RFC 2/7] mm: change PAGEREF_RECLAIM_CLEAN with PAGE_REFRECLAIM Minchan Kim 2019-05-20 16:50 ` Johannes Weiner 2019-05-20 22:57 ` Minchan Kim 2019-05-20 3:52 ` [RFC 3/7] mm: introduce MADV_COLD Minchan Kim 2019-05-20 8:27 ` Michal Hocko 2019-05-20 23:00 ` Minchan Kim 2019-05-21 6:08 ` Michal Hocko 2019-05-21 9:13 ` Minchan Kim 2019-05-28 14:54 ` Hillf Danton 2019-05-30 0:45 ` Minchan Kim 2019-05-20 3:52 ` [RFC 4/7] mm: factor out madvise's core functionality Minchan Kim 2019-05-20 14:26 ` Oleksandr Natalenko 2019-05-21 1:26 ` Minchan Kim 2019-05-21 6:36 ` Oleksandr Natalenko 2019-05-21 6:50 ` Michal Hocko 2019-05-21 7:06 ` Oleksandr Natalenko 2019-05-21 10:52 ` Minchan Kim 2019-05-21 11:00 ` Michal Hocko 2019-05-21 11:24 ` Minchan Kim 2019-05-21 11:32 ` Michal Hocko 2019-05-21 10:49 ` Minchan Kim 2019-05-21 10:55 ` Michal Hocko 2019-05-20 3:52 ` [RFC 5/7] mm: introduce external memory hinting API Minchan Kim 2019-05-20 9:18 ` Michal Hocko 2019-05-21 2:41 ` Minchan Kim 2019-05-21 6:17 ` Michal Hocko 2019-05-21 10:32 ` Minchan Kim 2019-05-21 9:01 ` Christian Brauner 2019-05-21 11:35 ` Minchan Kim 2019-05-21 11:51 ` Christian Brauner 2019-05-21 15:31 ` Oleg Nesterov 2019-05-27 7:43 ` Minchan Kim 2019-05-27 15:12 ` Oleg Nesterov 2019-05-27 23:33 ` Minchan Kim 2019-05-28 7:23 ` Michal Hocko 2019-05-29 3:41 ` Hillf Danton 2019-05-30 0:38 ` Minchan Kim 2019-05-20 3:52 ` [RFC 6/7] mm: extend process_madvise syscall to support vector arrary Minchan Kim 2019-05-20 9:22 ` Michal Hocko 2019-05-21 2:48 ` Minchan Kim 2019-05-21 6:24 ` Michal Hocko 2019-05-21 10:26 ` Minchan Kim 2019-05-21 10:37 ` Michal Hocko 2019-05-27 7:49 ` Minchan Kim 2019-05-29 10:08 ` Daniel Colascione 2019-05-29 10:33 ` Michal Hocko 2019-05-30 2:17 ` Minchan Kim 2019-05-30 6:57 ` Michal Hocko 2019-05-30 8:02 ` Minchan Kim 2019-05-30 16:19 ` Daniel Colascione 2019-05-30 18:47 ` Michal Hocko 2019-05-29 4:14 ` Hillf Danton 2019-05-30 0:35 ` Minchan Kim 2019-05-20 3:52 ` [RFC 7/7] mm: madvise support MADV_ANONYMOUS_FILTER and MADV_FILE_FILTER Minchan Kim 2019-05-20 9:28 ` Michal Hocko 2019-05-21 2:55 ` Minchan Kim 2019-05-21 6:26 ` Michal Hocko 2019-05-27 7:58 ` Minchan Kim 2019-05-27 12:44 ` Michal Hocko 2019-05-28 3:26 ` Minchan Kim 2019-05-28 6:29 ` Michal Hocko 2019-05-28 8:13 ` Minchan Kim 2019-05-28 8:31 ` Daniel Colascione 2019-05-28 8:49 ` Minchan Kim 2019-05-28 9:08 ` Michal Hocko 2019-05-28 9:39 ` Daniel Colascione 2019-05-28 10:33 ` Michal Hocko 2019-05-28 11:21 ` Daniel Colascione 2019-05-28 11:49 ` Michal Hocko 2019-05-28 12:11 ` Daniel Colascione 2019-05-28 12:32 ` Michal Hocko [this message] 2019-05-28 10:32 ` Minchan Kim 2019-05-28 10:41 ` Michal Hocko 2019-05-28 11:12 ` Minchan Kim 2019-05-28 11:28 ` Michal Hocko 2019-05-28 11:42 ` Daniel Colascione 2019-05-28 11:56 ` Michal Hocko 2019-05-28 12:18 ` Daniel Colascione 2019-05-28 12:38 ` Michal Hocko 2019-05-28 12:10 ` Minchan Kim 2019-05-28 11:44 ` Minchan Kim 2019-05-28 11:51 ` Daniel Colascione 2019-05-28 12:06 ` Michal Hocko 2019-05-28 12:22 ` Minchan Kim 2019-05-28 11:28 ` Daniel Colascione 2019-05-21 15:33 ` Johannes Weiner 2019-05-22 1:50 ` Minchan Kim 2019-05-29 4:36 ` Hillf Danton 2019-05-30 1:00 ` Minchan Kim 2019-05-20 6:37 ` [RFC 0/7] introduce memory hinting API for external process Anshuman Khandual 2019-05-20 16:59 ` Tim Murray 2019-05-21 2:55 ` Anshuman Khandual 2019-05-21 5:14 ` Minchan Kim 2019-05-21 10:34 ` Michal Hocko 2019-05-28 10:50 ` Anshuman Khandual 2019-05-21 12:56 ` Shakeel Butt 2019-05-22 4:15 ` Brian Geffon 2019-05-22 4:23 ` Brian Geffon 2019-05-20 9:28 ` Michal Hocko 2019-05-20 14:42 ` Oleksandr Natalenko 2019-05-21 2:56 ` Minchan Kim 2019-05-20 16:46 ` Johannes Weiner 2019-05-21 4:39 ` Minchan Kim 2019-05-21 6:32 ` Michal Hocko 2019-05-21 1:44 ` Matthew Wilcox 2019-05-21 5:01 ` Minchan Kim 2019-05-21 6:34 ` Michal Hocko 2019-05-21 8:42 ` Christian Brauner 2019-05-21 11:05 ` Minchan Kim 2019-05-21 11:30 ` Christian Brauner 2019-05-21 11:39 ` Christian Brauner 2019-05-22 5:11 ` Daniel Colascione 2019-05-22 8:22 ` Christian Brauner 2019-05-22 13:16 ` Daniel Colascione 2019-05-22 14:52 ` Christian Brauner 2019-05-22 15:17 ` Daniel Colascione 2019-05-22 15:48 ` Christian Brauner 2019-05-22 15:57 ` Daniel Colascione 2019-05-22 16:01 ` Christian Brauner 2019-05-22 16:01 ` Daniel Colascione 2019-05-23 13:07 ` Minchan Kim 2019-05-27 8:06 ` Minchan Kim 2019-05-21 11:41 ` Minchan Kim 2019-05-21 12:04 ` Christian Brauner 2019-05-21 12:15 ` Oleksandr Natalenko 2019-05-21 12:53 ` Shakeel Butt
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190528123208.GC1658@dhcp22.suse.cz \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [RFC 7/7] mm: madvise support MADV_ANONYMOUS_FILTER and MADV_FILE_FILTER' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).