From: Linus Torvalds <email@example.com> To: Joel Becker <Joel.Becker@oracle.com> Cc: Chris Friesen <firstname.lastname@example.org>, Jamie Lokier <email@example.com>, Trond Myklebust <firstname.lastname@example.org>, Ulrich Drepper <email@example.com>, Linux Kernel <firstname.lastname@example.org> Subject: Re: statfs() / statvfs() syscall ballsup... Date: Fri, 10 Oct 2003 10:40:40 -0700 (PDT) [thread overview] Message-ID: <Pine.LNX.email@example.com> (raw) In-Reply-To: <20031010172001.GA29301@ca-server1.us.oracle.com> On Fri, 10 Oct 2003, Joel Becker wrote: > > msync() forces write(), like fsync(). It doesn't force read(). Actually, the kernel has a "readahead(fd, offset, size)" system call that will start asynchronous read-ahead on any mapping. After that, just touching the page will obviously map in and synchronize the result. I don't think anybody uses it, and the interface may be broken, but it was literally 20 lines of code, and I had a trivial test program that populated the cache for a directory structure really quickly using it. In general, it would be really nice to have more oracle people discussing what their particular pet horror is, and what they'd really like to do. I know you're more used to just doing your own thing and working with vendors, but even just people getting used to do the unofficial "this is what we do, and it sucks because xxx" would make people more aware of what you wan tto do, and maybe it would suggest novel ways of doing things. I suspect most of the things would get shot down as being impractical, but there have always been a lot of discussion about more direct control of the page cache for programs that really want it, and I'm more than willing to discuss things (obviously 2.7.x material, but still.. A lot of it is trivial and could be back-ported to 2.6.x if people start using it). For example, things we can do, but don't, partly because of interface issues and because there is no point in doing it if people wouldn't use it: - moving a page back and forth between user space. It's _trivial_ to do, with a fallback on copying if the page happens to be busy (ie we can often just replace the existing page cache page, but if somebody else has it mapped, we'd have to copy the contents instead) We can't do this for "regular" read and write, because the resulting copy-on-write sitution makes it less than desireable in most cases, but if the user space specifically says "you can throw these pages away after moving them to the page cache", that avoids a lot of horror. The "remap_file_pages()" thing kind of does this on the read side (ie it says "map in this page cache entry into my virtual address space"), but we don't have the reverse aka "take this page in the virtual address space and map it into the page cache". Interfaces like these would also allow things like zero-copy file copies with smaller page cache footprints - at the expense of invalidating the cache for the source file as a result of the copy. Which is why it can't be a _regular_ read - but it's one of those things where if the user knows what he wants.. - dirty mapping control (ie controlling partial page dirty state, and also _delaying_ writeout if it needs to be ordered). Possibly by having a separate backing store (ie a mmap that says "read from this file, but write back to that other file") to avoid the nasty memory management problems. A lot of these are really easy to do, but the usage and the interfaces are non-obvious. Linus
next prev parent reply other threads:[~2003-10-10 17:41 UTC|newest] Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top 2003-10-09 22:16 Trond Myklebust 2003-10-09 22:26 ` Linus Torvalds 2003-10-09 23:19 ` Ulrich Drepper 2003-10-10 0:22 ` viro 2003-10-10 4:49 ` Jamie Lokier 2003-10-10 5:26 ` Trond Myklebust 2003-10-10 12:37 ` Jamie Lokier 2003-10-10 13:46 ` Trond Myklebust 2003-10-10 14:35 ` Jamie Lokier 2003-10-10 15:32 ` Misc NFSv4 (was Re: statfs() / statvfs() syscall ballsup...) Trond Myklebust 2003-10-10 15:53 ` Jamie Lokier 2003-10-10 16:07 ` Trond Myklebust 2003-10-10 15:55 ` Michael Shuey 2003-10-10 16:20 ` Trond Myklebust 2003-10-10 16:45 ` J. Bruce Fields 2003-10-10 14:39 ` statfs() / statvfs() syscall ballsup Jamie Lokier 2003-10-09 23:31 ` Trond Myklebust 2003-10-10 12:27 ` Joel Becker 2003-10-10 14:59 ` Linus Torvalds 2003-10-10 15:27 ` Joel Becker 2003-10-10 16:00 ` Linus Torvalds 2003-10-10 16:26 ` Joel Becker 2003-10-10 16:50 ` Linus Torvalds 2003-10-10 17:33 ` Joel Becker 2003-10-10 17:51 ` Linus Torvalds 2003-10-10 18:13 ` Joel Becker 2003-10-10 16:27 ` Valdis.Kletnieks 2003-10-10 16:33 ` Chris Friesen 2003-10-10 17:04 ` Linus Torvalds 2003-10-10 17:07 ` Linus Torvalds 2003-10-10 17:21 ` Joel Becker 2003-10-10 16:01 ` Jamie Lokier 2003-10-10 16:33 ` Joel Becker 2003-10-10 16:58 ` Chris Friesen 2003-10-10 17:05 ` Trond Myklebust 2003-10-10 17:20 ` Joel Becker 2003-10-10 17:33 ` Chris Friesen 2003-10-10 17:40 ` Linus Torvalds [this message] 2003-10-10 17:54 ` Trond Myklebust 2003-10-10 18:05 ` Linus Torvalds 2003-10-10 20:40 ` Trond Myklebust 2003-10-10 21:09 ` Linus Torvalds 2003-10-10 22:17 ` Trond Myklebust 2003-10-11 2:53 ` Andrew Morton 2003-10-11 3:47 ` Trond Myklebust 2003-10-10 18:05 ` Joel Becker 2003-10-10 18:31 ` Andrea Arcangeli 2003-10-10 20:33 ` Helge Hafting 2003-10-10 20:07 ` Jamie Lokier 2003-10-12 15:31 ` Greg Stark 2003-10-12 16:13 ` Linus Torvalds 2003-10-12 22:09 ` Greg Stark 2003-10-13 8:45 ` Helge Hafting 2003-10-15 13:25 ` Ingo Oeser 2003-10-15 15:03 ` Greg Stark 2003-10-15 18:37 ` Helge Hafting 2003-10-16 10:29 ` Ingo Oeser 2003-10-16 14:02 ` Greg Stark 2003-10-21 11:47 ` Ingo Oeser 2003-10-10 18:20 ` Andrea Arcangeli 2003-10-10 18:36 ` Linus Torvalds 2003-10-10 19:03 ` Andrea Arcangeli 2003-10-09 23:16 ` Andreas Dilger 2003-10-09 23:24 ` Linus Torvalds
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=Pine.LNX.firstname.lastname@example.org \ --email@example.com \ --cc=Joel.Becker@oracle.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: statfs() / statvfs() syscall ballsup...' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.