From: Linus Torvalds <firstname.lastname@example.org> To: Andrea Arcangeli <email@example.com> Cc: Jamie Lokier <firstname.lastname@example.org>, Trond Myklebust <email@example.com>, Ulrich Drepper <firstname.lastname@example.org>, Linux Kernel <email@example.com> Subject: Re: statfs() / statvfs() syscall ballsup... Date: Fri, 10 Oct 2003 11:36:29 -0700 (PDT) [thread overview] Message-ID: <Pine.LNX.firstname.lastname@example.org> (raw) In-Reply-To: <20031010182021.GF16013@velociraptor.random> On Fri, 10 Oct 2003, Andrea Arcangeli wrote: > > O_DIRECT only walk the pagetables, no pte mangling, no tlb flushes, the > TLB is preserved fully. Yes. However, it's even _nicer_ if you don't need to walk the page tables at all. Quite a lot of operations could be done directly on the page cache. I'm not a huge fan of mmap() myself - the biggest advantage of mmap is when you don't know your access patterns, and you have reasonably good locality. In many other cases mmap is just a total loss, because the page table walking is often more expensive than even a memcpy(). That's _especially_ true if you have to move mappings around, and you have to invalidate TLB's. memcpy() often gets a bad name. Yeah, memory is slow, but especially if you copy something you just worked on, you're actually often better off letting the CPU cache do its job, rather than walking page tables and trying to be clever. Just as an example: copying often means that you don't need nearly as much locking and synchronization - which in turn avoids one whole big mess (yes, the memcpy() will look very hot in profiles, but then doing extra work to avoid the memcpy() will cause spread-out overhead that is a lot worse and harder to think about). This is why a simple read()/write() loop often _beats_ mmap approaches. And often it's actually better to not even have big buffers (ie the old "avoid system calls by aggregation" approach) because that just blows your cache away. Right now, the fastest way to copy a file is apparently by doing lots of ~8kB read/write pairs (that data may be slightly stale, but it was true at some point). Never mind the system call overhead - just having the extra buffer stay in the L1 cache and avoiding page faults from mmap is a bigger win. And I don't think mmap _can_ beat that. It's fundamental. In contrast, direct page cache accesses really can do so. Exactly because they don't touch any page tables at all, and because they can take advantage of internal kernel data structure layout and move pages around without any cost.. Linus
next prev parent reply other threads:[~2003-10-10 18:36 UTC|newest] Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top 2003-10-09 22:16 Trond Myklebust 2003-10-09 22:26 ` Linus Torvalds 2003-10-09 23:19 ` Ulrich Drepper 2003-10-10 0:22 ` viro 2003-10-10 4:49 ` Jamie Lokier 2003-10-10 5:26 ` Trond Myklebust 2003-10-10 12:37 ` Jamie Lokier 2003-10-10 13:46 ` Trond Myklebust 2003-10-10 14:35 ` Jamie Lokier 2003-10-10 15:32 ` Misc NFSv4 (was Re: statfs() / statvfs() syscall ballsup...) Trond Myklebust 2003-10-10 15:53 ` Jamie Lokier 2003-10-10 16:07 ` Trond Myklebust 2003-10-10 15:55 ` Michael Shuey 2003-10-10 16:20 ` Trond Myklebust 2003-10-10 16:45 ` J. Bruce Fields 2003-10-10 14:39 ` statfs() / statvfs() syscall ballsup Jamie Lokier 2003-10-09 23:31 ` Trond Myklebust 2003-10-10 12:27 ` Joel Becker 2003-10-10 14:59 ` Linus Torvalds 2003-10-10 15:27 ` Joel Becker 2003-10-10 16:00 ` Linus Torvalds 2003-10-10 16:26 ` Joel Becker 2003-10-10 16:50 ` Linus Torvalds 2003-10-10 17:33 ` Joel Becker 2003-10-10 17:51 ` Linus Torvalds 2003-10-10 18:13 ` Joel Becker 2003-10-10 16:27 ` Valdis.Kletnieks 2003-10-10 16:33 ` Chris Friesen 2003-10-10 17:04 ` Linus Torvalds 2003-10-10 17:07 ` Linus Torvalds 2003-10-10 17:21 ` Joel Becker 2003-10-10 16:01 ` Jamie Lokier 2003-10-10 16:33 ` Joel Becker 2003-10-10 16:58 ` Chris Friesen 2003-10-10 17:05 ` Trond Myklebust 2003-10-10 17:20 ` Joel Becker 2003-10-10 17:33 ` Chris Friesen 2003-10-10 17:40 ` Linus Torvalds 2003-10-10 17:54 ` Trond Myklebust 2003-10-10 18:05 ` Linus Torvalds 2003-10-10 20:40 ` Trond Myklebust 2003-10-10 21:09 ` Linus Torvalds 2003-10-10 22:17 ` Trond Myklebust 2003-10-11 2:53 ` Andrew Morton 2003-10-11 3:47 ` Trond Myklebust 2003-10-10 18:05 ` Joel Becker 2003-10-10 18:31 ` Andrea Arcangeli 2003-10-10 20:33 ` Helge Hafting 2003-10-10 20:07 ` Jamie Lokier 2003-10-12 15:31 ` Greg Stark 2003-10-12 16:13 ` Linus Torvalds 2003-10-12 22:09 ` Greg Stark 2003-10-13 8:45 ` Helge Hafting 2003-10-15 13:25 ` Ingo Oeser 2003-10-15 15:03 ` Greg Stark 2003-10-15 18:37 ` Helge Hafting 2003-10-16 10:29 ` Ingo Oeser 2003-10-16 14:02 ` Greg Stark 2003-10-21 11:47 ` Ingo Oeser 2003-10-10 18:20 ` Andrea Arcangeli 2003-10-10 18:36 ` Linus Torvalds [this message] 2003-10-10 19:03 ` Andrea Arcangeli 2003-10-09 23:16 ` Andreas Dilger 2003-10-09 23:24 ` Linus Torvalds
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=Pine.LNX.email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: statfs() / statvfs() syscall ballsup...' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.