From: Greg Stark <firstname.lastname@example.org> To: Ingo Oeser <email@example.com> Cc: Greg Stark <firstname.lastname@example.org>, Helge Hafting <email@example.com>, Joel Becker <Joel.Becker@oracle.com>, Jamie Lokier <firstname.lastname@example.org>, Trond Myklebust <email@example.com>, Ulrich Drepper <firstname.lastname@example.org>, Linux Kernel <email@example.com> Subject: Re: statfs() / statvfs() syscall ballsup... Date: 16 Oct 2003 10:02:27 -0400 [thread overview] Message-ID: <firstname.lastname@example.org> (raw) In-Reply-To: <email@example.com> Ingo Oeser <firstname.lastname@example.org> writes: > Hi there, > > first: I think the problem is solvable with mixing blocking and > non-blocking IO or simply AIO, which will be supported nicely by 2.6.0, > is a POSIX standard and is meant for doing your own IO scheduling. I think aio could be very useful for databases, but not in this area. I think it's useful as a more fine-grained tool than sync/fsync. Currently the database has to fsync a file to commit a transaction, which means flushing _all_writes to the file even ones from other transactions. If aio inserted write barriers to the disk controller then it would provide a way to ensure the current transaction is synced without having to flush all other transactions writes at the same time. But I don't see how it's useful for the problem I'm describing. > On Wednesday 15 October 2003 17:03, Greg Stark wrote: > > Ingo Oeser <email@example.com> writes: > > > On Monday 13 October 2003 10:45, Helge Hafting wrote: > > > > This is easier than trying to tell the kernel that the job is > > > > less important, that goes wrong wether the job runs too much > > > > or too little. Let that job sleep a little when its services > > > > aren't needed, or when you need the disk bandwith elsewhere. > > > > Actually I think that's exactly backwards. The problem is that if the > > user-space tries to throttle the process it doesn't know how much or when. > > The kernel knows exactly when there are other higher priority writes, it > > can schedule just enough writes from vacuum to not interfere. > > On dedicated servers this might be true. But on these you could also > solve it in user space by measuring disk bandwidth and issueing just > enough IO to keep up roughly with it. Indeed we're discussing methods for doing that now. But this seems like a awkward way to accomplish what the kernel could do very precisely. I don't see why non-dedicated servers would be make priorities any less useful, in fact I think that's exactly where they would shine. > > So if vacuum slept a bit, say every 64k of data vacuumed. It could end up > > sleeping when the disks are actually idle. Or it could be not sleeping > > enough and still be interfering with transactions. > > The vacuum io is submitted (via AIO or simulation of it) normally in a > unit U and waiting ALWAYS for U to complete, before submitting a new one. > Between submitting units, the vacuums checks for outstanding transactions > and stops, when we have one. > > Now a transaction is submitted and the submitting from vacuum is stopped > by it existing. The transaction waits for completion (e.g. aio_suspend()) > and signals vacuum to continue. User-space has no idea if disk i/o is occurring. The data the transaction needs could be cached, or it could be on a different disk. Besides, I think this is far too coarse-grained than what's needed. Transactions sometimes run for seconds, minutes, or hours,, some of that time is spent doing disk i/o and some of it doing cpu calculations. It can't stop and signal another process every time it finishes reading a block and needs to do a bit of calculation. Then context switch again a millisecond later so it can read the next block... And besides, this is would only useful on dedicated servers. -- greg
next prev parent reply other threads:[~2003-10-16 14:02 UTC|newest] Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top 2003-10-09 22:16 Trond Myklebust 2003-10-09 22:26 ` Linus Torvalds 2003-10-09 23:19 ` Ulrich Drepper 2003-10-10 0:22 ` viro 2003-10-10 4:49 ` Jamie Lokier 2003-10-10 5:26 ` Trond Myklebust 2003-10-10 12:37 ` Jamie Lokier 2003-10-10 13:46 ` Trond Myklebust 2003-10-10 14:35 ` Jamie Lokier 2003-10-10 15:32 ` Misc NFSv4 (was Re: statfs() / statvfs() syscall ballsup...) Trond Myklebust 2003-10-10 15:53 ` Jamie Lokier 2003-10-10 16:07 ` Trond Myklebust 2003-10-10 15:55 ` Michael Shuey 2003-10-10 16:20 ` Trond Myklebust 2003-10-10 16:45 ` J. Bruce Fields 2003-10-10 14:39 ` statfs() / statvfs() syscall ballsup Jamie Lokier 2003-10-09 23:31 ` Trond Myklebust 2003-10-10 12:27 ` Joel Becker 2003-10-10 14:59 ` Linus Torvalds 2003-10-10 15:27 ` Joel Becker 2003-10-10 16:00 ` Linus Torvalds 2003-10-10 16:26 ` Joel Becker 2003-10-10 16:50 ` Linus Torvalds 2003-10-10 17:33 ` Joel Becker 2003-10-10 17:51 ` Linus Torvalds 2003-10-10 18:13 ` Joel Becker 2003-10-10 16:27 ` Valdis.Kletnieks 2003-10-10 16:33 ` Chris Friesen 2003-10-10 17:04 ` Linus Torvalds 2003-10-10 17:07 ` Linus Torvalds 2003-10-10 17:21 ` Joel Becker 2003-10-10 16:01 ` Jamie Lokier 2003-10-10 16:33 ` Joel Becker 2003-10-10 16:58 ` Chris Friesen 2003-10-10 17:05 ` Trond Myklebust 2003-10-10 17:20 ` Joel Becker 2003-10-10 17:33 ` Chris Friesen 2003-10-10 17:40 ` Linus Torvalds 2003-10-10 17:54 ` Trond Myklebust 2003-10-10 18:05 ` Linus Torvalds 2003-10-10 20:40 ` Trond Myklebust 2003-10-10 21:09 ` Linus Torvalds 2003-10-10 22:17 ` Trond Myklebust 2003-10-11 2:53 ` Andrew Morton 2003-10-11 3:47 ` Trond Myklebust 2003-10-10 18:05 ` Joel Becker 2003-10-10 18:31 ` Andrea Arcangeli 2003-10-10 20:33 ` Helge Hafting 2003-10-10 20:07 ` Jamie Lokier 2003-10-12 15:31 ` Greg Stark 2003-10-12 16:13 ` Linus Torvalds 2003-10-12 22:09 ` Greg Stark 2003-10-13 8:45 ` Helge Hafting 2003-10-15 13:25 ` Ingo Oeser 2003-10-15 15:03 ` Greg Stark 2003-10-15 18:37 ` Helge Hafting 2003-10-16 10:29 ` Ingo Oeser 2003-10-16 14:02 ` Greg Stark [this message] 2003-10-21 11:47 ` Ingo Oeser 2003-10-10 18:20 ` Andrea Arcangeli 2003-10-10 18:36 ` Linus Torvalds 2003-10-10 19:03 ` Andrea Arcangeli 2003-10-09 23:16 ` Andreas Dilger 2003-10-09 23:24 ` Linus Torvalds
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --cc=Joel.Becker@oracle.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: statfs() / statvfs() syscall ballsup...' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.