From: Ingo Oeser <ioe-lkml@rameria.de> To: Greg Stark <gsstark@mit.edu> Cc: Helge Hafting <helgehaf@aitel.hist.no>, Joel Becker <Joel.Becker@oracle.com>, Jamie Lokier <jamie@shareable.org>, Trond Myklebust <trond.myklebust@fys.uio.no>, Ulrich Drepper <drepper@redhat.com>, Linux Kernel <linux-kernel@vger.kernel.org> Subject: Re: statfs() / statvfs() syscall ballsup... Date: Thu, 16 Oct 2003 12:29:44 +0200 [thread overview] Message-ID: <200310161229.44861.ioe-lkml@rameria.de> (raw) In-Reply-To: <87llrmbl1g.fsf@stark.dyndns.tv> Hi there, first: I think the problem is solvable with mixing blocking and non-blocking IO or simply AIO, which will be supported nicely by 2.6.0, is a POSIX standard and is meant for doing your own IO scheduling. On Wednesday 15 October 2003 17:03, Greg Stark wrote: > Ingo Oeser <ioe-lkml@rameria.de> writes: > > On Monday 13 October 2003 10:45, Helge Hafting wrote: > > > This is easier than trying to tell the kernel that the job is > > > less important, that goes wrong wether the job runs too much > > > or too little. Let that job sleep a little when its services > > > aren't needed, or when you need the disk bandwith elsewhere. > > Actually I think that's exactly backwards. The problem is that if the > user-space tries to throttle the process it doesn't know how much or when. > The kernel knows exactly when there are other higher priority writes, it > can schedule just enough writes from vacuum to not interfere. On dedicated servers this might be true. But on these you could also solve it in user space by measuring disk bandwidth and issueing just enough IO to keep up roughly with it. > So if vacuum slept a bit, say every 64k of data vacuumed. It could end up > sleeping when the disks are actually idle. Or it could be not sleeping > enough and still be interfering with transactions. The vacuum io is submitted (via AIO or simulation of it) normally in a unit U and waiting ALWAYS for U to complete, before submitting a new one. Between submitting units, the vacuums checks for outstanding transactions and stops, when we have one. Now a transaction is submitted and the submitting from vacuum is stopped by it existing. The transaction waits for completion (e.g. aio_suspend()) and signals vacuum to continue. So the disk(s) should be always in good use. I don't know much of the design internals of your database, but this sounds promising and is portable. > > The questions are: How IO-intensive vacuum? How fast can a throttling > > free disk bandwidth (and memory)? > > It's purely i/o bound on large sequential reads. Ideally it should still > have large enough sequential reads to not lose the streaming advantage, but > not so large that it preempts the more random-access transactions. Ok, so we can ignore the processing time and the above should just work. Regards Ingo Oeser
next prev parent reply other threads:[~2003-10-16 10:32 UTC|newest] Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top 2003-10-09 22:16 Trond Myklebust 2003-10-09 22:26 ` Linus Torvalds 2003-10-09 23:19 ` Ulrich Drepper 2003-10-10 0:22 ` viro 2003-10-10 4:49 ` Jamie Lokier 2003-10-10 5:26 ` Trond Myklebust 2003-10-10 12:37 ` Jamie Lokier 2003-10-10 13:46 ` Trond Myklebust 2003-10-10 14:35 ` Jamie Lokier 2003-10-10 15:32 ` Misc NFSv4 (was Re: statfs() / statvfs() syscall ballsup...) Trond Myklebust 2003-10-10 15:53 ` Jamie Lokier 2003-10-10 16:07 ` Trond Myklebust 2003-10-10 15:55 ` Michael Shuey 2003-10-10 16:20 ` Trond Myklebust 2003-10-10 16:45 ` J. Bruce Fields 2003-10-10 14:39 ` statfs() / statvfs() syscall ballsup Jamie Lokier 2003-10-09 23:31 ` Trond Myklebust 2003-10-10 12:27 ` Joel Becker 2003-10-10 14:59 ` Linus Torvalds 2003-10-10 15:27 ` Joel Becker 2003-10-10 16:00 ` Linus Torvalds 2003-10-10 16:26 ` Joel Becker 2003-10-10 16:50 ` Linus Torvalds 2003-10-10 17:33 ` Joel Becker 2003-10-10 17:51 ` Linus Torvalds 2003-10-10 18:13 ` Joel Becker 2003-10-10 16:27 ` Valdis.Kletnieks 2003-10-10 16:33 ` Chris Friesen 2003-10-10 17:04 ` Linus Torvalds 2003-10-10 17:07 ` Linus Torvalds 2003-10-10 17:21 ` Joel Becker 2003-10-10 16:01 ` Jamie Lokier 2003-10-10 16:33 ` Joel Becker 2003-10-10 16:58 ` Chris Friesen 2003-10-10 17:05 ` Trond Myklebust 2003-10-10 17:20 ` Joel Becker 2003-10-10 17:33 ` Chris Friesen 2003-10-10 17:40 ` Linus Torvalds 2003-10-10 17:54 ` Trond Myklebust 2003-10-10 18:05 ` Linus Torvalds 2003-10-10 20:40 ` Trond Myklebust 2003-10-10 21:09 ` Linus Torvalds 2003-10-10 22:17 ` Trond Myklebust 2003-10-11 2:53 ` Andrew Morton 2003-10-11 3:47 ` Trond Myklebust 2003-10-10 18:05 ` Joel Becker 2003-10-10 18:31 ` Andrea Arcangeli 2003-10-10 20:33 ` Helge Hafting 2003-10-10 20:07 ` Jamie Lokier 2003-10-12 15:31 ` Greg Stark 2003-10-12 16:13 ` Linus Torvalds 2003-10-12 22:09 ` Greg Stark 2003-10-13 8:45 ` Helge Hafting 2003-10-15 13:25 ` Ingo Oeser 2003-10-15 15:03 ` Greg Stark 2003-10-15 18:37 ` Helge Hafting 2003-10-16 10:29 ` Ingo Oeser [this message] 2003-10-16 14:02 ` Greg Stark 2003-10-21 11:47 ` Ingo Oeser 2003-10-10 18:20 ` Andrea Arcangeli 2003-10-10 18:36 ` Linus Torvalds 2003-10-10 19:03 ` Andrea Arcangeli 2003-10-09 23:16 ` Andreas Dilger 2003-10-09 23:24 ` Linus Torvalds
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=200310161229.44861.ioe-lkml@rameria.de \ --to=ioe-lkml@rameria.de \ --cc=Joel.Becker@oracle.com \ --cc=drepper@redhat.com \ --cc=gsstark@mit.edu \ --cc=helgehaf@aitel.hist.no \ --cc=jamie@shareable.org \ --cc=linux-kernel@vger.kernel.org \ --cc=trond.myklebust@fys.uio.no \ --subject='Re: statfs() / statvfs() syscall ballsup...' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.