* Comment on patch to remove nr_async_pages limit @ 2001-06-05 1:04 Marcelo Tosatti 2001-06-05 7:38 ` Mike Galbraith 2001-06-05 15:56 ` Zlatko Calusic 0 siblings, 2 replies; 14+ messages in thread From: Marcelo Tosatti @ 2001-06-05 1:04 UTC (permalink / raw) To: Zlatko Calusic; +Cc: lkml, linux-mm Zlatko, I've read your patch to remove nr_async_pages limit while reading an archive on the web. (I have to figure out why lkml is not being delivered correctly to me...) Quoting your message: "That artificial limit hurts both swap out and swap in path as it introduces synchronization points (and/or weakens swapin readahead), which I think are not necessary." If we are under low memory, we cannot simply writeout a whole bunch of swap data. Remember the writeout operations will potentially allocate buffer_head's for the swapcache pages before doing real IO, which takes _more memory_: OOM deadlock. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-05 1:04 Comment on patch to remove nr_async_pages limit Marcelo Tosatti @ 2001-06-05 7:38 ` Mike Galbraith 2001-06-05 6:18 ` Marcelo Tosatti 2001-06-05 15:57 ` Zlatko Calusic 2001-06-05 15:56 ` Zlatko Calusic 1 sibling, 2 replies; 14+ messages in thread From: Mike Galbraith @ 2001-06-05 7:38 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Zlatko Calusic, lkml, linux-mm On Mon, 4 Jun 2001, Marcelo Tosatti wrote: > Zlatko, > > I've read your patch to remove nr_async_pages limit while reading an > archive on the web. (I have to figure out why lkml is not being delivered > correctly to me...) > > Quoting your message: > > "That artificial limit hurts both swap out and swap in path as it > introduces synchronization points (and/or weakens swapin readahead), > which I think are not necessary." > > If we are under low memory, we cannot simply writeout a whole bunch of > swap data. Remember the writeout operations will potentially allocate > buffer_head's for the swapcache pages before doing real IO, which takes > _more memory_: OOM deadlock. What's the point of creating swapcache pages, and then avoiding doing the IO until it becomes _dangerous_ to do so? That's what we're doing right now. This is a problem because we guarantee it will become one. We guarantee that the pagecache will become almost pure swapcache by delaying the writeout so long that everything else is consumed. In experiments, speeding swapcache pages on their way helps. Special handling (swapcache bean counting) also helps. (was _really ugly_ code.. putting them on a seperate list would be a lot easier on the stomach:) $.02 -Mike ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-05 7:38 ` Mike Galbraith @ 2001-06-05 6:18 ` Marcelo Tosatti 2001-06-05 10:32 ` Mike Galbraith 2001-06-05 16:05 ` Comment on patch to remove nr_async_pages limit Zlatko Calusic 2001-06-05 15:57 ` Zlatko Calusic 1 sibling, 2 replies; 14+ messages in thread From: Marcelo Tosatti @ 2001-06-05 6:18 UTC (permalink / raw) To: Mike Galbraith; +Cc: Zlatko Calusic, lkml, linux-mm On Tue, 5 Jun 2001, Mike Galbraith wrote: > On Mon, 4 Jun 2001, Marcelo Tosatti wrote: > > > Zlatko, > > > > I've read your patch to remove nr_async_pages limit while reading an > > archive on the web. (I have to figure out why lkml is not being delivered > > correctly to me...) > > > > Quoting your message: > > > > "That artificial limit hurts both swap out and swap in path as it > > introduces synchronization points (and/or weakens swapin readahead), > > which I think are not necessary." > > > > If we are under low memory, we cannot simply writeout a whole bunch of > > swap data. Remember the writeout operations will potentially allocate > > buffer_head's for the swapcache pages before doing real IO, which takes > > _more memory_: OOM deadlock. > > What's the point of creating swapcache pages, and then avoiding doing > the IO until it becomes _dangerous_ to do so? Its not dangerous to do the IO. Now it _is_ dangerous to do the IO without having any sane limit on the amount of data being written out at the same time. > That's what we're doing right now. This is a problem because we > guarantee it will become one. Its not really about swapcache pages --- its about anonymous memory. If you're memory is full of anonymous data, you have to push some of this data to disk. (conceptually it does not really matter if its swapcache or not, think about anonymous memory) > We guarantee that the pagecache will become almost pure swapcache by > delaying the writeout so long that everything else is consumed. Exactly. And when we reach a low watermark of memory, we start writting out the anonymous memory. > In experiments, speeding swapcache pages on their way helps. Special > handling (swapcache bean counting) also helps. (was _really ugly_ code.. > putting them on a seperate list would be a lot easier on the stomach:) I agree that the current way of limiting on-flight swapout can be changed to perform better. Removing the amount of data being written to disk when we have a memory shortage is not nice. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-05 6:18 ` Marcelo Tosatti @ 2001-06-05 10:32 ` Mike Galbraith 2001-06-05 11:42 ` Ed Tomlinson 2001-06-05 19:21 ` Benjamin C.R. LaHaise 2001-06-05 16:05 ` Comment on patch to remove nr_async_pages limit Zlatko Calusic 1 sibling, 2 replies; 14+ messages in thread From: Mike Galbraith @ 2001-06-05 10:32 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Zlatko Calusic, lkml, linux-mm On Tue, 5 Jun 2001, Marcelo Tosatti wrote: > On Tue, 5 Jun 2001, Mike Galbraith wrote: > > > On Mon, 4 Jun 2001, Marcelo Tosatti wrote: > > > > > Zlatko, > > > > > > I've read your patch to remove nr_async_pages limit while reading an > > > archive on the web. (I have to figure out why lkml is not being delivered > > > correctly to me...) > > > > > > Quoting your message: > > > > > > "That artificial limit hurts both swap out and swap in path as it > > > introduces synchronization points (and/or weakens swapin readahead), > > > which I think are not necessary." > > > > > > If we are under low memory, we cannot simply writeout a whole bunch of > > > swap data. Remember the writeout operations will potentially allocate > > > buffer_head's for the swapcache pages before doing real IO, which takes > > > _more memory_: OOM deadlock. > > > > What's the point of creating swapcache pages, and then avoiding doing > > the IO until it becomes _dangerous_ to do so? > > Its not dangerous to do the IO. Now it _is_ dangerous to do the IO without > having any sane limit on the amount of data being written out at the same > time. Yes. If we start writing out sooner, we aren't stuck with pushing a ton of IO all at once and can use prudent limits. Not only because of potential allocation problems, but because our situation is changing rapidly so small corrections done often is more precise than whopping big ones can be. > > That's what we're doing right now. This is a problem because we > > guarantee it will become one. > > Its not really about swapcache pages --- its about anonymous memory. (swapcache is the biggest pain in the butt for the portion of the spetrum I'm hammering on though) > If you're memory is full of anonymous data, you have to push some of this > data to disk. (conceptually it does not really matter if its swapcache or > not, think about anonymous memory) > > > We guarantee that the pagecache will become almost pure swapcache by > > delaying the writeout so long that everything else is consumed. > > Exactly. And when we reach a low watermark of memory, we start writting > out the anonymous memory. > > > In experiments, speeding swapcache pages on their way helps. Special > > handling (swapcache bean counting) also helps. (was _really ugly_ code.. > > putting them on a seperate list would be a lot easier on the stomach:) > > I agree that the current way of limiting on-flight swapout can be changed > to perform better. > > Removing the amount of data being written to disk when we have a memory > shortage is not nice. Here, that doesn't make any real difference. We can have too many pages completing IO too late or too few.. problem is that they start coming out of the pipe too late. I'd rather see my poor disk saturated than partly idle when my box is choking on dirtclods ;-) -Mike ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-05 10:32 ` Mike Galbraith @ 2001-06-05 11:42 ` Ed Tomlinson 2001-06-05 16:08 ` Zlatko Calusic 2001-06-05 19:21 ` Benjamin C.R. LaHaise 1 sibling, 1 reply; 14+ messages in thread From: Ed Tomlinson @ 2001-06-05 11:42 UTC (permalink / raw) To: Mike Galbraith, Marcelo Tosatti; +Cc: Zlatko Calusic, lkml, linux-mm Hi, To paraphase Mike, We defer doing IO until we are under short of storage. Doing IO uses storage. So delaying IO as much as we do forces us to impose limits. If we did the IO earlier we would not need this limit often, if at all. Does this make any sense? Maybe we can have the best of both worlds. Is it possible to allocate the BH early and then defer the IO? The idea being to make IO possible without having to allocate. This would let us remove the async page limit but would ensure we could still free. Thoughts? Ed Tomlinson ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-05 11:42 ` Ed Tomlinson @ 2001-06-05 16:08 ` Zlatko Calusic 0 siblings, 0 replies; 14+ messages in thread From: Zlatko Calusic @ 2001-06-05 16:08 UTC (permalink / raw) To: Ed Tomlinson; +Cc: Mike Galbraith, Marcelo Tosatti, lkml, linux-mm Ed Tomlinson <tomlins@cam.org> writes: [snip] > Maybe we can have the best of both worlds. Is it possible to allocate the BH > early and then defer the IO? The idea being to make IO possible without having > to allocate. This would let us remove the async page limit but would ensure > we could still free. > Yes, this is a good idea if you ask me. Basically, to remove as many limits as we can, and also to secure us from the deadlocks. With just a few pages of extra memory for the reserved buffer heads, I think it's a fair game. Still, pending further analysis... -- Zlatko ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-05 10:32 ` Mike Galbraith 2001-06-05 11:42 ` Ed Tomlinson @ 2001-06-05 19:21 ` Benjamin C.R. LaHaise 2001-06-05 21:00 ` Comment on patch to remove nr_async_pages limitA Mike Galbraith 1 sibling, 1 reply; 14+ messages in thread From: Benjamin C.R. LaHaise @ 2001-06-05 19:21 UTC (permalink / raw) To: Mike Galbraith; +Cc: Marcelo Tosatti, Zlatko Calusic, lkml, linux-mm On Tue, 5 Jun 2001, Mike Galbraith wrote: > Yes. If we start writing out sooner, we aren't stuck with pushing a > ton of IO all at once and can use prudent limits. Not only because of > potential allocation problems, but because our situation is changing > rapidly so small corrections done often is more precise than whopping > big ones can be. Hold on there big boy, writing out sooner is not better. What if the memory shortage is because real data is being written out to disk? Swapping early causes many more problems than swapping late as extraneous seeks to the swap partiton severely degrade performance. -ben ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limitA 2001-06-05 19:21 ` Benjamin C.R. LaHaise @ 2001-06-05 21:00 ` Mike Galbraith 2001-06-05 22:21 ` Daniel Phillips 0 siblings, 1 reply; 14+ messages in thread From: Mike Galbraith @ 2001-06-05 21:00 UTC (permalink / raw) To: Benjamin C.R. LaHaise; +Cc: Marcelo Tosatti, Zlatko Calusic, lkml, linux-mm On Tue, 5 Jun 2001, Benjamin C.R. LaHaise wrote: > On Tue, 5 Jun 2001, Mike Galbraith wrote: > > > Yes. If we start writing out sooner, we aren't stuck with pushing a > > ton of IO all at once and can use prudent limits. Not only because of > > potential allocation problems, but because our situation is changing > > rapidly so small corrections done often is more precise than whopping > > big ones can be. > > Hold on there big boy, writing out sooner is not better. What if the (do definitely beat my thoughts up, please don't use condescending terms) In some cases, it definitely is. I can routinely improve throughput by writing more.. that is a measurable and reproducable fact. I know also from measurement that it is not _always_ the right thing to do. > memory shortage is because real data is being written out to disk? (I would hope that we're doing our best to always be writing real data to disk. I also know that this isn't always the case.) > Swapping early causes many more problems than swapping late as extraneous > seeks to the swap partiton severely degrade performance. That is not the case here at the spot in the performance curve I'm looking at (transition to throughput). Does this mean the block layer and/or elevator is having problems? Why would using avaliable disk bandwidth vs letting it lie dormant be a generically bad thing?.. this I just can't understand. The elevator deals with seeks, the vm is flat not equipped to do so.. it contains such concept. Avoiding write is great, delaying write is not at _all_ great. -Mike ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limitA 2001-06-05 21:00 ` Comment on patch to remove nr_async_pages limitA Mike Galbraith @ 2001-06-05 22:21 ` Daniel Phillips 0 siblings, 0 replies; 14+ messages in thread From: Daniel Phillips @ 2001-06-05 22:21 UTC (permalink / raw) To: Mike Galbraith, Benjamin C.R. LaHaise Cc: Marcelo Tosatti, Zlatko Calusic, lkml, linux-mm On Tuesday 05 June 2001 23:00, Mike Galbraith wrote: > On Tue, 5 Jun 2001, Benjamin C.R. LaHaise wrote: > > Swapping early causes many more problems than swapping late as > > extraneous seeks to the swap partiton severely degrade performance. > > That is not the case here at the spot in the performance curve I'm > looking at (transition to throughput). > > Does this mean the block layer and/or elevator is having problems? > Why would using avaliable disk bandwidth vs letting it lie dormant be > a generically bad thing?.. this I just can't understand. The > elevator deals with seeks, the vm is flat not equipped to do so.. it > contains such concept. Clearly, if the spindle a dirty file page belongs to is idle, we have goofed. With process data the situation is a little different because the natural home of the data is not the swap device but main memory. The following gets pretty close to the truth: when there is memory pressure, if the spindle a dirty process page belongs to is idle, we have goofed. Well, as soon as I wrote those obvious truths I started thinking of exceptions, but they are silly exceptions such as: - read disk block 0 - dirty last block of disk - dirty 1,000 blocks starting at block 0. For good measure, delete the file the last block of the disk belongs to. We have just sent the head off on a wild goose chase, but we had to work at it. To handle such a set of events without requiring prescience we need to be able to cancel disk writes, but just ignoring such oddball situations is the next best thing. That's all by way of saying I agree with you. -- Daniel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-05 6:18 ` Marcelo Tosatti 2001-06-05 10:32 ` Mike Galbraith @ 2001-06-05 16:05 ` Zlatko Calusic 2001-06-09 3:09 ` Rik van Riel 1 sibling, 1 reply; 14+ messages in thread From: Zlatko Calusic @ 2001-06-05 16:05 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Mike Galbraith, lkml, linux-mm Marcelo Tosatti <marcelo@conectiva.com.br> writes: [snip] > Exactly. And when we reach a low watermark of memory, we start writting > out the anonymous memory. > Hm, my observations are a little bit different. I find that writeouts happen sooner than the moment we reach low watermark, and many times just in time to interact badly with some read I/O workload that made a virtual shortage of memory in the first place. Net effect is poor performance and too much stuff in the swap. > > In experiments, speeding swapcache pages on their way helps. Special > > handling (swapcache bean counting) also helps. (was _really ugly_ code.. > > putting them on a seperate list would be a lot easier on the stomach:) > > I agree that the current way of limiting on-flight swapout can be changed > to perform better. > > Removing the amount of data being written to disk when we have a memory > shortage is not nice. > OK, then we basically agree that there is a place for improvement, and you also agree that we must be careful while trying to achieve that. I'll admit that my patch is mostly experimental, and its best effect is this discussion, which I enjoy very much. :) -- Zlatko ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-05 16:05 ` Comment on patch to remove nr_async_pages limit Zlatko Calusic @ 2001-06-09 3:09 ` Rik van Riel 2001-06-09 6:07 ` Mike Galbraith 0 siblings, 1 reply; 14+ messages in thread From: Rik van Riel @ 2001-06-09 3:09 UTC (permalink / raw) To: Zlatko Calusic; +Cc: Marcelo Tosatti, Mike Galbraith, lkml, linux-mm On 5 Jun 2001, Zlatko Calusic wrote: > Marcelo Tosatti <marcelo@conectiva.com.br> writes: > > [snip] > > Exactly. And when we reach a low watermark of memory, we start writting > > out the anonymous memory. > > Hm, my observations are a little bit different. I find that writeouts > happen sooner than the moment we reach low watermark, and many times > just in time to interact badly with some read I/O workload that made a > virtual shortage of memory in the first place. I have a patch that tries to address this by not reordering the inactive list whenever we scan through it. I'll post it right now ... (yes, I've done some recreational patching while on holidays) regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to aardvark@nl.linux.org (spam digging piggy) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-09 3:09 ` Rik van Riel @ 2001-06-09 6:07 ` Mike Galbraith 0 siblings, 0 replies; 14+ messages in thread From: Mike Galbraith @ 2001-06-09 6:07 UTC (permalink / raw) To: Rik van Riel; +Cc: Zlatko Calusic, Marcelo Tosatti, lkml, linux-mm On Sat, 9 Jun 2001, Rik van Riel wrote: > On 5 Jun 2001, Zlatko Calusic wrote: > > Marcelo Tosatti <marcelo@conectiva.com.br> writes: > > > > [snip] > > > Exactly. And when we reach a low watermark of memory, we start writting > > > out the anonymous memory. > > > > Hm, my observations are a little bit different. I find that writeouts > > happen sooner than the moment we reach low watermark, and many times > > just in time to interact badly with some read I/O workload that made a > > virtual shortage of memory in the first place. > > I have a patch that tries to address this by not reordering > the inactive list whenever we scan through it. I'll post it > right now ... Excellent. I've done some of that (crude but effective) and have had nice encouraging results. If the dirty list is long enough, this most definitely improves behavior under heavy load. -Mike ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-05 7:38 ` Mike Galbraith 2001-06-05 6:18 ` Marcelo Tosatti @ 2001-06-05 15:57 ` Zlatko Calusic 1 sibling, 0 replies; 14+ messages in thread From: Zlatko Calusic @ 2001-06-05 15:57 UTC (permalink / raw) To: Mike Galbraith; +Cc: Marcelo Tosatti, lkml, linux-mm Mike Galbraith <mikeg@wen-online.de> writes: > On Mon, 4 Jun 2001, Marcelo Tosatti wrote: > > > Zlatko, > > > > I've read your patch to remove nr_async_pages limit while reading an > > archive on the web. (I have to figure out why lkml is not being delivered > > correctly to me...) > > > > Quoting your message: > > > > "That artificial limit hurts both swap out and swap in path as it > > introduces synchronization points (and/or weakens swapin readahead), > > which I think are not necessary." > > > > If we are under low memory, we cannot simply writeout a whole bunch of > > swap data. Remember the writeout operations will potentially allocate > > buffer_head's for the swapcache pages before doing real IO, which takes > > _more memory_: OOM deadlock. > > What's the point of creating swapcache pages, and then avoiding doing > the IO until it becomes _dangerous_ to do so? That's what we're doing > right now. This is a problem because we guarantee it will become one. > We guarantee that the pagecache will become almost pure swapcache by > delaying the writeout so long that everything else is consumed. > Huh, this looks just like my argument, just put in different words. I should have read this sooner. :) -- Zlatko ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Comment on patch to remove nr_async_pages limit 2001-06-05 1:04 Comment on patch to remove nr_async_pages limit Marcelo Tosatti 2001-06-05 7:38 ` Mike Galbraith @ 2001-06-05 15:56 ` Zlatko Calusic 1 sibling, 0 replies; 14+ messages in thread From: Zlatko Calusic @ 2001-06-05 15:56 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: lkml, linux-mm Marcelo Tosatti <marcelo@conectiva.com.br> writes: > Zlatko, > > I've read your patch to remove nr_async_pages limit while reading an > archive on the web. (I have to figure out why lkml is not being delivered > correctly to me...) > > Quoting your message: > > "That artificial limit hurts both swap out and swap in path as it > introduces synchronization points (and/or weakens swapin readahead), > which I think are not necessary." > > If we are under low memory, we cannot simply writeout a whole bunch of > swap data. Remember the writeout operations will potentially allocate > buffer_head's for the swapcache pages before doing real IO, which takes > _more memory_: OOM deadlock. > My question is: if we defer writing and in a way "loose" that 4096 bytes of memory (because we decide to keep the page in the memory for some more time), how can a much smaller buffer_head be a problem? I think we could always make a bigger reserve of buffer heads just for this purpose, to make swapout more robust, and then don't impose any limits on the number of the outstanding async io pages in the flight. Does this make any sense? -- Zlatko ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2001-06-09 6:08 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2001-06-05 1:04 Comment on patch to remove nr_async_pages limit Marcelo Tosatti 2001-06-05 7:38 ` Mike Galbraith 2001-06-05 6:18 ` Marcelo Tosatti 2001-06-05 10:32 ` Mike Galbraith 2001-06-05 11:42 ` Ed Tomlinson 2001-06-05 16:08 ` Zlatko Calusic 2001-06-05 19:21 ` Benjamin C.R. LaHaise 2001-06-05 21:00 ` Comment on patch to remove nr_async_pages limitA Mike Galbraith 2001-06-05 22:21 ` Daniel Phillips 2001-06-05 16:05 ` Comment on patch to remove nr_async_pages limit Zlatko Calusic 2001-06-09 3:09 ` Rik van Riel 2001-06-09 6:07 ` Mike Galbraith 2001-06-05 15:57 ` Zlatko Calusic 2001-06-05 15:56 ` Zlatko Calusic
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).