* swapcache size oddness @ 2012-04-27 20:27 Dan Magenheimer 2012-04-28 3:58 ` Hugh Dickins 0 siblings, 1 reply; 3+ messages in thread From: Dan Magenheimer @ 2012-04-27 20:27 UTC (permalink / raw) To: linux-mm In continuing digging through the swap code (with the overall objective of improving zcache policy), I was looking at the size of the swapcache. My understanding was that the swapcache is simply a buffer cache for pages that are actively in the process of being swapped in or swapped out. And keeping pages around in the swapcache is inefficient because every process access to a page in the swapcache causes a minor page fault. So I was surprised to see that, under a memory intensive workload, the swapcache can grow quite large. I have seen it grow to almost half of the size of RAM. Digging into this oddity, I re-discovered the definition for "vm_swap_full()" which, in scan_swap_map() is a pre-condition for calling __try_to_reclaim_swap(). But vm_swap_full() compares how much free swap space there is "on disk", with the total swap space available "on disk" with no regard to how much RAM there is. So on my system, which is running with 1GB RAM and 10GB swap, I think this is the reason that swapcache is growing so large. Am I misunderstanding something? Or is this code making some (possibly false) assumptions about how swap is/should be sized relative to RAM? Or maybe the size of swapcache is harmless as long as it doesn't approach total "on disk" size? (Sorry if this is a silly question again...) Thanks, Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: swapcache size oddness 2012-04-27 20:27 swapcache size oddness Dan Magenheimer @ 2012-04-28 3:58 ` Hugh Dickins 2012-04-28 16:48 ` Dan Magenheimer 0 siblings, 1 reply; 3+ messages in thread From: Hugh Dickins @ 2012-04-28 3:58 UTC (permalink / raw) To: Dan Magenheimer; +Cc: linux-mm On Fri, 27 Apr 2012, Dan Magenheimer wrote: > In continuing digging through the swap code (with the > overall objective of improving zcache policy), I was > looking at the size of the swapcache. > > My understanding was that the swapcache is simply a > buffer cache for pages that are actively in the process > of being swapped in or swapped out. It's that part of the pagecache for pages on swap. Once written out, as with other pagecache pages written out under reclaim, we do expect to reclaim them fairly soon (they're moved to the bottom of the inactive list). But when read back in, we read a cluster at a time, hoping to pick up some more useful pages while the disk head is there (though of course it may be a headless disk). We don't disassociate those from swap until they're dirtied (or swap looks fullish), why should we? > And keeping pages > around in the swapcache is inefficient because every > process access to a page in the swapcache causes a > minor page fault. What's inefficient about that? A minor fault is much less costly than the major fault of reading them back from disk. > > So I was surprised to see that, under a memory intensive > workload, the swapcache can grow quite large. I have > seen it grow to almost half of the size of RAM. Nothing wrong with that, so long as they can be freed and used for better purpose when needed. > > Digging into this oddity, I re-discovered the definition > for "vm_swap_full()" which, in scan_swap_map() is a > pre-condition for calling __try_to_reclaim_swap(). > But vm_swap_full() compares how much free swap space > there is "on disk", with the total swap space available > "on disk" with no regard to how much RAM there is. > So on my system, which is running with 1GB RAM and > 10GB swap, I think this is the reason that swapcache > is growing so large. > > Am I misunderstanding something? Or is this code > making some (possibly false) assumptions about how > swap is/should be sized relative to RAM? Or maybe the > size of swapcache is harmless as long as it doesn't > approach total "on disk" size? The size of swapcache is harmless: we break those pages' association with swap once a better use for the page comes up. But the size of swapcache does (of course) represent a duplication of what's on swap. As swap becomes full, that duplication becomes wasteful: we may need some of the swap already in memory for saving other pages; so break the association, freeing the swap for reuse but keeping the page (but now it's no longer swapcache). That's what the vm_swap_full() tests are about: choosing to free swap when it's duplicated in memory, once it's becoming a scarce resource. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: swapcache size oddness 2012-04-28 3:58 ` Hugh Dickins @ 2012-04-28 16:48 ` Dan Magenheimer 0 siblings, 0 replies; 3+ messages in thread From: Dan Magenheimer @ 2012-04-28 16:48 UTC (permalink / raw) To: Hugh Dickins; +Cc: linux-mm > From: Hugh Dickins [mailto:hughd@google.com] > Subject: Re: swapcache size oddness Hi Hugh -- Thanks for your, as usual, quick and thorough response! > On Fri, 27 Apr 2012, Dan Magenheimer wrote: > > > In continuing digging through the swap code (with the > > overall objective of improving zcache policy), I was > > looking at the size of the swapcache. > > > > My understanding was that the swapcache is simply a > > buffer cache for pages that are actively in the process > > of being swapped in or swapped out. > > It's that part of the pagecache for pages on swap. > > Once written out, as with other pagecache pages written out under > reclaim, we do expect to reclaim them fairly soon (they're moved to > the bottom of the inactive list). But when read back in, we read a > cluster at a time, hoping to pick up some more useful pages while the > disk head is there (though of course it may be a headless disk). We > don't disassociate those from swap until they're dirtied (or swap > looks fullish), why should we? OK. Yes, I forgot about the pages that are swapped in "speculatively" rather than on demand. This will certainly result in an increase in the size of the swapcache (especially with Rik's recent change that increases the average effective cluster size). > > And keeping pages > > around in the swapcache is inefficient because every > > process access to a page in the swapcache causes a > > minor page fault. > > What's inefficient about that? A minor fault is much less > costly than the major fault of reading them back from disk. Yes, but a minor fault is much more costly than a read/write. I guess I was under the mistaken assumption that a page in the swapcache can never be directly accessed because the page table would always have it marked as non-present, in order to avoid races due to multiple process accesses and I/O. But I think I see how that is avoided now (at least for non-shared-memory pages). > > So I was surprised to see that, under a memory intensive > > workload, the swapcache can grow quite large. I have > > seen it grow to almost half of the size of RAM. > > Nothing wrong with that, so long as they can be freed and > used for better purpose when needed. Due to my mistaken assumption above, I thought a page in the swap cache was "worse" than a normal anonymous page (i.e. for system performance). So really the primary difference between an anonymous page that is NOT in the swap cache, and an anonymous page that IS in the swap cache, is that the latter already has a slot reserved on the swap disk. (Flags and mapping differences too of course.) > > Digging into this oddity, I re-discovered the definition > > for "vm_swap_full()" which, in scan_swap_map() is a > > pre-condition for calling __try_to_reclaim_swap(). > > But vm_swap_full() compares how much free swap space > > there is "on disk", with the total swap space available > > "on disk" with no regard to how much RAM there is. > > So on my system, which is running with 1GB RAM and > > 10GB swap, I think this is the reason that swapcache > > is growing so large. > > > > Am I misunderstanding something? Or is this code > > making some (possibly false) assumptions about how > > swap is/should be sized relative to RAM? Or maybe the > > size of swapcache is harmless as long as it doesn't > > approach total "on disk" size? > > The size of swapcache is harmless: we break those pages' association > with swap once a better use for the page comes up. But the size of > swapcache does (of course) represent a duplication of what's on swap. > > As swap becomes full, that duplication becomes wasteful: we may need > some of the swap already in memory for saving other pages; so break > the association, freeing the swap for reuse but keeping the page > (but now it's no longer swapcache). > > That's what the vm_swap_full() tests are about: choosing to free swap > when it's duplicated in memory, once it's becoming a scarce resource. Got it. Thanks! Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-04-28 16:49 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-04-27 20:27 swapcache size oddness Dan Magenheimer 2012-04-28 3:58 ` Hugh Dickins 2012-04-28 16:48 ` Dan Magenheimer
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.