Rik van Riel wrote: > > Sorry I am replying to a really old email, but exactly > what information do you believe would be more useful to > extract from vmscan.c with tracepoints? > > What are the kinds of problems that customer systems > (which cannot be rebooted into experimental kernels) > run into, that can be tracked down with tracepoints? > > I can think of a few: > - excessive CPU use in page reclaim code > - excessive reclaim latency in page reclaim code > - unbalanced memory allocation between zones/nodes > - strange balance problems between reclaiming of page > cache and swapping out process pages > > I suspect we would need fairly fine grained tracepoints > to track down these kinds of problems, with filtering > and/or interpretation in userspace, but I am always > interested in easier ways of tracking down these kinds > of problems :) > > What kinds of tracepoints do you believe we would need? > > Or, using Larry's patch as a starting point, what do you > believe should be changed? > Rik, I know these mm tracepoint patches produce a low of output in the trace buffer. In a nutshell what I have done is to add them in critical locations in places that allocate memory, map that memory in user space, unmap it from user space, and free it. In addition, I have added tracepoints to important places in the memory allocation and reclaim paths so we can see failures, stalls, high latencies as well as normal behavior. Finally I added them to the pdflush operations so we can determine amounts of memory written back to disk there versus the swapout paths. Perhaps if this is too many tracepoints all at once we could focus mainly on those specific to the page reclaim code path since that is where most contention occurs? Anonymous memory tracepoints: 1.) mm_anon_fault - initial anonymous pagefault. 2.) mm_anon_unmap - anonymous unmap triggered by page reclaim. 3.) mm_anon_userfree - anonymous memory unmap by user. 4.) mm_anon_cow - anonymous COW fault 5.) mm_anon_pgin - anonymous pagein from swap. Filemap memory tracepoints: 1.) mm_filemap_fault - initial filemap fault. 2.) mm_filemap_cow - filemap COW fault. 3.) mm_filemap_userunmap - filemap unmap by user. 4.) mm_filemap_unmap - filemap unmap triggered by page reclaim. Page allocation failure tracepoints: 1.) mm_page_allocation - page allocation that fails and causes page reclaim. Page kswapd and direct reclaim tracepoints: 1.) mm_kswapd_ran - kswapd ran and tells us how many pages it reclaimed. 2.) mm_directreclaim_reclaimall - direct reclaim because free lists were below min. 3.) mm_directreclaim_reclaimzone - direct reclaim of a specific numa node. Inner workings of the page reclaim tracepoints: 1.) mm_pagereclaim_shrinkzone - shrink zone, tells us how many pages were scanned. 2.) mm_pagereclaim_shrinkinactive - shrink inactive list, tells us how many pages were deactivated. 3.) mm_pagereclaim_shrinkactive - shrink inactive list, tells us how many pages were processed 4.) mm_pagereclaim_pgout - pageout, tells us which pages were paged out. 5.) mm_pagereclaim_free - tells us how many pages were freed in each page reclaim invocation. Pagecache flushing tracepoints: 1.) mm_balance_dirty - tells us how many pages were written when dirty was above dirty_ratio. 2.) mm_pdflush_bgwriteout - tells us how many pages written when dirty was above dirty_background_ratio. 3.) mm_pdflush_kupdate - tells us how many pages kupdate wrote.