From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756460Ab0HPUVM (ORCPT ); Mon, 16 Aug 2010 16:21:12 -0400 Received: from mailout-de.gmx.net ([213.165.64.23]:47976 "HELO mail.gmx.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with SMTP id S1754842Ab0HPUVL (ORCPT ); Mon, 16 Aug 2010 16:21:11 -0400 X-Authenticated: #1045983 X-Provags-ID: V01U2FsdGVkX1/nRWUnbhvnvW/Fy45uoUet20kd74NSv1VIBTfJn+ WnpfXr4IBUMqPC Message-ID: <4C699DB1.3010905@gmx.de> Date: Mon, 16 Aug 2010 22:21:05 +0200 From: Helge Deller User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.7) Gecko/20100720 Fedora/3.1.1-1.fc13 Lightning/1.0b2pre Thunderbird/3.1.1 MIME-Version: 1.0 To: Hugh Dickins CC: linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org, Manfred Spraul Subject: Re: [PATCH][RFC] Fix up rss/swap usage of shm segments in /proc/pid/smaps References: <20100811201345.GA11304@p100.box> <20100812131005.e466a9fd.akpm@linux-foundation.org> <4C6468A9.7090503@gmx.de> <20100813195252.GA2450@p100.box> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/14/2010 12:45 AM, Hugh Dickins wrote: > On Fri, 13 Aug 2010, Helge Deller wrote: >> >> I tried quite hard to implement rss/swap accounting for shm segments inside >> smaps_pte_range() which is a callback function of walk_page_range() in >> show_smap(). > > Sorry, I think the short answer will be that you should give up on this: > reasons below. > >> >> Given the fact that I'm no linux-mm expert, I might have overseen other >> possibilities, but my experiments inside smaps_pte_range() were not >> very successful: >> From my tests, a swapped-out shm segment >> - fails on the "is_swap_pte()" test, and >> - succeeds on the "!pte_present()" test (since it's swapped >> out). > > Yes. > >> So, here would it be possible to add such accounting for swap, but how >> can I then see that this pte is >> a) belonging to a shm segment?, and >> b) see if this page/pte was really swapped out and not just not >> yet written to at all? > > You would have to add a function in mm/shmem.c to do this: it would > need to check vma->vm_file to work out if this vma belongs to it, > and use shmem_swp_alloc() to check if the page there is on swap. OTOH > I'm not sure if you could call it while holding page table lock or not. > >> As answers I found: >> a) (vma->vm_flags& VM_MAYSHARE) is true for shm segments (is >> this check sufficient?) > > No, VM_MAYSHARE is set on many other kinds of mapping too; and is not > set on all mappings of shmem objects - there is no good reason to > include SysV shm segments here, yet omit other kinds of shmem object > (/dev/shm POSIX shared memory, shared-anonymous mappings, mappings of > tmpfs files). > >> b) no idea. >> >> But if I add this page to the mss.swap entry, all pages including such >> which haven't been touched yet at all are suddenly counted as >> swapped-out...? >> >> Any hints here would be great... >> >> >> As an alternative solution, I created the following patch. >> This one works nicely, but it's just a fix-up of the mss.resident and >> mss.swap values after walk_page_range() was called. >> It's mostly a copy of the shm_add_rss_swap() function from >> my previous patch (http://marc.info/?l=linux-mm&m=128171161101817&w=2). >> Do you think such a fix-up-afterwards-approach is acceptable at all? >> If yes, a new patch on top of my ipc/shm.c patch would be easy (and >> small). > > Not acceptable, I'm afraid. Nothing wrong with a fix-up-afterwards > approach as such, but it's assuming that the vma covers the full extent > of the shmem object. That is very often the case, but by no means > necessarily so (whereas it is always the case that one vma cannot cover > more than one object). So you do have to count pageslot by pageslot. > > There are two reasons why I think you have to abandon this. One is > that /proc//smaps is reporting on the userspace mappings, saying > where swap is instanced in them. Some of those mappings may be of > shmem objects, and some of those shmem objects may use swap backing > themselves, but that's different from the mapping using swap directly. > > One can argue about that distinction, but it is how all this is > designed, and blurring that distinction tends to get into trouble. > (It's reasonable to think of anonymous mappings as mappings of anon > objects, which just happen to find room for the swp_entry in the page > table: but then it's a happy accident that smaps can see them.) > > The second reason is that since 2.6.34, /proc//status shows > VmSwap: we would not want a huge discrepancy between what it shows > in swap and what /proc//smaps shows in swap, but nor would we > want to make /proc//status scan through page tables enquiring > of shmem. > > All this stands in contrast to your /proc/sysvipc/shm patch, which > is rightly dealing with one class of shmem object, not via mappings > of those objects. > > There is a case for a "where has my swap gone" tool, which examines > the different kinds of object involved (anonymous mappings as well > as shmem objects), and shows them all somehow. But that's a lot > more work than just extending an existing stats display. Hugh, thanks for the good and comprehensive summary! Seems that I have to live with the /proc/sysvipc/shm overview then :-( Thanks, Helge From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail203.messagelabs.com (mail203.messagelabs.com [216.82.254.243]) by kanga.kvack.org (Postfix) with SMTP id 80C0C6B01F0 for ; Mon, 16 Aug 2010 16:21:07 -0400 (EDT) Message-ID: <4C699DB1.3010905@gmx.de> Date: Mon, 16 Aug 2010 22:21:05 +0200 From: Helge Deller MIME-Version: 1.0 Subject: Re: [PATCH][RFC] Fix up rss/swap usage of shm segments in /proc/pid/smaps References: <20100811201345.GA11304@p100.box> <20100812131005.e466a9fd.akpm@linux-foundation.org> <4C6468A9.7090503@gmx.de> <20100813195252.GA2450@p100.box> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org To: Hugh Dickins Cc: linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org, Manfred Spraul List-ID: On 08/14/2010 12:45 AM, Hugh Dickins wrote: > On Fri, 13 Aug 2010, Helge Deller wrote: >> >> I tried quite hard to implement rss/swap accounting for shm segments inside >> smaps_pte_range() which is a callback function of walk_page_range() in >> show_smap(). > > Sorry, I think the short answer will be that you should give up on this: > reasons below. > >> >> Given the fact that I'm no linux-mm expert, I might have overseen other >> possibilities, but my experiments inside smaps_pte_range() were not >> very successful: >> From my tests, a swapped-out shm segment >> - fails on the "is_swap_pte()" test, and >> - succeeds on the "!pte_present()" test (since it's swapped >> out). > > Yes. > >> So, here would it be possible to add such accounting for swap, but how >> can I then see that this pte is >> a) belonging to a shm segment?, and >> b) see if this page/pte was really swapped out and not just not >> yet written to at all? > > You would have to add a function in mm/shmem.c to do this: it would > need to check vma->vm_file to work out if this vma belongs to it, > and use shmem_swp_alloc() to check if the page there is on swap. OTOH > I'm not sure if you could call it while holding page table lock or not. > >> As answers I found: >> a) (vma->vm_flags& VM_MAYSHARE) is true for shm segments (is >> this check sufficient?) > > No, VM_MAYSHARE is set on many other kinds of mapping too; and is not > set on all mappings of shmem objects - there is no good reason to > include SysV shm segments here, yet omit other kinds of shmem object > (/dev/shm POSIX shared memory, shared-anonymous mappings, mappings of > tmpfs files). > >> b) no idea. >> >> But if I add this page to the mss.swap entry, all pages including such >> which haven't been touched yet at all are suddenly counted as >> swapped-out...? >> >> Any hints here would be great... >> >> >> As an alternative solution, I created the following patch. >> This one works nicely, but it's just a fix-up of the mss.resident and >> mss.swap values after walk_page_range() was called. >> It's mostly a copy of the shm_add_rss_swap() function from >> my previous patch (http://marc.info/?l=linux-mm&m=128171161101817&w=2). >> Do you think such a fix-up-afterwards-approach is acceptable at all? >> If yes, a new patch on top of my ipc/shm.c patch would be easy (and >> small). > > Not acceptable, I'm afraid. Nothing wrong with a fix-up-afterwards > approach as such, but it's assuming that the vma covers the full extent > of the shmem object. That is very often the case, but by no means > necessarily so (whereas it is always the case that one vma cannot cover > more than one object). So you do have to count pageslot by pageslot. > > There are two reasons why I think you have to abandon this. One is > that /proc//smaps is reporting on the userspace mappings, saying > where swap is instanced in them. Some of those mappings may be of > shmem objects, and some of those shmem objects may use swap backing > themselves, but that's different from the mapping using swap directly. > > One can argue about that distinction, but it is how all this is > designed, and blurring that distinction tends to get into trouble. > (It's reasonable to think of anonymous mappings as mappings of anon > objects, which just happen to find room for the swp_entry in the page > table: but then it's a happy accident that smaps can see them.) > > The second reason is that since 2.6.34, /proc//status shows > VmSwap: we would not want a huge discrepancy between what it shows > in swap and what /proc//smaps shows in swap, but nor would we > want to make /proc//status scan through page tables enquiring > of shmem. > > All this stands in contrast to your /proc/sysvipc/shm patch, which > is rightly dealing with one class of shmem object, not via mappings > of those objects. > > There is a case for a "where has my swap gone" tool, which examines > the different kinds of object involved (anonymous mappings as well > as shmem objects), and shows them all somehow. But that's a lot > more work than just extending an existing stats display. Hugh, thanks for the good and comprehensive summary! Seems that I have to live with the /proc/sysvipc/shm overview then :-( Thanks, Helge -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org