All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Sistare <steven.sistare@oracle.com>
To: Prakash Sangappa <prakash.sangappa@oracle.com>,
	Michal Hocko <mhocko@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	dave.hansen@intel.com, nao.horiguchi@gmail.com,
	akpm@linux-foundation.org, kirill.shutemov@linux.intel.com,
	khandual@linux.vnet.ibm.com
Subject: Re: [PATCH V2 0/6] VA to numa node information
Date: Mon, 26 Nov 2018 14:20:10 -0500	[thread overview]
Message-ID: <79d5e991-d9f6-65e2-cb77-0f999fa512fe@oracle.com> (raw)
In-Reply-To: <41af45a9-c428-ccd8-ca10-c355d22c56a7@oracle.com>

On 11/9/2018 11:48 PM, Prakash Sangappa wrote:
> On 9/24/18 10:14 AM, Michal Hocko wrote:
>> On Fri 14-09-18 12:01:18, Steven Sistare wrote:
>>> On 9/14/2018 1:56 AM, Michal Hocko wrote:
>> [...]
>>>> Why does this matter for something that is for analysis purposes.
>>>> Reading the file for the whole address space is far from a free
>>>> operation. Is the page walk optimization really essential for usability?
>>>> Moreover what prevents move_pages implementation to be clever for the
>>>> page walk itself? In other words why would we want to add a new API
>>>> rather than make the existing one faster for everybody.
>>> One could optimize move pages.  If the caller passes a consecutive range
>>> of small pages, and the page walk sees that a VA is mapped by a huge page,
>>> then it can return the same numa node for each of the following VA's that fall
>>> into the huge page range. It would be faster than 55 nsec per small page, but
>>> hard to say how much faster, and the cost is still driven by the number of
>>> small pages.
>> This is exactly what I was arguing for. There is some room for
>> improvements for the existing interface. I yet have to hear the explicit
>> usecase which would required even better performance that cannot be
>> achieved by the existing API.
>>
> 
> Above mentioned optimization to move_pages() API helps when scanning
> mapped huge pages, but does not help if there are large sparse mappings
> with few pages mapped. Otherwise, consider adding page walk support in
> the move_pages() implementation, enhance the API(new flag?) to return
> address range to numa node information. The page walk optimization
> would certainly make a difference for usability.
> 
> We can have applications(Like Oracle DB) having processes with large sparse
> mappings(in TBs)  with only some areas of these mapped address range
> being accessed, basically  large portions not having page tables backing it.
> This can become more prevalent on newer systems with multiple TBs of
> memory.
> 
> Here is some data from pmap using move_pages() API  with optimization.
> Following table compares time pmap takes to print address mapping of a
> large process, with numa node information using move_pages() api vs pmap
> using /proc numa_vamaps file.
> 
> Running pmap command on a process with 1.3 TB of address space, with
> sparse mappings.
> 
>                        ~1.3 TB sparse      250G dense segment with hugepages.
> move_pages              8.33s              3.14
> optimized move_pages    6.29s              0.92
> /proc numa_vamaps       0.08s              0.04
> 
>  
> Second column is pmap time on a 250G address range of this process, which maps
> hugepages(THP & hugetlb).

The data look compelling to me.  numa_vmap provides a much smoother user experience
for the analyst who is casting a wide net looking for the root of a performance issue.
Almost no waiting to see the data.

- Steve

  reply	other threads:[~2018-11-26 19:21 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-12 20:23 [PATCH V2 0/6] VA to numa node information Prakash Sangappa
2018-09-12 20:23 ` [PATCH V2 1/6] Add check to match numa node id when gathering pte stats Prakash Sangappa
2018-09-12 20:24 ` [PATCH V2 2/6] Add /proc/<pid>/numa_vamaps file for numa node information Prakash Sangappa
2018-09-12 20:24 ` [PATCH V2 3/6] Provide process address range to numa node id mapping Prakash Sangappa
2018-09-12 20:24 ` [PATCH V2 4/6] Add support to lseek /proc/<pid>/numa_vamaps file Prakash Sangappa
2018-09-12 20:24 ` [PATCH V2 5/6] File /proc/<pid>/numa_vamaps access needs PTRACE_MODE_READ_REALCREDS check Prakash Sangappa
2018-09-12 20:24 ` [PATCH V2 6/6] /proc/pid/numa_vamaps: document in Documentation/filesystems/proc.txt Prakash Sangappa
2018-09-13  8:40 ` [PATCH V2 0/6] VA to numa node information Michal Hocko
2018-09-13 22:32   ` prakash.sangappa
2018-09-14  0:10     ` Andrew Morton
2018-09-14  0:25       ` Dave Hansen
2018-09-15  1:31         ` Prakash Sangappa
2018-09-14  5:56     ` Michal Hocko
2018-09-14 16:01       ` Steven Sistare
2018-09-14 18:04         ` Prakash Sangappa
2018-09-14 18:04           ` Prakash Sangappa
2018-09-14 19:01           ` Dave Hansen
2018-09-24 17:14         ` Michal Hocko
2018-11-10  4:48           ` Prakash Sangappa
2018-11-10  4:48             ` Prakash Sangappa
2018-11-26 19:20             ` Steven Sistare [this message]
2018-12-18 23:46               ` prakash.sangappa
2018-12-19 20:52                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=79d5e991-d9f6-65e2-cb77-0f999fa512fe@oracle.com \
    --to=steven.sistare@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=nao.horiguchi@gmail.com \
    --cc=prakash.sangappa@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.