All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Dave Hansen <dave.hansen@intel.com>,
	Nitesh Narayan Lal <nitesh@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, pbonzini@redhat.com, lcapitulino@redhat.com,
	pagupta@redhat.com, wei.w.wang@intel.com,
	yang.zhang.wz@gmail.com, riel@surriel.com, mst@redhat.com,
	dodgen@google.com, konrad.wilk@oracle.com, dhildenb@redhat.com,
	aarcange@redhat.com, alexander.duyck@gmail.com,
	john.starks@microsoft.com, mhocko@suse.com
Subject: Re: [RFC][Patch v11 1/2] mm: page_hinting: core infrastructure
Date: Mon, 15 Jul 2019 16:40:30 +0200	[thread overview]
Message-ID: <c978542a-6535-634f-b07a-0a158993bada@redhat.com> (raw)
In-Reply-To: <46336efb-3243-0083-1d20-7e8578131679@redhat.com>

On 15.07.19 11:33, David Hildenbrand wrote:
> On 11.07.19 20:21, Dave Hansen wrote:
>> On 7/10/19 12:51 PM, Nitesh Narayan Lal wrote:
>>> +static void bm_set_pfn(struct page *page)
>>> +{
>>> +	struct zone *zone = page_zone(page);
>>> +	int zone_idx = page_zonenum(page);
>>> +	unsigned long bitnr = 0;
>>> +
>>> +	lockdep_assert_held(&zone->lock);
>>> +	bitnr = pfn_to_bit(page, zone_idx);
>>> +	/*
>>> +	 * TODO: fix possible underflows.
>>> +	 */
>>> +	if (free_area[zone_idx].bitmap &&
>>> +	    bitnr < free_area[zone_idx].nbits &&
>>> +	    !test_and_set_bit(bitnr, free_area[zone_idx].bitmap))
>>> +		atomic_inc(&free_area[zone_idx].free_pages);
>>> +}
>>
>> Let's say I have two NUMA nodes, each with ZONE_NORMAL and ZONE_MOVABLE
>> and each zone with 1GB of memory:
>>
>> Node:         0        1
>> NORMAL   0->1GB   2->3GB
>> MOVABLE  1->2GB   3->4GB
>>
>> This code will allocate two bitmaps.  The ZONE_NORMAL bitmap will
>> represent data from 0->3GB and the ZONE_MOVABLE bitmap will represent
>> data from 1->4GB.  That's the result of this code:
>>
>>> +			if (free_area[zone_idx].base_pfn) {
>>> +				free_area[zone_idx].base_pfn =
>>> +					min(free_area[zone_idx].base_pfn,
>>> +					    zone->zone_start_pfn);
>>> +				free_area[zone_idx].end_pfn =
>>> +					max(free_area[zone_idx].end_pfn,
>>> +					    zone->zone_start_pfn +
>>> +					    zone->spanned_pages);
>>
>> But that means that both bitmaps will have space for PFNs in the other
>> zone type, which is completely bogus.  This is fundamental because the
>> data structures are incorrectly built per zone *type* instead of per zone.
>>
> 
> I don't think it's incorrect, it's just not optimal in all scenarios.
> E.g., in you example, this approach would "waste" 2 * 1GB of tracking
> data for the wholes (2* 64bytes when using 1 bit for 2MB).
> 
> FWIW, this is not a numa-specific thingy. We can have sparse zones
> easily on single-numa systems.
> 
> Node:                 0
> NORMAL   0->1GB, 2->3GB
> MOVABLE  1->2GB, 3->4GB
> 
> So tracking it per zones instead instead of zone type is only one part
> of the story.
> 

Oh, and FWIW,

in setups like

Node:                 0               1
NORMAL   4->5GB, 6->7GB  5->6GB, 8->9GB

What Nitesh proposes is actually better. So it really depends on the use
case - but in general sparsity is the issue.

-- 

Thanks,

David / dhildenb

  reply	other threads:[~2019-07-15 14:42 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-10 19:51 [RFC][PATCH v11 0/2] mm: Support for page hinting Nitesh Narayan Lal
2019-07-10 19:51 ` [RFC][Patch v11 1/2] mm: page_hinting: core infrastructure Nitesh Narayan Lal
2019-07-10 20:45   ` Dave Hansen
2019-07-11 11:48     ` Nitesh Narayan Lal
2019-07-11 15:25     ` Nitesh Narayan Lal
2019-07-11 15:50       ` Nitesh Narayan Lal
2019-07-11 16:22       ` Dave Hansen
2019-07-11 16:36         ` Nitesh Narayan Lal
2019-07-11 16:45           ` Dave Hansen
2019-07-11 16:52             ` Nitesh Narayan Lal
2019-07-15  9:26     ` David Hildenbrand
2019-07-10 21:56   ` Alexander Duyck
2019-07-10 21:56     ` Alexander Duyck
2019-07-11 17:58     ` Nitesh Narayan Lal
2019-07-11 23:20       ` Alexander Duyck
2019-07-11 23:20         ` Alexander Duyck
2019-07-12  1:12         ` Nitesh Narayan Lal
2019-07-12 16:22           ` Alexander Duyck
2019-07-12 16:22             ` Alexander Duyck
2019-07-12 16:25             ` Nitesh Narayan Lal
2019-08-08 11:41             ` Nitesh Narayan Lal
2019-07-11 18:21   ` Dave Hansen
2019-07-15  9:33     ` David Hildenbrand
2019-07-15 14:40       ` David Hildenbrand [this message]
2019-07-10 19:51 ` [RFC][Patch v11 2/2] virtio-balloon: page_hinting: reporting to the host Nitesh Narayan Lal
2019-07-24 19:47   ` Michael S. Tsirkin
2019-07-24 19:56     ` David Hildenbrand
2019-07-24 20:10       ` Nitesh Narayan Lal
2019-07-24 20:06     ` Nitesh Narayan Lal
2019-07-10 19:53 ` [QEMU Patch] virtio-baloon: Support for page hinting Nitesh Narayan Lal
2019-07-10 20:17   ` Alexander Duyck
2019-07-10 20:17     ` Alexander Duyck
2019-07-11 12:03     ` Nitesh Narayan Lal
2019-07-11  8:49   ` Cornelia Huck
2019-07-11 11:13     ` Nitesh Narayan Lal
2019-07-11 18:55   ` Michael S. Tsirkin
2019-07-11 19:06     ` Nitesh Narayan Lal
2019-07-11 22:36       ` Alexander Duyck
2019-07-11 22:36         ` Alexander Duyck
2019-07-10 20:19 ` [RFC][PATCH v11 0/2] mm: " Dave Hansen
2019-07-11 11:37   ` Nitesh Narayan Lal
2019-07-10 23:40 ` Alexander Duyck
2019-07-10 23:40   ` Alexander Duyck
2019-07-11 11:30   ` Nitesh Narayan Lal
2019-07-11 14:58     ` Alexander Duyck
2019-07-11 14:58       ` Alexander Duyck
2019-07-11 15:03       ` Nitesh Narayan Lal
2019-07-11 15:08         ` Alexander Duyck
2019-07-11 15:08           ` Alexander Duyck
2019-07-11 15:19           ` Nitesh Narayan Lal
2019-07-11 17:01             ` Alexander Duyck
2019-07-11 17:01               ` Alexander Duyck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c978542a-6535-634f-b07a-0a158993bada@redhat.com \
    --to=david@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=dave.hansen@intel.com \
    --cc=dhildenb@redhat.com \
    --cc=dodgen@google.com \
    --cc=john.starks@microsoft.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mst@redhat.com \
    --cc=nitesh@redhat.com \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@surriel.com \
    --cc=wei.w.wang@intel.com \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.