linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Li, Liang Z" <liang.z.li@intel.com>
To: "Hansen, Dave" <dave.hansen@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Cc: "virtualization@lists.linux-foundation.org" 
	<virtualization@lists.linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"virtio-dev@lists.oasis-open.org"
	<virtio-dev@lists.oasis-open.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"dgilbert@redhat.com" <dgilbert@redhat.com>,
	"mst@redhat.com" <mst@redhat.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mhocko@suse.com" <mhocko@suse.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Cornelia Huck <cornelia.huck@de.ibm.com>,
	"Amit Shah" <amit.shah@redhat.com>
Subject: RE: [PATCH kernel v5 5/5] virtio-balloon: tell host vm's unused page info
Date: Sun, 4 Dec 2016 13:13:23 +0000	[thread overview]
Message-ID: <F2CBF3009FA73547804AE4C663CAB28E3A12D814@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <438dd41a-fdf1-2a77-ef9c-8c103f492b2f@intel.com>

> On 11/30/2016 12:43 AM, Liang Li wrote:
> > +static void send_unused_pages_info(struct virtio_balloon *vb,
> > +				unsigned long req_id)
> > +{
> > +	struct scatterlist sg_in;
> > +	unsigned long pos = 0;
> > +	struct virtqueue *vq = vb->req_vq;
> > +	struct virtio_balloon_resp_hdr *hdr = vb->resp_hdr;
> > +	int ret, order;
> > +
> > +	mutex_lock(&vb->balloon_lock);
> > +
> > +	for (order = MAX_ORDER - 1; order >= 0; order--) {
> 
> I scratched my head for a bit on this one.  Why are you walking over orders,
> *then* zones.  I *think* you're doing it because you can efficiently fill the
> bitmaps at a given order for all zones, then move to a new bitmap.  But, it
> would be interesting to document this.
> 

Yes, use the order is somewhat strange, but it's helpful to keep the API simple. 
Do you think it's acceptable?

> > +		pos = 0;
> > +		ret = get_unused_pages(vb->resp_data,
> > +			 vb->resp_buf_size / sizeof(unsigned long),
> > +			 order, &pos);
> 
> FWIW, get_unsued_pages() is a pretty bad name.  "get" usually implies
> bumping reference counts or consuming something.  You're just "recording"
> or "marking" them.
> 

Will change to mark_unused_pages().

> > +		if (ret == -ENOSPC) {
> > +			void *new_resp_data;
> > +
> > +			new_resp_data = kmalloc(2 * vb->resp_buf_size,
> > +						GFP_KERNEL);
> > +			if (new_resp_data) {
> > +				kfree(vb->resp_data);
> > +				vb->resp_data = new_resp_data;
> > +				vb->resp_buf_size *= 2;
> 
> What happens to the data in ->resp_data at this point?  Doesn't this just
> throw it away?
> 

Yes, so we should make sure the data in resp_data is not inuse.

> ...
> > +struct page_info_item {
> > +	__le64 start_pfn : 52; /* start pfn for the bitmap */
> > +	__le64 page_shift : 6; /* page shift width, in bytes */
> > +	__le64 bmap_len : 6;  /* bitmap length, in bytes */ };
> 
> Is 'bmap_len' too short?  a 64-byte buffer is a bit tiny.  Right?
> 

Currently, we just use the 8 bytes and 0 bytes bitmap, should we support more than 64 bytes?

> > +static int  mark_unused_pages(struct zone *zone,
> > +		unsigned long *unused_pages, unsigned long size,
> > +		int order, unsigned long *pos)
> > +{
> > +	unsigned long pfn, flags;
> > +	unsigned int t;
> > +	struct list_head *curr;
> > +	struct page_info_item *info;
> > +
> > +	if (zone_is_empty(zone))
> > +		return 0;
> > +
> > +	spin_lock_irqsave(&zone->lock, flags);
> > +
> > +	if (*pos + zone->free_area[order].nr_free > size)
> > +		return -ENOSPC;
> 
> Urg, so this won't partially fill?  So, what the nr_free pages limit where we no
> longer fit in the kmalloc()'d buffer where this simply won't work?
> 

Yes.  My initial implementation is partially fill, it's better for the worst case.
I thought the above code is more efficient for most case ...
Do you think partially fill the bitmap is better?
 
> > +	for (t = 0; t < MIGRATE_TYPES; t++) {
> > +		list_for_each(curr, &zone->free_area[order].free_list[t]) {
> > +			pfn = page_to_pfn(list_entry(curr, struct page, lru));
> > +			info = (struct page_info_item *)(unused_pages +
> *pos);
> > +			info->start_pfn = pfn;
> > +			info->page_shift = order + PAGE_SHIFT;
> > +			*pos += 1;
> > +		}
> > +	}
> 
> Do we need to fill in ->bmap_len here?

For integrity, the bmap_len should be filled, will add.
Omit this step just because QEMU assume the ->bmp_len is 0 and ignore this field.

Thanks for your comment!

Liang

  reply	other threads:[~2016-12-04 13:13 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-30  8:43 [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Liang Li
2016-11-30  8:43 ` [PATCH kernel v5 1/5] virtio-balloon: rework deflate to add page to a list Liang Li
2016-11-30  8:43 ` [PATCH kernel v5 2/5] virtio-balloon: define new feature bit and head struct Liang Li
2016-11-30  8:43 ` [PATCH kernel v5 3/5] virtio-balloon: speed up inflate/deflate process Liang Li
2016-11-30  8:43 ` [PATCH kernel v5 4/5] virtio-balloon: define flags and head for host request vq Liang Li
2016-11-30  8:43 ` [PATCH kernel v5 5/5] virtio-balloon: tell host vm's unused page info Liang Li
2016-11-30 19:15   ` Dave Hansen
2016-12-04 13:13     ` Li, Liang Z [this message]
2016-12-05 17:22       ` Dave Hansen
2016-12-06  4:47         ` Li, Liang Z
2016-12-06  8:40 ` [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration David Hildenbrand
2016-12-07 13:35   ` Li, Liang Z
2016-12-07 15:34     ` Dave Hansen
2016-12-09  3:09       ` Li, Liang Z
2016-12-07 15:42     ` David Hildenbrand
2016-12-07 15:45       ` Dave Hansen
2016-12-07 16:21         ` David Hildenbrand
2016-12-07 16:57           ` Dave Hansen
2016-12-07 18:38             ` [Qemu-devel] " Andrea Arcangeli
2016-12-07 18:44               ` Dave Hansen
2016-12-07 18:58                 ` Andrea Arcangeli
2016-12-07 19:54               ` Dave Hansen
2016-12-07 20:28                 ` Andrea Arcangeli
2016-12-09  4:45                   ` Li, Liang Z
2016-12-09  4:53                     ` Dave Hansen
2016-12-09  5:35                       ` Li, Liang Z
2016-12-09 16:42                         ` Andrea Arcangeli
2016-12-14  8:20                           ` Li, Liang Z
2016-12-14  8:59                       ` Li, Liang Z
2016-12-15 15:34                         ` Dave Hansen
2016-12-15 15:54                           ` Michael S. Tsirkin
2016-12-16  1:12                             ` Li, Liang Z
2016-12-16 15:40                               ` Andrea Arcangeli
2016-12-17 11:56                                 ` Li, Liang Z
2016-12-16  0:48                           ` Li, Liang Z
2016-12-16  1:09                             ` Dave Hansen
2016-12-16  1:38                               ` Li, Liang Z
2016-12-16  1:40                                 ` Dave Hansen
2016-12-16  1:43                                   ` Li, Liang Z
2016-12-16 16:01                                   ` Andrea Arcangeli
2016-12-17 12:39                                     ` Li, Liang Z

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F2CBF3009FA73547804AE4C663CAB28E3A12D814@shsmsx102.ccr.corp.intel.com \
    --to=liang.z.li@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=amit.shah@redhat.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=dave.hansen@intel.com \
    --cc=dgilbert@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).