From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758748AbcK3TPy (ORCPT ); Wed, 30 Nov 2016 14:15:54 -0500 Received: from mga01.intel.com ([192.55.52.88]:6860 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757704AbcK3TPY (ORCPT ); Wed, 30 Nov 2016 14:15:24 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,574,1473145200"; d="scan'208";a="35940261" Subject: Re: [PATCH kernel v5 5/5] virtio-balloon: tell host vm's unused page info To: Liang Li , kvm@vger.kernel.org References: <1480495397-23225-1-git-send-email-liang.z.li@intel.com> <1480495397-23225-6-git-send-email-liang.z.li@intel.com> Cc: virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtio-dev@lists.oasis-open.org, qemu-devel@nongnu.org, quintela@redhat.com, dgilbert@redhat.com, mst@redhat.com, jasowang@redhat.com, kirill.shutemov@linux.intel.com, akpm@linux-foundation.org, mhocko@suse.com, pbonzini@redhat.com, Mel Gorman , Cornelia Huck , Amit Shah From: Dave Hansen Message-ID: <438dd41a-fdf1-2a77-ef9c-8c103f492b2f@intel.com> Date: Wed, 30 Nov 2016 11:15:23 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <1480495397-23225-6-git-send-email-liang.z.li@intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/30/2016 12:43 AM, Liang Li wrote: > +static void send_unused_pages_info(struct virtio_balloon *vb, > + unsigned long req_id) > +{ > + struct scatterlist sg_in; > + unsigned long pos = 0; > + struct virtqueue *vq = vb->req_vq; > + struct virtio_balloon_resp_hdr *hdr = vb->resp_hdr; > + int ret, order; > + > + mutex_lock(&vb->balloon_lock); > + > + for (order = MAX_ORDER - 1; order >= 0; order--) { I scratched my head for a bit on this one. Why are you walking over orders, *then* zones. I *think* you're doing it because you can efficiently fill the bitmaps at a given order for all zones, then move to a new bitmap. But, it would be interesting to document this. > + pos = 0; > + ret = get_unused_pages(vb->resp_data, > + vb->resp_buf_size / sizeof(unsigned long), > + order, &pos); FWIW, get_unsued_pages() is a pretty bad name. "get" usually implies bumping reference counts or consuming something. You're just "recording" or "marking" them. > + if (ret == -ENOSPC) { > + void *new_resp_data; > + > + new_resp_data = kmalloc(2 * vb->resp_buf_size, > + GFP_KERNEL); > + if (new_resp_data) { > + kfree(vb->resp_data); > + vb->resp_data = new_resp_data; > + vb->resp_buf_size *= 2; What happens to the data in ->resp_data at this point? Doesn't this just throw it away? ... > +struct page_info_item { > + __le64 start_pfn : 52; /* start pfn for the bitmap */ > + __le64 page_shift : 6; /* page shift width, in bytes */ > + __le64 bmap_len : 6; /* bitmap length, in bytes */ > +}; Is 'bmap_len' too short? a 64-byte buffer is a bit tiny. Right? > +static int mark_unused_pages(struct zone *zone, > + unsigned long *unused_pages, unsigned long size, > + int order, unsigned long *pos) > +{ > + unsigned long pfn, flags; > + unsigned int t; > + struct list_head *curr; > + struct page_info_item *info; > + > + if (zone_is_empty(zone)) > + return 0; > + > + spin_lock_irqsave(&zone->lock, flags); > + > + if (*pos + zone->free_area[order].nr_free > size) > + return -ENOSPC; Urg, so this won't partially fill? So, what the nr_free pages limit where we no longer fit in the kmalloc()'d buffer where this simply won't work? > + for (t = 0; t < MIGRATE_TYPES; t++) { > + list_for_each(curr, &zone->free_area[order].free_list[t]) { > + pfn = page_to_pfn(list_entry(curr, struct page, lru)); > + info = (struct page_info_item *)(unused_pages + *pos); > + info->start_pfn = pfn; > + info->page_shift = order + PAGE_SHIFT; > + *pos += 1; > + } > + } Do we need to fill in ->bmap_len here? From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Hansen Subject: Re: [PATCH kernel v5 5/5] virtio-balloon: tell host vm's unused page info Date: Wed, 30 Nov 2016 11:15:23 -0800 Message-ID: <438dd41a-fdf1-2a77-ef9c-8c103f492b2f@intel.com> References: <1480495397-23225-1-git-send-email-liang.z.li@intel.com> <1480495397-23225-6-git-send-email-liang.z.li@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtio-dev@lists.oasis-open.org, qemu-devel@nongnu.org, quintela@redhat.com, dgilbert@redhat.com, mst@redhat.com, jasowang@redhat.com, kirill.shutemov@linux.intel.com, akpm@linux-foundation.org, mhocko@suse.com, pbonzini@redhat.com, Mel Gorman , Cornelia Huck , Amit Shah To: Liang Li , kvm@vger.kernel.org Return-path: In-Reply-To: <1480495397-23225-6-git-send-email-liang.z.li@intel.com> Sender: owner-linux-mm@kvack.org List-Id: kvm.vger.kernel.org On 11/30/2016 12:43 AM, Liang Li wrote: > +static void send_unused_pages_info(struct virtio_balloon *vb, > + unsigned long req_id) > +{ > + struct scatterlist sg_in; > + unsigned long pos = 0; > + struct virtqueue *vq = vb->req_vq; > + struct virtio_balloon_resp_hdr *hdr = vb->resp_hdr; > + int ret, order; > + > + mutex_lock(&vb->balloon_lock); > + > + for (order = MAX_ORDER - 1; order >= 0; order--) { I scratched my head for a bit on this one. Why are you walking over orders, *then* zones. I *think* you're doing it because you can efficiently fill the bitmaps at a given order for all zones, then move to a new bitmap. But, it would be interesting to document this. > + pos = 0; > + ret = get_unused_pages(vb->resp_data, > + vb->resp_buf_size / sizeof(unsigned long), > + order, &pos); FWIW, get_unsued_pages() is a pretty bad name. "get" usually implies bumping reference counts or consuming something. You're just "recording" or "marking" them. > + if (ret == -ENOSPC) { > + void *new_resp_data; > + > + new_resp_data = kmalloc(2 * vb->resp_buf_size, > + GFP_KERNEL); > + if (new_resp_data) { > + kfree(vb->resp_data); > + vb->resp_data = new_resp_data; > + vb->resp_buf_size *= 2; What happens to the data in ->resp_data at this point? Doesn't this just throw it away? ... > +struct page_info_item { > + __le64 start_pfn : 52; /* start pfn for the bitmap */ > + __le64 page_shift : 6; /* page shift width, in bytes */ > + __le64 bmap_len : 6; /* bitmap length, in bytes */ > +}; Is 'bmap_len' too short? a 64-byte buffer is a bit tiny. Right? > +static int mark_unused_pages(struct zone *zone, > + unsigned long *unused_pages, unsigned long size, > + int order, unsigned long *pos) > +{ > + unsigned long pfn, flags; > + unsigned int t; > + struct list_head *curr; > + struct page_info_item *info; > + > + if (zone_is_empty(zone)) > + return 0; > + > + spin_lock_irqsave(&zone->lock, flags); > + > + if (*pos + zone->free_area[order].nr_free > size) > + return -ENOSPC; Urg, so this won't partially fill? So, what the nr_free pages limit where we no longer fit in the kmalloc()'d buffer where this simply won't work? > + for (t = 0; t < MIGRATE_TYPES; t++) { > + list_for_each(curr, &zone->free_area[order].free_list[t]) { > + pfn = page_to_pfn(list_entry(curr, struct page, lru)); > + info = (struct page_info_item *)(unused_pages + *pos); > + info->start_pfn = pfn; > + info->page_shift = order + PAGE_SHIFT; > + *pos += 1; > + } > + } Do we need to fill in ->bmap_len here? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51333) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cCAM5-0005AJ-Um for qemu-devel@nongnu.org; Wed, 30 Nov 2016 14:15:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cCAM0-00058c-HS for qemu-devel@nongnu.org; Wed, 30 Nov 2016 14:15:45 -0500 Received: from mga04.intel.com ([192.55.52.120]:63454) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cCAM0-00053w-4Q for qemu-devel@nongnu.org; Wed, 30 Nov 2016 14:15:40 -0500 References: <1480495397-23225-1-git-send-email-liang.z.li@intel.com> <1480495397-23225-6-git-send-email-liang.z.li@intel.com> From: Dave Hansen Message-ID: <438dd41a-fdf1-2a77-ef9c-8c103f492b2f@intel.com> Date: Wed, 30 Nov 2016 11:15:23 -0800 MIME-Version: 1.0 In-Reply-To: <1480495397-23225-6-git-send-email-liang.z.li@intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH kernel v5 5/5] virtio-balloon: tell host vm's unused page info List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Liang Li , kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtio-dev@lists.oasis-open.org, qemu-devel@nongnu.org, quintela@redhat.com, dgilbert@redhat.com, mst@redhat.com, jasowang@redhat.com, kirill.shutemov@linux.intel.com, akpm@linux-foundation.org, mhocko@suse.com, pbonzini@redhat.com, Mel Gorman , Cornelia Huck , Amit Shah On 11/30/2016 12:43 AM, Liang Li wrote: > +static void send_unused_pages_info(struct virtio_balloon *vb, > + unsigned long req_id) > +{ > + struct scatterlist sg_in; > + unsigned long pos = 0; > + struct virtqueue *vq = vb->req_vq; > + struct virtio_balloon_resp_hdr *hdr = vb->resp_hdr; > + int ret, order; > + > + mutex_lock(&vb->balloon_lock); > + > + for (order = MAX_ORDER - 1; order >= 0; order--) { I scratched my head for a bit on this one. Why are you walking over orders, *then* zones. I *think* you're doing it because you can efficiently fill the bitmaps at a given order for all zones, then move to a new bitmap. But, it would be interesting to document this. > + pos = 0; > + ret = get_unused_pages(vb->resp_data, > + vb->resp_buf_size / sizeof(unsigned long), > + order, &pos); FWIW, get_unsued_pages() is a pretty bad name. "get" usually implies bumping reference counts or consuming something. You're just "recording" or "marking" them. > + if (ret == -ENOSPC) { > + void *new_resp_data; > + > + new_resp_data = kmalloc(2 * vb->resp_buf_size, > + GFP_KERNEL); > + if (new_resp_data) { > + kfree(vb->resp_data); > + vb->resp_data = new_resp_data; > + vb->resp_buf_size *= 2; What happens to the data in ->resp_data at this point? Doesn't this just throw it away? ... > +struct page_info_item { > + __le64 start_pfn : 52; /* start pfn for the bitmap */ > + __le64 page_shift : 6; /* page shift width, in bytes */ > + __le64 bmap_len : 6; /* bitmap length, in bytes */ > +}; Is 'bmap_len' too short? a 64-byte buffer is a bit tiny. Right? > +static int mark_unused_pages(struct zone *zone, > + unsigned long *unused_pages, unsigned long size, > + int order, unsigned long *pos) > +{ > + unsigned long pfn, flags; > + unsigned int t; > + struct list_head *curr; > + struct page_info_item *info; > + > + if (zone_is_empty(zone)) > + return 0; > + > + spin_lock_irqsave(&zone->lock, flags); > + > + if (*pos + zone->free_area[order].nr_free > size) > + return -ENOSPC; Urg, so this won't partially fill? So, what the nr_free pages limit where we no longer fit in the kmalloc()'d buffer where this simply won't work? > + for (t = 0; t < MIGRATE_TYPES; t++) { > + list_for_each(curr, &zone->free_area[order].free_list[t]) { > + pfn = page_to_pfn(list_entry(curr, struct page, lru)); > + info = (struct page_info_item *)(unused_pages + *pos); > + info->start_pfn = pfn; > + info->page_shift = order + PAGE_SHIFT; > + *pos += 1; > + } > + } Do we need to fill in ->bmap_len here?