All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xen/grant-table: correctly initialize grant table version 1
@ 2012-12-19 20:20 Matt Wilson
  2012-12-20  3:42 ` ANNIE LI
  2012-12-20 10:01 ` Ian Campbell
  0 siblings, 2 replies; 29+ messages in thread
From: Matt Wilson @ 2012-12-19 20:20 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Annie Li, Steven Noonan, Matt Wilson, xen-devel

Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
a constant to a conditional expression. The expression depends on
grant_table_version being appropriately set. Unfortunately, at init
time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
conditional expression checks for "grant_table_version == 1", and
therefore returns the number of grant references per frame for v2.

This causes gnttab_init() to allocate fewer pages for gnttab_list, as
a frame can old half the number of v2 entries than v1 entries. After
gnttab_resume() is called, grant_table_version is appropriately
set. nr_init_grefs will then be miscalculated and gnttab_free_count
will hold a value larger than the actual number of free gref entries.

If a guest is heavily utilizing improperly initialized v1 grant
tables, memory corruption can occur. One common manifestation is
corruption of the vmalloc list, resulting in a poisoned pointer
derefrence when accessing /proc/meminfo or /proc/vmallocinfo:

[   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
[   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
[   40.770102] PGD 0
[   40.770107] Oops: 0000 [#1] SMP
[   40.770114] CPU 10

This patch introduces a static variable, grefs_per_grant_frame, to
cache the calculated value. gnttab_init() now calls
gnttab_request_version() early so that grant_table_version and
grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
been added to prevent this type of bug from reoccurring in the future.

Signed-off-by: Matt Wilson <msw@amazon.com>
Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: xen-devel@lists.xen.org
---
 drivers/xen/grant-table.c |   41 +++++++++++++++++++++++------------------
 1 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 043bf07..011fdc3 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -55,10 +55,6 @@
 /* External tools reserve first few grant table entries. */
 #define NR_RESERVED_ENTRIES 8
 #define GNTTAB_LIST_END 0xffffffff
-#define GREFS_PER_GRANT_FRAME \
-(grant_table_version == 1 ?                      \
-(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
-(PAGE_SIZE / sizeof(union grant_entry_v2)))
 
 static grant_ref_t **gnttab_list;
 static unsigned int nr_grant_frames;
@@ -153,6 +149,7 @@ static struct gnttab_ops *gnttab_interface;
 static grant_status_t *grstatus;
 
 static int grant_table_version;
+static int grefs_per_grant_frame;
 
 static struct gnttab_free_callback *gnttab_free_callback_list;
 
@@ -766,12 +763,14 @@ static int grow_gnttab_list(unsigned int more_frames)
 	unsigned int new_nr_grant_frames, extra_entries, i;
 	unsigned int nr_glist_frames, new_nr_glist_frames;
 
+	BUG_ON(grefs_per_grant_frame == 0);
+
 	new_nr_grant_frames = nr_grant_frames + more_frames;
-	extra_entries       = more_frames * GREFS_PER_GRANT_FRAME;
+	extra_entries       = more_frames * grefs_per_grant_frame;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	new_nr_glist_frames =
-		(new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+		(new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = nr_glist_frames; i < new_nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
 		if (!gnttab_list[i])
@@ -779,12 +778,12 @@ static int grow_gnttab_list(unsigned int more_frames)
 	}
 
 
-	for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
-	     i < GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
+	for (i = grefs_per_grant_frame * nr_grant_frames;
+	     i < grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
 		gnttab_entry(i) = i + 1;
 
 	gnttab_entry(i) = gnttab_free_head;
-	gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
+	gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
 	gnttab_free_count += extra_entries;
 
 	nr_grant_frames = new_nr_grant_frames;
@@ -904,7 +903,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
 
 static unsigned nr_status_frames(unsigned nr_grant_frames)
 {
-	return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
+	BUG_ON(grefs_per_grant_frame == 0);
+	return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
 }
 
 static int gnttab_map_frames_v1(xen_pfn_t *frames, unsigned int nr_gframes)
@@ -1068,6 +1068,7 @@ static void gnttab_request_version(void)
 	rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
 	if (rc == 0 && gsv.version == 2) {
 		grant_table_version = 2;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(union grant_entry_v2);
 		gnttab_interface = &gnttab_v2_ops;
 	} else if (grant_table_version == 2) {
 		/*
@@ -1080,10 +1081,9 @@ static void gnttab_request_version(void)
 		panic("we need grant tables version 2, but only version 1 is available");
 	} else {
 		grant_table_version = 1;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
 		gnttab_interface = &gnttab_v1_ops;
 	}
-	printk(KERN_INFO "Grant tables using version %d layout.\n",
-		grant_table_version);
 }
 
 int gnttab_resume(void)
@@ -1092,6 +1092,8 @@ int gnttab_resume(void)
 	char *kmsg = "Failed to kmalloc pages for pv in hvm grant frames\n";
 
 	gnttab_request_version();
+	printk(KERN_INFO "Grant tables using version %d layout.\n",
+		grant_table_version);
 	max_nr_gframes = gnttab_max_grant_frames();
 	if (max_nr_gframes < nr_grant_frames)
 		return -ENOSYS;
@@ -1137,9 +1139,10 @@ static int gnttab_expand(unsigned int req_entries)
 	int rc;
 	unsigned int cur, extra;
 
+	BUG_ON(grefs_per_grant_frame == 0);
 	cur = nr_grant_frames;
-	extra = ((req_entries + (GREFS_PER_GRANT_FRAME-1)) /
-		 GREFS_PER_GRANT_FRAME);
+	extra = ((req_entries + (grefs_per_grant_frame-1)) /
+		 grefs_per_grant_frame);
 	if (cur + extra > gnttab_max_grant_frames())
 		return -ENOSPC;
 
@@ -1157,21 +1160,23 @@ int gnttab_init(void)
 	unsigned int nr_init_grefs;
 	int ret;
 
+	gnttab_request_version();
 	nr_grant_frames = 1;
 	boot_max_nr_grant_frames = __max_nr_grant_frames();
 
 	/* Determine the maximum number of frames required for the
 	 * grant reference free list on the current hypervisor.
 	 */
+	BUG_ON(grefs_per_grant_frame == 0);
 	max_nr_glist_frames = (boot_max_nr_grant_frames *
-			       GREFS_PER_GRANT_FRAME / RPP);
+			       grefs_per_grant_frame / RPP);
 
 	gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *),
 			      GFP_KERNEL);
 	if (gnttab_list == NULL)
 		return -ENOMEM;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = 0; i < nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_KERNEL);
 		if (gnttab_list[i] == NULL) {
@@ -1185,7 +1190,7 @@ int gnttab_init(void)
 		goto ini_nomem;
 	}
 
-	nr_init_grefs = nr_grant_frames * GREFS_PER_GRANT_FRAME;
+	nr_init_grefs = nr_grant_frames * grefs_per_grant_frame;
 
 	for (i = NR_RESERVED_ENTRIES; i < nr_init_grefs - 1; i++)
 		gnttab_entry(i) = i + 1;
-- 
1.7.4.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] xen/grant-table: correctly initialize grant table version 1
  2012-12-19 20:20 [PATCH] xen/grant-table: correctly initialize grant table version 1 Matt Wilson
@ 2012-12-20  3:42 ` ANNIE LI
  2012-12-20 21:24   ` Matt Wilson
  2012-12-20 10:01 ` Ian Campbell
  1 sibling, 1 reply; 29+ messages in thread
From: ANNIE LI @ 2012-12-20  3:42 UTC (permalink / raw)
  To: Matt Wilson; +Cc: Steven Noonan, xen-devel, Konrad Rzeszutek Wilk



On 2012-12-20 4:20, Matt Wilson wrote:
> Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
> tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
> a constant to a conditional expression. The expression depends on
> grant_table_version being appropriately set. Unfortunately, at init
> time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
> conditional expression checks for "grant_table_version == 1", and
> therefore returns the number of grant references per frame for v2.
>
> This causes gnttab_init() to allocate fewer pages for gnttab_list, as
> a frame can old half the number of v2 entries than v1 entries. After
> gnttab_resume() is called, grant_table_version is appropriately
> set. nr_init_grefs will then be miscalculated and gnttab_free_count
> will hold a value larger than the actual number of free gref entries.

Correct.

>
> If a guest is heavily utilizing improperly initialized v1 grant
> tables, memory corruption can occur. One common manifestation is
> corruption of the vmalloc list, resulting in a poisoned pointer
> derefrence when accessing /proc/meminfo or /proc/vmallocinfo:
>
> [   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
> [   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
> [   40.770102] PGD 0
> [   40.770107] Oops: 0000 [#1] SMP
> [   40.770114] CPU 10
>
> This patch introduces a static variable, grefs_per_grant_frame, to
> cache the calculated value. gnttab_init() now calls
> gnttab_request_version() early so that grant_table_version and
> grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
> been added to prevent this type of bug from reoccurring in the future.

Thanks for posting this.
This is caused by not initializing grant_table_version in gnttab_init() 
before gnttab_resume(). How about only adding gnttab_request_version() 
and BUG_ON() to check grant_table_version in gnttab_init(), then no need 
to change GREFS_PER_GRANT_FRAME into grefs_per_grant_frame and add more 
BUG_ON() to check grefs_per_grant_frame?

For example, in gnttab_init()
....

+gnttab_request_version();
+BUG_ON(grant_table_version == 0);

....

Thanks
Annie
>
> Signed-off-by: Matt Wilson<msw@amazon.com>
> Reviewed-and-Tested-by: Steven Noonan<snoonan@amazon.com>
> Cc: Konrad Rzeszutek Wilk<konrad.wilk@oracle.com>
> Cc: Annie Li<annie.li@oracle.com>
> Cc: xen-devel@lists.xen.org
> ---
>   drivers/xen/grant-table.c |   41 +++++++++++++++++++++++------------------
>   1 files changed, 23 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> index 043bf07..011fdc3 100644
> --- a/drivers/xen/grant-table.c
> +++ b/drivers/xen/grant-table.c
> @@ -55,10 +55,6 @@
>   /* External tools reserve first few grant table entries. */
>   #define NR_RESERVED_ENTRIES 8
>   #define GNTTAB_LIST_END 0xffffffff
> -#define GREFS_PER_GRANT_FRAME \
> -(grant_table_version == 1 ?                      \
> -(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
> -(PAGE_SIZE / sizeof(union grant_entry_v2)))
>
>   static grant_ref_t **gnttab_list;
>   static unsigned int nr_grant_frames;
> @@ -153,6 +149,7 @@ static struct gnttab_ops *gnttab_interface;
>   static grant_status_t *grstatus;
>
>   static int grant_table_version;
> +static int grefs_per_grant_frame;
>
>   static struct gnttab_free_callback *gnttab_free_callback_list;
>
> @@ -766,12 +763,14 @@ static int grow_gnttab_list(unsigned int more_frames)
>   	unsigned int new_nr_grant_frames, extra_entries, i;
>   	unsigned int nr_glist_frames, new_nr_glist_frames;
>
> +	BUG_ON(grefs_per_grant_frame == 0);
> +
>   	new_nr_grant_frames = nr_grant_frames + more_frames;
> -	extra_entries       = more_frames * GREFS_PER_GRANT_FRAME;
> +	extra_entries       = more_frames * grefs_per_grant_frame;
>
> -	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
> +	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
>   	new_nr_glist_frames =
> -		(new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
> +		(new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
>   	for (i = nr_glist_frames; i<  new_nr_glist_frames; i++) {
>   		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
>   		if (!gnttab_list[i])
> @@ -779,12 +778,12 @@ static int grow_gnttab_list(unsigned int more_frames)
>   	}
>
>
> -	for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
> -	     i<  GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
> +	for (i = grefs_per_grant_frame * nr_grant_frames;
> +	     i<  grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
>   		gnttab_entry(i) = i + 1;
>
>   	gnttab_entry(i) = gnttab_free_head;
> -	gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
> +	gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
>   	gnttab_free_count += extra_entries;
>
>   	nr_grant_frames = new_nr_grant_frames;
> @@ -904,7 +903,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
>
>   static unsigned nr_status_frames(unsigned nr_grant_frames)
>   {
> -	return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
> +	BUG_ON(grefs_per_grant_frame == 0);
> +	return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
>   }
>
>   static int gnttab_map_frames_v1(xen_pfn_t *frames, unsigned int nr_gframes)
> @@ -1068,6 +1068,7 @@ static void gnttab_request_version(void)
>   	rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version,&gsv, 1);
>   	if (rc == 0&&  gsv.version == 2) {
>   		grant_table_version = 2;
> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(union grant_entry_v2);
>   		gnttab_interface =&gnttab_v2_ops;
>   	} else if (grant_table_version == 2) {
>   		/*
> @@ -1080,10 +1081,9 @@ static void gnttab_request_version(void)
>   		panic("we need grant tables version 2, but only version 1 is available");
>   	} else {
>   		grant_table_version = 1;
> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
>   		gnttab_interface =&gnttab_v1_ops;
>   	}
> -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> -		grant_table_version);
>   }
>
>   int gnttab_resume(void)
> @@ -1092,6 +1092,8 @@ int gnttab_resume(void)
>   	char *kmsg = "Failed to kmalloc pages for pv in hvm grant frames\n";
>
>   	gnttab_request_version();
> +	printk(KERN_INFO "Grant tables using version %d layout.\n",
> +		grant_table_version);
>   	max_nr_gframes = gnttab_max_grant_frames();
>   	if (max_nr_gframes<  nr_grant_frames)
>   		return -ENOSYS;
> @@ -1137,9 +1139,10 @@ static int gnttab_expand(unsigned int req_entries)
>   	int rc;
>   	unsigned int cur, extra;
>
> +	BUG_ON(grefs_per_grant_frame == 0);
>   	cur = nr_grant_frames;
> -	extra = ((req_entries + (GREFS_PER_GRANT_FRAME-1)) /
> -		 GREFS_PER_GRANT_FRAME);
> +	extra = ((req_entries + (grefs_per_grant_frame-1)) /
> +		 grefs_per_grant_frame);
>   	if (cur + extra>  gnttab_max_grant_frames())
>   		return -ENOSPC;
>
> @@ -1157,21 +1160,23 @@ int gnttab_init(void)
>   	unsigned int nr_init_grefs;
>   	int ret;
>
> +	gnttab_request_version();
>   	nr_grant_frames = 1;
>   	boot_max_nr_grant_frames = __max_nr_grant_frames();
>
>   	/* Determine the maximum number of frames required for the
>   	 * grant reference free list on the current hypervisor.
>   	 */
> +	BUG_ON(grefs_per_grant_frame == 0);
>   	max_nr_glist_frames = (boot_max_nr_grant_frames *
> -			       GREFS_PER_GRANT_FRAME / RPP);
> +			       grefs_per_grant_frame / RPP);
>
>   	gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *),
>   			      GFP_KERNEL);
>   	if (gnttab_list == NULL)
>   		return -ENOMEM;
>
> -	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
> +	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
>   	for (i = 0; i<  nr_glist_frames; i++) {
>   		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_KERNEL);
>   		if (gnttab_list[i] == NULL) {
> @@ -1185,7 +1190,7 @@ int gnttab_init(void)
>   		goto ini_nomem;
>   	}
>
> -	nr_init_grefs = nr_grant_frames * GREFS_PER_GRANT_FRAME;
> +	nr_init_grefs = nr_grant_frames * grefs_per_grant_frame;
>
>   	for (i = NR_RESERVED_ENTRIES; i<  nr_init_grefs - 1; i++)
>   		gnttab_entry(i) = i + 1;

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] xen/grant-table: correctly initialize grant table version 1
  2012-12-19 20:20 [PATCH] xen/grant-table: correctly initialize grant table version 1 Matt Wilson
  2012-12-20  3:42 ` ANNIE LI
@ 2012-12-20 10:01 ` Ian Campbell
  2012-12-20 21:19   ` Matt Wilson
  1 sibling, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2012-12-20 10:01 UTC (permalink / raw)
  To: Matt Wilson; +Cc: Annie Li, Steven Noonan, xen-devel, Konrad Rzeszutek Wilk

On Wed, 2012-12-19 at 20:20 +0000, Matt Wilson wrote:
> Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
> tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
> a constant to a conditional expression. The expression depends on
> grant_table_version being appropriately set. Unfortunately, at init
> time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
> conditional expression checks for "grant_table_version == 1", and
> therefore returns the number of grant references per frame for v2.
> 
> This causes gnttab_init() to allocate fewer pages for gnttab_list, as
> a frame can old half the number of v2 entries than v1 entries. After
> gnttab_resume() is called, grant_table_version is appropriately
> set. nr_init_grefs will then be miscalculated and gnttab_free_count
> will hold a value larger than the actual number of free gref entries.
> 
> If a guest is heavily utilizing improperly initialized v1 grant
> tables, memory corruption can occur.

Good catch!

[...]
@@ -1080,10 +1081,9 @@ static void gnttab_request_version(void)
>  		panic("we need grant tables version 2, but only version 1 is available");
>  	} else {
>  		grant_table_version = 1;
> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
>  		gnttab_interface = &gnttab_v1_ops;
>  	}
> -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> -		grant_table_version);

You remove this here, and re-add it in the gnttab_resume callsite but I
don't  see anything added at the gnttab_init callsite. It would be
useful to keep this print at start of day too.

Oh, I see, gnttab_init also calls gnttab_resume later so we get the
message from there, with gnttab_request_version getting called twice.

Can we avoid doing this twice by moving the tail of gnttab_resume
(everything except the gnttab_request_version) into a new gnttab_setup
and calling that from gnttab_resume and gnttab_init (instead of calling
resume)?

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] xen/grant-table: correctly initialize grant table version 1
  2012-12-20 10:01 ` Ian Campbell
@ 2012-12-20 21:19   ` Matt Wilson
  0 siblings, 0 replies; 29+ messages in thread
From: Matt Wilson @ 2012-12-20 21:19 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Annie Li, Steven Noonan, xen-devel, Konrad Rzeszutek Wilk

On Thu, Dec 20, 2012 at 10:01:50AM +0000, Ian Campbell wrote:
> On Wed, 2012-12-19 at 20:20 +0000, Matt Wilson wrote:
> > Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
> > tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
> > a constant to a conditional expression. The expression depends on
> > grant_table_version being appropriately set. Unfortunately, at init
> > time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
> > conditional expression checks for "grant_table_version == 1", and
> > therefore returns the number of grant references per frame for v2.
> > 
> > This causes gnttab_init() to allocate fewer pages for gnttab_list, as
> > a frame can old half the number of v2 entries than v1 entries. After
> > gnttab_resume() is called, grant_table_version is appropriately
> > set. nr_init_grefs will then be miscalculated and gnttab_free_count
> > will hold a value larger than the actual number of free gref entries.
> > 
> > If a guest is heavily utilizing improperly initialized v1 grant
> > tables, memory corruption can occur.
> 
> Good catch!
> 
> [...]
> @@ -1080,10 +1081,9 @@ static void gnttab_request_version(void)
> >  		panic("we need grant tables version 2, but only version 1 is available");
> >  	} else {
> >  		grant_table_version = 1;
> > +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
> >  		gnttab_interface = &gnttab_v1_ops;
> >  	}
> > -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> > -		grant_table_version);
> 
> You remove this here, and re-add it in the gnttab_resume callsite but I
> don't  see anything added at the gnttab_init callsite. It would be
> useful to keep this print at start of day too.
>
> Oh, I see, gnttab_init also calls gnttab_resume later so we get the
> message from there, with gnttab_request_version getting called twice.

Right, I was just avoiding printing the "Grant tables using version 1
layout." message at init time.
 
> Can we avoid doing this twice by moving the tail of gnttab_resume
> (everything except the gnttab_request_version) into a new gnttab_setup
> and calling that from gnttab_resume and gnttab_init (instead of calling
> resume)?

Sure, I can do that.

Matt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] xen/grant-table: correctly initialize grant table version 1
  2012-12-20  3:42 ` ANNIE LI
@ 2012-12-20 21:24   ` Matt Wilson
  0 siblings, 0 replies; 29+ messages in thread
From: Matt Wilson @ 2012-12-20 21:24 UTC (permalink / raw)
  To: ANNIE LI; +Cc: Steven Noonan, xen-devel, Konrad Rzeszutek Wilk

On Thu, Dec 20, 2012 at 11:42:05AM +0800, ANNIE LI wrote:
> 
> Thanks for posting this.
> This is caused by not initializing grant_table_version in
> gnttab_init() before gnttab_resume(). How about only adding
> gnttab_request_version() and BUG_ON() to check grant_table_version
> in gnttab_init(), then no need to change GREFS_PER_GRANT_FRAME into
> grefs_per_grant_frame and add more BUG_ON() to check
> grefs_per_grant_frame?
> 
> For example, in gnttab_init()
> ....
> 
> +gnttab_request_version();
> +BUG_ON(grant_table_version == 0);
> 

Having GREFS_PER_GRANT_FRAME evaluate a conditional every time it's
used makes me uneasy. I know I added excessive BUG_ON()s, but my
intent was just to be defensive. GREFS_PER_GRANT_FRAME is used in the
conditional part of for (...) loops in several locations. I think it's
better to have this as a variable instead of a conditional expression.

Matt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2] xen/grant-table: correctly initialize grant table version 1
       [not found] <Message-ID: <1355997710.26722.12.camel@zakaz.uk.xensource.com>
  2013-01-06 11:14 ` [PATCH v2] " Matt Wilson
@ 2013-01-06 11:14 ` Matt Wilson
  2013-01-09  2:40   ` ANNIE LI
                     ` (3 more replies)
  1 sibling, 4 replies; 29+ messages in thread
From: Matt Wilson @ 2013-01-06 11:14 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: stable, Matt Wilson, Ian Campbell, Annie Li, xen-devel, linux-kernel

Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
a constant to a conditional expression. The expression depends on
grant_table_version being appropriately set. Unfortunately, at init
time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
conditional expression checks for "grant_table_version == 1", and
therefore returns the number of grant references per frame for v2.

This causes gnttab_init() to allocate fewer pages for gnttab_list, as
a frame can old half the number of v2 entries than v1 entries. After
gnttab_resume() is called, grant_table_version is appropriately
set. nr_init_grefs will then be miscalculated and gnttab_free_count
will hold a value larger than the actual number of free gref entries.

If a guest is heavily utilizing improperly initialized v1 grant
tables, memory corruption can occur. One common manifestation is
corruption of the vmalloc list, resulting in a poisoned pointer
derefrence when accessing /proc/meminfo or /proc/vmallocinfo:

[   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
[   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
[   40.770102] PGD 0
[   40.770107] Oops: 0000 [#1] SMP
[   40.770114] CPU 10

This patch introduces a static variable, grefs_per_grant_frame, to
cache the calculated value. gnttab_init() now calls
gnttab_request_version() early so that grant_table_version and
grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
been added to prevent this type of bug from reoccurring in the future.

Signed-off-by: Matt Wilson <msw@amazon.com>
Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: xen-devel@lists.xen.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v3.3 and newer
---
Changes since v1:
* introduced a new gnttab_setup() function and moved all of the
  initialization code from gnttab_resume() there.
---
 drivers/xen/grant-table.c |   52 ++++++++++++++++++++++++++------------------
 1 files changed, 31 insertions(+), 21 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 043bf07..53715de 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -55,10 +55,6 @@
 /* External tools reserve first few grant table entries. */
 #define NR_RESERVED_ENTRIES 8
 #define GNTTAB_LIST_END 0xffffffff
-#define GREFS_PER_GRANT_FRAME \
-(grant_table_version == 1 ?                      \
-(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
-(PAGE_SIZE / sizeof(union grant_entry_v2)))
 
 static grant_ref_t **gnttab_list;
 static unsigned int nr_grant_frames;
@@ -153,6 +149,7 @@ static struct gnttab_ops *gnttab_interface;
 static grant_status_t *grstatus;
 
 static int grant_table_version;
+static int grefs_per_grant_frame;
 
 static struct gnttab_free_callback *gnttab_free_callback_list;
 
@@ -766,12 +763,14 @@ static int grow_gnttab_list(unsigned int more_frames)
 	unsigned int new_nr_grant_frames, extra_entries, i;
 	unsigned int nr_glist_frames, new_nr_glist_frames;
 
+	BUG_ON(grefs_per_grant_frame == 0);
+
 	new_nr_grant_frames = nr_grant_frames + more_frames;
-	extra_entries       = more_frames * GREFS_PER_GRANT_FRAME;
+	extra_entries       = more_frames * grefs_per_grant_frame;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	new_nr_glist_frames =
-		(new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+		(new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = nr_glist_frames; i < new_nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
 		if (!gnttab_list[i])
@@ -779,12 +778,12 @@ static int grow_gnttab_list(unsigned int more_frames)
 	}
 
 
-	for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
-	     i < GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
+	for (i = grefs_per_grant_frame * nr_grant_frames;
+	     i < grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
 		gnttab_entry(i) = i + 1;
 
 	gnttab_entry(i) = gnttab_free_head;
-	gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
+	gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
 	gnttab_free_count += extra_entries;
 
 	nr_grant_frames = new_nr_grant_frames;
@@ -904,7 +903,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
 
 static unsigned nr_status_frames(unsigned nr_grant_frames)
 {
-	return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
+	BUG_ON(grefs_per_grant_frame == 0);
+	return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
 }
 
 static int gnttab_map_frames_v1(xen_pfn_t *frames, unsigned int nr_gframes)
@@ -1068,6 +1068,7 @@ static void gnttab_request_version(void)
 	rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
 	if (rc == 0 && gsv.version == 2) {
 		grant_table_version = 2;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(union grant_entry_v2);
 		gnttab_interface = &gnttab_v2_ops;
 	} else if (grant_table_version == 2) {
 		/*
@@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
 		panic("we need grant tables version 2, but only version 1 is available");
 	} else {
 		grant_table_version = 1;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
 		gnttab_interface = &gnttab_v1_ops;
 	}
-	printk(KERN_INFO "Grant tables using version %d layout.\n",
-		grant_table_version);
 }
 
-int gnttab_resume(void)
+static int gnttab_setup(void)
 {
 	unsigned int max_nr_gframes;
 	char *kmsg = "Failed to kmalloc pages for pv in hvm grant frames\n";
 
-	gnttab_request_version();
+	printk(KERN_INFO "Grant tables using version %d layout.\n",
+		grant_table_version);
 	max_nr_gframes = gnttab_max_grant_frames();
 	if (max_nr_gframes < nr_grant_frames)
 		return -ENOSYS;
@@ -1126,6 +1127,12 @@ int gnttab_resume(void)
 	return 0;
 }
 
+int gnttab_resume(void)
+{
+	gnttab_request_version();
+	return gnttab_setup();
+}
+
 int gnttab_suspend(void)
 {
 	gnttab_interface->unmap_frames();
@@ -1137,9 +1144,10 @@ static int gnttab_expand(unsigned int req_entries)
 	int rc;
 	unsigned int cur, extra;
 
+	BUG_ON(grefs_per_grant_frame == 0);
 	cur = nr_grant_frames;
-	extra = ((req_entries + (GREFS_PER_GRANT_FRAME-1)) /
-		 GREFS_PER_GRANT_FRAME);
+	extra = ((req_entries + (grefs_per_grant_frame-1)) /
+		 grefs_per_grant_frame);
 	if (cur + extra > gnttab_max_grant_frames())
 		return -ENOSPC;
 
@@ -1157,21 +1165,23 @@ int gnttab_init(void)
 	unsigned int nr_init_grefs;
 	int ret;
 
+	gnttab_request_version();
 	nr_grant_frames = 1;
 	boot_max_nr_grant_frames = __max_nr_grant_frames();
 
 	/* Determine the maximum number of frames required for the
 	 * grant reference free list on the current hypervisor.
 	 */
+	BUG_ON(grefs_per_grant_frame == 0);
 	max_nr_glist_frames = (boot_max_nr_grant_frames *
-			       GREFS_PER_GRANT_FRAME / RPP);
+			       grefs_per_grant_frame / RPP);
 
 	gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *),
 			      GFP_KERNEL);
 	if (gnttab_list == NULL)
 		return -ENOMEM;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = 0; i < nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_KERNEL);
 		if (gnttab_list[i] == NULL) {
@@ -1180,12 +1190,12 @@ int gnttab_init(void)
 		}
 	}
 
-	if (gnttab_resume() < 0) {
+	if (gnttab_setup() < 0) {
 		ret = -ENODEV;
 		goto ini_nomem;
 	}
 
-	nr_init_grefs = nr_grant_frames * GREFS_PER_GRANT_FRAME;
+	nr_init_grefs = nr_grant_frames * grefs_per_grant_frame;
 
 	for (i = NR_RESERVED_ENTRIES; i < nr_init_grefs - 1; i++)
 		gnttab_entry(i) = i + 1;
-- 
1.7.4.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2] xen/grant-table: correctly initialize grant table version 1
       [not found] <Message-ID: <1355997710.26722.12.camel@zakaz.uk.xensource.com>
@ 2013-01-06 11:14 ` Matt Wilson
  2013-01-06 11:14 ` Matt Wilson
  1 sibling, 0 replies; 29+ messages in thread
From: Matt Wilson @ 2013-01-06 11:14 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Ian Campbell, linux-kernel, stable, xen-devel, Annie Li, Matt Wilson

Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
a constant to a conditional expression. The expression depends on
grant_table_version being appropriately set. Unfortunately, at init
time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
conditional expression checks for "grant_table_version == 1", and
therefore returns the number of grant references per frame for v2.

This causes gnttab_init() to allocate fewer pages for gnttab_list, as
a frame can old half the number of v2 entries than v1 entries. After
gnttab_resume() is called, grant_table_version is appropriately
set. nr_init_grefs will then be miscalculated and gnttab_free_count
will hold a value larger than the actual number of free gref entries.

If a guest is heavily utilizing improperly initialized v1 grant
tables, memory corruption can occur. One common manifestation is
corruption of the vmalloc list, resulting in a poisoned pointer
derefrence when accessing /proc/meminfo or /proc/vmallocinfo:

[   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
[   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
[   40.770102] PGD 0
[   40.770107] Oops: 0000 [#1] SMP
[   40.770114] CPU 10

This patch introduces a static variable, grefs_per_grant_frame, to
cache the calculated value. gnttab_init() now calls
gnttab_request_version() early so that grant_table_version and
grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
been added to prevent this type of bug from reoccurring in the future.

Signed-off-by: Matt Wilson <msw@amazon.com>
Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: xen-devel@lists.xen.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v3.3 and newer
---
Changes since v1:
* introduced a new gnttab_setup() function and moved all of the
  initialization code from gnttab_resume() there.
---
 drivers/xen/grant-table.c |   52 ++++++++++++++++++++++++++------------------
 1 files changed, 31 insertions(+), 21 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 043bf07..53715de 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -55,10 +55,6 @@
 /* External tools reserve first few grant table entries. */
 #define NR_RESERVED_ENTRIES 8
 #define GNTTAB_LIST_END 0xffffffff
-#define GREFS_PER_GRANT_FRAME \
-(grant_table_version == 1 ?                      \
-(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
-(PAGE_SIZE / sizeof(union grant_entry_v2)))
 
 static grant_ref_t **gnttab_list;
 static unsigned int nr_grant_frames;
@@ -153,6 +149,7 @@ static struct gnttab_ops *gnttab_interface;
 static grant_status_t *grstatus;
 
 static int grant_table_version;
+static int grefs_per_grant_frame;
 
 static struct gnttab_free_callback *gnttab_free_callback_list;
 
@@ -766,12 +763,14 @@ static int grow_gnttab_list(unsigned int more_frames)
 	unsigned int new_nr_grant_frames, extra_entries, i;
 	unsigned int nr_glist_frames, new_nr_glist_frames;
 
+	BUG_ON(grefs_per_grant_frame == 0);
+
 	new_nr_grant_frames = nr_grant_frames + more_frames;
-	extra_entries       = more_frames * GREFS_PER_GRANT_FRAME;
+	extra_entries       = more_frames * grefs_per_grant_frame;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	new_nr_glist_frames =
-		(new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+		(new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = nr_glist_frames; i < new_nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
 		if (!gnttab_list[i])
@@ -779,12 +778,12 @@ static int grow_gnttab_list(unsigned int more_frames)
 	}
 
 
-	for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
-	     i < GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
+	for (i = grefs_per_grant_frame * nr_grant_frames;
+	     i < grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
 		gnttab_entry(i) = i + 1;
 
 	gnttab_entry(i) = gnttab_free_head;
-	gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
+	gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
 	gnttab_free_count += extra_entries;
 
 	nr_grant_frames = new_nr_grant_frames;
@@ -904,7 +903,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
 
 static unsigned nr_status_frames(unsigned nr_grant_frames)
 {
-	return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
+	BUG_ON(grefs_per_grant_frame == 0);
+	return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
 }
 
 static int gnttab_map_frames_v1(xen_pfn_t *frames, unsigned int nr_gframes)
@@ -1068,6 +1068,7 @@ static void gnttab_request_version(void)
 	rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
 	if (rc == 0 && gsv.version == 2) {
 		grant_table_version = 2;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(union grant_entry_v2);
 		gnttab_interface = &gnttab_v2_ops;
 	} else if (grant_table_version == 2) {
 		/*
@@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
 		panic("we need grant tables version 2, but only version 1 is available");
 	} else {
 		grant_table_version = 1;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
 		gnttab_interface = &gnttab_v1_ops;
 	}
-	printk(KERN_INFO "Grant tables using version %d layout.\n",
-		grant_table_version);
 }
 
-int gnttab_resume(void)
+static int gnttab_setup(void)
 {
 	unsigned int max_nr_gframes;
 	char *kmsg = "Failed to kmalloc pages for pv in hvm grant frames\n";
 
-	gnttab_request_version();
+	printk(KERN_INFO "Grant tables using version %d layout.\n",
+		grant_table_version);
 	max_nr_gframes = gnttab_max_grant_frames();
 	if (max_nr_gframes < nr_grant_frames)
 		return -ENOSYS;
@@ -1126,6 +1127,12 @@ int gnttab_resume(void)
 	return 0;
 }
 
+int gnttab_resume(void)
+{
+	gnttab_request_version();
+	return gnttab_setup();
+}
+
 int gnttab_suspend(void)
 {
 	gnttab_interface->unmap_frames();
@@ -1137,9 +1144,10 @@ static int gnttab_expand(unsigned int req_entries)
 	int rc;
 	unsigned int cur, extra;
 
+	BUG_ON(grefs_per_grant_frame == 0);
 	cur = nr_grant_frames;
-	extra = ((req_entries + (GREFS_PER_GRANT_FRAME-1)) /
-		 GREFS_PER_GRANT_FRAME);
+	extra = ((req_entries + (grefs_per_grant_frame-1)) /
+		 grefs_per_grant_frame);
 	if (cur + extra > gnttab_max_grant_frames())
 		return -ENOSPC;
 
@@ -1157,21 +1165,23 @@ int gnttab_init(void)
 	unsigned int nr_init_grefs;
 	int ret;
 
+	gnttab_request_version();
 	nr_grant_frames = 1;
 	boot_max_nr_grant_frames = __max_nr_grant_frames();
 
 	/* Determine the maximum number of frames required for the
 	 * grant reference free list on the current hypervisor.
 	 */
+	BUG_ON(grefs_per_grant_frame == 0);
 	max_nr_glist_frames = (boot_max_nr_grant_frames *
-			       GREFS_PER_GRANT_FRAME / RPP);
+			       grefs_per_grant_frame / RPP);
 
 	gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *),
 			      GFP_KERNEL);
 	if (gnttab_list == NULL)
 		return -ENOMEM;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = 0; i < nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_KERNEL);
 		if (gnttab_list[i] == NULL) {
@@ -1180,12 +1190,12 @@ int gnttab_init(void)
 		}
 	}
 
-	if (gnttab_resume() < 0) {
+	if (gnttab_setup() < 0) {
 		ret = -ENODEV;
 		goto ini_nomem;
 	}
 
-	nr_init_grefs = nr_grant_frames * GREFS_PER_GRANT_FRAME;
+	nr_init_grefs = nr_grant_frames * grefs_per_grant_frame;
 
 	for (i = NR_RESERVED_ENTRIES; i < nr_init_grefs - 1; i++)
 		gnttab_entry(i) = i + 1;
-- 
1.7.4.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [Xen-devel] [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-06 11:14 ` Matt Wilson
  2013-01-09  2:40   ` ANNIE LI
@ 2013-01-09  2:40   ` ANNIE LI
  2013-01-09 12:02     ` Ian Campbell
  2013-01-09 12:02     ` Ian Campbell
  2013-01-11 21:35   ` Konrad Rzeszutek Wilk
  2013-01-11 21:35   ` Konrad Rzeszutek Wilk
  3 siblings, 2 replies; 29+ messages in thread
From: ANNIE LI @ 2013-01-09  2:40 UTC (permalink / raw)
  To: Matt Wilson
  Cc: Konrad Rzeszutek Wilk, Ian Campbell, linux-kernel, stable, xen-devel

Thanks so much for posting this.

On 2013-1-6 19:14, Matt Wilson wrote:
> Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
> tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
> a constant to a conditional expression. The expression depends on
> grant_table_version being appropriately set. Unfortunately, at init
> time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
> conditional expression checks for "grant_table_version == 1", and
> therefore returns the number of grant references per frame for v2.
>
> This causes gnttab_init() to allocate fewer pages for gnttab_list, as
> a frame can old half the number of v2 entries than v1 entries. After
> gnttab_resume() is called, grant_table_version is appropriately
> set. nr_init_grefs will then be miscalculated and gnttab_free_count
> will hold a value larger than the actual number of free gref entries.
>
> If a guest is heavily utilizing improperly initialized v1 grant
> tables, memory corruption can occur. One common manifestation is
> corruption of the vmalloc list, resulting in a poisoned pointer
> derefrence when accessing /proc/meminfo or /proc/vmallocinfo:
>
> [   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
> [   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
> [   40.770102] PGD 0
> [   40.770107] Oops: 0000 [#1] SMP
> [   40.770114] CPU 10
>
> This patch introduces a static variable, grefs_per_grant_frame, to
> cache the calculated value. gnttab_init() now calls
> gnttab_request_version() early so that grant_table_version and
> grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
> been added to prevent this type of bug from reoccurring in the future.
>
> Signed-off-by: Matt Wilson<msw@amazon.com>
> Reviewed-and-Tested-by: Steven Noonan<snoonan@amazon.com>
> Cc: Ian Campbell<Ian.Campbell@citrix.com>
> Cc: Konrad Rzeszutek Wilk<konrad.wilk@oracle.com>
> Cc: Annie Li<annie.li@oracle.com>
> Cc: xen-devel@lists.xen.org
> Cc: linux-kernel@vger.kernel.org
> Cc: stable@vger.kernel.org # v3.3 and newer
> ---
> Changes since v1:
> * introduced a new gnttab_setup() function and moved all of the
>    initialization code from gnttab_resume() there.
> ---
>   drivers/xen/grant-table.c |   52 ++++++++++++++++++++++++++------------------
>   1 files changed, 31 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> index 043bf07..53715de 100644
> --- a/drivers/xen/grant-table.c
> +++ b/drivers/xen/grant-table.c
> @@ -55,10 +55,6 @@
>   /* External tools reserve first few grant table entries. */
>   #define NR_RESERVED_ENTRIES 8
>   #define GNTTAB_LIST_END 0xffffffff
> -#define GREFS_PER_GRANT_FRAME \
> -(grant_table_version == 1 ?                      \
> -(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
> -(PAGE_SIZE / sizeof(union grant_entry_v2)))
>
>   static grant_ref_t **gnttab_list;
>   static unsigned int nr_grant_frames;
> @@ -153,6 +149,7 @@ static struct gnttab_ops *gnttab_interface;
>   static grant_status_t *grstatus;
>
>   static int grant_table_version;
> +static int grefs_per_grant_frame;
>
>   static struct gnttab_free_callback *gnttab_free_callback_list;
>
> @@ -766,12 +763,14 @@ static int grow_gnttab_list(unsigned int more_frames)
>   	unsigned int new_nr_grant_frames, extra_entries, i;
>   	unsigned int nr_glist_frames, new_nr_glist_frames;
>
> +	BUG_ON(grefs_per_grant_frame == 0);
> +
>   	new_nr_grant_frames = nr_grant_frames + more_frames;
> -	extra_entries       = more_frames * GREFS_PER_GRANT_FRAME;
> +	extra_entries       = more_frames * grefs_per_grant_frame;
>
> -	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
> +	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
>   	new_nr_glist_frames =
> -		(new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
> +		(new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
>   	for (i = nr_glist_frames; i<  new_nr_glist_frames; i++) {
>   		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
>   		if (!gnttab_list[i])
> @@ -779,12 +778,12 @@ static int grow_gnttab_list(unsigned int more_frames)
>   	}
>
>
> -	for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
> -	     i<  GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
> +	for (i = grefs_per_grant_frame * nr_grant_frames;
> +	     i<  grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
>   		gnttab_entry(i) = i + 1;
>
>   	gnttab_entry(i) = gnttab_free_head;
> -	gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
> +	gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
>   	gnttab_free_count += extra_entries;
>
>   	nr_grant_frames = new_nr_grant_frames;
> @@ -904,7 +903,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
>
>   static unsigned nr_status_frames(unsigned nr_grant_frames)
>   {
> -	return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
> +	BUG_ON(grefs_per_grant_frame == 0);
> +	return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
>   }
>
>   static int gnttab_map_frames_v1(xen_pfn_t *frames, unsigned int nr_gframes)
> @@ -1068,6 +1068,7 @@ static void gnttab_request_version(void)
>   	rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version,&gsv, 1);
>   	if (rc == 0&&  gsv.version == 2) {
>   		grant_table_version = 2;
> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(union grant_entry_v2);
>   		gnttab_interface =&gnttab_v2_ops;
>   	} else if (grant_table_version == 2) {
>   		/*
> @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
>   		panic("we need grant tables version 2, but only version 1 is available");
>   	} else {
>   		grant_table_version = 1;
> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
>   		gnttab_interface =&gnttab_v1_ops;
>   	}
> -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> -		grant_table_version);
>   }
>   

Is it better to keep printk here? In your last patch, you removed it 
because gnttab_request_version and gnttab_resume are all called in 
gnttab_init. and gnttab_resume also contains calling of 
gnttab_request_version. But in this patch, gnttab_setup is used, and 
does not have this issue now.

> -int gnttab_resume(void)
> +static int gnttab_setup(void)
>   {
>   	unsigned int max_nr_gframes;
>   	char *kmsg = "Failed to kmalloc pages for pv in hvm grant frames\n";
>
> -	gnttab_request_version();
> +	printk(KERN_INFO "Grant tables using version %d layout.\n",
> +		grant_table_version);

See comments above.

Thanks
Annie
>   	max_nr_gframes = gnttab_max_grant_frames();
>   	if (max_nr_gframes<  nr_grant_frames)
>   		return -ENOSYS;
> @@ -1126,6 +1127,12 @@ int gnttab_resume(void)
>   	return 0;
>   }
>
> +int gnttab_resume(void)
> +{
> +	gnttab_request_version();
> +	return gnttab_setup();
> +}
> +
>   int gnttab_suspend(void)
>   {
>   	gnttab_interface->unmap_frames();
> @@ -1137,9 +1144,10 @@ static int gnttab_expand(unsigned int req_entries)
>   	int rc;
>   	unsigned int cur, extra;
>
> +	BUG_ON(grefs_per_grant_frame == 0);
>   	cur = nr_grant_frames;
> -	extra = ((req_entries + (GREFS_PER_GRANT_FRAME-1)) /
> -		 GREFS_PER_GRANT_FRAME);
> +	extra = ((req_entries + (grefs_per_grant_frame-1)) /
> +		 grefs_per_grant_frame);
>   	if (cur + extra>  gnttab_max_grant_frames())
>   		return -ENOSPC;
>
> @@ -1157,21 +1165,23 @@ int gnttab_init(void)
>   	unsigned int nr_init_grefs;
>   	int ret;
>
> +	gnttab_request_version();
>   	nr_grant_frames = 1;
>   	boot_max_nr_grant_frames = __max_nr_grant_frames();
>
>   	/* Determine the maximum number of frames required for the
>   	 * grant reference free list on the current hypervisor.
>   	 */
> +	BUG_ON(grefs_per_grant_frame == 0);
>   	max_nr_glist_frames = (boot_max_nr_grant_frames *
> -			       GREFS_PER_GRANT_FRAME / RPP);
> +			       grefs_per_grant_frame / RPP);
>
>   	gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *),
>   			      GFP_KERNEL);
>   	if (gnttab_list == NULL)
>   		return -ENOMEM;
>
> -	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
> +	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
>   	for (i = 0; i<  nr_glist_frames; i++) {
>   		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_KERNEL);
>   		if (gnttab_list[i] == NULL) {
> @@ -1180,12 +1190,12 @@ int gnttab_init(void)
>   		}
>   	}
>
> -	if (gnttab_resume()<  0) {
> +	if (gnttab_setup()<  0) {
>   		ret = -ENODEV;
>   		goto ini_nomem;
>   	}
>
> -	nr_init_grefs = nr_grant_frames * GREFS_PER_GRANT_FRAME;
> +	nr_init_grefs = nr_grant_frames * grefs_per_grant_frame;
>
>   	for (i = NR_RESERVED_ENTRIES; i<  nr_init_grefs - 1; i++)
>   		gnttab_entry(i) = i + 1;

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-06 11:14 ` Matt Wilson
@ 2013-01-09  2:40   ` ANNIE LI
  2013-01-09  2:40   ` [Xen-devel] " ANNIE LI
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: ANNIE LI @ 2013-01-09  2:40 UTC (permalink / raw)
  To: Matt Wilson
  Cc: linux-kernel, xen-devel, Ian Campbell, stable, Konrad Rzeszutek Wilk

Thanks so much for posting this.

On 2013-1-6 19:14, Matt Wilson wrote:
> Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
> tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
> a constant to a conditional expression. The expression depends on
> grant_table_version being appropriately set. Unfortunately, at init
> time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
> conditional expression checks for "grant_table_version == 1", and
> therefore returns the number of grant references per frame for v2.
>
> This causes gnttab_init() to allocate fewer pages for gnttab_list, as
> a frame can old half the number of v2 entries than v1 entries. After
> gnttab_resume() is called, grant_table_version is appropriately
> set. nr_init_grefs will then be miscalculated and gnttab_free_count
> will hold a value larger than the actual number of free gref entries.
>
> If a guest is heavily utilizing improperly initialized v1 grant
> tables, memory corruption can occur. One common manifestation is
> corruption of the vmalloc list, resulting in a poisoned pointer
> derefrence when accessing /proc/meminfo or /proc/vmallocinfo:
>
> [   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
> [   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
> [   40.770102] PGD 0
> [   40.770107] Oops: 0000 [#1] SMP
> [   40.770114] CPU 10
>
> This patch introduces a static variable, grefs_per_grant_frame, to
> cache the calculated value. gnttab_init() now calls
> gnttab_request_version() early so that grant_table_version and
> grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
> been added to prevent this type of bug from reoccurring in the future.
>
> Signed-off-by: Matt Wilson<msw@amazon.com>
> Reviewed-and-Tested-by: Steven Noonan<snoonan@amazon.com>
> Cc: Ian Campbell<Ian.Campbell@citrix.com>
> Cc: Konrad Rzeszutek Wilk<konrad.wilk@oracle.com>
> Cc: Annie Li<annie.li@oracle.com>
> Cc: xen-devel@lists.xen.org
> Cc: linux-kernel@vger.kernel.org
> Cc: stable@vger.kernel.org # v3.3 and newer
> ---
> Changes since v1:
> * introduced a new gnttab_setup() function and moved all of the
>    initialization code from gnttab_resume() there.
> ---
>   drivers/xen/grant-table.c |   52 ++++++++++++++++++++++++++------------------
>   1 files changed, 31 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> index 043bf07..53715de 100644
> --- a/drivers/xen/grant-table.c
> +++ b/drivers/xen/grant-table.c
> @@ -55,10 +55,6 @@
>   /* External tools reserve first few grant table entries. */
>   #define NR_RESERVED_ENTRIES 8
>   #define GNTTAB_LIST_END 0xffffffff
> -#define GREFS_PER_GRANT_FRAME \
> -(grant_table_version == 1 ?                      \
> -(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
> -(PAGE_SIZE / sizeof(union grant_entry_v2)))
>
>   static grant_ref_t **gnttab_list;
>   static unsigned int nr_grant_frames;
> @@ -153,6 +149,7 @@ static struct gnttab_ops *gnttab_interface;
>   static grant_status_t *grstatus;
>
>   static int grant_table_version;
> +static int grefs_per_grant_frame;
>
>   static struct gnttab_free_callback *gnttab_free_callback_list;
>
> @@ -766,12 +763,14 @@ static int grow_gnttab_list(unsigned int more_frames)
>   	unsigned int new_nr_grant_frames, extra_entries, i;
>   	unsigned int nr_glist_frames, new_nr_glist_frames;
>
> +	BUG_ON(grefs_per_grant_frame == 0);
> +
>   	new_nr_grant_frames = nr_grant_frames + more_frames;
> -	extra_entries       = more_frames * GREFS_PER_GRANT_FRAME;
> +	extra_entries       = more_frames * grefs_per_grant_frame;
>
> -	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
> +	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
>   	new_nr_glist_frames =
> -		(new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
> +		(new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
>   	for (i = nr_glist_frames; i<  new_nr_glist_frames; i++) {
>   		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
>   		if (!gnttab_list[i])
> @@ -779,12 +778,12 @@ static int grow_gnttab_list(unsigned int more_frames)
>   	}
>
>
> -	for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
> -	     i<  GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
> +	for (i = grefs_per_grant_frame * nr_grant_frames;
> +	     i<  grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
>   		gnttab_entry(i) = i + 1;
>
>   	gnttab_entry(i) = gnttab_free_head;
> -	gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
> +	gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
>   	gnttab_free_count += extra_entries;
>
>   	nr_grant_frames = new_nr_grant_frames;
> @@ -904,7 +903,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
>
>   static unsigned nr_status_frames(unsigned nr_grant_frames)
>   {
> -	return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
> +	BUG_ON(grefs_per_grant_frame == 0);
> +	return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
>   }
>
>   static int gnttab_map_frames_v1(xen_pfn_t *frames, unsigned int nr_gframes)
> @@ -1068,6 +1068,7 @@ static void gnttab_request_version(void)
>   	rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version,&gsv, 1);
>   	if (rc == 0&&  gsv.version == 2) {
>   		grant_table_version = 2;
> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(union grant_entry_v2);
>   		gnttab_interface =&gnttab_v2_ops;
>   	} else if (grant_table_version == 2) {
>   		/*
> @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
>   		panic("we need grant tables version 2, but only version 1 is available");
>   	} else {
>   		grant_table_version = 1;
> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
>   		gnttab_interface =&gnttab_v1_ops;
>   	}
> -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> -		grant_table_version);
>   }
>   

Is it better to keep printk here? In your last patch, you removed it 
because gnttab_request_version and gnttab_resume are all called in 
gnttab_init. and gnttab_resume also contains calling of 
gnttab_request_version. But in this patch, gnttab_setup is used, and 
does not have this issue now.

> -int gnttab_resume(void)
> +static int gnttab_setup(void)
>   {
>   	unsigned int max_nr_gframes;
>   	char *kmsg = "Failed to kmalloc pages for pv in hvm grant frames\n";
>
> -	gnttab_request_version();
> +	printk(KERN_INFO "Grant tables using version %d layout.\n",
> +		grant_table_version);

See comments above.

Thanks
Annie
>   	max_nr_gframes = gnttab_max_grant_frames();
>   	if (max_nr_gframes<  nr_grant_frames)
>   		return -ENOSYS;
> @@ -1126,6 +1127,12 @@ int gnttab_resume(void)
>   	return 0;
>   }
>
> +int gnttab_resume(void)
> +{
> +	gnttab_request_version();
> +	return gnttab_setup();
> +}
> +
>   int gnttab_suspend(void)
>   {
>   	gnttab_interface->unmap_frames();
> @@ -1137,9 +1144,10 @@ static int gnttab_expand(unsigned int req_entries)
>   	int rc;
>   	unsigned int cur, extra;
>
> +	BUG_ON(grefs_per_grant_frame == 0);
>   	cur = nr_grant_frames;
> -	extra = ((req_entries + (GREFS_PER_GRANT_FRAME-1)) /
> -		 GREFS_PER_GRANT_FRAME);
> +	extra = ((req_entries + (grefs_per_grant_frame-1)) /
> +		 grefs_per_grant_frame);
>   	if (cur + extra>  gnttab_max_grant_frames())
>   		return -ENOSPC;
>
> @@ -1157,21 +1165,23 @@ int gnttab_init(void)
>   	unsigned int nr_init_grefs;
>   	int ret;
>
> +	gnttab_request_version();
>   	nr_grant_frames = 1;
>   	boot_max_nr_grant_frames = __max_nr_grant_frames();
>
>   	/* Determine the maximum number of frames required for the
>   	 * grant reference free list on the current hypervisor.
>   	 */
> +	BUG_ON(grefs_per_grant_frame == 0);
>   	max_nr_glist_frames = (boot_max_nr_grant_frames *
> -			       GREFS_PER_GRANT_FRAME / RPP);
> +			       grefs_per_grant_frame / RPP);
>
>   	gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *),
>   			      GFP_KERNEL);
>   	if (gnttab_list == NULL)
>   		return -ENOMEM;
>
> -	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
> +	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
>   	for (i = 0; i<  nr_glist_frames; i++) {
>   		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_KERNEL);
>   		if (gnttab_list[i] == NULL) {
> @@ -1180,12 +1190,12 @@ int gnttab_init(void)
>   		}
>   	}
>
> -	if (gnttab_resume()<  0) {
> +	if (gnttab_setup()<  0) {
>   		ret = -ENODEV;
>   		goto ini_nomem;
>   	}
>
> -	nr_init_grefs = nr_grant_frames * GREFS_PER_GRANT_FRAME;
> +	nr_init_grefs = nr_grant_frames * grefs_per_grant_frame;
>
>   	for (i = NR_RESERVED_ENTRIES; i<  nr_init_grefs - 1; i++)
>   		gnttab_entry(i) = i + 1;

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Xen-devel] [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-09  2:40   ` [Xen-devel] " ANNIE LI
@ 2013-01-09 12:02     ` Ian Campbell
  2013-01-09 15:02       ` annie li
                         ` (3 more replies)
  2013-01-09 12:02     ` Ian Campbell
  1 sibling, 4 replies; 29+ messages in thread
From: Ian Campbell @ 2013-01-09 12:02 UTC (permalink / raw)
  To: ANNIE LI
  Cc: Matt Wilson, Konrad Rzeszutek Wilk, linux-kernel, stable, xen-devel

On Wed, 2013-01-09 at 02:40 +0000, ANNIE LI wrote:
> > @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
> >   		panic("we need grant tables version 2, but only version 1 is available");
> >   	} else {
> >   		grant_table_version = 1;
> > +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
> >   		gnttab_interface =&gnttab_v1_ops;
> >   	}
> > -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> > -		grant_table_version);
> >   }
> >   
> 
> Is it better to keep printk here? In your last patch, you removed it 
> because gnttab_request_version and gnttab_resume are all called in 
> gnttab_init. and gnttab_resume also contains calling of 
> gnttab_request_version. But in this patch, gnttab_setup is used, and 
> does not have this issue now.

Yes, I think we want to print this at both start of day and resume?

Either by adding a print to gnttab_resume() or by keeping the existing
one here in preference to moving it to gnttab_setup(). I'd prefer the
latter to avoid the duplication, unless I'm mistaken and request_version
is called in more than those two locations.

Ian.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-09  2:40   ` [Xen-devel] " ANNIE LI
  2013-01-09 12:02     ` Ian Campbell
@ 2013-01-09 12:02     ` Ian Campbell
  1 sibling, 0 replies; 29+ messages in thread
From: Ian Campbell @ 2013-01-09 12:02 UTC (permalink / raw)
  To: ANNIE LI
  Cc: stable, xen-devel, linux-kernel, Matt Wilson, Konrad Rzeszutek Wilk

On Wed, 2013-01-09 at 02:40 +0000, ANNIE LI wrote:
> > @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
> >   		panic("we need grant tables version 2, but only version 1 is available");
> >   	} else {
> >   		grant_table_version = 1;
> > +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
> >   		gnttab_interface =&gnttab_v1_ops;
> >   	}
> > -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> > -		grant_table_version);
> >   }
> >   
> 
> Is it better to keep printk here? In your last patch, you removed it 
> because gnttab_request_version and gnttab_resume are all called in 
> gnttab_init. and gnttab_resume also contains calling of 
> gnttab_request_version. But in this patch, gnttab_setup is used, and 
> does not have this issue now.

Yes, I think we want to print this at both start of day and resume?

Either by adding a print to gnttab_resume() or by keeping the existing
one here in preference to moving it to gnttab_setup(). I'd prefer the
latter to avoid the duplication, unless I'm mistaken and request_version
is called in more than those two locations.

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Xen-devel] [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-09 12:02     ` Ian Campbell
@ 2013-01-09 15:02       ` annie li
  2013-01-09 15:02       ` annie li
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: annie li @ 2013-01-09 15:02 UTC (permalink / raw)
  To: Ian Campbell
  Cc: stable, xen-devel, linux-kernel, Matt Wilson, Konrad Rzeszutek Wilk



On 2013-1-9 20:02, Ian Campbell wrote:
> On Wed, 2013-01-09 at 02:40 +0000, ANNIE LI wrote:
>   
>>> @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
>>>   		panic("we need grant tables version 2, but only version 1 is available");
>>>   	} else {
>>>   		grant_table_version = 1;
>>> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
>>>   		gnttab_interface =&gnttab_v1_ops;
>>>   	}
>>> -	printk(KERN_INFO "Grant tables using version %d layout.\n",
>>> -		grant_table_version);
>>>   }
>>>   
>>>       
>> Is it better to keep printk here? In your last patch, you removed it 
>> because gnttab_request_version and gnttab_resume are all called in 
>> gnttab_init. and gnttab_resume also contains calling of 
>> gnttab_request_version. But in this patch, gnttab_setup is used, and 
>> does not have this issue now.
>>     
>
> Yes, I think we want to print this at both start of day and resume?
>   
Right.
> Either by adding a print to gnttab_resume()
Only adding a print into gnttab_resume() would miss this print at start 
of day. In gnttab_init, gnttab_request_version and gnttab_setup are 
called, not gnttab_resume.
>  or by keeping the existing
> one here in preference to moving it to gnttab_setup(). I'd prefer the
> latter to avoid the duplication, unless I'm mistaken and request_version
> is called in more than those two locations.
>   
Yes, I'd like the latter. Request_version is only called in those two 
locations.

Thanks
Annie
> Ian.
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>   

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-09 12:02     ` Ian Campbell
  2013-01-09 15:02       ` annie li
@ 2013-01-09 15:02       ` annie li
  2013-01-10  9:16       ` Matt Wilson
  2013-01-10  9:16       ` [Xen-devel] " Matt Wilson
  3 siblings, 0 replies; 29+ messages in thread
From: annie li @ 2013-01-09 15:02 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Konrad Rzeszutek Wilk, Matt Wilson, linux-kernel, stable, xen-devel



On 2013-1-9 20:02, Ian Campbell wrote:
> On Wed, 2013-01-09 at 02:40 +0000, ANNIE LI wrote:
>   
>>> @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
>>>   		panic("we need grant tables version 2, but only version 1 is available");
>>>   	} else {
>>>   		grant_table_version = 1;
>>> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
>>>   		gnttab_interface =&gnttab_v1_ops;
>>>   	}
>>> -	printk(KERN_INFO "Grant tables using version %d layout.\n",
>>> -		grant_table_version);
>>>   }
>>>   
>>>       
>> Is it better to keep printk here? In your last patch, you removed it 
>> because gnttab_request_version and gnttab_resume are all called in 
>> gnttab_init. and gnttab_resume also contains calling of 
>> gnttab_request_version. But in this patch, gnttab_setup is used, and 
>> does not have this issue now.
>>     
>
> Yes, I think we want to print this at both start of day and resume?
>   
Right.
> Either by adding a print to gnttab_resume()
Only adding a print into gnttab_resume() would miss this print at start 
of day. In gnttab_init, gnttab_request_version and gnttab_setup are 
called, not gnttab_resume.
>  or by keeping the existing
> one here in preference to moving it to gnttab_setup(). I'd prefer the
> latter to avoid the duplication, unless I'm mistaken and request_version
> is called in more than those two locations.
>   
Yes, I'd like the latter. Request_version is only called in those two 
locations.

Thanks
Annie
> Ian.
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>   

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Xen-devel] [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-09 12:02     ` Ian Campbell
                         ` (2 preceding siblings ...)
  2013-01-10  9:16       ` Matt Wilson
@ 2013-01-10  9:16       ` Matt Wilson
  2013-01-10 10:32         ` Ian Campbell
                           ` (3 more replies)
  3 siblings, 4 replies; 29+ messages in thread
From: Matt Wilson @ 2013-01-10  9:16 UTC (permalink / raw)
  To: Ian Campbell
  Cc: ANNIE LI, Konrad Rzeszutek Wilk, linux-kernel, stable, xen-devel

On Wed, Jan 09, 2013 at 12:02:09PM +0000, Ian Campbell wrote:
> On Wed, 2013-01-09 at 02:40 +0000, ANNIE LI wrote:
> > > @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
> > >   		panic("we need grant tables version 2, but only version 1 is available");
> > >   	} else {
> > >   		grant_table_version = 1;
> > > +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
> > >   		gnttab_interface =&gnttab_v1_ops;
> > >   	}
> > > -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> > > -		grant_table_version);
> > >   }
> > >   
> > 
> > Is it better to keep printk here? In your last patch, you removed it 
> > because gnttab_request_version and gnttab_resume are all called in 
> > gnttab_init. and gnttab_resume also contains calling of 
> > gnttab_request_version. But in this patch, gnttab_setup is used, and 
> > does not have this issue now.
> 
> Yes, I think we want to print this at both start of day and resume?
> 
> Either by adding a print to gnttab_resume() or by keeping the existing
> one here in preference to moving it to gnttab_setup(). I'd prefer the
> latter to avoid the duplication, unless I'm mistaken and request_version
> is called in more than those two locations.

In this version the printk() is in gnttab_setup(), which is called
both at start of day (gnttab_init()) and resume (gnttab_resume()). I
can move it back to gnttab_request_version() if you'd like, but this
patch should be behaving as expected already.

Matt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-09 12:02     ` Ian Campbell
  2013-01-09 15:02       ` annie li
  2013-01-09 15:02       ` annie li
@ 2013-01-10  9:16       ` Matt Wilson
  2013-01-10  9:16       ` [Xen-devel] " Matt Wilson
  3 siblings, 0 replies; 29+ messages in thread
From: Matt Wilson @ 2013-01-10  9:16 UTC (permalink / raw)
  To: Ian Campbell
  Cc: ANNIE LI, xen-devel, linux-kernel, stable, Konrad Rzeszutek Wilk

On Wed, Jan 09, 2013 at 12:02:09PM +0000, Ian Campbell wrote:
> On Wed, 2013-01-09 at 02:40 +0000, ANNIE LI wrote:
> > > @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
> > >   		panic("we need grant tables version 2, but only version 1 is available");
> > >   	} else {
> > >   		grant_table_version = 1;
> > > +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
> > >   		gnttab_interface =&gnttab_v1_ops;
> > >   	}
> > > -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> > > -		grant_table_version);
> > >   }
> > >   
> > 
> > Is it better to keep printk here? In your last patch, you removed it 
> > because gnttab_request_version and gnttab_resume are all called in 
> > gnttab_init. and gnttab_resume also contains calling of 
> > gnttab_request_version. But in this patch, gnttab_setup is used, and 
> > does not have this issue now.
> 
> Yes, I think we want to print this at both start of day and resume?
> 
> Either by adding a print to gnttab_resume() or by keeping the existing
> one here in preference to moving it to gnttab_setup(). I'd prefer the
> latter to avoid the duplication, unless I'm mistaken and request_version
> is called in more than those two locations.

In this version the printk() is in gnttab_setup(), which is called
both at start of day (gnttab_init()) and resume (gnttab_resume()). I
can move it back to gnttab_request_version() if you'd like, but this
patch should be behaving as expected already.

Matt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Xen-devel] [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-10  9:16       ` [Xen-devel] " Matt Wilson
@ 2013-01-10 10:32         ` Ian Campbell
  2013-01-10 10:32         ` Ian Campbell
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Ian Campbell @ 2013-01-10 10:32 UTC (permalink / raw)
  To: Matt Wilson
  Cc: ANNIE LI, Konrad Rzeszutek Wilk, linux-kernel, stable, xen-devel

On Thu, 2013-01-10 at 09:16 +0000, Matt Wilson wrote:
> On Wed, Jan 09, 2013 at 12:02:09PM +0000, Ian Campbell wrote:
> > On Wed, 2013-01-09 at 02:40 +0000, ANNIE LI wrote:
> > > > @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
> > > >   		panic("we need grant tables version 2, but only version 1 is available");
> > > >   	} else {
> > > >   		grant_table_version = 1;
> > > > +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
> > > >   		gnttab_interface =&gnttab_v1_ops;
> > > >   	}
> > > > -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> > > > -		grant_table_version);
> > > >   }
> > > >   
> > > 
> > > Is it better to keep printk here? In your last patch, you removed it 
> > > because gnttab_request_version and gnttab_resume are all called in 
> > > gnttab_init. and gnttab_resume also contains calling of 
> > > gnttab_request_version. But in this patch, gnttab_setup is used, and 
> > > does not have this issue now.
> > 
> > Yes, I think we want to print this at both start of day and resume?
> > 
> > Either by adding a print to gnttab_resume() or by keeping the existing
> > one here in preference to moving it to gnttab_setup(). I'd prefer the
> > latter to avoid the duplication, unless I'm mistaken and request_version
> > is called in more than those two locations.
> 
> In this version the printk() is in gnttab_setup(), which is called
> both at start of day (gnttab_init()) and resume (gnttab_resume()). I
> can move it back to gnttab_request_version() if you'd like, but this
> patch should be behaving as expected already.

I misread the patch, you are absolutely right.

Acked-by: Ian Campbell <ian.campbell@citrix.com>



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-10  9:16       ` [Xen-devel] " Matt Wilson
  2013-01-10 10:32         ` Ian Campbell
@ 2013-01-10 10:32         ` Ian Campbell
  2013-01-10 11:02         ` ANNIE LI
  2013-01-10 11:02         ` [Xen-devel] " ANNIE LI
  3 siblings, 0 replies; 29+ messages in thread
From: Ian Campbell @ 2013-01-10 10:32 UTC (permalink / raw)
  To: Matt Wilson
  Cc: ANNIE LI, xen-devel, linux-kernel, stable, Konrad Rzeszutek Wilk

On Thu, 2013-01-10 at 09:16 +0000, Matt Wilson wrote:
> On Wed, Jan 09, 2013 at 12:02:09PM +0000, Ian Campbell wrote:
> > On Wed, 2013-01-09 at 02:40 +0000, ANNIE LI wrote:
> > > > @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
> > > >   		panic("we need grant tables version 2, but only version 1 is available");
> > > >   	} else {
> > > >   		grant_table_version = 1;
> > > > +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
> > > >   		gnttab_interface =&gnttab_v1_ops;
> > > >   	}
> > > > -	printk(KERN_INFO "Grant tables using version %d layout.\n",
> > > > -		grant_table_version);
> > > >   }
> > > >   
> > > 
> > > Is it better to keep printk here? In your last patch, you removed it 
> > > because gnttab_request_version and gnttab_resume are all called in 
> > > gnttab_init. and gnttab_resume also contains calling of 
> > > gnttab_request_version. But in this patch, gnttab_setup is used, and 
> > > does not have this issue now.
> > 
> > Yes, I think we want to print this at both start of day and resume?
> > 
> > Either by adding a print to gnttab_resume() or by keeping the existing
> > one here in preference to moving it to gnttab_setup(). I'd prefer the
> > latter to avoid the duplication, unless I'm mistaken and request_version
> > is called in more than those two locations.
> 
> In this version the printk() is in gnttab_setup(), which is called
> both at start of day (gnttab_init()) and resume (gnttab_resume()). I
> can move it back to gnttab_request_version() if you'd like, but this
> patch should be behaving as expected already.

I misread the patch, you are absolutely right.

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Xen-devel] [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-10  9:16       ` [Xen-devel] " Matt Wilson
                           ` (2 preceding siblings ...)
  2013-01-10 11:02         ` ANNIE LI
@ 2013-01-10 11:02         ` ANNIE LI
  3 siblings, 0 replies; 29+ messages in thread
From: ANNIE LI @ 2013-01-10 11:02 UTC (permalink / raw)
  To: Matt Wilson
  Cc: Ian Campbell, xen-devel, linux-kernel, stable, Konrad Rzeszutek Wilk



On 2013-1-10 17:16, Matt Wilson wrote:
> On Wed, Jan 09, 2013 at 12:02:09PM +0000, Ian Campbell wrote:
>> On Wed, 2013-01-09 at 02:40 +0000, ANNIE LI wrote:
>>>> @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
>>>>    		panic("we need grant tables version 2, but only version 1 is available");
>>>>    	} else {
>>>>    		grant_table_version = 1;
>>>> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
>>>>    		gnttab_interface =&gnttab_v1_ops;
>>>>    	}
>>>> -	printk(KERN_INFO "Grant tables using version %d layout.\n",
>>>> -		grant_table_version);
>>>>    }
>>>>
>>> Is it better to keep printk here? In your last patch, you removed it
>>> because gnttab_request_version and gnttab_resume are all called in
>>> gnttab_init. and gnttab_resume also contains calling of
>>> gnttab_request_version. But in this patch, gnttab_setup is used, and
>>> does not have this issue now.
>> Yes, I think we want to print this at both start of day and resume?
>>
>> Either by adding a print to gnttab_resume() or by keeping the existing
>> one here in preference to moving it to gnttab_setup(). I'd prefer the
>> latter to avoid the duplication, unless I'm mistaken and request_version
>> is called in more than those two locations.
> In this version the printk() is in gnttab_setup(), which is called
> both at start of day (gnttab_init()) and resume (gnttab_resume()). I
> can move it back to gnttab_request_version() if you'd like, but this
> patch should be behaving as expected already.

I was thinking to change less from original source code, this patch 
moves printk without specific purpose and shows same behavior as 
original code. It is OK if you'd like to keep it.

Thanks
Annie
>
> Matt
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-10  9:16       ` [Xen-devel] " Matt Wilson
  2013-01-10 10:32         ` Ian Campbell
  2013-01-10 10:32         ` Ian Campbell
@ 2013-01-10 11:02         ` ANNIE LI
  2013-01-10 11:02         ` [Xen-devel] " ANNIE LI
  3 siblings, 0 replies; 29+ messages in thread
From: ANNIE LI @ 2013-01-10 11:02 UTC (permalink / raw)
  To: Matt Wilson
  Cc: linux-kernel, Konrad Rzeszutek Wilk, Ian Campbell, stable, xen-devel



On 2013-1-10 17:16, Matt Wilson wrote:
> On Wed, Jan 09, 2013 at 12:02:09PM +0000, Ian Campbell wrote:
>> On Wed, 2013-01-09 at 02:40 +0000, ANNIE LI wrote:
>>>> @@ -1080,18 +1081,18 @@ static void gnttab_request_version(void)
>>>>    		panic("we need grant tables version 2, but only version 1 is available");
>>>>    	} else {
>>>>    		grant_table_version = 1;
>>>> +		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
>>>>    		gnttab_interface =&gnttab_v1_ops;
>>>>    	}
>>>> -	printk(KERN_INFO "Grant tables using version %d layout.\n",
>>>> -		grant_table_version);
>>>>    }
>>>>
>>> Is it better to keep printk here? In your last patch, you removed it
>>> because gnttab_request_version and gnttab_resume are all called in
>>> gnttab_init. and gnttab_resume also contains calling of
>>> gnttab_request_version. But in this patch, gnttab_setup is used, and
>>> does not have this issue now.
>> Yes, I think we want to print this at both start of day and resume?
>>
>> Either by adding a print to gnttab_resume() or by keeping the existing
>> one here in preference to moving it to gnttab_setup(). I'd prefer the
>> latter to avoid the duplication, unless I'm mistaken and request_version
>> is called in more than those two locations.
> In this version the printk() is in gnttab_setup(), which is called
> both at start of day (gnttab_init()) and resume (gnttab_resume()). I
> can move it back to gnttab_request_version() if you'd like, but this
> patch should be behaving as expected already.

I was thinking to change less from original source code, this patch 
moves printk without specific purpose and shows same behavior as 
original code. It is OK if you'd like to keep it.

Thanks
Annie
>
> Matt
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-06 11:14 ` Matt Wilson
                     ` (2 preceding siblings ...)
  2013-01-11 21:35   ` Konrad Rzeszutek Wilk
@ 2013-01-11 21:35   ` Konrad Rzeszutek Wilk
  2013-01-14  9:29     ` Matt Wilson
  2013-01-14  9:29     ` Matt Wilson
  3 siblings, 2 replies; 29+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-01-11 21:35 UTC (permalink / raw)
  To: Matt Wilson; +Cc: stable, Ian Campbell, Annie Li, xen-devel, linux-kernel

On Sun, Jan 06, 2013 at 11:14:42AM +0000, Matt Wilson wrote:
> Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
> tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
> a constant to a conditional expression. The expression depends on
> grant_table_version being appropriately set. Unfortunately, at init
> time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
> conditional expression checks for "grant_table_version == 1", and
> therefore returns the number of grant references per frame for v2.
> 
> This causes gnttab_init() to allocate fewer pages for gnttab_list, as
> a frame can old half the number of v2 entries than v1 entries. After
> gnttab_resume() is called, grant_table_version is appropriately
> set. nr_init_grefs will then be miscalculated and gnttab_free_count
> will hold a value larger than the actual number of free gref entries.
> 
> If a guest is heavily utilizing improperly initialized v1 grant
> tables, memory corruption can occur. One common manifestation is
> corruption of the vmalloc list, resulting in a poisoned pointer
> derefrence when accessing /proc/meminfo or /proc/vmallocinfo:
> 
> [   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
> [   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
> [   40.770102] PGD 0
> [   40.770107] Oops: 0000 [#1] SMP
> [   40.770114] CPU 10

With this patch I get:

[    2.555087] GHES: HEST is not enabled!
[    2.560522] ioatdma: Intel(R) QuickData Technology Driver 4.00
[    2.569651] ------------[ cut here ]------------
[    2.570529] kernel BUG at /home/konrad/linux/drivers/xen/grant-table.c:1175!
[    2.570529] invalid opcode: 0000 [#1] SMP 
[    2.570529] Modules linked in:
[    2.570529] Pid: 1, comm: swapper/0 Not tainted 3.8.0-rc3upstream-00236-g8a3fed1 #1 Xen HVM domU
[    2.570529] EIP: 0060:[<c1330668>] EFLAGS: 00010246 CPU: 1
[    2.570529] EIP is at gnttab_init+0x1b8/0x1c0
[    2.570529] EAX: 00000000 EBX: 00000006 ECX: f2479e24 EDX: 00000020
[    2.570529] ESI: f25a5800 EDI: 00000000 EBP: f2479e40 ESP: f2479e1c
[    2.570529]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[    2.570529] CR0: 8005003b CR2: 00000000 CR3: 017fb000 CR4: 000006f0
[    2.570529] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    2.570529] DR6: ffff0ff0 DR7: 00000400
[    2.570529] Process swapper/0 (pid: 1, ti=f2478000 task=f2462a70 task.ti=f2478000)
[    2.570529] Stack:
[    2.570529]  000000ff c132f0e7 f25a7ff0 00000004 00000020 00000000 00000000 f25a5800
[    2.570529]  00000000 f2479e6c c133bcec 00000000 f2479e70 c10a945f 01000000 f2479e60
[    2.570529]  f2000000 f25a5800 c1744700 c1744734 f2479e78 c12b5284 f25a5800 f2479e94
[    2.570529] Call Trace:
[    2.570529]  [<c132f0e7>] ? gnttab_max_grant_frames+0x27/0x60
[    2.570529]  [<c133bcec>] platform_pci_init+0x1ac/0x260
[    2.570529]  [<c10a945f>] ? __blocking_notifier_call_chain+0x4f/0x60
[    2.570529]  [<c12b5284>] local_pci_probe+0x14/0x30
[    2.570529]  [<c12b63b8>] pci_device_probe+0x58/0x70
[    2.570529]  [<c139c8a9>] driver_probe_device+0x69/0x1e0
[    2.570529]  [<c12b555e>] ? pci_match_device+0xbe/0xd0
[    2.570529]  [<c139ca99>] __driver_attach+0x79/0x80
[    2.570529]  [<c139b348>] bus_for_each_dev+0x48/0x70
[    2.570529]  [<c139c739>] driver_attach+0x19/0x20
[    2.570529]  [<c139ca20>] ? driver_probe_device+0x1e0/0x1e0
[    2.570529]  [<c139c1b4>] bus_add_driver+0xa4/0x220
[    2.570529]  [<c12b6300>] ? pci_dev_put+0x20/0x20
[    2.570529]  [<c12b6300>] ? pci_dev_put+0x20/0x20
[    2.570529]  [<c139d025>] driver_register+0x65/0x130
[    2.570529]  [<c12b646e>] __pci_register_driver+0x2e/0x40
[    2.570529]  [<c1796d92>] platform_pci_module_init+0x14/0x16
[    2.570529]  [<c100302f>] do_one_initcall+0x2f/0x170
[    2.570529]  [<c1796d7e>] ? hyper_sysfs_init+0xc4/0xc4
[    2.570529]  [<c1576e12>] kernel_init+0x142/0x2a0
[    2.570529]  [<c176b442>] ? parse_early_options+0x35/0x35
[    2.570529]  [<c158e477>] ret_from_kernel_thread+0x1b/0x28
[    2.570529]  [<c1576cd0>] ? rest_init+0x80/0x80
[    2.570529] Code: ff ff ff 89 15 98 54 89 c1 c7 05 9c 54 89 c1 08 00 00 00 c7 04 24 4e 22 6c c1 e8 b9 48 25 00 c7 45 e0 00 00 00 00 e9 58 ff ff ff <0f> 0b eb fe 8d 74 26 00 8b 15 00 e1 7f c1 31 c0 55 89 e5 83 fa
[    2.570529] EIP: [<c1330668>] gnttab_init+0x1b8/0x1c0 SS:ESP 0068:f2479e1c
[    2.922510] ---[ end trace 14cd1ffa03240f7d ]---
[    2.928972] swapper/0 (1) used greatest stack depth: 5196 bytes left
[    2.935947] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    2.935947] 

This with a 64-bit dom0 and 64-bit and 32-bit PVHVM.

I had to fix a merge conflict with this patch so perhaps I missed something:

(The tree is #linux-next-test-3.8-rc3 on my xen.git tree);

commit 31d1172faa705443aec03915bdc809ec6a8ebdd7
Author: Matt Wilson <msw@amazon.com>
Date:   Sun Jan 6 11:14:42 2013 +0000

    xen/grant-table: correctly initialize grant table version 1
    
    Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
    tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
    a constant to a conditional expression. The expression depends on
    grant_table_version being appropriately set. Unfortunately, at init
    time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
    conditional expression checks for "grant_table_version == 1", and
    therefore returns the number of grant references per frame for v2.
    
    This causes gnttab_init() to allocate fewer pages for gnttab_list, as
    a frame can old half the number of v2 entries than v1 entries. After
    gnttab_resume() is called, grant_table_version is appropriately
    set. nr_init_grefs will then be miscalculated and gnttab_free_count
    will hold a value larger than the actual number of free gref entries.
    
    If a guest is heavily utilizing improperly initialized v1 grant
    tables, memory corruption can occur. One common manifestation is
    corruption of the vmalloc list, resulting in a poisoned pointer
    derefrence when accessing /proc/meminfo or /proc/vmallocinfo:
    
    [   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
    [   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
    [   40.770102] PGD 0
    [   40.770107] Oops: 0000 [#1] SMP
    [   40.770114] CPU 10
    
    This patch introduces a static variable, grefs_per_grant_frame, to
    cache the calculated value. gnttab_init() now calls
    gnttab_request_version() early so that grant_table_version and
    grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
    been added to prevent this type of bug from reoccurring in the future.
    
    Signed-off-by: Matt Wilson <msw@amazon.com>
    Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com>
    Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
    Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Cc: Annie Li <annie.li@oracle.com>
    Cc: xen-devel@lists.xen.org
    Cc: linux-kernel@vger.kernel.org
    Cc: stable@vger.kernel.org # v3.3 and newer
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    [v1: Fixed merge conflict]

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index b91f14e..d49c9fb 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -56,10 +56,6 @@
 /* External tools reserve first few grant table entries. */
 #define NR_RESERVED_ENTRIES 8
 #define GNTTAB_LIST_END 0xffffffff
-#define GREFS_PER_GRANT_FRAME \
-(grant_table_version == 1 ?                      \
-(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
-(PAGE_SIZE / sizeof(union grant_entry_v2)))
 
 static grant_ref_t **gnttab_list;
 static unsigned int nr_grant_frames;
@@ -154,6 +150,7 @@ static struct gnttab_ops *gnttab_interface;
 static grant_status_t *grstatus;
 
 static int grant_table_version;
+static int grefs_per_grant_frame;
 
 static struct gnttab_free_callback *gnttab_free_callback_list;
 
@@ -767,12 +764,14 @@ static int grow_gnttab_list(unsigned int more_frames)
 	unsigned int new_nr_grant_frames, extra_entries, i;
 	unsigned int nr_glist_frames, new_nr_glist_frames;
 
+	BUG_ON(grefs_per_grant_frame == 0);
+
 	new_nr_grant_frames = nr_grant_frames + more_frames;
-	extra_entries       = more_frames * GREFS_PER_GRANT_FRAME;
+	extra_entries       = more_frames * grefs_per_grant_frame;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	new_nr_glist_frames =
-		(new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+		(new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = nr_glist_frames; i < new_nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
 		if (!gnttab_list[i])
@@ -780,12 +779,12 @@ static int grow_gnttab_list(unsigned int more_frames)
 	}
 
 
-	for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
-	     i < GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
+	for (i = grefs_per_grant_frame * nr_grant_frames;
+	     i < grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
 		gnttab_entry(i) = i + 1;
 
 	gnttab_entry(i) = gnttab_free_head;
-	gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
+	gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
 	gnttab_free_count += extra_entries;
 
 	nr_grant_frames = new_nr_grant_frames;
@@ -957,7 +956,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
 
 static unsigned nr_status_frames(unsigned nr_grant_frames)
 {
-	return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
+	BUG_ON(grefs_per_grant_frame == 0);
+	return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
 }
 
 static int gnttab_map_frames_v1(xen_pfn_t *frames, unsigned int nr_gframes)
@@ -1115,6 +1115,7 @@ static void gnttab_request_version(void)
 	rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
 	if (rc == 0 && gsv.version == 2) {
 		grant_table_version = 2;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(union grant_entry_v2);
 		gnttab_interface = &gnttab_v2_ops;
 	} else if (grant_table_version == 2) {
 		/*
@@ -1129,15 +1130,15 @@ static void gnttab_request_version(void)
 		grant_table_version = 1;
 		gnttab_interface = &gnttab_v1_ops;
 	}
-	printk(KERN_INFO "Grant tables using version %d layout.\n",
-		grant_table_version);
 }
 
-int gnttab_resume(void)
+static int gnttab_setup(void)
 {
 	unsigned int max_nr_gframes;
 
-	gnttab_request_version();
+	printk(KERN_INFO "Grant tables using version %d layout.\n",
+		grant_table_version);
+
 	max_nr_gframes = gnttab_max_grant_frames();
 	if (max_nr_gframes < nr_grant_frames)
 		return -ENOSYS;
@@ -1160,6 +1161,12 @@ int gnttab_resume(void)
 	return 0;
 }
 
+int gnttab_resume(void)
+{
+	gnttab_request_version();
+	return gnttab_setup();
+}
+
 int gnttab_suspend(void)
 {
 	gnttab_interface->unmap_frames();
@@ -1171,9 +1178,10 @@ static int gnttab_expand(unsigned int req_entries)
 	int rc;
 	unsigned int cur, extra;
 
+	BUG_ON(grefs_per_grant_frame == 0);
 	cur = nr_grant_frames;
-	extra = ((req_entries + (GREFS_PER_GRANT_FRAME-1)) /
-		 GREFS_PER_GRANT_FRAME);
+	extra = ((req_entries + (grefs_per_grant_frame-1)) /
+		 grefs_per_grant_frame);
 	if (cur + extra > gnttab_max_grant_frames())
 		return -ENOSPC;
 
@@ -1191,21 +1199,23 @@ int gnttab_init(void)
 	unsigned int nr_init_grefs;
 	int ret;
 
+	gnttab_request_version();
 	nr_grant_frames = 1;
 	boot_max_nr_grant_frames = __max_nr_grant_frames();
 
 	/* Determine the maximum number of frames required for the
 	 * grant reference free list on the current hypervisor.
 	 */
+	BUG_ON(grefs_per_grant_frame == 0);
 	max_nr_glist_frames = (boot_max_nr_grant_frames *
-			       GREFS_PER_GRANT_FRAME / RPP);
+			       grefs_per_grant_frame / RPP);
 
 	gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *),
 			      GFP_KERNEL);
 	if (gnttab_list == NULL)
 		return -ENOMEM;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = 0; i < nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_KERNEL);
 		if (gnttab_list[i] == NULL) {
@@ -1214,12 +1224,12 @@ int gnttab_init(void)
 		}
 	}
 
-	if (gnttab_resume() < 0) {
+	if (gnttab_setup() < 0) {
 		ret = -ENODEV;
 		goto ini_nomem;
 	}
 
-	nr_init_grefs = nr_grant_frames * GREFS_PER_GRANT_FRAME;
+	nr_init_grefs = nr_grant_frames * grefs_per_grant_frame;
 
 	for (i = NR_RESERVED_ENTRIES; i < nr_init_grefs - 1; i++)
 		gnttab_entry(i) = i + 1;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-06 11:14 ` Matt Wilson
  2013-01-09  2:40   ` ANNIE LI
  2013-01-09  2:40   ` [Xen-devel] " ANNIE LI
@ 2013-01-11 21:35   ` Konrad Rzeszutek Wilk
  2013-01-11 21:35   ` Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 29+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-01-11 21:35 UTC (permalink / raw)
  To: Matt Wilson; +Cc: linux-kernel, Annie Li, Ian Campbell, stable, xen-devel

On Sun, Jan 06, 2013 at 11:14:42AM +0000, Matt Wilson wrote:
> Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
> tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
> a constant to a conditional expression. The expression depends on
> grant_table_version being appropriately set. Unfortunately, at init
> time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
> conditional expression checks for "grant_table_version == 1", and
> therefore returns the number of grant references per frame for v2.
> 
> This causes gnttab_init() to allocate fewer pages for gnttab_list, as
> a frame can old half the number of v2 entries than v1 entries. After
> gnttab_resume() is called, grant_table_version is appropriately
> set. nr_init_grefs will then be miscalculated and gnttab_free_count
> will hold a value larger than the actual number of free gref entries.
> 
> If a guest is heavily utilizing improperly initialized v1 grant
> tables, memory corruption can occur. One common manifestation is
> corruption of the vmalloc list, resulting in a poisoned pointer
> derefrence when accessing /proc/meminfo or /proc/vmallocinfo:
> 
> [   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
> [   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
> [   40.770102] PGD 0
> [   40.770107] Oops: 0000 [#1] SMP
> [   40.770114] CPU 10

With this patch I get:

[    2.555087] GHES: HEST is not enabled!
[    2.560522] ioatdma: Intel(R) QuickData Technology Driver 4.00
[    2.569651] ------------[ cut here ]------------
[    2.570529] kernel BUG at /home/konrad/linux/drivers/xen/grant-table.c:1175!
[    2.570529] invalid opcode: 0000 [#1] SMP 
[    2.570529] Modules linked in:
[    2.570529] Pid: 1, comm: swapper/0 Not tainted 3.8.0-rc3upstream-00236-g8a3fed1 #1 Xen HVM domU
[    2.570529] EIP: 0060:[<c1330668>] EFLAGS: 00010246 CPU: 1
[    2.570529] EIP is at gnttab_init+0x1b8/0x1c0
[    2.570529] EAX: 00000000 EBX: 00000006 ECX: f2479e24 EDX: 00000020
[    2.570529] ESI: f25a5800 EDI: 00000000 EBP: f2479e40 ESP: f2479e1c
[    2.570529]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[    2.570529] CR0: 8005003b CR2: 00000000 CR3: 017fb000 CR4: 000006f0
[    2.570529] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    2.570529] DR6: ffff0ff0 DR7: 00000400
[    2.570529] Process swapper/0 (pid: 1, ti=f2478000 task=f2462a70 task.ti=f2478000)
[    2.570529] Stack:
[    2.570529]  000000ff c132f0e7 f25a7ff0 00000004 00000020 00000000 00000000 f25a5800
[    2.570529]  00000000 f2479e6c c133bcec 00000000 f2479e70 c10a945f 01000000 f2479e60
[    2.570529]  f2000000 f25a5800 c1744700 c1744734 f2479e78 c12b5284 f25a5800 f2479e94
[    2.570529] Call Trace:
[    2.570529]  [<c132f0e7>] ? gnttab_max_grant_frames+0x27/0x60
[    2.570529]  [<c133bcec>] platform_pci_init+0x1ac/0x260
[    2.570529]  [<c10a945f>] ? __blocking_notifier_call_chain+0x4f/0x60
[    2.570529]  [<c12b5284>] local_pci_probe+0x14/0x30
[    2.570529]  [<c12b63b8>] pci_device_probe+0x58/0x70
[    2.570529]  [<c139c8a9>] driver_probe_device+0x69/0x1e0
[    2.570529]  [<c12b555e>] ? pci_match_device+0xbe/0xd0
[    2.570529]  [<c139ca99>] __driver_attach+0x79/0x80
[    2.570529]  [<c139b348>] bus_for_each_dev+0x48/0x70
[    2.570529]  [<c139c739>] driver_attach+0x19/0x20
[    2.570529]  [<c139ca20>] ? driver_probe_device+0x1e0/0x1e0
[    2.570529]  [<c139c1b4>] bus_add_driver+0xa4/0x220
[    2.570529]  [<c12b6300>] ? pci_dev_put+0x20/0x20
[    2.570529]  [<c12b6300>] ? pci_dev_put+0x20/0x20
[    2.570529]  [<c139d025>] driver_register+0x65/0x130
[    2.570529]  [<c12b646e>] __pci_register_driver+0x2e/0x40
[    2.570529]  [<c1796d92>] platform_pci_module_init+0x14/0x16
[    2.570529]  [<c100302f>] do_one_initcall+0x2f/0x170
[    2.570529]  [<c1796d7e>] ? hyper_sysfs_init+0xc4/0xc4
[    2.570529]  [<c1576e12>] kernel_init+0x142/0x2a0
[    2.570529]  [<c176b442>] ? parse_early_options+0x35/0x35
[    2.570529]  [<c158e477>] ret_from_kernel_thread+0x1b/0x28
[    2.570529]  [<c1576cd0>] ? rest_init+0x80/0x80
[    2.570529] Code: ff ff ff 89 15 98 54 89 c1 c7 05 9c 54 89 c1 08 00 00 00 c7 04 24 4e 22 6c c1 e8 b9 48 25 00 c7 45 e0 00 00 00 00 e9 58 ff ff ff <0f> 0b eb fe 8d 74 26 00 8b 15 00 e1 7f c1 31 c0 55 89 e5 83 fa
[    2.570529] EIP: [<c1330668>] gnttab_init+0x1b8/0x1c0 SS:ESP 0068:f2479e1c
[    2.922510] ---[ end trace 14cd1ffa03240f7d ]---
[    2.928972] swapper/0 (1) used greatest stack depth: 5196 bytes left
[    2.935947] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    2.935947] 

This with a 64-bit dom0 and 64-bit and 32-bit PVHVM.

I had to fix a merge conflict with this patch so perhaps I missed something:

(The tree is #linux-next-test-3.8-rc3 on my xen.git tree);

commit 31d1172faa705443aec03915bdc809ec6a8ebdd7
Author: Matt Wilson <msw@amazon.com>
Date:   Sun Jan 6 11:14:42 2013 +0000

    xen/grant-table: correctly initialize grant table version 1
    
    Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
    tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
    a constant to a conditional expression. The expression depends on
    grant_table_version being appropriately set. Unfortunately, at init
    time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
    conditional expression checks for "grant_table_version == 1", and
    therefore returns the number of grant references per frame for v2.
    
    This causes gnttab_init() to allocate fewer pages for gnttab_list, as
    a frame can old half the number of v2 entries than v1 entries. After
    gnttab_resume() is called, grant_table_version is appropriately
    set. nr_init_grefs will then be miscalculated and gnttab_free_count
    will hold a value larger than the actual number of free gref entries.
    
    If a guest is heavily utilizing improperly initialized v1 grant
    tables, memory corruption can occur. One common manifestation is
    corruption of the vmalloc list, resulting in a poisoned pointer
    derefrence when accessing /proc/meminfo or /proc/vmallocinfo:
    
    [   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
    [   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
    [   40.770102] PGD 0
    [   40.770107] Oops: 0000 [#1] SMP
    [   40.770114] CPU 10
    
    This patch introduces a static variable, grefs_per_grant_frame, to
    cache the calculated value. gnttab_init() now calls
    gnttab_request_version() early so that grant_table_version and
    grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
    been added to prevent this type of bug from reoccurring in the future.
    
    Signed-off-by: Matt Wilson <msw@amazon.com>
    Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com>
    Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
    Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Cc: Annie Li <annie.li@oracle.com>
    Cc: xen-devel@lists.xen.org
    Cc: linux-kernel@vger.kernel.org
    Cc: stable@vger.kernel.org # v3.3 and newer
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    [v1: Fixed merge conflict]

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index b91f14e..d49c9fb 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -56,10 +56,6 @@
 /* External tools reserve first few grant table entries. */
 #define NR_RESERVED_ENTRIES 8
 #define GNTTAB_LIST_END 0xffffffff
-#define GREFS_PER_GRANT_FRAME \
-(grant_table_version == 1 ?                      \
-(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
-(PAGE_SIZE / sizeof(union grant_entry_v2)))
 
 static grant_ref_t **gnttab_list;
 static unsigned int nr_grant_frames;
@@ -154,6 +150,7 @@ static struct gnttab_ops *gnttab_interface;
 static grant_status_t *grstatus;
 
 static int grant_table_version;
+static int grefs_per_grant_frame;
 
 static struct gnttab_free_callback *gnttab_free_callback_list;
 
@@ -767,12 +764,14 @@ static int grow_gnttab_list(unsigned int more_frames)
 	unsigned int new_nr_grant_frames, extra_entries, i;
 	unsigned int nr_glist_frames, new_nr_glist_frames;
 
+	BUG_ON(grefs_per_grant_frame == 0);
+
 	new_nr_grant_frames = nr_grant_frames + more_frames;
-	extra_entries       = more_frames * GREFS_PER_GRANT_FRAME;
+	extra_entries       = more_frames * grefs_per_grant_frame;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	new_nr_glist_frames =
-		(new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+		(new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = nr_glist_frames; i < new_nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
 		if (!gnttab_list[i])
@@ -780,12 +779,12 @@ static int grow_gnttab_list(unsigned int more_frames)
 	}
 
 
-	for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
-	     i < GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
+	for (i = grefs_per_grant_frame * nr_grant_frames;
+	     i < grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
 		gnttab_entry(i) = i + 1;
 
 	gnttab_entry(i) = gnttab_free_head;
-	gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
+	gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
 	gnttab_free_count += extra_entries;
 
 	nr_grant_frames = new_nr_grant_frames;
@@ -957,7 +956,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
 
 static unsigned nr_status_frames(unsigned nr_grant_frames)
 {
-	return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
+	BUG_ON(grefs_per_grant_frame == 0);
+	return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
 }
 
 static int gnttab_map_frames_v1(xen_pfn_t *frames, unsigned int nr_gframes)
@@ -1115,6 +1115,7 @@ static void gnttab_request_version(void)
 	rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
 	if (rc == 0 && gsv.version == 2) {
 		grant_table_version = 2;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(union grant_entry_v2);
 		gnttab_interface = &gnttab_v2_ops;
 	} else if (grant_table_version == 2) {
 		/*
@@ -1129,15 +1130,15 @@ static void gnttab_request_version(void)
 		grant_table_version = 1;
 		gnttab_interface = &gnttab_v1_ops;
 	}
-	printk(KERN_INFO "Grant tables using version %d layout.\n",
-		grant_table_version);
 }
 
-int gnttab_resume(void)
+static int gnttab_setup(void)
 {
 	unsigned int max_nr_gframes;
 
-	gnttab_request_version();
+	printk(KERN_INFO "Grant tables using version %d layout.\n",
+		grant_table_version);
+
 	max_nr_gframes = gnttab_max_grant_frames();
 	if (max_nr_gframes < nr_grant_frames)
 		return -ENOSYS;
@@ -1160,6 +1161,12 @@ int gnttab_resume(void)
 	return 0;
 }
 
+int gnttab_resume(void)
+{
+	gnttab_request_version();
+	return gnttab_setup();
+}
+
 int gnttab_suspend(void)
 {
 	gnttab_interface->unmap_frames();
@@ -1171,9 +1178,10 @@ static int gnttab_expand(unsigned int req_entries)
 	int rc;
 	unsigned int cur, extra;
 
+	BUG_ON(grefs_per_grant_frame == 0);
 	cur = nr_grant_frames;
-	extra = ((req_entries + (GREFS_PER_GRANT_FRAME-1)) /
-		 GREFS_PER_GRANT_FRAME);
+	extra = ((req_entries + (grefs_per_grant_frame-1)) /
+		 grefs_per_grant_frame);
 	if (cur + extra > gnttab_max_grant_frames())
 		return -ENOSPC;
 
@@ -1191,21 +1199,23 @@ int gnttab_init(void)
 	unsigned int nr_init_grefs;
 	int ret;
 
+	gnttab_request_version();
 	nr_grant_frames = 1;
 	boot_max_nr_grant_frames = __max_nr_grant_frames();
 
 	/* Determine the maximum number of frames required for the
 	 * grant reference free list on the current hypervisor.
 	 */
+	BUG_ON(grefs_per_grant_frame == 0);
 	max_nr_glist_frames = (boot_max_nr_grant_frames *
-			       GREFS_PER_GRANT_FRAME / RPP);
+			       grefs_per_grant_frame / RPP);
 
 	gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *),
 			      GFP_KERNEL);
 	if (gnttab_list == NULL)
 		return -ENOMEM;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = 0; i < nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_KERNEL);
 		if (gnttab_list[i] == NULL) {
@@ -1214,12 +1224,12 @@ int gnttab_init(void)
 		}
 	}
 
-	if (gnttab_resume() < 0) {
+	if (gnttab_setup() < 0) {
 		ret = -ENODEV;
 		goto ini_nomem;
 	}
 
-	nr_init_grefs = nr_grant_frames * GREFS_PER_GRANT_FRAME;
+	nr_init_grefs = nr_grant_frames * grefs_per_grant_frame;
 
 	for (i = NR_RESERVED_ENTRIES; i < nr_init_grefs - 1; i++)
 		gnttab_entry(i) = i + 1;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-11 21:35   ` Konrad Rzeszutek Wilk
@ 2013-01-14  9:29     ` Matt Wilson
  2013-01-14 15:31       ` Konrad Rzeszutek Wilk
  2013-01-14 15:31       ` Konrad Rzeszutek Wilk
  2013-01-14  9:29     ` Matt Wilson
  1 sibling, 2 replies; 29+ messages in thread
From: Matt Wilson @ 2013-01-14  9:29 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: stable, Ian Campbell, Annie Li, xen-devel, linux-kernel

On Fri, Jan 11, 2013 at 04:35:50PM -0500, Konrad Rzeszutek Wilk wrote:
> 
> With this patch I get:
> 
> [    2.555087] GHES: HEST is not enabled!
> [    2.560522] ioatdma: Intel(R) QuickData Technology Driver 4.00
> [    2.569651] ------------[ cut here ]------------
> [    2.570529] kernel BUG at /home/konrad/linux/drivers/xen/grant-table.c:1175!

You dropped part of a hunk when you applied the patch:

@@ -1127,17 +1128,17 @@ static void gnttab_request_version(void)
                panic("we need grant tables version 2, but only version 1 is a
        } else {
                grant_table_version = 1;
+               grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_
                gnttab_interface = &gnttab_v1_ops;
        }
        printk(KERN_INFO "Grant tables using version %d layout.\n",
                grant_table_version);
 }
 
I can rebase to v3.8-rc3 if that would help.

Matt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-11 21:35   ` Konrad Rzeszutek Wilk
  2013-01-14  9:29     ` Matt Wilson
@ 2013-01-14  9:29     ` Matt Wilson
  1 sibling, 0 replies; 29+ messages in thread
From: Matt Wilson @ 2013-01-14  9:29 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: linux-kernel, Annie Li, Ian Campbell, stable, xen-devel

On Fri, Jan 11, 2013 at 04:35:50PM -0500, Konrad Rzeszutek Wilk wrote:
> 
> With this patch I get:
> 
> [    2.555087] GHES: HEST is not enabled!
> [    2.560522] ioatdma: Intel(R) QuickData Technology Driver 4.00
> [    2.569651] ------------[ cut here ]------------
> [    2.570529] kernel BUG at /home/konrad/linux/drivers/xen/grant-table.c:1175!

You dropped part of a hunk when you applied the patch:

@@ -1127,17 +1128,17 @@ static void gnttab_request_version(void)
                panic("we need grant tables version 2, but only version 1 is a
        } else {
                grant_table_version = 1;
+               grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_
                gnttab_interface = &gnttab_v1_ops;
        }
        printk(KERN_INFO "Grant tables using version %d layout.\n",
                grant_table_version);
 }
 
I can rebase to v3.8-rc3 if that would help.

Matt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-14  9:29     ` Matt Wilson
@ 2013-01-14 15:31       ` Konrad Rzeszutek Wilk
  2013-01-14 15:31       ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 29+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-01-14 15:31 UTC (permalink / raw)
  To: Matt Wilson; +Cc: stable, Ian Campbell, Annie Li, xen-devel, linux-kernel

On Mon, Jan 14, 2013 at 10:29:41AM +0100, Matt Wilson wrote:
> On Fri, Jan 11, 2013 at 04:35:50PM -0500, Konrad Rzeszutek Wilk wrote:
> > 
> > With this patch I get:
> > 
> > [    2.555087] GHES: HEST is not enabled!
> > [    2.560522] ioatdma: Intel(R) QuickData Technology Driver 4.00
> > [    2.569651] ------------[ cut here ]------------
> > [    2.570529] kernel BUG at /home/konrad/linux/drivers/xen/grant-table.c:1175!
> 
> You dropped part of a hunk when you applied the patch:
> 
> @@ -1127,17 +1128,17 @@ static void gnttab_request_version(void)
>                 panic("we need grant tables version 2, but only version 1 is a
>         } else {
>                 grant_table_version = 1;
> +               grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_
>                 gnttab_interface = &gnttab_v1_ops;
>         }
>         printk(KERN_INFO "Grant tables using version %d layout.\n",
>                 grant_table_version);
>  }
>  
> I can rebase to v3.8-rc3 if that would help.

Yes pls.
> 
> Matt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2] xen/grant-table: correctly initialize grant table version 1
  2013-01-14  9:29     ` Matt Wilson
  2013-01-14 15:31       ` Konrad Rzeszutek Wilk
@ 2013-01-14 15:31       ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 29+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-01-14 15:31 UTC (permalink / raw)
  To: Matt Wilson; +Cc: linux-kernel, Annie Li, Ian Campbell, stable, xen-devel

On Mon, Jan 14, 2013 at 10:29:41AM +0100, Matt Wilson wrote:
> On Fri, Jan 11, 2013 at 04:35:50PM -0500, Konrad Rzeszutek Wilk wrote:
> > 
> > With this patch I get:
> > 
> > [    2.555087] GHES: HEST is not enabled!
> > [    2.560522] ioatdma: Intel(R) QuickData Technology Driver 4.00
> > [    2.569651] ------------[ cut here ]------------
> > [    2.570529] kernel BUG at /home/konrad/linux/drivers/xen/grant-table.c:1175!
> 
> You dropped part of a hunk when you applied the patch:
> 
> @@ -1127,17 +1128,17 @@ static void gnttab_request_version(void)
>                 panic("we need grant tables version 2, but only version 1 is a
>         } else {
>                 grant_table_version = 1;
> +               grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_
>                 gnttab_interface = &gnttab_v1_ops;
>         }
>         printk(KERN_INFO "Grant tables using version %d layout.\n",
>                 grant_table_version);
>  }
>  
> I can rebase to v3.8-rc3 if that would help.

Yes pls.
> 
> Matt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3] xen/grant-table: correctly initialize grant table version 1
       [not found] <Message-ID: <20130114153157.GF1156@phenom.dumpdata.com>
  2013-01-15 13:21 ` [PATCH v3] " Matt Wilson
@ 2013-01-15 13:21 ` Matt Wilson
  2013-01-15 13:24   ` Matt Wilson
  2013-01-15 13:24   ` Matt Wilson
  1 sibling, 2 replies; 29+ messages in thread
From: Matt Wilson @ 2013-01-15 13:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: stable, Matt Wilson, Ian Campbell, Annie Li, xen-devel, linux-kernel

Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
a constant to a conditional expression. The expression depends on
grant_table_version being appropriately set. Unfortunately, at init
time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
conditional expression checks for "grant_table_version == 1", and
therefore returns the number of grant references per frame for v2.

This causes gnttab_init() to allocate fewer pages for gnttab_list, as
a frame can old half the number of v2 entries than v1 entries. After
gnttab_resume() is called, grant_table_version is appropriately
set. nr_init_grefs will then be miscalculated and gnttab_free_count
will hold a value larger than the actual number of free gref entries.

If a guest is heavily utilizing improperly initialized v1 grant
tables, memory corruption can occur. One common manifestation is
corruption of the vmalloc list, resulting in a poisoned pointer
derefrence when accessing /proc/meminfo or /proc/vmallocinfo:

[   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
[   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
[   40.770102] PGD 0
[   40.770107] Oops: 0000 [#1] SMP
[   40.770114] CPU 10

This patch introduces a static variable, grefs_per_grant_frame, to
cache the calculated value. gnttab_init() now calls
gnttab_request_version() early so that grant_table_version and
grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
been added to prevent this type of bug from reoccurring in the future.

Signed-off-by: Matt Wilson <msw@amazon.com>
Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com>
Signed-off-by: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: xen-devel@lists.xen.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v3.3 and newer
---
Changes since v1:
* introduced a new gnttab_setup() function and moved all of the
  initialization code from gnttab_resume() there.

Changes since v2:
* moved the "Grant tables using version %d layout." message back to
  gnttab_request_version()
* rebased on v3.8-rc3
---
 drivers/xen/grant-table.c |   48 +++++++++++++++++++++++++++-----------------
 1 files changed, 29 insertions(+), 19 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 7038de5..157c0cc 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -56,10 +56,6 @@
 /* External tools reserve first few grant table entries. */
 #define NR_RESERVED_ENTRIES 8
 #define GNTTAB_LIST_END 0xffffffff
-#define GREFS_PER_GRANT_FRAME \
-(grant_table_version == 1 ?                      \
-(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
-(PAGE_SIZE / sizeof(union grant_entry_v2)))
 
 static grant_ref_t **gnttab_list;
 static unsigned int nr_grant_frames;
@@ -154,6 +150,7 @@ static struct gnttab_ops *gnttab_interface;
 static grant_status_t *grstatus;
 
 static int grant_table_version;
+static int grefs_per_grant_frame;
 
 static struct gnttab_free_callback *gnttab_free_callback_list;
 
@@ -767,12 +764,14 @@ static int grow_gnttab_list(unsigned int more_frames)
 	unsigned int new_nr_grant_frames, extra_entries, i;
 	unsigned int nr_glist_frames, new_nr_glist_frames;
 
+	BUG_ON(grefs_per_grant_frame == 0);
+
 	new_nr_grant_frames = nr_grant_frames + more_frames;
-	extra_entries       = more_frames * GREFS_PER_GRANT_FRAME;
+	extra_entries       = more_frames * grefs_per_grant_frame;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	new_nr_glist_frames =
-		(new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+		(new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = nr_glist_frames; i < new_nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
 		if (!gnttab_list[i])
@@ -780,12 +779,12 @@ static int grow_gnttab_list(unsigned int more_frames)
 	}
 
 
-	for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
-	     i < GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
+	for (i = grefs_per_grant_frame * nr_grant_frames;
+	     i < grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
 		gnttab_entry(i) = i + 1;
 
 	gnttab_entry(i) = gnttab_free_head;
-	gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
+	gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
 	gnttab_free_count += extra_entries;
 
 	nr_grant_frames = new_nr_grant_frames;
@@ -957,7 +956,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
 
 static unsigned nr_status_frames(unsigned nr_grant_frames)
 {
-	return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
+	BUG_ON(grefs_per_grant_frame == 0);
+	return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
 }
 
 static int gnttab_map_frames_v1(xen_pfn_t *frames, unsigned int nr_gframes)
@@ -1115,6 +1115,7 @@ static void gnttab_request_version(void)
 	rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
 	if (rc == 0 && gsv.version == 2) {
 		grant_table_version = 2;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(union grant_entry_v2);
 		gnttab_interface = &gnttab_v2_ops;
 	} else if (grant_table_version == 2) {
 		/*
@@ -1127,17 +1128,17 @@ static void gnttab_request_version(void)
 		panic("we need grant tables version 2, but only version 1 is available");
 	} else {
 		grant_table_version = 1;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
 		gnttab_interface = &gnttab_v1_ops;
 	}
 	printk(KERN_INFO "Grant tables using version %d layout.\n",
 		grant_table_version);
 }
 
-int gnttab_resume(void)
+static int gnttab_setup(void)
 {
 	unsigned int max_nr_gframes;
 
-	gnttab_request_version();
 	max_nr_gframes = gnttab_max_grant_frames();
 	if (max_nr_gframes < nr_grant_frames)
 		return -ENOSYS;
@@ -1160,6 +1161,12 @@ int gnttab_resume(void)
 	return 0;
 }
 
+int gnttab_resume(void)
+{
+	gnttab_request_version();
+	return gnttab_setup();
+}
+
 int gnttab_suspend(void)
 {
 	gnttab_interface->unmap_frames();
@@ -1171,9 +1178,10 @@ static int gnttab_expand(unsigned int req_entries)
 	int rc;
 	unsigned int cur, extra;
 
+	BUG_ON(grefs_per_grant_frame == 0);
 	cur = nr_grant_frames;
-	extra = ((req_entries + (GREFS_PER_GRANT_FRAME-1)) /
-		 GREFS_PER_GRANT_FRAME);
+	extra = ((req_entries + (grefs_per_grant_frame-1)) /
+		 grefs_per_grant_frame);
 	if (cur + extra > gnttab_max_grant_frames())
 		return -ENOSPC;
 
@@ -1191,21 +1199,23 @@ int gnttab_init(void)
 	unsigned int nr_init_grefs;
 	int ret;
 
+	gnttab_request_version();
 	nr_grant_frames = 1;
 	boot_max_nr_grant_frames = __max_nr_grant_frames();
 
 	/* Determine the maximum number of frames required for the
 	 * grant reference free list on the current hypervisor.
 	 */
+	BUG_ON(grefs_per_grant_frame == 0);
 	max_nr_glist_frames = (boot_max_nr_grant_frames *
-			       GREFS_PER_GRANT_FRAME / RPP);
+			       grefs_per_grant_frame / RPP);
 
 	gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *),
 			      GFP_KERNEL);
 	if (gnttab_list == NULL)
 		return -ENOMEM;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = 0; i < nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_KERNEL);
 		if (gnttab_list[i] == NULL) {
@@ -1214,12 +1224,12 @@ int gnttab_init(void)
 		}
 	}
 
-	if (gnttab_resume() < 0) {
+	if (gnttab_setup() < 0) {
 		ret = -ENODEV;
 		goto ini_nomem;
 	}
 
-	nr_init_grefs = nr_grant_frames * GREFS_PER_GRANT_FRAME;
+	nr_init_grefs = nr_grant_frames * grefs_per_grant_frame;
 
 	for (i = NR_RESERVED_ENTRIES; i < nr_init_grefs - 1; i++)
 		gnttab_entry(i) = i + 1;
-- 
1.7.4.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3] xen/grant-table: correctly initialize grant table version 1
       [not found] <Message-ID: <20130114153157.GF1156@phenom.dumpdata.com>
@ 2013-01-15 13:21 ` Matt Wilson
  2013-01-15 13:21 ` Matt Wilson
  1 sibling, 0 replies; 29+ messages in thread
From: Matt Wilson @ 2013-01-15 13:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Ian Campbell, linux-kernel, stable, xen-devel, Annie Li, Matt Wilson

Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
a constant to a conditional expression. The expression depends on
grant_table_version being appropriately set. Unfortunately, at init
time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
conditional expression checks for "grant_table_version == 1", and
therefore returns the number of grant references per frame for v2.

This causes gnttab_init() to allocate fewer pages for gnttab_list, as
a frame can old half the number of v2 entries than v1 entries. After
gnttab_resume() is called, grant_table_version is appropriately
set. nr_init_grefs will then be miscalculated and gnttab_free_count
will hold a value larger than the actual number of free gref entries.

If a guest is heavily utilizing improperly initialized v1 grant
tables, memory corruption can occur. One common manifestation is
corruption of the vmalloc list, resulting in a poisoned pointer
derefrence when accessing /proc/meminfo or /proc/vmallocinfo:

[   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
[   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
[   40.770102] PGD 0
[   40.770107] Oops: 0000 [#1] SMP
[   40.770114] CPU 10

This patch introduces a static variable, grefs_per_grant_frame, to
cache the calculated value. gnttab_init() now calls
gnttab_request_version() early so that grant_table_version and
grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
been added to prevent this type of bug from reoccurring in the future.

Signed-off-by: Matt Wilson <msw@amazon.com>
Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com>
Signed-off-by: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: xen-devel@lists.xen.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v3.3 and newer
---
Changes since v1:
* introduced a new gnttab_setup() function and moved all of the
  initialization code from gnttab_resume() there.

Changes since v2:
* moved the "Grant tables using version %d layout." message back to
  gnttab_request_version()
* rebased on v3.8-rc3
---
 drivers/xen/grant-table.c |   48 +++++++++++++++++++++++++++-----------------
 1 files changed, 29 insertions(+), 19 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 7038de5..157c0cc 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -56,10 +56,6 @@
 /* External tools reserve first few grant table entries. */
 #define NR_RESERVED_ENTRIES 8
 #define GNTTAB_LIST_END 0xffffffff
-#define GREFS_PER_GRANT_FRAME \
-(grant_table_version == 1 ?                      \
-(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
-(PAGE_SIZE / sizeof(union grant_entry_v2)))
 
 static grant_ref_t **gnttab_list;
 static unsigned int nr_grant_frames;
@@ -154,6 +150,7 @@ static struct gnttab_ops *gnttab_interface;
 static grant_status_t *grstatus;
 
 static int grant_table_version;
+static int grefs_per_grant_frame;
 
 static struct gnttab_free_callback *gnttab_free_callback_list;
 
@@ -767,12 +764,14 @@ static int grow_gnttab_list(unsigned int more_frames)
 	unsigned int new_nr_grant_frames, extra_entries, i;
 	unsigned int nr_glist_frames, new_nr_glist_frames;
 
+	BUG_ON(grefs_per_grant_frame == 0);
+
 	new_nr_grant_frames = nr_grant_frames + more_frames;
-	extra_entries       = more_frames * GREFS_PER_GRANT_FRAME;
+	extra_entries       = more_frames * grefs_per_grant_frame;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	new_nr_glist_frames =
-		(new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+		(new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = nr_glist_frames; i < new_nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
 		if (!gnttab_list[i])
@@ -780,12 +779,12 @@ static int grow_gnttab_list(unsigned int more_frames)
 	}
 
 
-	for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
-	     i < GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
+	for (i = grefs_per_grant_frame * nr_grant_frames;
+	     i < grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
 		gnttab_entry(i) = i + 1;
 
 	gnttab_entry(i) = gnttab_free_head;
-	gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
+	gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
 	gnttab_free_count += extra_entries;
 
 	nr_grant_frames = new_nr_grant_frames;
@@ -957,7 +956,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
 
 static unsigned nr_status_frames(unsigned nr_grant_frames)
 {
-	return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
+	BUG_ON(grefs_per_grant_frame == 0);
+	return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
 }
 
 static int gnttab_map_frames_v1(xen_pfn_t *frames, unsigned int nr_gframes)
@@ -1115,6 +1115,7 @@ static void gnttab_request_version(void)
 	rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1);
 	if (rc == 0 && gsv.version == 2) {
 		grant_table_version = 2;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(union grant_entry_v2);
 		gnttab_interface = &gnttab_v2_ops;
 	} else if (grant_table_version == 2) {
 		/*
@@ -1127,17 +1128,17 @@ static void gnttab_request_version(void)
 		panic("we need grant tables version 2, but only version 1 is available");
 	} else {
 		grant_table_version = 1;
+		grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
 		gnttab_interface = &gnttab_v1_ops;
 	}
 	printk(KERN_INFO "Grant tables using version %d layout.\n",
 		grant_table_version);
 }
 
-int gnttab_resume(void)
+static int gnttab_setup(void)
 {
 	unsigned int max_nr_gframes;
 
-	gnttab_request_version();
 	max_nr_gframes = gnttab_max_grant_frames();
 	if (max_nr_gframes < nr_grant_frames)
 		return -ENOSYS;
@@ -1160,6 +1161,12 @@ int gnttab_resume(void)
 	return 0;
 }
 
+int gnttab_resume(void)
+{
+	gnttab_request_version();
+	return gnttab_setup();
+}
+
 int gnttab_suspend(void)
 {
 	gnttab_interface->unmap_frames();
@@ -1171,9 +1178,10 @@ static int gnttab_expand(unsigned int req_entries)
 	int rc;
 	unsigned int cur, extra;
 
+	BUG_ON(grefs_per_grant_frame == 0);
 	cur = nr_grant_frames;
-	extra = ((req_entries + (GREFS_PER_GRANT_FRAME-1)) /
-		 GREFS_PER_GRANT_FRAME);
+	extra = ((req_entries + (grefs_per_grant_frame-1)) /
+		 grefs_per_grant_frame);
 	if (cur + extra > gnttab_max_grant_frames())
 		return -ENOSPC;
 
@@ -1191,21 +1199,23 @@ int gnttab_init(void)
 	unsigned int nr_init_grefs;
 	int ret;
 
+	gnttab_request_version();
 	nr_grant_frames = 1;
 	boot_max_nr_grant_frames = __max_nr_grant_frames();
 
 	/* Determine the maximum number of frames required for the
 	 * grant reference free list on the current hypervisor.
 	 */
+	BUG_ON(grefs_per_grant_frame == 0);
 	max_nr_glist_frames = (boot_max_nr_grant_frames *
-			       GREFS_PER_GRANT_FRAME / RPP);
+			       grefs_per_grant_frame / RPP);
 
 	gnttab_list = kmalloc(max_nr_glist_frames * sizeof(grant_ref_t *),
 			      GFP_KERNEL);
 	if (gnttab_list == NULL)
 		return -ENOMEM;
 
-	nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+	nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
 	for (i = 0; i < nr_glist_frames; i++) {
 		gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_KERNEL);
 		if (gnttab_list[i] == NULL) {
@@ -1214,12 +1224,12 @@ int gnttab_init(void)
 		}
 	}
 
-	if (gnttab_resume() < 0) {
+	if (gnttab_setup() < 0) {
 		ret = -ENODEV;
 		goto ini_nomem;
 	}
 
-	nr_init_grefs = nr_grant_frames * GREFS_PER_GRANT_FRAME;
+	nr_init_grefs = nr_grant_frames * grefs_per_grant_frame;
 
 	for (i = NR_RESERVED_ENTRIES; i < nr_init_grefs - 1; i++)
 		gnttab_entry(i) = i + 1;
-- 
1.7.4.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3] xen/grant-table: correctly initialize grant table version 1
  2013-01-15 13:21 ` Matt Wilson
  2013-01-15 13:24   ` Matt Wilson
@ 2013-01-15 13:24   ` Matt Wilson
  1 sibling, 0 replies; 29+ messages in thread
From: Matt Wilson @ 2013-01-15 13:24 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: stable, Ian Campbell, Annie Li, xen-devel, linux-kernel

On Tue, Jan 15, 2013 at 01:21:27PM +0000, Matt Wilson wrote:
> Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
> tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
> a constant to a conditional expression. The expression depends on
> grant_table_version being appropriately set. Unfortunately, at init
> time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
> conditional expression checks for "grant_table_version == 1", and
> therefore returns the number of grant references per frame for v2.
> 
> This causes gnttab_init() to allocate fewer pages for gnttab_list, as
> a frame can old half the number of v2 entries than v1 entries. After
> gnttab_resume() is called, grant_table_version is appropriately
> set. nr_init_grefs will then be miscalculated and gnttab_free_count
> will hold a value larger than the actual number of free gref entries.
> 
> If a guest is heavily utilizing improperly initialized v1 grant
> tables, memory corruption can occur. One common manifestation is
> corruption of the vmalloc list, resulting in a poisoned pointer
> derefrence when accessing /proc/meminfo or /proc/vmallocinfo:
> 
> [   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
> [   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
> [   40.770102] PGD 0
> [   40.770107] Oops: 0000 [#1] SMP
> [   40.770114] CPU 10
> 
> This patch introduces a static variable, grefs_per_grant_frame, to
> cache the calculated value. gnttab_init() now calls
> gnttab_request_version() early so that grant_table_version and
> grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
> been added to prevent this type of bug from reoccurring in the future.
> 
> Signed-off-by: Matt Wilson <msw@amazon.com>
> Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com>
> Signed-off-by: Ian Campbell <Ian.Campbell@citrix.com>

Correction, that should have been "Acked-by: Ian Campbell
<Ian.Campbell@citrix.com>". My apologies.

Matt

> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Annie Li <annie.li@oracle.com>
> Cc: xen-devel@lists.xen.org
> Cc: linux-kernel@vger.kernel.org
> Cc: stable@vger.kernel.org # v3.3 and newer

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3] xen/grant-table: correctly initialize grant table version 1
  2013-01-15 13:21 ` Matt Wilson
@ 2013-01-15 13:24   ` Matt Wilson
  2013-01-15 13:24   ` Matt Wilson
  1 sibling, 0 replies; 29+ messages in thread
From: Matt Wilson @ 2013-01-15 13:24 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: linux-kernel, Annie Li, Ian Campbell, stable, xen-devel

On Tue, Jan 15, 2013 at 01:21:27PM +0000, Matt Wilson wrote:
> Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
> tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
> a constant to a conditional expression. The expression depends on
> grant_table_version being appropriately set. Unfortunately, at init
> time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
> conditional expression checks for "grant_table_version == 1", and
> therefore returns the number of grant references per frame for v2.
> 
> This causes gnttab_init() to allocate fewer pages for gnttab_list, as
> a frame can old half the number of v2 entries than v1 entries. After
> gnttab_resume() is called, grant_table_version is appropriately
> set. nr_init_grefs will then be miscalculated and gnttab_free_count
> will hold a value larger than the actual number of free gref entries.
> 
> If a guest is heavily utilizing improperly initialized v1 grant
> tables, memory corruption can occur. One common manifestation is
> corruption of the vmalloc list, resulting in a poisoned pointer
> derefrence when accessing /proc/meminfo or /proc/vmallocinfo:
> 
> [   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
> [   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
> [   40.770102] PGD 0
> [   40.770107] Oops: 0000 [#1] SMP
> [   40.770114] CPU 10
> 
> This patch introduces a static variable, grefs_per_grant_frame, to
> cache the calculated value. gnttab_init() now calls
> gnttab_request_version() early so that grant_table_version and
> grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
> been added to prevent this type of bug from reoccurring in the future.
> 
> Signed-off-by: Matt Wilson <msw@amazon.com>
> Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com>
> Signed-off-by: Ian Campbell <Ian.Campbell@citrix.com>

Correction, that should have been "Acked-by: Ian Campbell
<Ian.Campbell@citrix.com>". My apologies.

Matt

> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Annie Li <annie.li@oracle.com>
> Cc: xen-devel@lists.xen.org
> Cc: linux-kernel@vger.kernel.org
> Cc: stable@vger.kernel.org # v3.3 and newer

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2013-01-15 13:34 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-19 20:20 [PATCH] xen/grant-table: correctly initialize grant table version 1 Matt Wilson
2012-12-20  3:42 ` ANNIE LI
2012-12-20 21:24   ` Matt Wilson
2012-12-20 10:01 ` Ian Campbell
2012-12-20 21:19   ` Matt Wilson
     [not found] <Message-ID: <1355997710.26722.12.camel@zakaz.uk.xensource.com>
2013-01-06 11:14 ` [PATCH v2] " Matt Wilson
2013-01-06 11:14 ` Matt Wilson
2013-01-09  2:40   ` ANNIE LI
2013-01-09  2:40   ` [Xen-devel] " ANNIE LI
2013-01-09 12:02     ` Ian Campbell
2013-01-09 15:02       ` annie li
2013-01-09 15:02       ` annie li
2013-01-10  9:16       ` Matt Wilson
2013-01-10  9:16       ` [Xen-devel] " Matt Wilson
2013-01-10 10:32         ` Ian Campbell
2013-01-10 10:32         ` Ian Campbell
2013-01-10 11:02         ` ANNIE LI
2013-01-10 11:02         ` [Xen-devel] " ANNIE LI
2013-01-09 12:02     ` Ian Campbell
2013-01-11 21:35   ` Konrad Rzeszutek Wilk
2013-01-11 21:35   ` Konrad Rzeszutek Wilk
2013-01-14  9:29     ` Matt Wilson
2013-01-14 15:31       ` Konrad Rzeszutek Wilk
2013-01-14 15:31       ` Konrad Rzeszutek Wilk
2013-01-14  9:29     ` Matt Wilson
     [not found] <Message-ID: <20130114153157.GF1156@phenom.dumpdata.com>
2013-01-15 13:21 ` [PATCH v3] " Matt Wilson
2013-01-15 13:21 ` Matt Wilson
2013-01-15 13:24   ` Matt Wilson
2013-01-15 13:24   ` Matt Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.