kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Can't boot guest with more than 3585MB when using large pages
@ 2009-03-24 21:06 Alex Williamson
  2009-03-24 21:57 ` Ryan Harper
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Williamson @ 2009-03-24 21:06 UTC (permalink / raw)
  To: kvm-devel


On a 2.6.29, x86_64 host/guest, what's special about specifying a guest
size of -m 3586 when using -mem-path backed by hugetlbfs?  3585 works,
3586 hangs here:

...
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
software IO TLB at phys 0x20000000 - 0x24000000
Memory: 3504832k/4196352k available (2926k kernel code, 524740k absent, 166780k reserved, 1260k data, 496k init)

I can back -mem-path by tmpfs or disk and it works fine.  Also works
with no -mem-path, but it would obviously be nice to benefit from large
pages on big guests.  The system has plenty of huge pages to back the
request, and booting with -mem-prealloc makes no difference.  Tested on
latest git as of today.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Can't boot guest with more than 3585MB when using large pages
  2009-03-24 21:06 Can't boot guest with more than 3585MB when using large pages Alex Williamson
@ 2009-03-24 21:57 ` Ryan Harper
  2009-03-25 16:10   ` Marcelo Tosatti
  2009-04-03 23:28   ` Marcelo Tosatti
  0 siblings, 2 replies; 7+ messages in thread
From: Ryan Harper @ 2009-03-24 21:57 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm-devel

* Alex Williamson <alex.williamson@hp.com> [2009-03-24 16:07]:
> 
> On a 2.6.29, x86_64 host/guest, what's special about specifying a guest
> size of -m 3586 when using -mem-path backed by hugetlbfs?  3585 works,
> 3586 hangs here:
> 
> ...
> PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
> software IO TLB at phys 0x20000000 - 0x24000000
> Memory: 3504832k/4196352k available (2926k kernel code, 524740k absent, 166780k reserved, 1260k data, 496k init)
> 
> I can back -mem-path by tmpfs or disk and it works fine.  Also works
> with no -mem-path, but it would obviously be nice to benefit from large
> pages on big guests.  The system has plenty of huge pages to back the
> request, and booting with -mem-prealloc makes no difference.  Tested on
> latest git as of today.  Thanks,

I've seen this as well, haven't had a chance to dig into the issue yet
either.  Certainly can test patches if anyone has an idea of what's
wrong here.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ryanh@us.ibm.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Can't boot guest with more than 3585MB when using large pages
  2009-03-24 21:57 ` Ryan Harper
@ 2009-03-25 16:10   ` Marcelo Tosatti
  2009-03-25 16:26     ` Alex Williamson
  2009-04-03 23:28   ` Marcelo Tosatti
  1 sibling, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2009-03-25 16:10 UTC (permalink / raw)
  To: Ryan Harper; +Cc: Alex Williamson, kvm-devel

On Tue, Mar 24, 2009 at 04:57:46PM -0500, Ryan Harper wrote:
> * Alex Williamson <alex.williamson@hp.com> [2009-03-24 16:07]:
> > 
> > On a 2.6.29, x86_64 host/guest, what's special about specifying a guest
> > size of -m 3586 when using -mem-path backed by hugetlbfs?  3585 works,
> > 3586 hangs here:
> > 
> > ...
> > PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> > Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
> > software IO TLB at phys 0x20000000 - 0x24000000
> > Memory: 3504832k/4196352k available (2926k kernel code, 524740k absent, 166780k reserved, 1260k data, 496k init)
> > 
> > I can back -mem-path by tmpfs or disk and it works fine.  Also works
> > with no -mem-path, but it would obviously be nice to benefit from large
> > pages on big guests.  The system has plenty of huge pages to back the
> > request, and booting with -mem-prealloc makes no difference.  Tested on
> > latest git as of today.  Thanks,
> 
> I've seen this as well, haven't had a chance to dig into the issue yet
> either.  Certainly can test patches if anyone has an idea of what's
> wrong here.

Can you strace and see if the mmap on hugetlbfs is correctly sized?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Can't boot guest with more than 3585MB when using large pages
  2009-03-25 16:10   ` Marcelo Tosatti
@ 2009-03-25 16:26     ` Alex Williamson
  0 siblings, 0 replies; 7+ messages in thread
From: Alex Williamson @ 2009-03-25 16:26 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Ryan Harper, kvm-devel

On Wed, 2009-03-25 at 13:10 -0300, Marcelo Tosatti wrote:
> On Tue, Mar 24, 2009 at 04:57:46PM -0500, Ryan Harper wrote:
> > * Alex Williamson <alex.williamson@hp.com> [2009-03-24 16:07]:
> > > 
> > > On a 2.6.29, x86_64 host/guest, what's special about specifying a guest
> > > size of -m 3586 when using -mem-path backed by hugetlbfs?  3585 works,
> > > 3586 hangs here:
> > > 
> > > ...
> > > PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> > > Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
> > > software IO TLB at phys 0x20000000 - 0x24000000
> > > Memory: 3504832k/4196352k available (2926k kernel code, 524740k absent, 166780k reserved, 1260k data, 496k init)
> > 
> > I've seen this as well, haven't had a chance to dig into the issue yet
> > either.  Certainly can test patches if anyone has an idea of what's
> > wrong here.
> 
> Can you strace and see if the mmap on hugetlbfs is correctly sized?

Seems reasonable with some 2MB rounding.

Failing case, -m 3586:

open("/hugepages//kvm.5fuuH5", O_RDWR|O_CREAT|O_EXCL, 0600) = 9
unlink("/hugepages//kvm.5fuuH5")        = 0
ftruncate(9, 3783262208)                = 0
mmap(NULL, 3783262208, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, 9, 0) = 0x7f37a5e00000

Working case, -m 3585:

open("/hugepages//kvm.Mv6Zgd", O_RDWR|O_CREAT|O_EXCL, 0600) = 9
unlink("/hugepages//kvm.Mv6Zgd")        = 0
ftruncate(9, 3781165056)                = 0
mmap(NULL, 3781165056, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, 9, 0) = 0x7fd44b800000

Working case using disk backing: -mem-path /tmp -mem-prealloc -m 3586:

open("/tmp/kvm.nPlxl1", O_RDWR|O_CREAT|O_EXCL, 0600) = 9
unlink("/tmp/kvm.nPlxl1")               = 0
ftruncate(9, 3783262208)                = 0
mmap(NULL, 3783262208, PROT_READ|PROT_WRITE, MAP_PRIVATE, 9, 0) = 0x7f432e055000



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Can't boot guest with more than 3585MB when using large pages
  2009-03-24 21:57 ` Ryan Harper
  2009-03-25 16:10   ` Marcelo Tosatti
@ 2009-04-03 23:28   ` Marcelo Tosatti
  2009-04-04 17:56     ` Alex Williamson
  1 sibling, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2009-04-03 23:28 UTC (permalink / raw)
  To: Ryan Harper; +Cc: Alex Williamson, kvm-devel

On Tue, Mar 24, 2009 at 04:57:46PM -0500, Ryan Harper wrote:
> * Alex Williamson <alex.williamson@hp.com> [2009-03-24 16:07]:
> > 
> > On a 2.6.29, x86_64 host/guest, what's special about specifying a guest
> > size of -m 3586 when using -mem-path backed by hugetlbfs?  3585 works,
> > 3586 hangs here:
> > 
> > ...
> > PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> > Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
> > software IO TLB at phys 0x20000000 - 0x24000000
> > Memory: 3504832k/4196352k available (2926k kernel code, 524740k absent, 166780k reserved, 1260k data, 496k init)
> > 
> > I can back -mem-path by tmpfs or disk and it works fine.  Also works
> > with no -mem-path, but it would obviously be nice to benefit from large
> > pages on big guests.  The system has plenty of huge pages to back the
> > request, and booting with -mem-prealloc makes no difference.  Tested on
> > latest git as of today.  Thanks,
> 
> I've seen this as well, haven't had a chance to dig into the issue yet
> either.  Certainly can test patches if anyone has an idea of what's
> wrong here.

Can you please try the following

------

qemu: kvm: fixup 4GB+ memslot large page alignment

Need to align the 4GB+ memslot after we know its address, not before.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c
index d4a4320..cc84772 100644
--- a/qemu/hw/pc.c
+++ b/qemu/hw/pc.c
@@ -866,6 +866,7 @@ static void pc_init1(ram_addr_t ram_size, int vga_ram_size,
 
     /* above 4giga memory allocation */
     if (above_4g_mem_size > 0) {
+        ram_addr = qemu_ram_alloc(above_4g_mem_size);
         if (hpagesize) {
             if (ram_addr & (hpagesize-1)) {
                 unsigned long aligned_addr;
@@ -874,7 +875,6 @@ static void pc_init1(ram_addr_t ram_size, int vga_ram_size,
                 ram_addr = aligned_addr;
             }
         }
-        ram_addr = qemu_ram_alloc(above_4g_mem_size);
         cpu_register_physical_memory(0x100000000ULL,
                                      above_4g_mem_size,
                                      ram_addr);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Can't boot guest with more than 3585MB when using large pages
  2009-04-03 23:28   ` Marcelo Tosatti
@ 2009-04-04 17:56     ` Alex Williamson
  2009-04-05 11:53       ` Avi Kivity
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Williamson @ 2009-04-04 17:56 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Ryan Harper, kvm-devel

On Fri, 2009-04-03 at 20:28 -0300, Marcelo Tosatti wrote:
> 
> Can you please try the following

Thanks Marcelo, this seems to fix it.  I tested up to a 30G guest with
large pages.

Alex

> ------
> 
> qemu: kvm: fixup 4GB+ memslot large page alignment
> 
> Need to align the 4GB+ memslot after we know its address, not before.
> 
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Tested-by: Alex Williamson <alex.williamson@hp.com>

> diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c
> index d4a4320..cc84772 100644
> --- a/qemu/hw/pc.c
> +++ b/qemu/hw/pc.c
> @@ -866,6 +866,7 @@ static void pc_init1(ram_addr_t ram_size, int vga_ram_size,
>  
>      /* above 4giga memory allocation */
>      if (above_4g_mem_size > 0) {
> +        ram_addr = qemu_ram_alloc(above_4g_mem_size);
>          if (hpagesize) {
>              if (ram_addr & (hpagesize-1)) {
>                  unsigned long aligned_addr;
> @@ -874,7 +875,6 @@ static void pc_init1(ram_addr_t ram_size, int vga_ram_size,
>                  ram_addr = aligned_addr;
>              }
>          }
> -        ram_addr = qemu_ram_alloc(above_4g_mem_size);
>          cpu_register_physical_memory(0x100000000ULL,
>                                       above_4g_mem_size,
>                                       ram_addr);
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Can't boot guest with more than 3585MB when using large pages
  2009-04-04 17:56     ` Alex Williamson
@ 2009-04-05 11:53       ` Avi Kivity
  0 siblings, 0 replies; 7+ messages in thread
From: Avi Kivity @ 2009-04-05 11:53 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Marcelo Tosatti, Ryan Harper, kvm-devel

Alex Williamson wrote:
> On Fri, 2009-04-03 at 20:28 -0300, Marcelo Tosatti wrote:
>   
>> Can you please try the following
>>     
>
> Thanks Marcelo, this seems to fix it.  I tested up to a 30G guest with
> large pages.
>   

I've applied the patch, thanks.  I keep thinking we need to do 
additional rounding when we allocate the file.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-04-05 11:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-24 21:06 Can't boot guest with more than 3585MB when using large pages Alex Williamson
2009-03-24 21:57 ` Ryan Harper
2009-03-25 16:10   ` Marcelo Tosatti
2009-03-25 16:26     ` Alex Williamson
2009-04-03 23:28   ` Marcelo Tosatti
2009-04-04 17:56     ` Alex Williamson
2009-04-05 11:53       ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).