All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cam Macdonell <cam@cs.ualberta.ca>
To: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Avi Kivity <avi@redhat.com>,
	seabios@seabios.org,
	"qemu-devel@nongnu.org Developers" <qemu-devel@nongnu.org>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] Re: Unusual physical address when using 64-bit BAR
Date: Tue, 24 Aug 2010 10:52:36 -0600	[thread overview]
Message-ID: <AANLkTinw-3cErM-ZAfXK_-S4F3oEkn49HJgwAu5=y_YJ@mail.gmail.com> (raw)
In-Reply-To: <20100721034918.GA6285@valinux.co.jp>

On Tue, Jul 20, 2010 at 9:49 PM, Isaku Yamahata <yamahata@valinux.co.jp> wrote:
> Added Cc: seabios@seabios.org
>
> On Wed, Jul 21, 2010 at 06:31:01AM +0300, Michael S. Tsirkin wrote:
>> On Tue, Jul 20, 2010 at 06:52:23PM +0900, Isaku Yamahata wrote:
>> > On Wed, Jul 14, 2010 at 09:10:28AM -0600, Cam Macdonell wrote:
>> > > On Tue, Jul 13, 2010 at 8:52 PM, Isaku Yamahata <yamahata@valinux.co.jp> wrote:
>> > > > On Tue, Jul 13, 2010 at 04:48:19PM -0600, Cam Macdonell wrote:
>> > > >> On Tue, Jul 13, 2010 at 2:41 PM, Isaku Yamahata <yamahata@valinux.co.jp> wrote:
>> > > >> > On Tue, Jul 13, 2010 at 02:05:51PM -0600, Cam Macdonell wrote:
>> > > >> >> >> > Seabios completely ignore the 64-bitness of the BAR. ?Looks like it also
>> > > >> >> >> > thinks the second half of the BAR is an I/O region instead of memory (hence
>> > > >> >> >> > the c200, that's part of the pci portio region.
>> > > >> >> >
>> > > >> >> > I've sent the patches to address it. But they haven't been merged yet.
>> > > >> >> > seabios doesn't map BARs beyond 4GB.
>> > > >> >> > If bar is mapped beyond 4GB, guest BIOS does it.
>> > > >> >>
>> > > >> >> Have those patches been merged yet?
>> > > >> >
>> > > >> > They have been merged into seabios upstream now.
>> > > >> > qemu seabios fork hasn't pulled for a while, though.
>> > > >> >
>> > > >> >
>> > > >> >> > To see how seabios works, it would help to increase CONFIG_DEBUG_LEVEL
>> > > >> >> > in config.h of seabios
>> > > >> >>
>> > > >> >> Where does the output from seabios end up? ?Inside dmesg?
>> > > >> >
>> > > >> > It outputs them to the serial console which qemu emulates.
>> > > >> > seabios is out of kernel control, so dmesg doesn't show it.
>> > > >> >
>> > > >> >
>> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr)
>> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > >> >> >> pci_read_config: (val) 0xffffffff <- 0x1c (addr)
>> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr)
>> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > >> >> >
>> > > >> >> > seabios BAR3. Not sure how it is mapped from this
>> > > >> >> > message.
>> > > >> >>
>> > > >> >> Isn't the BAR3 from the fact that a 64-bit BAR would use both BAR2 and
>> > > >> >> BAR3 to store all 64-bits?
>> > > >> >
>> > > >> > Yes. Seabios misbehaves. 64bit bar is(was) a missing feature.
>> > > >> > --
>> > > >> > yamahata
>> > > >> >
>> > > >> >
>> > > >>
>> > > >> With the latest seabios git passed via -bios, I no longer see the
>> > > >> 48-bit address, but instead a 32-bit address and then
>> > > >> ffffffff00000000. ?This guest has 1gb of RAM so the address isn't be
>> > > >> mapped beyond 4g.
>> > > >
>> > > > Can I see the debug log like before?
>> > > > (hopefully seabios with CONFIG_DEBUG_LEVEL enabled.)
>> > >
>> > > Here's the dump from SeaBIOS in the region related to the PCI devices.
>> > >  The SeaBIOS output is identical whether the BAR is 32-bit or 64-bit.
>> > >
>> > > PCI: bus=0 devfn=0x10: vendor_id=0x1013 device_id=0x00b8
>> > > region 0: 0xf0000000
>> > > region 1: 0xf2000000
>> > > region 6: 0xf2010000
>> > > PCI: bus=0 devfn=0x18: vendor_id=0x1af4 device_id=0x1000
>> > > region 0: 0x0000c020
>> > > region 1: 0xf2020000
>> > > region 6: 0xf2030000
>> > > PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110
>> > > region 0: 0xf2040000
>> > > region 1: 0xf2041000
>> > > region 2: 0x00000000
>> >
>> > Is this region (region 2 of devfn=0x20: vendor_id=0x1af4 device_id=0x1110)
>> > the BAR in quistion?
>> > The value 0 seems odd. Probably BAR address calculation overflowed.
>> > Currently seabios doesn't check overflow. I attached the patch.
>> >
>> >
>> > > > Do you know who sets the BAR to ffffffff00000000?
>> > >
>> > > Here are the config reads/writes related to the 0x18/1c, the 'IVSHMEM'
>> > > lines are from the map function passed to pci_register_bar().  It
>> > > looks like SeaBIOS sets the address to 0 and then the potentially
>> > > useful e0000000 address gets mangled into ffffffff000000.
>> >
>> > There is something wrong with the debug message of write case, I suppose.
>> > All written value are 0, but the resulted effect doesn't seems so.
>> >
>> > >
>> > > IVSHMEM: guest pci addr = 0, guest h/w addr = 1090912256, size = 536870912
>> > >
>> > > ...snip...
>> > >
>> > > pci_read_config: (val) 0x4 <- 0x18 (addr)
>> > > pci_write_config: (val) 0x0 -> 0x18 (addr)
>> > > IVSHMEM: guest pci addr = e0000000, guest h/w addr = 1090912256, size = 20000000
>> >
>> > If 0 is written to 0x18, the bar address should be 0, but it says e0000000.
>> >
>> > > pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
>> >
>> > The read value isn't 0. and so on...
>> >
>> > > pci_write_config: (val) 0x0 -> 0x18 (addr)
>> > > pci_read_config: (val) 0x0 <- 0x1c (addr)
>> > > pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr =
>> > > 1090912256, size = 20000000
>> > > pci_read_config: (val) 0xffffffff <- 0x1c (addr)
>> > > pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > >
>> > > and with the 64-bit guest I get this error as well (recall the guest
>> > > fails to boot on 64-bit)
>> > >
>> > > BUG: kvm_dirty_pages_log_change: invalid parameters
>> > > 00000000f0000000-00000000f0ffffff
>> >
>> >
>> > diff --git a/src/pciinit.c b/src/pciinit.c
>> > index b110531..6eca2ce 100644
>> > --- a/src/pciinit.c
>> > +++ b/src/pciinit.c
>> > @@ -90,7 +90,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num)
>> >                   /* If pci_bios_prefmem_addr == 0, keep old behaviour */
>> >                   pci_bios_prefmem_addr != 0) {
>> >              paddr = &pci_bios_prefmem_addr;
>> > -            if (ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) {
>> > +            if (ALIGN(*paddr, size) + size < *paddr ||
>> > +                ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) {
>> >                  dprintf(1,
>> >                          "prefmem region of (bdf 0x%x bar %d) can't be mapped. "
>> >                          "decrease BUILD_PCIMEM_SIZE and recompile. size %x\n",
>> > @@ -99,7 +100,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num)
>> >              }
>> >          } else {
>> >              paddr = &pci_bios_mem_addr;
>> > -            if (ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) {
>> > +            if (ALIGN(*paddr, size) + size < *paddr ||
>> > +                ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) {
>> >                  dprintf(1,
>> >                          "mem region of (bdf 0x%x bar %d) can't be mapped. "
>> >                          "increase BUILD_PCIMEM_SIZE and recompile. size %x\n",
>>
>> Looking at the source, all of the values like pci_bios_prefmem_addr seem to be
>> 32 bit. Since in the spec prefetcheable memory is up to 64 bit,
>> can't the math overflow, here and elsewhere?
>> Maybe we should switch to 64 bit values all over ...
>
> Make sense. I'll create a patch to convert them into u64.
>
>>
>> > @@ -116,12 +118,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num)
>> >
>> >      int is_64bit = !(val & PCI_BASE_ADDRESS_SPACE_IO) &&
>> >          (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == PCI_BASE_ADDRESS_MEM_TYPE_64;
>> > -    if (is_64bit) {
>> > -        if (size > 0) {
>> > -            pci_config_writel(bdf, ofs + 4, 0);
>> > -        } else {
>> > -            pci_config_writel(bdf, ofs + 4, ~0);
>> > -        }
>> > +    if (is_64bit && size > 0) {
>> > +        pci_config_writel(bdf, ofs + 4, 0);
>> >      }
>> >      return is_64bit;
>> >  }
>>
>>
>> Was there any reason we wrote all-ones there on size 0?
>> BAR sizing?
>
> No reason. It's just left over from debugging.
> So I'd like to remove it.
>
> --
> yamahata
>
>

Hi, 64-bit BARs still do not seem to be working.

When using the latest seabios the guest does not hit a "BUG:"
statement, but booting still fails

HPET: 1 timers in total, 0 timers will be used for per-cpu timer
divide error: 0000 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:

Pid: 1, comm: swapper Not tainted 2.6.35+ #299 /Bochs
RIP: 0010:[<ffffffff812a9b5b>]  [<ffffffff812a9b5b>] hpet_alloc+0x12c/0x35b
RSP: 0018:ffff88007d7b3d80  EFLAGS: 00010246
RAX: 00038d7ea4c68000 RBX: ffff88007d062cc0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff817bb9b0
RBP: ffff88007d7b3dc0 R08: 00000000000080d0 R09: ffffc90000000000
R10: ffff88007d72b5a0 R11: 0000000000000000 R12: ffff88007d7b3dd0
R13: ffffc90000000000 R14: 0000000000000000 R15: ffffffff817a41c3
FS:  0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001a42000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88007d7b2000, task ffff88007d7b8000)
Stack:
 ffff88007f43ab90 ffff88007f43ab90 ffffffff81ca6174 ffffffff81b1f5e1
<0> 0000000000000000 0000000000000100 0000000000000100 0000000000000000
<0> ffff88007d7b3e80 ffffffff810294ea 00000000fed00000 ffffc90000000000
Call Trace:
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff810294ea>] hpet_reserve_platform_timers+0x10b/0x115
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff81b1f64c>] hpet_late_init+0x6b/0xea
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff81002069>] do_one_initcall+0x5e/0x159
 [<ffffffff81b0d72a>] kernel_init+0x19a/0x228
 [<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
 [<ffffffff81b0d590>] ? kernel_init+0x0/0x228
 [<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10
Code: 89 1d ca b2 b3 00 48 c1 ea 21 8b 73 34 49 c7 c7 c3 41 7a 81 48
8d 04 02 4c 89 f2 48 c7 c7 b0 b9 7b 81 48 c1 ea 20 48 89 d1 31 d2 <48>
f7 f1 83 7b 30 01 48 c7 c1 86 1c 7d 81 49 0f 46 cf 48 89 43
RIP  [<ffffffff812a9b5b>] hpet_alloc+0x12c/0x35b
 RSP <ffff88007d7b3d80>
---[ end trace a7919e7f17c0a725 ]---
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: swapper Tainted: G      D     2.6.35+ #299
Call Trace:
 [<ffffffff81459a85>] panic+0x8b/0x10b
 [<ffffffff81056a83>] ? exit_ptrace+0x38/0x121
 [<ffffffff8104f9e8>] do_exit+0x7a/0x722
 [<ffffffff8104c3bd>] ? spin_unlock_irqrestore+0xe/0x10
 [<ffffffff8104cfd6>] ? kmsg_dump+0x12b/0x145
 [<ffffffff8145ccc8>] oops_end+0xbf/0xc7
 [<ffffffff8100d299>] die+0x5a/0x63
 [<ffffffff8145c6d2>] do_trap+0x121/0x130
 [<ffffffff8100b560>] do_divide_error+0x96/0x9f
 [<ffffffff812a9b5b>] ? hpet_alloc+0x12c/0x35b
 [<ffffffff8120cf80>] ? radix_tree_preload+0x34/0x88
 [<ffffffff8100a83b>] divide_error+0x1b/0x20
 [<ffffffff812a9b5b>] ? hpet_alloc+0x12c/0x35b
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff810294ea>] hpet_reserve_platform_timers+0x10b/0x115
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff81b1f64c>] hpet_late_init+0x6b/0xea
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff81002069>] do_one_initcall+0x5e/0x159
 [<ffffffff81b0d72a>] kernel_init+0x19a/0x228
 [<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
 [<ffffffff81b0d590>] ? kernel_init+0x0/0x228
 [<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10

seabios output for the device:

PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110
region 0: 0xf1020000
region 2: 0x00000000
init smm

Running the latest seabios, the debug output only remaps the BAR
twice, once with a potentially correct address of e00000000

pci_read_config: (val) 0xe0000004 <- 0x18 (addr)

...snip...

pci_default_write_config: (val) 0x0 -> 0x18 (addr)
IVSHMEM: guest pci addr = e0000000, guest h/w addr = 2164588544, size = 20000000
pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
pci_default_write_config: (val) 0x0 -> 0x18 (addr)
pci_read_config: (val) 0x0 <- 0x1c (addr)
pci_default_write_config: (val) 0x0 -> 0x1c (addr)
IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr =
2164588544, size = 20000000
pci_read_config: (val) 0xffffffff <- 0x1c (addr)
pci_default_write_config: (val) 0x0 -> 0x1c (addr)
pci_read_config: (val) 0x0 <- 0x20 (addr)

the pci writes are all still 0, I can't see how my debug statements
are incorrect though.  Below is my trivial pci config debugging patch.

diff --git a/hw/pci.c b/hw/pci.c
index 70dbace..01087b1 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -1159,6 +1159,8 @@ static uint32_t pci_read_config(PCIDevice *d,

     len = MIN(len, pci_config_size(d) - address);
     memcpy(&val, d->config + address, len);
+    if (strncmp(d->name, "ivshmem", 7) == 0)
+        printf("pci_read_config: (val) 0x%x <- 0x%x (addr)\n", val, address);
     return le32_to_cpu(val);
 }

@@ -1219,6 +1221,8 @@ void pci_default_write_config(PCIDevice *d,
uint32_t addr, uint32_t val, int l)
         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
     }

+    if (strncmp(d->name, "ivshmem", 7) == 0)
+        printf("pci_write_config: (val) 0x%x -> 0x%x (addr)\n", val, addr);
 #ifdef CONFIG_KVM_DEVICE_ASSIGNMENT
     if (kvm_enabled() && kvm_irqchip_in_kernel() &&
         addr >= PIIX_CONFIG_IRQ_ROUTE &&

Cam

  reply	other threads:[~2010-08-24 16:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-19 16:41 [Qemu-devel] Unusual physical address when using 64-bit BAR Cam Macdonell
2010-06-11 17:31 ` [Qemu-devel] " Cam Macdonell
2010-06-15 11:04   ` Avi Kivity
2010-06-24 21:51     ` Cam Macdonell
2010-06-27  8:39       ` Avi Kivity
2010-06-28 20:38         ` Cam Macdonell
2010-06-29  6:50           ` Avi Kivity
2010-06-29 17:48             ` Cam Macdonell
2010-06-30  3:29               ` Isaku Yamahata
2010-07-13 20:05                 ` Cam Macdonell
2010-07-13 20:41                   ` Isaku Yamahata
2010-07-13 22:48                     ` Cam Macdonell
2010-07-14  2:52                       ` Isaku Yamahata
2010-07-14 15:10                         ` Cam Macdonell
2010-07-20  9:52                           ` Isaku Yamahata
2010-07-21  3:31                             ` Michael S. Tsirkin
2010-07-21  3:49                               ` Isaku Yamahata
2010-08-24 16:52                                 ` Cam Macdonell [this message]
2010-08-25  2:21                                   ` Isaku Yamahata
2010-08-27 19:35                                     ` Cam Macdonell
2010-08-30  2:36                                       ` Isaku Yamahata

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTinw-3cErM-ZAfXK_-S4F3oEkn49HJgwAu5=y_YJ@mail.gmail.com' \
    --to=cam@cs.ualberta.ca \
    --cc=avi@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=seabios@seabios.org \
    --cc=yamahata@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.