All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] host physical address width issues/questions for x86_64
@ 2017-10-13 16:17 Prasad Singamsetty
  2017-10-13 17:01 ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 16+ messages in thread
From: Prasad Singamsetty @ 2017-10-13 16:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, rth, ehabkost, Sunit Jain

Hi,

I am new to the alias. I have some questions on this subject
and seek some clarifications from the experts in the team.
I ran into a couple of issues when I tried with large configuration
( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.

1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
    width if user has not specified phys-bits or host-phys-bits=true
    property. The default value is obviously not sufficient and
    causing guest kernel to crash if configured with >= 1TB
    memory. Depending on the linux kernel version in the guest the
    panic was in different code paths. The workaround is for the
    user to specify the phys-bits property or set the property
    host-phys-bits=true.

    QUESTIONS:
    1) Could we change the default value to same as the host physcial
       address for x86_64 machines?  Are there any side effects on this?
    2) Adding a check to fail to boot the guest if phys-bits is not
       sufficient for the specified maxmem or if it is more than
       the host phys bits value. Do you have any objections if I
       add a patch for this?

2. host_address_width in DMAR table structure

    In this case, the default value is set to 39
    (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
    enabled for the intel iommu and the guest is configured
    with > 255 cpus and >= 1TB memory, the guest kernel hangs
    during boot up. This need to be fixed.

    QUESTION:
    The question here again is can we fix this to use the
    real address width from the host as the default?

Please let me know if you have some suggestions in fixing these
two problem cases for supporting large config guests. Also, please
let me know if there are any other known limitations in the current
implementation.

Thanks.
--Prasad

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-13 16:17 [Qemu-devel] host physical address width issues/questions for x86_64 Prasad Singamsetty
@ 2017-10-13 17:01 ` Dr. David Alan Gilbert
  2017-10-13 17:14   ` Alex Williamson
  2017-10-16 16:59   ` Prasad Singamsetty
  0 siblings, 2 replies; 16+ messages in thread
From: Dr. David Alan Gilbert @ 2017-10-13 17:01 UTC (permalink / raw)
  To: Prasad Singamsetty
  Cc: qemu-devel, pbonzini, alex.williamson, Sunit Jain, ehabkost, rth

* Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
> Hi,
> 
> I am new to the alias. I have some questions on this subject
> and seek some clarifications from the experts in the team.
> I ran into a couple of issues when I tried with large configuration
> ( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
> 
> 1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
>    width if user has not specified phys-bits or host-phys-bits=true
>    property. The default value is obviously not sufficient and
>    causing guest kernel to crash if configured with >= 1TB
>    memory. Depending on the linux kernel version in the guest the
>    panic was in different code paths. The workaround is for the
>    user to specify the phys-bits property or set the property
>    host-phys-bits=true.
> 
>    QUESTIONS:
>    1) Could we change the default value to same as the host physcial
>       address for x86_64 machines?  Are there any side effects on this?

That's what we do in the RH downstream packages.

If you did that you wouldn't want to break existing machine-types,
so you'd have to tie it to a new machine type.

There's some fun with MTRRs that have bits set based on the address
size, and if you migrate between hosts with different physical address sizes; e.g. between
a non-Xeon (or I think a Xeon-E3) and the bigger boxes you have
to be careful.  See fcc35e7 and commits around that;  tbh I can't
remember the details.

>    2) Adding a check to fail to boot the guest if phys-bits is not
>       sufficient for the specified maxmem or if it is more than
>       the host phys bits value. Do you have any objections if I
>       add a patch for this?

It's a little more complicated, but good in principal.  You need
to take account of the allocated address space for hotplug
and I think the PCI address space;  I can't remember if we
ever figured out a good way of finding that out.
I think it might also depend if you're on SeaBIOS or OVMF
about what they're defaults are for things like where PCI
gets allocated.

> 2. host_address_width in DMAR table structure
> 
>    In this case, the default value is set to 39
>    (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
>    enabled for the intel iommu and the guest is configured
>    with > 255 cpus and >= 1TB memory, the guest kernel hangs
>    during boot up. This need to be fixed.
> 
>    QUESTION:
>    The question here again is can we fix this to use the
>    real address width from the host as the default?

I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
like that's an ommission that should be fixed.

> Please let me know if you have some suggestions in fixing these
> two problem cases for supporting large config guests. Also, please
> let me know if there are any other known limitations in the current
> implementation.

Dave

> 
> Thanks.
> --Prasad
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-13 17:01 ` Dr. David Alan Gilbert
@ 2017-10-13 17:14   ` Alex Williamson
  2017-10-15  3:53     ` Peter Xu
  2017-10-16 17:11     ` Prasad Singamsetty
  2017-10-16 16:59   ` Prasad Singamsetty
  1 sibling, 2 replies; 16+ messages in thread
From: Alex Williamson @ 2017-10-13 17:14 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Prasad Singamsetty, qemu-devel, pbonzini, Sunit Jain, ehabkost,
	rth, Peter Xu

On Fri, 13 Oct 2017 18:01:44 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
> > Hi,
> > 
> > I am new to the alias. I have some questions on this subject
> > and seek some clarifications from the experts in the team.
> > I ran into a couple of issues when I tried with large configuration
> > ( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
> > 
> > 1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
> >    width if user has not specified phys-bits or host-phys-bits=true
> >    property. The default value is obviously not sufficient and
> >    causing guest kernel to crash if configured with >= 1TB
> >    memory. Depending on the linux kernel version in the guest the
> >    panic was in different code paths. The workaround is for the
> >    user to specify the phys-bits property or set the property
> >    host-phys-bits=true.
> > 
> >    QUESTIONS:
...
> > 2. host_address_width in DMAR table structure
> > 
> >    In this case, the default value is set to 39
> >    (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
> >    enabled for the intel iommu and the guest is configured
> >    with > 255 cpus and >= 1TB memory, the guest kernel hangs
> >    during boot up. This need to be fixed.
> > 
> >    QUESTION:
> >    The question here again is can we fix this to use the
> >    real address width from the host as the default?  
> 
> I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
> like that's an ommission that should be fixed.

[CC +Peter]

On physical hardware VT-d supports either 39 or 48 bit address widths
and generally you'd expect a sufficiently capable IOMMU to be matched
with the CPU.  Seems QEMU has only implemented a lower bit width and
it should probably be forcing phys bits of the VM to 39 to match until
the extended width can be implemented.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-13 17:14   ` Alex Williamson
@ 2017-10-15  3:53     ` Peter Xu
  2017-10-16 17:02       ` Prasad Singamsetty
  2017-10-16 17:11     ` Prasad Singamsetty
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Xu @ 2017-10-15  3:53 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Dr. David Alan Gilbert, Prasad Singamsetty, qemu-devel, pbonzini,
	Sunit Jain, ehabkost, rth

On Fri, Oct 13, 2017 at 11:14:03AM -0600, Alex Williamson wrote:
> On Fri, 13 Oct 2017 18:01:44 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
> > > Hi,
> > > 
> > > I am new to the alias. I have some questions on this subject
> > > and seek some clarifications from the experts in the team.
> > > I ran into a couple of issues when I tried with large configuration
> > > ( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
> > > 
> > > 1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
> > >    width if user has not specified phys-bits or host-phys-bits=true
> > >    property. The default value is obviously not sufficient and
> > >    causing guest kernel to crash if configured with >= 1TB
> > >    memory. Depending on the linux kernel version in the guest the
> > >    panic was in different code paths. The workaround is for the
> > >    user to specify the phys-bits property or set the property
> > >    host-phys-bits=true.
> > > 
> > >    QUESTIONS:
> ...
> > > 2. host_address_width in DMAR table structure
> > > 
> > >    In this case, the default value is set to 39
> > >    (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
> > >    enabled for the intel iommu and the guest is configured
> > >    with > 255 cpus and >= 1TB memory, the guest kernel hangs
> > >    during boot up. This need to be fixed.
> > > 
> > >    QUESTION:
> > >    The question here again is can we fix this to use the
> > >    real address width from the host as the default?  
> > 
> > I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
> > like that's an ommission that should be fixed.
> 
> [CC +Peter]
> 
> On physical hardware VT-d supports either 39 or 48 bit address widths
> and generally you'd expect a sufficiently capable IOMMU to be matched
> with the CPU.  Seems QEMU has only implemented a lower bit width and
> it should probably be forcing phys bits of the VM to 39 to match until
> the extended width can be implemented.  Thanks,
> 
> Alex

There were patches that tried to enable 48 bits GAW but it was
not accepted somehow:

  https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01886.html

Would this help in any way?

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-13 17:01 ` Dr. David Alan Gilbert
  2017-10-13 17:14   ` Alex Williamson
@ 2017-10-16 16:59   ` Prasad Singamsetty
  1 sibling, 0 replies; 16+ messages in thread
From: Prasad Singamsetty @ 2017-10-16 16:59 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: ehabkost, qemu-devel, alex.williamson, Sunit Jain, pbonzini, rth


On 10/13/2017 10:01 AM, Dr. David Alan Gilbert wrote:
> * Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
>> Hi,
>>
>> I am new to the alias. I have some questions on this subject
>> and seek some clarifications from the experts in the team.
>> I ran into a couple of issues when I tried with large configuration
>> ( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
>>
>> 1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
>>     width if user has not specified phys-bits or host-phys-bits=true
>>     property. The default value is obviously not sufficient and
>>     causing guest kernel to crash if configured with >= 1TB
>>     memory. Depending on the linux kernel version in the guest the
>>     panic was in different code paths. The workaround is for the
>>     user to specify the phys-bits property or set the property
>>     host-phys-bits=true.
>>
>>     QUESTIONS:
>>     1) Could we change the default value to same as the host physcial
>>        address for x86_64 machines?  Are there any side effects on this?
> 
> That's what we do in the RH downstream packages.
> 
> If you did that you wouldn't want to break existing machine-types,
> so you'd have to tie it to a new machine type.

OK.

> There's some fun with MTRRs that have bits set based on the address
> size, and if you migrate between hosts with different physical address sizes; e.g. between
> a non-Xeon (or I think a Xeon-E3) and the bigger boxes you have
> to be careful.  See fcc35e7 and commits around that;  tbh I can't
> remember the details.

Right. The problem with migration between hosts is still there.

> 
>>     2) Adding a check to fail to boot the guest if phys-bits is not
>>        sufficient for the specified maxmem or if it is more than
>>        the host phys bits value. Do you have any objections if I
>>        add a patch for this?
> 
> It's a little more complicated, but good in principal.  You need
> to take account of the allocated address space for hotplug
> and I think the PCI address space;  I can't remember if we
> ever figured out a good way of finding that out.
> I think it might also depend if you're on SeaBIOS or OVMF
> about what they're defaults are for things like where PCI
> gets allocated.

Thanks for the suggestions. I will check with OVMF also.

> 
>> 2. host_address_width in DMAR table structure
>>
>>     In this case, the default value is set to 39
>>     (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
>>     enabled for the intel iommu and the guest is configured
>>     with > 255 cpus and >= 1TB memory, the guest kernel hangs
>>     during boot up. This need to be fixed.
>>
>>     QUESTION:
>>     The question here again is can we fix this to use the
>>     real address width from the host as the default?
> 
> I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
> like that's an ommission that should be fixed.

Thanks,
--Prasad

> 
>> Please let me know if you have some suggestions in fixing these
>> two problem cases for supporting large config guests. Also, please
>> let me know if there are any other known limitations in the current
>> implementation.
> 
> Dave
> 
>>
>> Thanks.
>> --Prasad
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-15  3:53     ` Peter Xu
@ 2017-10-16 17:02       ` Prasad Singamsetty
  2017-10-17  3:56         ` Peter Xu
  0 siblings, 1 reply; 16+ messages in thread
From: Prasad Singamsetty @ 2017-10-16 17:02 UTC (permalink / raw)
  To: Peter Xu
  Cc: Alex Williamson, Dr. David Alan Gilbert, qemu-devel, pbonzini,
	Sunit Jain, ehabkost, rth



On 10/14/2017 8:53 PM, Peter Xu wrote:
> On Fri, Oct 13, 2017 at 11:14:03AM -0600, Alex Williamson wrote:
>> On Fri, 13 Oct 2017 18:01:44 +0100
>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>
>>> * Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
>>>> Hi,
>>>>
>>>> I am new to the alias. I have some questions on this subject
>>>> and seek some clarifications from the experts in the team.
>>>> I ran into a couple of issues when I tried with large configuration
>>>> ( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
>>>>
>>>> 1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
>>>>     width if user has not specified phys-bits or host-phys-bits=true
>>>>     property. The default value is obviously not sufficient and
>>>>     causing guest kernel to crash if configured with >= 1TB
>>>>     memory. Depending on the linux kernel version in the guest the
>>>>     panic was in different code paths. The workaround is for the
>>>>     user to specify the phys-bits property or set the property
>>>>     host-phys-bits=true.
>>>>
>>>>     QUESTIONS:
>> ...
>>>> 2. host_address_width in DMAR table structure
>>>>
>>>>     In this case, the default value is set to 39
>>>>     (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
>>>>     enabled for the intel iommu and the guest is configured
>>>>     with > 255 cpus and >= 1TB memory, the guest kernel hangs
>>>>     during boot up. This need to be fixed.
>>>>
>>>>     QUESTION:
>>>>     The question here again is can we fix this to use the
>>>>     real address width from the host as the default?
>>>
>>> I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
>>> like that's an ommission that should be fixed.
>>
>> [CC +Peter]
>>
>> On physical hardware VT-d supports either 39 or 48 bit address widths
>> and generally you'd expect a sufficiently capable IOMMU to be matched
>> with the CPU.  Seems QEMU has only implemented a lower bit width and
>> it should probably be forcing phys bits of the VM to 39 to match until
>> the extended width can be implemented.  Thanks,
>>
>> Alex
> 
> There were patches that tried to enable 48 bits GAW but it was
> not accepted somehow:
> 
>    https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01886.html
> 
> Would this help in any way?
> 

Thanks Alex for the patch info. Just curious why the patch was not
accepted. Any way, I will try it.

Thanks.
--Prasad

> Thanks,
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-13 17:14   ` Alex Williamson
  2017-10-15  3:53     ` Peter Xu
@ 2017-10-16 17:11     ` Prasad Singamsetty
  1 sibling, 0 replies; 16+ messages in thread
From: Prasad Singamsetty @ 2017-10-16 17:11 UTC (permalink / raw)
  To: Alex Williamson, Dr. David Alan Gilbert
  Cc: qemu-devel, pbonzini, Sunit Jain, ehabkost, rth, Peter Xu



On 10/13/2017 10:14 AM, Alex Williamson wrote:
> On Fri, 13 Oct 2017 18:01:44 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
>> * Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
>>> Hi,
>>>
>>> I am new to the alias. I have some questions on this subject
>>> and seek some clarifications from the experts in the team.
>>> I ran into a couple of issues when I tried with large configuration
>>> ( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
>>>
>>> 1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
>>>     width if user has not specified phys-bits or host-phys-bits=true
>>>     property. The default value is obviously not sufficient and
>>>     causing guest kernel to crash if configured with >= 1TB
>>>     memory. Depending on the linux kernel version in the guest the
>>>     panic was in different code paths. The workaround is for the
>>>     user to specify the phys-bits property or set the property
>>>     host-phys-bits=true.
>>>
>>>     QUESTIONS:
> ...
>>> 2. host_address_width in DMAR table structure
>>>
>>>     In this case, the default value is set to 39
>>>     (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
>>>     enabled for the intel iommu and the guest is configured
>>>     with > 255 cpus and >= 1TB memory, the guest kernel hangs
>>>     during boot up. This need to be fixed.
>>>
>>>     QUESTION:
>>>     The question here again is can we fix this to use the
>>>     real address width from the host as the default?
>>
>> I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
>> like that's an ommission that should be fixed.
> 
> [CC +Peter]
> 
> On physical hardware VT-d supports either 39 or 48 bit address widths
> and generally you'd expect a sufficiently capable IOMMU to be matched
> with the CPU.  Seems QEMU has only implemented a lower bit width and
> it should probably be forcing phys bits of the VM to 39 to match until
> the extended width can be implemented.  Thanks,

Thanks Alex. Are there specific incompatibilities with emulated
IOMMU features presented by QEMU and CPU that can cause problems?

Thanks.
--Prasad

> 
> Alex
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-16 17:02       ` Prasad Singamsetty
@ 2017-10-17  3:56         ` Peter Xu
  2017-10-18  5:59           ` Fam Zheng
  2017-10-18 17:19           ` Prasad Singamsetty
  0 siblings, 2 replies; 16+ messages in thread
From: Peter Xu @ 2017-10-17  3:56 UTC (permalink / raw)
  To: Prasad Singamsetty
  Cc: Alex Williamson, Dr. David Alan Gilbert, qemu-devel, pbonzini,
	Sunit Jain, ehabkost, rth

On Mon, Oct 16, 2017 at 10:02:25AM -0700, Prasad Singamsetty wrote:
> 
> 
> On 10/14/2017 8:53 PM, Peter Xu wrote:
> >On Fri, Oct 13, 2017 at 11:14:03AM -0600, Alex Williamson wrote:
> >>On Fri, 13 Oct 2017 18:01:44 +0100
> >>"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >>
> >>>* Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
> >>>>Hi,
> >>>>
> >>>>I am new to the alias. I have some questions on this subject
> >>>>and seek some clarifications from the experts in the team.
> >>>>I ran into a couple of issues when I tried with large configuration
> >>>>( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
> >>>>
> >>>>1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
> >>>>    width if user has not specified phys-bits or host-phys-bits=true
> >>>>    property. The default value is obviously not sufficient and
> >>>>    causing guest kernel to crash if configured with >= 1TB
> >>>>    memory. Depending on the linux kernel version in the guest the
> >>>>    panic was in different code paths. The workaround is for the
> >>>>    user to specify the phys-bits property or set the property
> >>>>    host-phys-bits=true.
> >>>>
> >>>>    QUESTIONS:
> >>...
> >>>>2. host_address_width in DMAR table structure
> >>>>
> >>>>    In this case, the default value is set to 39
> >>>>    (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
> >>>>    enabled for the intel iommu and the guest is configured
> >>>>    with > 255 cpus and >= 1TB memory, the guest kernel hangs
> >>>>    during boot up. This need to be fixed.
> >>>>
> >>>>    QUESTION:
> >>>>    The question here again is can we fix this to use the
> >>>>    real address width from the host as the default?
> >>>
> >>>I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
> >>>like that's an ommission that should be fixed.
> >>
> >>[CC +Peter]
> >>
> >>On physical hardware VT-d supports either 39 or 48 bit address widths
> >>and generally you'd expect a sufficiently capable IOMMU to be matched
> >>with the CPU.  Seems QEMU has only implemented a lower bit width and
> >>it should probably be forcing phys bits of the VM to 39 to match until
> >>the extended width can be implemented.  Thanks,
> >>
> >>Alex
> >
> >There were patches that tried to enable 48 bits GAW but it was
> >not accepted somehow:
> >
> >   https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01886.html
> >
> >Would this help in any way?
> >
> 
> Thanks Alex for the patch info. Just curious why the patch was not
> accepted. Any way, I will try it.

I don't sure I know the reason.  Anyway, it originated from one of
Fam's request for some NVMe tests.  If it can really help for your use
case as well, please feel free to revive those patches, or let me know
so that I can respin.  Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-17  3:56         ` Peter Xu
@ 2017-10-18  5:59           ` Fam Zheng
  2017-10-18 17:19           ` Prasad Singamsetty
  1 sibling, 0 replies; 16+ messages in thread
From: Fam Zheng @ 2017-10-18  5:59 UTC (permalink / raw)
  To: Peter Xu
  Cc: Prasad Singamsetty, ehabkost, Dr. David Alan Gilbert, qemu-devel,
	Alex Williamson, Sunit Jain, pbonzini, rth

On Tue, 10/17 11:56, Peter Xu wrote:
> I don't sure I know the reason.  Anyway, it originated from one of
> Fam's request for some NVMe testsr

FWIW, I was basically trying to test a driver code under development with 48bit
vIOMMU. I ended up using real hardware and also made the code compatible with
lower bits IOMMU.

Fam

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-17  3:56         ` Peter Xu
  2017-10-18  5:59           ` Fam Zheng
@ 2017-10-18 17:19           ` Prasad Singamsetty
  2017-10-19  3:33             ` Peter Xu
  1 sibling, 1 reply; 16+ messages in thread
From: Prasad Singamsetty @ 2017-10-18 17:19 UTC (permalink / raw)
  To: Peter Xu
  Cc: Alex Williamson, Dr. David Alan Gilbert, qemu-devel, pbonzini,
	Sunit Jain, ehabkost, rth



On 10/16/2017 8:56 PM, Peter Xu wrote:
> On Mon, Oct 16, 2017 at 10:02:25AM -0700, Prasad Singamsetty wrote:
>>
>>
>> On 10/14/2017 8:53 PM, Peter Xu wrote:
>>> On Fri, Oct 13, 2017 at 11:14:03AM -0600, Alex Williamson wrote:
>>>> On Fri, 13 Oct 2017 18:01:44 +0100
>>>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>>>
>>>>> * Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I am new to the alias. I have some questions on this subject
>>>>>> and seek some clarifications from the experts in the team.
>>>>>> I ran into a couple of issues when I tried with large configuration
>>>>>> ( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
>>>>>>
>>>>>> 1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
>>>>>>     width if user has not specified phys-bits or host-phys-bits=true
>>>>>>     property. The default value is obviously not sufficient and
>>>>>>     causing guest kernel to crash if configured with >= 1TB
>>>>>>     memory. Depending on the linux kernel version in the guest the
>>>>>>     panic was in different code paths. The workaround is for the
>>>>>>     user to specify the phys-bits property or set the property
>>>>>>     host-phys-bits=true.
>>>>>>
>>>>>>     QUESTIONS:
>>>> ...
>>>>>> 2. host_address_width in DMAR table structure
>>>>>>
>>>>>>     In this case, the default value is set to 39
>>>>>>     (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
>>>>>>     enabled for the intel iommu and the guest is configured
>>>>>>     with > 255 cpus and >= 1TB memory, the guest kernel hangs
>>>>>>     during boot up. This need to be fixed.
>>>>>>
>>>>>>     QUESTION:
>>>>>>     The question here again is can we fix this to use the
>>>>>>     real address width from the host as the default?
>>>>>
>>>>> I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
>>>>> like that's an ommission that should be fixed.
>>>>
>>>> [CC +Peter]
>>>>
>>>> On physical hardware VT-d supports either 39 or 48 bit address widths
>>>> and generally you'd expect a sufficiently capable IOMMU to be matched
>>>> with the CPU.  Seems QEMU has only implemented a lower bit width and
>>>> it should probably be forcing phys bits of the VM to 39 to match until
>>>> the extended width can be implemented.  Thanks,
>>>>
>>>> Alex
>>>
>>> There were patches that tried to enable 48 bits GAW but it was
>>> not accepted somehow:
>>>
>>>    https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01886.html
>>>
>>> Would this help in any way?
>>>
>>
>> Thanks Alex for the patch info. Just curious why the patch was not
>> accepted. Any way, I will try it.
> 
> I don't sure I know the reason.  Anyway, it originated from one of
> Fam's request for some NVMe tests.  If it can really help for your use
> case as well, please feel free to revive those patches, or let me know
> so that I can respin.  Thanks,
> 

Thanks Peter. I will start with your patch and see if I can get
it to work first.

A quick question. Looking at the code, it doesn't look like there
is a way to disable dma remapping. User may have a case where he
is interested only in interrupt remapping (for > 255 cpus) and
not DMA remapping. Is that scenario considered before?

Thanks.
--Prasad

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-18 17:19           ` Prasad Singamsetty
@ 2017-10-19  3:33             ` Peter Xu
  2017-10-20 22:54               ` Prasad Singamsetty
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2017-10-19  3:33 UTC (permalink / raw)
  To: Prasad Singamsetty
  Cc: Alex Williamson, Dr. David Alan Gilbert, qemu-devel, pbonzini,
	Sunit Jain, ehabkost, rth

On Wed, Oct 18, 2017 at 10:19:31AM -0700, Prasad Singamsetty wrote:
> 
> 
> On 10/16/2017 8:56 PM, Peter Xu wrote:
> >On Mon, Oct 16, 2017 at 10:02:25AM -0700, Prasad Singamsetty wrote:
> >>
> >>
> >>On 10/14/2017 8:53 PM, Peter Xu wrote:
> >>>On Fri, Oct 13, 2017 at 11:14:03AM -0600, Alex Williamson wrote:
> >>>>On Fri, 13 Oct 2017 18:01:44 +0100
> >>>>"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >>>>
> >>>>>* Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
> >>>>>>Hi,
> >>>>>>
> >>>>>>I am new to the alias. I have some questions on this subject
> >>>>>>and seek some clarifications from the experts in the team.
> >>>>>>I ran into a couple of issues when I tried with large configuration
> >>>>>>( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
> >>>>>>
> >>>>>>1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
> >>>>>>    width if user has not specified phys-bits or host-phys-bits=true
> >>>>>>    property. The default value is obviously not sufficient and
> >>>>>>    causing guest kernel to crash if configured with >= 1TB
> >>>>>>    memory. Depending on the linux kernel version in the guest the
> >>>>>>    panic was in different code paths. The workaround is for the
> >>>>>>    user to specify the phys-bits property or set the property
> >>>>>>    host-phys-bits=true.
> >>>>>>
> >>>>>>    QUESTIONS:
> >>>>...
> >>>>>>2. host_address_width in DMAR table structure
> >>>>>>
> >>>>>>    In this case, the default value is set to 39
> >>>>>>    (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
> >>>>>>    enabled for the intel iommu and the guest is configured
> >>>>>>    with > 255 cpus and >= 1TB memory, the guest kernel hangs
> >>>>>>    during boot up. This need to be fixed.
> >>>>>>
> >>>>>>    QUESTION:
> >>>>>>    The question here again is can we fix this to use the
> >>>>>>    real address width from the host as the default?
> >>>>>
> >>>>>I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
> >>>>>like that's an ommission that should be fixed.
> >>>>
> >>>>[CC +Peter]
> >>>>
> >>>>On physical hardware VT-d supports either 39 or 48 bit address widths
> >>>>and generally you'd expect a sufficiently capable IOMMU to be matched
> >>>>with the CPU.  Seems QEMU has only implemented a lower bit width and
> >>>>it should probably be forcing phys bits of the VM to 39 to match until
> >>>>the extended width can be implemented.  Thanks,
> >>>>
> >>>>Alex
> >>>
> >>>There were patches that tried to enable 48 bits GAW but it was
> >>>not accepted somehow:
> >>>
> >>>   https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01886.html
> >>>
> >>>Would this help in any way?
> >>>
> >>
> >>Thanks Alex for the patch info. Just curious why the patch was not
> >>accepted. Any way, I will try it.
> >
> >I don't sure I know the reason.  Anyway, it originated from one of
> >Fam's request for some NVMe tests.  If it can really help for your use
> >case as well, please feel free to revive those patches, or let me know
> >so that I can respin.  Thanks,
> >
> 
> Thanks Peter. I will start with your patch and see if I can get
> it to work first.
> 
> A quick question. Looking at the code, it doesn't look like there
> is a way to disable dma remapping. User may have a case where he
> is interested only in interrupt remapping (for > 255 cpus) and
> not DMA remapping. Is that scenario considered before?

It can be done in the guest if the guest doesn't want DMAR.

Note that there are two isolated kernel tunables for the VT-d device:

- intel_iommu: "on" to turn on DMAR, "off" to turn off DMAR
- intremap:    "on" to turn on IR, "off" to turn off IR

So even if guest has "intel_iommu=off" in its boot parameter, IR will
still be on by default (or specify it explicitly using "intremap=on").

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-19  3:33             ` Peter Xu
@ 2017-10-20 22:54               ` Prasad Singamsetty
  2017-10-23  6:37                 ` Peter Xu
  0 siblings, 1 reply; 16+ messages in thread
From: Prasad Singamsetty @ 2017-10-20 22:54 UTC (permalink / raw)
  To: Peter Xu
  Cc: ehabkost, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Sunit Jain, pbonzini, rth



On 10/18/2017 8:33 PM, Peter Xu wrote:
> On Wed, Oct 18, 2017 at 10:19:31AM -0700, Prasad Singamsetty wrote:
>>
>>
>> On 10/16/2017 8:56 PM, Peter Xu wrote:
>>> On Mon, Oct 16, 2017 at 10:02:25AM -0700, Prasad Singamsetty wrote:
>>>>
>>>>
>>>> On 10/14/2017 8:53 PM, Peter Xu wrote:
>>>>> On Fri, Oct 13, 2017 at 11:14:03AM -0600, Alex Williamson wrote:
>>>>>> On Fri, 13 Oct 2017 18:01:44 +0100
>>>>>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>>>>>
>>>>>>> * Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am new to the alias. I have some questions on this subject
>>>>>>>> and seek some clarifications from the experts in the team.
>>>>>>>> I ran into a couple of issues when I tried with large configuration
>>>>>>>> ( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
>>>>>>>>
>>>>>>>> 1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
>>>>>>>>     width if user has not specified phys-bits or host-phys-bits=true
>>>>>>>>     property. The default value is obviously not sufficient and
>>>>>>>>     causing guest kernel to crash if configured with >= 1TB
>>>>>>>>     memory. Depending on the linux kernel version in the guest the
>>>>>>>>     panic was in different code paths. The workaround is for the
>>>>>>>>     user to specify the phys-bits property or set the property
>>>>>>>>     host-phys-bits=true.
>>>>>>>>
>>>>>>>>     QUESTIONS:
>>>>>> ...
>>>>>>>> 2. host_address_width in DMAR table structure
>>>>>>>>
>>>>>>>>     In this case, the default value is set to 39
>>>>>>>>     (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
>>>>>>>>     enabled for the intel iommu and the guest is configured
>>>>>>>>     with > 255 cpus and >= 1TB memory, the guest kernel hangs
>>>>>>>>     during boot up. This need to be fixed.
>>>>>>>>
>>>>>>>>     QUESTION:
>>>>>>>>     The question here again is can we fix this to use the
>>>>>>>>     real address width from the host as the default?
>>>>>>>
>>>>>>> I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
>>>>>>> like that's an ommission that should be fixed.
>>>>>>
>>>>>> [CC +Peter]
>>>>>>
>>>>>> On physical hardware VT-d supports either 39 or 48 bit address widths
>>>>>> and generally you'd expect a sufficiently capable IOMMU to be matched
>>>>>> with the CPU.  Seems QEMU has only implemented a lower bit width and
>>>>>> it should probably be forcing phys bits of the VM to 39 to match until
>>>>>> the extended width can be implemented.  Thanks,
>>>>>>
>>>>>> Alex
>>>>>
>>>>> There were patches that tried to enable 48 bits GAW but it was
>>>>> not accepted somehow:
>>>>>
>>>>>    https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01886.html
>>>>>
>>>>> Would this help in any way?
>>>>>
>>>>
>>>> Thanks Alex for the patch info. Just curious why the patch was not
>>>> accepted. Any way, I will try it.
>>>
>>> I don't sure I know the reason.  Anyway, it originated from one of
>>> Fam's request for some NVMe tests.  If it can really help for your use
>>> case as well, please feel free to revive those patches, or let me know
>>> so that I can respin.  Thanks,
>>>
>>
>> Thanks Peter. I will start with your patch and see if I can get
>> it to work first.
>>
>> A quick question. Looking at the code, it doesn't look like there
>> is a way to disable dma remapping. User may have a case where he
>> is interested only in interrupt remapping (for > 255 cpus) and
>> not DMA remapping. Is that scenario considered before?
> 
> It can be done in the guest if the guest doesn't want DMAR.
> 
> Note that there are two isolated kernel tunables for the VT-d device:
> 
> - intel_iommu: "on" to turn on DMAR, "off" to turn off DMAR
> - intremap:    "on" to turn on IR, "off" to turn off IR
> 
> So even if guest has "intel_iommu=off" in its boot parameter, IR will
> still be on by default (or specify it explicitly using "intremap=on").

Thanks Peter. I think I figured out the problem in my test case
due to VTD_HOST_ADDRESS_WIDTH.

Problem scenario:

Guest kernel (machine type q35) is configured with 1TB memory.
With interrupt remapping enabled, the interrupt remapping
table is allocated by the guest kernel which can be any
where in the available physical memory. In my test case,
the physical address of the table is 0xfc3ec00000. And
this gets truncated by vtd_interrupt_remap_table_setup()
function to 0x7c3ec00000. This causes guest kernel to
get invalid data later on and it loops forever in
qi_submit_sync() in the guest kernel trying check fault
status.

This is after applying the patch from Peter Xu. The patch
is incomplete as the VTD_HAW_MASK is unchanged so it is
defined for 39 bits. There are several other masks defined
based on this in accessing iommu data structures. So, more
changes needed to implement Peter's approach of providing
x-aw-bits property.

Proposal:

We can simply change the VTD_HOST_ADDRESS_WIDTH to 48 bits
with out any other changes to the code. The current set of
features in the intel iommu emulator code works for q35
machine type and it doesn't have any other side effect.
Since the remapping tables are allocated by the guest kernel
they are always within the phys-bits range and as long
as the same range supported by intel iommu code in QEMU
it works fine. For the current q35 machine type, all the
supported cpus have <= 48 bits as the physical address
width. For short term, just changing the VTD_HOST_ADDRESS_WIDTH
to 48 should work fine for q35. I tried this and it seems
to work fine.

For long term, the VTD_HOST_ADDRESS_WIDTH has to match with
host cpu address width. If necessary we may need to define
a new machine type to keep VTD_HOST_ADDRESS_WIDTH value to
match with the host cpu.

Please let me know if you have any comments or suggestions
on this.

Thanks.
--Prasad


> 
> Thanks,
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-20 22:54               ` Prasad Singamsetty
@ 2017-10-23  6:37                 ` Peter Xu
  2017-10-23 17:23                   ` Prasad Singamsetty
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2017-10-23  6:37 UTC (permalink / raw)
  To: Prasad Singamsetty
  Cc: ehabkost, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Sunit Jain, pbonzini, rth, Michael S. Tsirkin

On Fri, Oct 20, 2017 at 03:54:21PM -0700, Prasad Singamsetty wrote:
> 
> 
> On 10/18/2017 8:33 PM, Peter Xu wrote:
> >On Wed, Oct 18, 2017 at 10:19:31AM -0700, Prasad Singamsetty wrote:
> >>
> >>
> >>On 10/16/2017 8:56 PM, Peter Xu wrote:
> >>>On Mon, Oct 16, 2017 at 10:02:25AM -0700, Prasad Singamsetty wrote:
> >>>>
> >>>>
> >>>>On 10/14/2017 8:53 PM, Peter Xu wrote:
> >>>>>On Fri, Oct 13, 2017 at 11:14:03AM -0600, Alex Williamson wrote:
> >>>>>>On Fri, 13 Oct 2017 18:01:44 +0100
> >>>>>>"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >>>>>>
> >>>>>>>* Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
> >>>>>>>>Hi,
> >>>>>>>>
> >>>>>>>>I am new to the alias. I have some questions on this subject
> >>>>>>>>and seek some clarifications from the experts in the team.
> >>>>>>>>I ran into a couple of issues when I tried with large configuration
> >>>>>>>>( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
> >>>>>>>>
> >>>>>>>>1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
> >>>>>>>>    width if user has not specified phys-bits or host-phys-bits=true
> >>>>>>>>    property. The default value is obviously not sufficient and
> >>>>>>>>    causing guest kernel to crash if configured with >= 1TB
> >>>>>>>>    memory. Depending on the linux kernel version in the guest the
> >>>>>>>>    panic was in different code paths. The workaround is for the
> >>>>>>>>    user to specify the phys-bits property or set the property
> >>>>>>>>    host-phys-bits=true.
> >>>>>>>>
> >>>>>>>>    QUESTIONS:
> >>>>>>...
> >>>>>>>>2. host_address_width in DMAR table structure
> >>>>>>>>
> >>>>>>>>    In this case, the default value is set to 39
> >>>>>>>>    (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
> >>>>>>>>    enabled for the intel iommu and the guest is configured
> >>>>>>>>    with > 255 cpus and >= 1TB memory, the guest kernel hangs
> >>>>>>>>    during boot up. This need to be fixed.
> >>>>>>>>
> >>>>>>>>    QUESTION:
> >>>>>>>>    The question here again is can we fix this to use the
> >>>>>>>>    real address width from the host as the default?
> >>>>>>>
> >>>>>>>I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
> >>>>>>>like that's an ommission that should be fixed.
> >>>>>>
> >>>>>>[CC +Peter]
> >>>>>>
> >>>>>>On physical hardware VT-d supports either 39 or 48 bit address widths
> >>>>>>and generally you'd expect a sufficiently capable IOMMU to be matched
> >>>>>>with the CPU.  Seems QEMU has only implemented a lower bit width and
> >>>>>>it should probably be forcing phys bits of the VM to 39 to match until
> >>>>>>the extended width can be implemented.  Thanks,
> >>>>>>
> >>>>>>Alex
> >>>>>
> >>>>>There were patches that tried to enable 48 bits GAW but it was
> >>>>>not accepted somehow:
> >>>>>
> >>>>>   https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01886.html
> >>>>>
> >>>>>Would this help in any way?
> >>>>>
> >>>>
> >>>>Thanks Alex for the patch info. Just curious why the patch was not
> >>>>accepted. Any way, I will try it.
> >>>
> >>>I don't sure I know the reason.  Anyway, it originated from one of
> >>>Fam's request for some NVMe tests.  If it can really help for your use
> >>>case as well, please feel free to revive those patches, or let me know
> >>>so that I can respin.  Thanks,
> >>>
> >>
> >>Thanks Peter. I will start with your patch and see if I can get
> >>it to work first.
> >>
> >>A quick question. Looking at the code, it doesn't look like there
> >>is a way to disable dma remapping. User may have a case where he
> >>is interested only in interrupt remapping (for > 255 cpus) and
> >>not DMA remapping. Is that scenario considered before?
> >
> >It can be done in the guest if the guest doesn't want DMAR.
> >
> >Note that there are two isolated kernel tunables for the VT-d device:
> >
> >- intel_iommu: "on" to turn on DMAR, "off" to turn off DMAR
> >- intremap:    "on" to turn on IR, "off" to turn off IR
> >
> >So even if guest has "intel_iommu=off" in its boot parameter, IR will
> >still be on by default (or specify it explicitly using "intremap=on").
> 
> Thanks Peter. I think I figured out the problem in my test case
> due to VTD_HOST_ADDRESS_WIDTH.
> 
> Problem scenario:
> 
> Guest kernel (machine type q35) is configured with 1TB memory.
> With interrupt remapping enabled, the interrupt remapping
> table is allocated by the guest kernel which can be any
> where in the available physical memory. In my test case,
> the physical address of the table is 0xfc3ec00000. And
> this gets truncated by vtd_interrupt_remap_table_setup()
> function to 0x7c3ec00000. This causes guest kernel to
> get invalid data later on and it loops forever in
> qi_submit_sync() in the guest kernel trying check fault
> status.
> 
> This is after applying the patch from Peter Xu. The patch
> is incomplete as the VTD_HAW_MASK is unchanged so it is
> defined for 39 bits. There are several other masks defined
> based on this in accessing iommu data structures. So, more
> changes needed to implement Peter's approach of providing
> x-aw-bits property.

Indeed.

> 
> Proposal:
> 
> We can simply change the VTD_HOST_ADDRESS_WIDTH to 48 bits
> with out any other changes to the code. The current set of
> features in the intel iommu emulator code works for q35
> machine type and it doesn't have any other side effect.
> Since the remapping tables are allocated by the guest kernel
> they are always within the phys-bits range and as long
> as the same range supported by intel iommu code in QEMU
> it works fine. For the current q35 machine type, all the
> supported cpus have <= 48 bits as the physical address
> width. For short term, just changing the VTD_HOST_ADDRESS_WIDTH
> to 48 should work fine for q35. I tried this and it seems
> to work fine.

I'm fine to change that macro, but IMHO only changing that line may
break backward compatibility of old guests (at least it'll change the
max address width reported in ACPI).  So I am not sure that's good.

I would prefer still using the new property ("x-aw-bits", or change
the name as you prefer) when people really want the 48 bits address
width, or even bigger ones in the future.  It makes sure that 39 bits
are still the default.

CCing Michael who maintains VT-d emulation codes.

> 
> For long term, the VTD_HOST_ADDRESS_WIDTH has to match with
> host cpu address width. If necessary we may need to define
> a new machine type to keep VTD_HOST_ADDRESS_WIDTH value to
> match with the host cpu.
> 
> Please let me know if you have any comments or suggestions
> on this.

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-23  6:37                 ` Peter Xu
@ 2017-10-23 17:23                   ` Prasad Singamsetty
  2017-10-26  8:30                     ` Peter Xu
  0 siblings, 1 reply; 16+ messages in thread
From: Prasad Singamsetty @ 2017-10-23 17:23 UTC (permalink / raw)
  To: Peter Xu
  Cc: ehabkost, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Sunit Jain, pbonzini, rth, Michael S. Tsirkin



On 10/22/2017 11:37 PM, Peter Xu wrote:
> On Fri, Oct 20, 2017 at 03:54:21PM -0700, Prasad Singamsetty wrote:
>>
>>
>> On 10/18/2017 8:33 PM, Peter Xu wrote:
>>> On Wed, Oct 18, 2017 at 10:19:31AM -0700, Prasad Singamsetty wrote:
>>>>
>>>>
>>>> On 10/16/2017 8:56 PM, Peter Xu wrote:
>>>>> On Mon, Oct 16, 2017 at 10:02:25AM -0700, Prasad Singamsetty wrote:
>>>>>>
>>>>>>
>>>>>> On 10/14/2017 8:53 PM, Peter Xu wrote:
>>>>>>> On Fri, Oct 13, 2017 at 11:14:03AM -0600, Alex Williamson wrote:
>>>>>>>> On Fri, 13 Oct 2017 18:01:44 +0100
>>>>>>>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>>>>>>>
>>>>>>>>> * Prasad Singamsetty (prasad.singamsetty@oracle.com) wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am new to the alias. I have some questions on this subject
>>>>>>>>>> and seek some clarifications from the experts in the team.
>>>>>>>>>> I ran into a couple of issues when I tried with large configuration
>>>>>>>>>> ( >= 1TB memory, > 255 CPUs) for x86_64 guest machine.
>>>>>>>>>>
>>>>>>>>>> 1. QEMU uses the default value of 40 (TCG_PHYS_ADDR_BITS) for address
>>>>>>>>>>     width if user has not specified phys-bits or host-phys-bits=true
>>>>>>>>>>     property. The default value is obviously not sufficient and
>>>>>>>>>>     causing guest kernel to crash if configured with >= 1TB
>>>>>>>>>>     memory. Depending on the linux kernel version in the guest the
>>>>>>>>>>     panic was in different code paths. The workaround is for the
>>>>>>>>>>     user to specify the phys-bits property or set the property
>>>>>>>>>>     host-phys-bits=true.
>>>>>>>>>>
>>>>>>>>>>     QUESTIONS:
>>>>>>>> ...
>>>>>>>>>> 2. host_address_width in DMAR table structure
>>>>>>>>>>
>>>>>>>>>>     In this case, the default value is set to 39
>>>>>>>>>>     (VTD_HOST_ADDRESS_WIDTH - 1). With interrupt remapping
>>>>>>>>>>     enabled for the intel iommu and the guest is configured
>>>>>>>>>>     with > 255 cpus and >= 1TB memory, the guest kernel hangs
>>>>>>>>>>     during boot up. This need to be fixed.
>>>>>>>>>>
>>>>>>>>>>     QUESTION:
>>>>>>>>>>     The question here again is can we fix this to use the
>>>>>>>>>>     real address width from the host as the default?
>>>>>>>>>
>>>>>>>>> I don't know DMAR stuff; chatting to Alex (cc'd) it does sound
>>>>>>>>> like that's an ommission that should be fixed.
>>>>>>>>
>>>>>>>> [CC +Peter]
>>>>>>>>
>>>>>>>> On physical hardware VT-d supports either 39 or 48 bit address widths
>>>>>>>> and generally you'd expect a sufficiently capable IOMMU to be matched
>>>>>>>> with the CPU.  Seems QEMU has only implemented a lower bit width and
>>>>>>>> it should probably be forcing phys bits of the VM to 39 to match until
>>>>>>>> the extended width can be implemented.  Thanks,
>>>>>>>>
>>>>>>>> Alex
>>>>>>>
>>>>>>> There were patches that tried to enable 48 bits GAW but it was
>>>>>>> not accepted somehow:
>>>>>>>
>>>>>>>    https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01886.html
>>>>>>>
>>>>>>> Would this help in any way?
>>>>>>>
>>>>>>
>>>>>> Thanks Alex for the patch info. Just curious why the patch was not
>>>>>> accepted. Any way, I will try it.
>>>>>
>>>>> I don't sure I know the reason.  Anyway, it originated from one of
>>>>> Fam's request for some NVMe tests.  If it can really help for your use
>>>>> case as well, please feel free to revive those patches, or let me know
>>>>> so that I can respin.  Thanks,
>>>>>
>>>>
>>>> Thanks Peter. I will start with your patch and see if I can get
>>>> it to work first.
>>>>
>>>> A quick question. Looking at the code, it doesn't look like there
>>>> is a way to disable dma remapping. User may have a case where he
>>>> is interested only in interrupt remapping (for > 255 cpus) and
>>>> not DMA remapping. Is that scenario considered before?
>>>
>>> It can be done in the guest if the guest doesn't want DMAR.
>>>
>>> Note that there are two isolated kernel tunables for the VT-d device:
>>>
>>> - intel_iommu: "on" to turn on DMAR, "off" to turn off DMAR
>>> - intremap:    "on" to turn on IR, "off" to turn off IR
>>>
>>> So even if guest has "intel_iommu=off" in its boot parameter, IR will
>>> still be on by default (or specify it explicitly using "intremap=on").
>>
>> Thanks Peter. I think I figured out the problem in my test case
>> due to VTD_HOST_ADDRESS_WIDTH.
>>
>> Problem scenario:
>>
>> Guest kernel (machine type q35) is configured with 1TB memory.
>> With interrupt remapping enabled, the interrupt remapping
>> table is allocated by the guest kernel which can be any
>> where in the available physical memory. In my test case,
>> the physical address of the table is 0xfc3ec00000. And
>> this gets truncated by vtd_interrupt_remap_table_setup()
>> function to 0x7c3ec00000. This causes guest kernel to
>> get invalid data later on and it loops forever in
>> qi_submit_sync() in the guest kernel trying check fault
>> status.
>>
>> This is after applying the patch from Peter Xu. The patch
>> is incomplete as the VTD_HAW_MASK is unchanged so it is
>> defined for 39 bits. There are several other masks defined
>> based on this in accessing iommu data structures. So, more
>> changes needed to implement Peter's approach of providing
>> x-aw-bits property.
> 
> Indeed.
> 
>>
>> Proposal:
>>
>> We can simply change the VTD_HOST_ADDRESS_WIDTH to 48 bits
>> with out any other changes to the code. The current set of
>> features in the intel iommu emulator code works for q35
>> machine type and it doesn't have any other side effect.
>> Since the remapping tables are allocated by the guest kernel
>> they are always within the phys-bits range and as long
>> as the same range supported by intel iommu code in QEMU
>> it works fine. For the current q35 machine type, all the
>> supported cpus have <= 48 bits as the physical address
>> width. For short term, just changing the VTD_HOST_ADDRESS_WIDTH
>> to 48 should work fine for q35. I tried this and it seems
>> to work fine.
> 
> I'm fine to change that macro, but IMHO only changing that line may
> break backward compatibility of old guests (at least it'll change the
> max address width reported in ACPI).  So I am not sure that's good.

Could you refer to any specifics on compatibility issues with old
guests?  The guest linux kernel doesn't seem to report any problem
with address width in DMAR reported doesn't match with what is
supported in the host cpu. I would like to better understand the gaps
we have here.

What other machine types intel iommu is supported with the current
implementation? Is there any test suite that can test intel iommu
functionality on supported guest types?

> 
> I would prefer still using the new property ("x-aw-bits", or change
> the name as you prefer) when people really want the 48 bits address
> width, or even bigger ones in the future.  It makes sure that 39 bits
> are still the default.
> 
> CCing Michael who maintains VT-d emulation codes.

Thanks.
--Prasad

> 
>>
>> For long term, the VTD_HOST_ADDRESS_WIDTH has to match with
>> host cpu address width. If necessary we may need to define
>> a new machine type to keep VTD_HOST_ADDRESS_WIDTH value to
>> match with the host cpu.
>>
>> Please let me know if you have any comments or suggestions
>> on this.
> 
> Thanks,
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-23 17:23                   ` Prasad Singamsetty
@ 2017-10-26  8:30                     ` Peter Xu
  2017-10-26 15:04                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2017-10-26  8:30 UTC (permalink / raw)
  To: Prasad Singamsetty
  Cc: ehabkost, Dr. David Alan Gilbert, qemu-devel, Alex Williamson,
	Sunit Jain, pbonzini, rth, Michael S. Tsirkin

On Mon, Oct 23, 2017 at 10:23:43AM -0700, Prasad Singamsetty wrote:

[...]

> >>Proposal:
> >>
> >>We can simply change the VTD_HOST_ADDRESS_WIDTH to 48 bits
> >>with out any other changes to the code. The current set of
> >>features in the intel iommu emulator code works for q35
> >>machine type and it doesn't have any other side effect.
> >>Since the remapping tables are allocated by the guest kernel
> >>they are always within the phys-bits range and as long
> >>as the same range supported by intel iommu code in QEMU
> >>it works fine. For the current q35 machine type, all the
> >>supported cpus have <= 48 bits as the physical address
> >>width. For short term, just changing the VTD_HOST_ADDRESS_WIDTH
> >>to 48 should work fine for q35. I tried this and it seems
> >>to work fine.
> >
> >I'm fine to change that macro, but IMHO only changing that line may
> >break backward compatibility of old guests (at least it'll change the
> >max address width reported in ACPI).  So I am not sure that's good.
> 
> Could you refer to any specifics on compatibility issues with old
> guests?  The guest linux kernel doesn't seem to report any problem
> with address width in DMAR reported doesn't match with what is
> supported in the host cpu. I would like to better understand the gaps
> we have here.

I mean for example when an old vIOMMU-enabled QEMU migrates to this
new QEMU.  When the guest first probes the DMAR device using the old
QEMU it should have seen 39 bits GAW, but after the migration to your
new QEMU instance it'll become 48 bits.  This can confuse the guest in
some way.  I'm not sure whether that would be a real problem but I
would rather just introduce the new property for 48 bits to avoid that
problem.

> 
> What other machine types intel iommu is supported with the current
> implementation? Is there any test suite that can test intel iommu
> functionality on supported guest types?

I don't think there are lots of tests on VT-d emulation.  There are a
few in kvm-unit-tests for the simplest DMAR and IR tests though, but I
don't think they are checking against compatibility problems.

Thanks,

> 
> >
> >I would prefer still using the new property ("x-aw-bits", or change
> >the name as you prefer) when people really want the 48 bits address
> >width, or even bigger ones in the future.  It makes sure that 39 bits
> >are still the default.
> >
> >CCing Michael who maintains VT-d emulation codes.
> 
> Thanks.
> --Prasad
> 
> >
> >>
> >>For long term, the VTD_HOST_ADDRESS_WIDTH has to match with
> >>host cpu address width. If necessary we may need to define
> >>a new machine type to keep VTD_HOST_ADDRESS_WIDTH value to
> >>match with the host cpu.
> >>
> >>Please let me know if you have any comments or suggestions
> >>on this.
> >
> >Thanks,
> >

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] host physical address width issues/questions for x86_64
  2017-10-26  8:30                     ` Peter Xu
@ 2017-10-26 15:04                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2017-10-26 15:04 UTC (permalink / raw)
  To: Peter Xu
  Cc: Prasad Singamsetty, ehabkost, Dr. David Alan Gilbert, qemu-devel,
	Alex Williamson, Sunit Jain, pbonzini, rth

On Thu, Oct 26, 2017 at 04:30:57PM +0800, Peter Xu wrote:
> On Mon, Oct 23, 2017 at 10:23:43AM -0700, Prasad Singamsetty wrote:
> 
> [...]
> 
> > >>Proposal:
> > >>
> > >>We can simply change the VTD_HOST_ADDRESS_WIDTH to 48 bits
> > >>with out any other changes to the code. The current set of
> > >>features in the intel iommu emulator code works for q35
> > >>machine type and it doesn't have any other side effect.
> > >>Since the remapping tables are allocated by the guest kernel
> > >>they are always within the phys-bits range and as long
> > >>as the same range supported by intel iommu code in QEMU
> > >>it works fine. For the current q35 machine type, all the
> > >>supported cpus have <= 48 bits as the physical address
> > >>width. For short term, just changing the VTD_HOST_ADDRESS_WIDTH
> > >>to 48 should work fine for q35. I tried this and it seems
> > >>to work fine.
> > >
> > >I'm fine to change that macro, but IMHO only changing that line may
> > >break backward compatibility of old guests (at least it'll change the
> > >max address width reported in ACPI).  So I am not sure that's good.
> > 
> > Could you refer to any specifics on compatibility issues with old
> > guests?  The guest linux kernel doesn't seem to report any problem
> > with address width in DMAR reported doesn't match with what is
> > supported in the host cpu. I would like to better understand the gaps
> > we have here.
> 
> I mean for example when an old vIOMMU-enabled QEMU migrates to this
> new QEMU.  When the guest first probes the DMAR device using the old
> QEMU it should have seen 39 bits GAW, but after the migration to your
> new QEMU instance it'll become 48 bits.  This can confuse the guest in
> some way.  I'm not sure whether that would be a real problem but I
> would rather just introduce the new property for 48 bits to avoid that
> problem.

Right.

> > 
> > What other machine types intel iommu is supported with the current
> > implementation? Is there any test suite that can test intel iommu
> > functionality on supported guest types?
> 
> I don't think there are lots of tests on VT-d emulation.  There are a
> few in kvm-unit-tests for the simplest DMAR and IR tests though, but I
> don't think they are checking against compatibility problems.
> 
> Thanks,
> 
> > 
> > >
> > >I would prefer still using the new property ("x-aw-bits", or change
> > >the name as you prefer) when people really want the 48 bits address
> > >width, or even bigger ones in the future.  It makes sure that 39 bits
> > >are still the default.
> > >
> > >CCing Michael who maintains VT-d emulation codes.
> > 
> > Thanks.
> > --Prasad
> > 
> > >
> > >>
> > >>For long term, the VTD_HOST_ADDRESS_WIDTH has to match with
> > >>host cpu address width. If necessary we may need to define
> > >>a new machine type to keep VTD_HOST_ADDRESS_WIDTH value to
> > >>match with the host cpu.
> > >>
> > >>Please let me know if you have any comments or suggestions
> > >>on this.
> > >
> > >Thanks,
> > >
> 
> -- 
> Peter Xu

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-10-26 15:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-13 16:17 [Qemu-devel] host physical address width issues/questions for x86_64 Prasad Singamsetty
2017-10-13 17:01 ` Dr. David Alan Gilbert
2017-10-13 17:14   ` Alex Williamson
2017-10-15  3:53     ` Peter Xu
2017-10-16 17:02       ` Prasad Singamsetty
2017-10-17  3:56         ` Peter Xu
2017-10-18  5:59           ` Fam Zheng
2017-10-18 17:19           ` Prasad Singamsetty
2017-10-19  3:33             ` Peter Xu
2017-10-20 22:54               ` Prasad Singamsetty
2017-10-23  6:37                 ` Peter Xu
2017-10-23 17:23                   ` Prasad Singamsetty
2017-10-26  8:30                     ` Peter Xu
2017-10-26 15:04                       ` Michael S. Tsirkin
2017-10-16 17:11     ` Prasad Singamsetty
2017-10-16 16:59   ` Prasad Singamsetty

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.