All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	qemu-devel@nongnu.org, Daniel Jordan <daniel.m.jordan@oracle.com>,
	David Edmondson <david.edmondson@oracle.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Ani Sinha <ani@anisinha.ca>, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH RFCv2 2/4] i386/pc: relocate 4g start to 1T where applicable
Date: Mon, 21 Feb 2022 15:28:11 +0000	[thread overview]
Message-ID: <57e85450-e248-874e-78a0-06b74f8d3617@oracle.com> (raw)
In-Reply-To: <20220221075807.7415884a@redhat.com>



On 2/21/22 06:58, Igor Mammedov wrote:
> On Fri, 18 Feb 2022 17:12:21 +0000
> Joao Martins <joao.m.martins@oracle.com> wrote:
> 
>> On 2/14/22 15:31, Igor Mammedov wrote:
>>> On Mon, 14 Feb 2022 15:05:00 +0000
>>> Joao Martins <joao.m.martins@oracle.com> wrote:  
>>>> On 2/14/22 14:53, Igor Mammedov wrote:  
>>>>> On Mon,  7 Feb 2022 20:24:20 +0000
>>>>> Joao Martins <joao.m.martins@oracle.com> wrote:  
>>>>>> +{
>>>>>> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
>>>>>> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>>>>>> +    ram_addr_t device_mem_size = 0;
>>>>>> +    uint32_t eax, vendor[3];
>>>>>> +
>>>>>> +    host_cpuid(0x0, 0, &eax, &vendor[0], &vendor[2], &vendor[1]);
>>>>>> +    if (!IS_AMD_VENDOR(vendor)) {
>>>>>> +        return;
>>>>>> +    }
>>>>>> +
>>>>>> +    if (pcmc->has_reserved_memory &&
>>>>>> +       (machine->ram_size < machine->maxram_size)) {
>>>>>> +        device_mem_size = machine->maxram_size - machine->ram_size;
>>>>>> +    }
>>>>>> +
>>>>>> +    if ((x86ms->above_4g_mem_start + x86ms->above_4g_mem_size +
>>>>>> +         device_mem_size) < AMD_HT_START) {    
>>>>>     
>>>> And I was at two minds on this one, whether to advertise *always*
>>>> the 1T hole, regardless of relocation. Or account the size
>>>> we advertise for the pci64 hole and make that part of the equation
>>>> above. Although that has the flaw that the firmware at admin request
>>>> may pick some ludricous number (limited by maxphysaddr).  
>>>
>>> it this point we have only pci64 hole size (machine property),
>>> so I'd include that in equation to make firmware assign
>>> pci64 aperture above HT range.
>>>
>>> as for checking maxphysaddr, we can only check 'default' PCI hole
>>> range at this stage (i.e. 1Gb aligned hole size after all possible RAM)
>>> and hard error on it.
>>>   
>>
>> Igor, in the context of your comment above, I'll be introducing another
>> preparatory patch that adds up pci_hole64_size to pc_memory_init() such
>> that all used/max physaddr space checks are consolidated in pc_memory_init().
>>
>> To that end, the changes involve mainly moves the the pcihost qdev creation
>> to be before pc_memory_init(). Q35 just needs a 2-line order change. i440fx
>> needs slightly more of a dance to extract that from i440fx_init() and also
>> because most i440fx state is private (hence the new helper for size). But
>> the actual initialization of I440fx/q35 pci host is still after pc_memory_init(),
>> it is just to extra pci_hole64_size from the object + user passed args (-global etc).
> 
> Shuffling init order is looks too intrusive and in practice
> quite risky.

Yeah, it is an intrusive change sadly. Although, why would you consider it
risky (curious)? We are "only" moving this:

	qdev_new(host_type);

... located at the very top of i440fx_init() and called at the top of q35_host
initilization to be instead before pc_memory_init(). And that means that an instance of an
object gets made and its properties initialized i.e. @instance_init of q35 and i440fx and
its properties. I don't see a particular dependence in PC code to tell that this
would affect its surroundings parts.

The actual pcihost-related initialization is still kept entirely unchanged.

> How about moving maxphysaddr check to pc_machine_done() instead?
> (this way you won't have to move pcihost around)
> 
I can move it. But be there will be a slight disconnect between what pc_memory_init()
checks against "max used address"  between ... dictating if the 4G mem start should change
to 1T or not ...  and when the phys-bits check is actually made which includes the pci hole64.

For example, we create a guest with maxram 1009G (which 4G mem start would get at
unchanged) and then the pci_hole64 goes likely assigned across the rest until 1023G (i.e.
across the HT region). Here it would need an extra check and fail if pci_hole64 crosses
the HT region. Whereby if it was added in pc_memory_init() then we could just relocate to
1T and the guest creation would still proceed.

> 
>> Raw staging changes below the scissors mark so far.
>>
>> -->8--  
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index b2e43eba1106..902977081350 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -875,7 +875,8 @@ static void x86_update_above_4g_mem_start(PCMachineState *pcms)
>>  void pc_memory_init(PCMachineState *pcms,
>>                      MemoryRegion *system_memory,
>>                      MemoryRegion *rom_memory,
>> -                    MemoryRegion **ram_memory)
>> +                    MemoryRegion **ram_memory,
>> +                    uint64_t pci_hole64_size)
>>  {
>>      int linux_boot, i;
>>      MemoryRegion *option_rom_mr;
>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
>> index d9b344248dac..5a608e30e28f 100644
>> --- a/hw/i386/pc_piix.c
>> +++ b/hw/i386/pc_piix.c
>> @@ -91,6 +91,8 @@ static void pc_init1(MachineState *machine,
>>      MemoryRegion *pci_memory;
>>      MemoryRegion *rom_memory;
>>      ram_addr_t lowmem;
>> +    uint64_t hole64_size;
>> +    DeviceState *i440fx_dev;
>>
>>      /*
>>       * Calculate ram split, for memory below and above 4G.  It's a bit
>> @@ -164,9 +166,13 @@ static void pc_init1(MachineState *machine,
>>          pci_memory = g_new(MemoryRegion, 1);
>>          memory_region_init(pci_memory, NULL, "pci", UINT64_MAX);
>>          rom_memory = pci_memory;
>> +        i440fx_dev = qdev_new(host_type);
>> +        hole64_size = i440fx_pci_hole64_size(i440fx_dev);
>>      } else {
>>          pci_memory = NULL;
>>          rom_memory = system_memory;
>> +        i440fx_dev = NULL;
>> +        hole64_size = 0;
>>      }
>>
>>      pc_guest_info_init(pcms);
>> @@ -183,7 +189,7 @@ static void pc_init1(MachineState *machine,
>>      /* allocate ram and load rom/bios */
>>      if (!xen_enabled()) {
>>          pc_memory_init(pcms, system_memory,
>> -                       rom_memory, &ram_memory);
>> +                       rom_memory, &ram_memory, hole64_size);
>>      } else {
>>          pc_system_flash_cleanup_unused(pcms);
>>          if (machine->kernel_filename != NULL) {
>> @@ -199,7 +205,7 @@ static void pc_init1(MachineState *machine,
>>
>>          pci_bus = i440fx_init(host_type,
>>                                pci_type,
>> -                              &i440fx_state,
>> +                              i440fx_dev, &i440fx_state,
>>                                system_memory, system_io, machine->ram_size,
>>                                x86ms->below_4g_mem_size,
>>                                x86ms->above_4g_mem_size,
>> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>> index 1780f79bc127..b7cf44d4755e 100644
>> --- a/hw/i386/pc_q35.c
>> +++ b/hw/i386/pc_q35.c
>> @@ -203,12 +203,13 @@ static void pc_q35_init(MachineState *machine)
>>                              pcms->smbios_entry_point_type);
>>      }
>>
>> -    /* allocate ram and load rom/bios */
>> -    pc_memory_init(pcms, get_system_memory(), rom_memory, &ram_memory);
>> -
>>      /* create pci host bus */
>>      q35_host = Q35_HOST_DEVICE(qdev_new(TYPE_Q35_HOST_DEVICE));
>>
>> +    /* allocate ram and load rom/bios */
>> +    pc_memory_init(pcms, get_system_memory(), rom_memory, &ram_memory,
>> +                   q35_host->mch.pci_hole64_size);
>> +
>>      object_property_add_child(qdev_get_machine(), "q35", OBJECT(q35_host));
>>      object_property_set_link(OBJECT(q35_host), MCH_HOST_PROP_RAM_MEM,
>>                               OBJECT(ram_memory), NULL);
>> diff --git a/hw/pci-host/i440fx.c b/hw/pci-host/i440fx.c
>> index e08716142b6e..c5cc28250d5c 100644
>> --- a/hw/pci-host/i440fx.c
>> +++ b/hw/pci-host/i440fx.c
>> @@ -237,7 +237,15 @@ static void i440fx_realize(PCIDevice *dev, Error **errp)
>>      }
>>  }
>>
>> +uint64_t i440fx_pci_hole64_size(DeviceState *i440fx_dev)
>> +{
>> +        I440FXState *i440fx = I440FX_PCI_HOST_BRIDGE(i440fx_dev);
>> +
>> +        return i440fx->pci_hole64_size;
>> +}
>> +
>>  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
>> +                    DeviceState *dev,
>>                      PCII440FXState **pi440fx_state,
>>                      MemoryRegion *address_space_mem,
>>                      MemoryRegion *address_space_io,
>> @@ -247,7 +255,6 @@ PCIBus *i440fx_init(const char *host_type, const char *pci_type,
>>                      MemoryRegion *pci_address_space,
>>                      MemoryRegion *ram_memory)
>>  {
>> -    DeviceState *dev;
>>      PCIBus *b;
>>      PCIDevice *d;
>>      PCIHostState *s;
>> @@ -255,7 +262,6 @@ PCIBus *i440fx_init(const char *host_type, const char *pci_type,
>>      unsigned i;
>>      I440FXState *i440fx;
>>
>> -    dev = qdev_new(host_type);
>>      s = PCI_HOST_BRIDGE(dev);
>>      b = pci_root_bus_new(dev, NULL, pci_address_space,
>>                           address_space_io, 0, TYPE_PCI_BUS);
>> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
>> index 9c9f4ac74810..d8b9c4ebd748 100644
>> --- a/include/hw/i386/pc.h
>> +++ b/include/hw/i386/pc.h
>> @@ -158,7 +158,8 @@ void xen_load_linux(PCMachineState *pcms);
>>  void pc_memory_init(PCMachineState *pcms,
>>                      MemoryRegion *system_memory,
>>                      MemoryRegion *rom_memory,
>> -                    MemoryRegion **ram_memory);
>> +                    MemoryRegion **ram_memory,
>> +                    uint64_t pci_hole64_size);
>>  uint64_t pc_pci_hole64_start(void);
>>  DeviceState *pc_vga_init(ISABus *isa_bus, PCIBus *pci_bus);
>>  void pc_basic_device_init(struct PCMachineState *pcms,
>> diff --git a/include/hw/pci-host/i440fx.h b/include/hw/pci-host/i440fx.h
>> index f068aaba8fda..1299d6a2b0e4 100644
>> --- a/include/hw/pci-host/i440fx.h
>> +++ b/include/hw/pci-host/i440fx.h
>> @@ -36,7 +36,7 @@ struct PCII440FXState {
>>  #define TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE "igd-passthrough-i440FX"
>>
>>  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
>> -                    PCII440FXState **pi440fx_state,
>> +                    DeviceState *dev, PCII440FXState **pi440fx_state,
>>                      MemoryRegion *address_space_mem,
>>                      MemoryRegion *address_space_io,
>>                      ram_addr_t ram_size,
>> @@ -45,5 +45,6 @@ PCIBus *i440fx_init(const char *host_type, const char *pci_type,
>>                      MemoryRegion *pci_memory,
>>                      MemoryRegion *ram_memory);
>>
>> +uint64_t i440fx_pci_hole64_size(DeviceState *i440fx_dev);
>>
>>  #endif
>>
> 


  reply	other threads:[~2022-02-21 15:32 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-07 20:24 [PATCH RFCv2 0/4] i386/pc: Fix creation of >= 1010G guests on AMD systems with IOMMU Joao Martins
2022-02-07 20:24 ` [PATCH RFCv2 1/4] hw/i386: add 4g boundary start to X86MachineState Joao Martins
2022-02-14 13:19   ` Igor Mammedov
2022-02-14 13:21     ` Joao Martins
2022-02-07 20:24 ` [PATCH RFCv2 2/4] i386/pc: relocate 4g start to 1T where applicable Joao Martins
2022-02-14 14:53   ` Igor Mammedov
2022-02-14 15:05     ` Joao Martins
2022-02-14 15:31       ` Igor Mammedov
2022-02-15  9:53         ` Gerd Hoffmann
2022-02-15 19:37           ` Joao Martins
2022-02-16  8:19             ` Gerd Hoffmann
2022-02-16 11:54               ` Joao Martins
2022-02-16 12:32                 ` Gerd Hoffmann
2022-02-16  9:51           ` Daniel P. Berrangé
2022-02-21 13:15             ` Dr. David Alan Gilbert
2022-02-22  8:46               ` Igor Mammedov
2022-02-22  9:30                 ` Dr. David Alan Gilbert
2022-02-22  9:42                 ` Gerd Hoffmann
2022-02-23  8:43                   ` Igor Mammedov
2022-02-23  9:16                     ` Dr. David Alan Gilbert
2022-02-23  9:31                       ` Igor Mammedov
2022-02-18 17:12         ` Joao Martins
2022-02-21  6:58           ` Igor Mammedov
2022-02-21 15:28             ` Joao Martins [this message]
2022-02-22 11:00               ` Joao Martins
2022-02-23  8:38                 ` Igor Mammedov
2022-02-07 20:24 ` [PATCH RFCv2 3/4] i386/pc: warn if phys-bits is too low Joao Martins
2022-02-14 13:15   ` David Edmondson
2022-02-14 13:18     ` Joao Martins
2022-02-14 15:03   ` Igor Mammedov
2022-02-14 15:18     ` Joao Martins
2022-02-14 15:41       ` Igor Mammedov
2022-02-14 15:48         ` Joao Martins
2022-02-23 17:18       ` Joao Martins
2022-02-24  9:01         ` Igor Mammedov
2022-02-24  9:27           ` Joao Martins
2022-02-07 20:24 ` [PATCH RFCv2 4/4] i386/pc: Restrict AMD-only enforcing of valid IOVAs to new machine type Joao Martins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57e85450-e248-874e-78a0-06b74f8d3617@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=alex.williamson@redhat.com \
    --cc=ani@anisinha.ca \
    --cc=daniel.m.jordan@oracle.com \
    --cc=david.edmondson@oracle.com \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=suravee.suthikulpanit@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.