All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcel Apfelbaum <marcel@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>,
	ehabkost@redhat.com, Marcel Apfelbaum <marcel.a@redhat.com>,
	ghammer@redhat.com, qemu-devel@nongnu.org,
	lcapitulino@redhat.com, lersek@redhat.com
Subject: Re: [Qemu-devel] [PATCH v19 3/9] pc: add a Virtual Machine Generation ID device
Date: Thu, 11 Feb 2016 19:34:52 +0200	[thread overview]
Message-ID: <56BCC63C.4060702@redhat.com> (raw)
In-Reply-To: <20160211180836-mutt-send-email-mst@redhat.com>

On 02/11/2016 06:30 PM, Michael S. Tsirkin wrote:
> On Thu, Feb 11, 2016 at 04:16:05PM +0100, Igor Mammedov wrote:
>> On Tue, 9 Feb 2016 14:17:44 +0200
>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>>> On Tue, Feb 09, 2016 at 11:46:08AM +0100, Igor Mammedov wrote:
>>>>> So the linker interface solves this rather neatly:
>>>>> bios allocates memory, bios passes memory map to guest.
>>>>> Served us well for several years without need for extensions,
>>>>> and it does solve the VM GEN ID problem, even though
>>>>> 1. it was never designed for huge areas like nvdimm seems to want to use
>>>>> 2. we might want to add a new 64 bit flag to avoid touching low memory
>>>> linker interface is fine for some readonly data, like ACPI tables
>>>> especially fixed tables not so for AML ones is one wants to patch it.
>>>>
>>>> However now when you want to use it for other purposes you start
>>>> adding extensions and other guest->QEMU channels to communicate
>>>> patching info back.
>>>> It steals guest's memory which is also not nice and doesn't scale well.
>>>
>>> This is an argument I don't get. memory is memory. call it guest memory
>>> or RAM backed PCI BAR - same thing. MMIO is cheaper of course
>>> but much slower.
>>>
>>> ...
>> It however matters for user, he pays for guest with XXX RAM but gets less
>> than that. And that will be getting worse as a number of such devices
>> increases.
>>
>>>>> OK fine, but returning PCI BAR address to guest is wrong.
>>>>> How about reading it from ACPI then? Is it really
>>>>> broken unless there's *also* a driver?
>>>> I don't get question, MS Spec requires address (ADDR method),
>>>> and it's read by ACPI (AML).
>>>
>>> You were unhappy about DMA into guest memory.
>>> As a replacement for DMA, we could have AML read from
>>> e.g. PCI and write into RAM.
>>> This way we don't need to pass address to QEMU.
>> That sounds better as it saves us from allocation of IO port
>> and QEMU don't need to write into guest memory, the only question is
>> if PCI_Config opregion would work with driver-less PCI device.
>
> Or PCI BAR for that reason. I don't know for sure.
>
>>
>> And it's still pretty much not test-able since it would require
>> fully running OSPM to execute AML side.
>
> AML is not testable, but that's nothing new.
> You can test reading from PCI.
>
>>>
>>>> As for working PCI_Config OpRegion without driver, I haven't tried,
>>>> but I wouldn't be surprised if it doesn't, taking in account that
>>>> MS introduced _DSM doesn't.
>>>>
>>>>>
>>>>>
>>>>>>>>     Just compare with a graphics card design, where on device memory
>>>>>>>>     is mapped directly at some GPA not wasting RAM that guest could
>>>>>>>>     use for other tasks.
>>>>>>>
>>>>>>> This might have been true 20 years ago.  Most modern cards do DMA.
>>>>>>
>>>>>> Modern cards, with it's own RAM, map its VRAM in address space directly
>>>>>> and allow users use it (GEM API). So they do not waste conventional RAM.
>>>>>> For example NVIDIA VRAM is mapped as PCI BARs the same way like in this
>>>>>> series (even PCI class id is the same)
>>>>>
>>>>> Don't know enough about graphics really, I'm not sure how these are
>>>>> relevant.  NICs and disks certainly do DMA.  And virtio gl seems to
>>>>> mostly use guest RAM, not on card RAM.
>>>>>
>>>>>>>>     VMGENID and NVDIMM use-cases look to me exactly the same, i.e.
>>>>>>>>     instead of consuming guest's RAM they should be mapped at
>>>>>>>>     some GPA and their memory accessed directly.
>>>>>>>
>>>>>>> VMGENID is tied to a spec that rather arbitrarily asks for a fixed
>>>>>>> address. This breaks the straight-forward approach of using a
>>>>>>> rebalanceable PCI BAR.
>>>>>>
>>>>>> For PCI rebalance to work on Windows, one has to provide working PCI driver
>>>>>> otherwise OS will ignore it when rebalancing happens and
>>>>>> might map something else over ignored BAR.
>>>>>
>>>>> Does it disable the BAR then? Or just move it elsewhere?
>>>> it doesn't, it just blindly ignores BARs existence and maps BAR of
>>>> another device with driver over it.
>>>
>>> Interesting. On classical PCI this is a forbidden configuration.
>>> Maybe we do something that confuses windows?
>>> Could you tell me how to reproduce this behaviour?
>> #cat > t << EOF
>> pci_update_mappings_del
>> pci_update_mappings_add
>> EOF
>>
>> #./x86_64-softmmu/qemu-system-x86_64 -snapshot -enable-kvm -snapshot \
>>   -monitor unix:/tmp/m,server,nowait -device pci-bridge,chassis_nr=1 \
>>   -boot menu=on -m 4G -trace events=t ws2012r2x64dc.img \
>>   -device ivshmem,id=foo,size=2M,shm,bus=pci.1,addr=01
>>
>> wait till OS boots, note BARs programmed for ivshmem
>>   in my case it was
>>     01:01.0 0,0xfe800000+0x100
>> then execute script and watch pci_update_mappings* trace events
>>
>> # for i in $(seq 3 18); do printf -- "device_add e1000,bus=pci.1,addr=%x\n" $i | nc -U /tmp/m; sleep 5; done;
>>
>> hotplugging e1000,bus=pci.1,addr=12 triggers rebalancing where
>> Windows unmaps all BARs of nics on bridge but doesn't touch ivshmem
>> and then programs new BARs, where:
>>    pci_update_mappings_add d=0x7fa02ff0cf90 01:11.0 0,0xfe800000+0x20000
>> creates overlapping BAR with ivshmem
>

Hi,

Let me see if I understand.
You say that in Windows, if a device does not have a driver installed,
its BARS ranges can be used after re-balancing by other devices, right?

If yes, in Windows we cannot use the device anyway, so we shouldn't care, right?

Our only problem remains the overlapping memory regions for
the old device and the new one and we need to ensure only the new device
will use these region?


Thanks,
Marcel

>
> Thanks!
> We need to figure this out because currently this does not
> work properly (or maybe it works, but merely by chance).
> Me and Marcel will play with this.
>
[...]

  reply	other threads:[~2016-02-11 17:35 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-28 10:54 [Qemu-devel] [PATCH v19 0/9] Virtual Machine Generation ID Igor Mammedov
2016-01-28 10:54 ` [Qemu-devel] [PATCH v19 1/9] acpi: extend ACPI interface to provide access to ACPI registers and SCI irq Igor Mammedov
2016-01-28 10:54 ` [Qemu-devel] [PATCH v19 2/9] docs: vm generation id device's description Igor Mammedov
2016-01-28 10:54 ` [Qemu-devel] [PATCH v19 3/9] pc: add a Virtual Machine Generation ID device Igor Mammedov
2016-01-28 11:13   ` Michael S. Tsirkin
2016-01-28 12:03     ` Igor Mammedov
2016-01-28 12:59       ` Michael S. Tsirkin
2016-01-29 11:13         ` Igor Mammedov
2016-01-31 16:22           ` Michael S. Tsirkin
2016-02-02  9:59             ` Igor Mammedov
2016-02-02 11:16               ` Michael S. Tsirkin
2016-02-09 10:46                 ` Igor Mammedov
2016-02-09 12:17                   ` Michael S. Tsirkin
2016-02-11 15:16                     ` Igor Mammedov
2016-02-11 16:30                       ` Michael S. Tsirkin
2016-02-11 17:34                         ` Marcel Apfelbaum [this message]
2016-02-12  6:15                           ` Michael S. Tsirkin
2016-02-15 10:30                         ` Igor Mammedov
2016-02-15 11:26                           ` Michael S. Tsirkin
2016-02-15 13:56                             ` Igor Mammedov
2016-02-16 10:05                         ` Marcel Apfelbaum
2016-02-16 12:17                           ` Igor Mammedov
2016-02-16 12:36                             ` Marcel Apfelbaum
2016-02-16 13:51                               ` Igor Mammedov
2016-02-16 14:53                                 ` Michael S. Tsirkin
2016-02-16 15:10                               ` Michael S. Tsirkin
2016-02-10  8:51                   ` Michael S. Tsirkin
2016-02-10  9:28                     ` Michael S. Tsirkin
2016-02-10 10:00                       ` Laszlo Ersek
2016-01-28 13:48     ` Laszlo Ersek
2016-01-28 10:54 ` [Qemu-devel] [PATCH v19 4/9] tests: add a unit test for the vmgenid device Igor Mammedov
2016-01-28 10:54 ` [Qemu-devel] [PATCH v19 5/9] qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands Igor Mammedov
2016-02-09 17:31   ` Eric Blake
2016-01-28 10:54 ` [Qemu-devel] [PATCH v19 6/9] qmp/hmp: add set-vm-generation-id commands Igor Mammedov
2016-02-09 17:33   ` Eric Blake
2016-01-28 10:54 ` [Qemu-devel] [PATCH v19 8/9] pc: put PIIX3 in slot 1 explicitly and cleanup functions assignment Igor Mammedov
2016-01-28 10:54 ` [Qemu-devel] [PATCH v19 9/9] pc/q53: by default put vmgenid device as an function of ISA bridge Igor Mammedov
2016-01-28 10:58 ` [Qemu-devel] [PATCH v19 7/9] machine: add properties to compat_props incrementaly Igor Mammedov
2016-01-28 14:02   ` Eduardo Habkost
2016-01-28 17:00     ` Igor Mammedov
2016-02-03 17:55       ` [Qemu-devel] qdev & hw/core owner? (was Re: [PATCH v19 7/9] machine: add properties to compat_props incrementaly) Eduardo Habkost
2016-02-03 18:46         ` Laszlo Ersek
2016-02-03 19:06         ` Michael S. Tsirkin
2016-02-04 11:31           ` Paolo Bonzini
2016-02-04 11:41             ` Andreas Färber
2016-02-04 11:55               ` Paolo Bonzini
2016-02-04 12:06                 ` Michael S. Tsirkin
2016-02-05  7:49                   ` Markus Armbruster
2016-02-05  7:51                     ` Marcel Apfelbaum
2016-02-11 19:41                       ` Eduardo Habkost
2016-02-12  9:17                         ` Marcel Apfelbaum
2016-02-12 11:22                           ` Andreas Färber
2016-02-12 18:17                             ` Eduardo Habkost
2016-02-12 22:30                               ` Paolo Bonzini
2016-02-12 18:09                           ` Eduardo Habkost
2016-02-05  7:52                 ` Markus Armbruster
2016-02-04 12:03               ` Michael S. Tsirkin
2016-02-04 12:12               ` Marcel Apfelbaum
2016-01-29 12:51   ` [Qemu-devel] [PATCH v19 7/9] machine: add properties to compat_props incrementaly Cornelia Huck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56BCC63C.4060702@redhat.com \
    --to=marcel@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=ghammer@redhat.com \
    --cc=guangrong.xiao@linux.intel.com \
    --cc=imammedo@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=lersek@redhat.com \
    --cc=marcel.a@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.