From mboxrd@z Thu Jan  1 00:00:00 1970
From: Alexander Graf <agraf@suse.de>
Subject: Re: [PATCH 04/14] KVM: PPC: e500: MMU API
Date: Thu, 10 Nov 2011 15:20:28 +0100
Message-ID: <4EBBDDAC.1060505@suse.de>
References: <1320047596-20577-1-git-send-email-agraf@suse.de> <1320047596-20577-5-git-send-email-agraf@suse.de> <4EAEA184.4050807@redhat.com> <4EAF013C.7050206@freescale.com> <4EAFB4B9.2040806@redhat.com> <4EB01B4B.8090209@freescale.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Avi Kivity <avi@redhat.com>, kvm-ppc@vger.kernel.org,
	kvm list <kvm@vger.kernel.org>,
	Marcelo Tosatti <mtosatti@redhat.com>
To: Scott Wood <scottwood@freescale.com>
Return-path: <kvm-ppc-owner@vger.kernel.org>
In-Reply-To: <4EB01B4B.8090209@freescale.com>
Sender: kvm-ppc-owner@vger.kernel.org
List-Id: kvm.vger.kernel.org

On 11/01/2011 05:16 PM, Scott Wood wrote:
> On 11/01/2011 03:58 AM, Avi Kivity wrote:
>> On 10/31/2011 10:12 PM, Scott Wood wrote:
>>>>> +4.59 KVM_DIRTY_TLB
>>>>> +
>>>>> +Capability: KVM_CAP_SW_TLB
>>>>> +Architectures: ppc
>>>>> +Type: vcpu ioctl
>>>>> +Parameters: struct kvm_dirty_tlb (in)
>>>>> +Returns: 0 on success, -1 on error
>>>>> +
>>>>> +struct kvm_dirty_tlb {
>>>>> +	__u64 bitmap;
>>>>> +	__u32 num_dirty;
>>>>> +};
>>>> This is not 32/64 bit safe.  e500 is 32-bit only, yes?
>>> e5500 is 64-bit -- we don't support it with KVM yet, but it's planned.
>>>
>>>> but what if someone wants to emulate an e500 on a ppc64?  maybe it's better to add
>>>> padding here.
>>> What is unsafe about it?  Are you picturing TLBs with more than 4
>>> billion entries?
>> sizeof(struct kvm_tlb_dirty) == 12 for 32-bit userspace, but ==  16 for
>> 64-bit userspace and the kernel.  ABI structures must have the same
>> alignment and size for 32/64 bit userspace, or they need compat handling.
> The size is 16 on 32-bit ppc -- the alignment of __u64 forces this.  It
> looks like this is different in the 32x86 ABI.
>
> We can pad explicitly if you prefer.

I would prefer if we keep this stable :). There's no good reason to pad 
it - ppc64 creates the same struct definition.

>>> There shouldn't be any alignment issues.
>>>
>>>> Another alternative is to drop the num_dirty field (and let the kernel
>>>> compute it instead, shouldn't take long?), and have the third argument
>>>> to ioctl() reference the bitmap directly.
>>> The idea was to make it possible for the kernel to apply a threshold
>>> above which it would be better to ignore the bitmap entirely and flush
>>> everything:
>>>
>>> http://www.spinics.net/lists/kvm/msg50079.html
>>>
>>> Currently we always just flush everything, and QEMU always says
>>> everything is dirty when it makes a change, but the API is there if needed.
>> Right, but you don't need num_dirty for it.  There are typically only a
>> few dozen entries, yes?  It should take a trivial amount of time to
>> calculate its weight.
> There are over 500 entries currently, and QEMU could make it much larger
> if it wants to decrease guest-visible faults on certain workloads.
>
> It's not the most important feature, indeed we currently ignore the
> bitmap entirely.  But it could be useful depending on how the API is
> used in the future, and I don't think we gain much by dropping it at
> this point.  Alex, any thoughts?

The kernel can always opt in to ignore the field if it chooses to, so I 
don't see the point in dropping it. There shouldn't be an alignment 
problem in the first place :).


Alex

From mboxrd@z Thu Jan  1 00:00:00 1970
From: Alexander Graf <agraf@suse.de>
Date: Thu, 10 Nov 2011 14:20:28 +0000
Subject: Re: [PATCH 04/14] KVM: PPC: e500: MMU API
Message-Id: <4EBBDDAC.1060505@suse.de>
List-Id: <kvm-ppc.vger.kernel.org>
References: <1320047596-20577-1-git-send-email-agraf@suse.de> <1320047596-20577-5-git-send-email-agraf@suse.de> <4EAEA184.4050807@redhat.com> <4EAF013C.7050206@freescale.com> <4EAFB4B9.2040806@redhat.com> <4EB01B4B.8090209@freescale.com>
In-Reply-To: <4EB01B4B.8090209@freescale.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: Scott Wood <scottwood@freescale.com>
Cc: Avi Kivity <avi@redhat.com>, kvm-ppc@vger.kernel.org, kvm list <kvm@vger.kernel.org>, Marcelo Tosatti <mtosatti@redhat.com>

On 11/01/2011 05:16 PM, Scott Wood wrote:
> On 11/01/2011 03:58 AM, Avi Kivity wrote:
>> On 10/31/2011 10:12 PM, Scott Wood wrote:
>>>>> +4.59 KVM_DIRTY_TLB
>>>>> +
>>>>> +Capability: KVM_CAP_SW_TLB
>>>>> +Architectures: ppc
>>>>> +Type: vcpu ioctl
>>>>> +Parameters: struct kvm_dirty_tlb (in)
>>>>> +Returns: 0 on success, -1 on error
>>>>> +
>>>>> +struct kvm_dirty_tlb {
>>>>> +	__u64 bitmap;
>>>>> +	__u32 num_dirty;
>>>>> +};
>>>> This is not 32/64 bit safe.  e500 is 32-bit only, yes?
>>> e5500 is 64-bit -- we don't support it with KVM yet, but it's planned.
>>>
>>>> but what if someone wants to emulate an e500 on a ppc64?  maybe it's better to add
>>>> padding here.
>>> What is unsafe about it?  Are you picturing TLBs with more than 4
>>> billion entries?
>> sizeof(struct kvm_tlb_dirty) = 12 for 32-bit userspace, but =  16 for
>> 64-bit userspace and the kernel.  ABI structures must have the same
>> alignment and size for 32/64 bit userspace, or they need compat handling.
> The size is 16 on 32-bit ppc -- the alignment of __u64 forces this.  It
> looks like this is different in the 32x86 ABI.
>
> We can pad explicitly if you prefer.

I would prefer if we keep this stable :). There's no good reason to pad 
it - ppc64 creates the same struct definition.

>>> There shouldn't be any alignment issues.
>>>
>>>> Another alternative is to drop the num_dirty field (and let the kernel
>>>> compute it instead, shouldn't take long?), and have the third argument
>>>> to ioctl() reference the bitmap directly.
>>> The idea was to make it possible for the kernel to apply a threshold
>>> above which it would be better to ignore the bitmap entirely and flush
>>> everything:
>>>
>>> http://www.spinics.net/lists/kvm/msg50079.html
>>>
>>> Currently we always just flush everything, and QEMU always says
>>> everything is dirty when it makes a change, but the API is there if needed.
>> Right, but you don't need num_dirty for it.  There are typically only a
>> few dozen entries, yes?  It should take a trivial amount of time to
>> calculate its weight.
> There are over 500 entries currently, and QEMU could make it much larger
> if it wants to decrease guest-visible faults on certain workloads.
>
> It's not the most important feature, indeed we currently ignore the
> bitmap entirely.  But it could be useful depending on how the API is
> used in the future, and I don't think we gain much by dropping it at
> this point.  Alex, any thoughts?

The kernel can always opt in to ignore the field if it chooses to, so I 
don't see the point in dropping it. There shouldn't be an alignment 
problem in the first place :).


Alex