All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
@ 2011-02-04  9:49 Jan Kiszka
  2011-02-04 21:03 ` Zachary Amsden
  2011-02-10 10:40 ` Avi Kivity
  0 siblings, 2 replies; 17+ messages in thread
From: Jan Kiszka @ 2011-02-04  9:49 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, Linux Kernel Mailing List, Zachary Amsden

Code under this lock requires non-preemptibility. Ensure this also over
-rt by converting it to raw spinlock.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/include/asm/kvm_host.h |    2 +-
 arch/x86/kvm/x86.c              |    6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ffd7f8d..25e7734 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -448,7 +448,7 @@ struct kvm_arch {
 
 	unsigned long irq_sources_bitmap;
 	s64 kvmclock_offset;
-	spinlock_t tsc_write_lock;
+	raw_spinlock_t tsc_write_lock;
 	u64 last_tsc_nsec;
 	u64 last_tsc_offset;
 	u64 last_tsc_write;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a7f65aa..d813919 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1017,7 +1017,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data)
 	unsigned long flags;
 	s64 sdiff;
 
-	spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
+	raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
 	offset = data - native_read_tsc();
 	ns = get_kernel_ns();
 	elapsed = ns - kvm->arch.last_tsc_nsec;
@@ -1050,7 +1050,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data)
 	kvm->arch.last_tsc_write = data;
 	kvm->arch.last_tsc_offset = offset;
 	kvm_x86_ops->write_tsc_offset(vcpu, offset);
-	spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
+	raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
 
 	/* Reset of TSC must disable overshoot protection below */
 	vcpu->arch.hv_clock.tsc_timestamp = 0;
@@ -6009,7 +6009,7 @@ int kvm_arch_init_vm(struct kvm *kvm)
 	/* Reserve bit 0 of irq_sources_bitmap for userspace irq source */
 	set_bit(KVM_USERSPACE_IRQ_SOURCE_ID, &kvm->arch.irq_sources_bitmap);
 
-	spin_lock_init(&kvm->arch.tsc_write_lock);
+	raw_spin_lock_init(&kvm->arch.tsc_write_lock);
 
 	return 0;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-04  9:49 [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock Jan Kiszka
@ 2011-02-04 21:03 ` Zachary Amsden
  2011-02-07 11:35   ` Jan Kiszka
  2011-02-10 10:40 ` Avi Kivity
  1 sibling, 1 reply; 17+ messages in thread
From: Zachary Amsden @ 2011-02-04 21:03 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 02/04/2011 04:49 AM, Jan Kiszka wrote:
> Code under this lock requires non-preemptibility. Ensure this also over
> -rt by converting it to raw spinlock.
>    

Oh dear, I had forgotten about that.  I believe kvm_lock might have the 
same assumption in a few places regarding clock.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-04 21:03 ` Zachary Amsden
@ 2011-02-07 11:35   ` Jan Kiszka
  2011-02-07 14:11     ` Zachary Amsden
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2011-02-07 11:35 UTC (permalink / raw)
  To: Zachary Amsden
  Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 2011-02-04 22:03, Zachary Amsden wrote:
> On 02/04/2011 04:49 AM, Jan Kiszka wrote:
>> Code under this lock requires non-preemptibility. Ensure this also over
>> -rt by converting it to raw spinlock.
>>    
> 
> Oh dear, I had forgotten about that.  I believe kvm_lock might have the 
> same assumption in a few places regarding clock.

I only found a problematic section in kvmclock_cpufreq_notifier. Didn't
see this during my tests as I have CPUFREQ disabled in my .config.

We may need something like this as converting kvm_lock would likely be
overkill:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 36f54fb..971ee0d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4530,7 +4530,7 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va
 	struct cpufreq_freqs *freq = data;
 	struct kvm *kvm;
 	struct kvm_vcpu *vcpu;
-	int i, send_ipi = 0;
+	int i, me, send_ipi = 0;
 
 	/*
 	 * We allow guests to temporarily run on slowing clocks,
@@ -4583,9 +4583,11 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va
 		kvm_for_each_vcpu(i, vcpu, kvm) {
 			if (vcpu->cpu != freq->cpu)
 				continue;
+			me = get_cpu();
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
-			if (vcpu->cpu != smp_processor_id())
+			if (vcpu->cpu != me)
 				send_ipi = 1;
+			put_cpu();
 		}
 	}
 	spin_unlock(&kvm_lock);

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 11:35   ` Jan Kiszka
@ 2011-02-07 14:11     ` Zachary Amsden
  2011-02-07 15:00       ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Zachary Amsden @ 2011-02-07 14:11 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 02/07/2011 06:35 AM, Jan Kiszka wrote:
> On 2011-02-04 22:03, Zachary Amsden wrote:
>    
>> On 02/04/2011 04:49 AM, Jan Kiszka wrote:
>>      
>>> Code under this lock requires non-preemptibility. Ensure this also over
>>> -rt by converting it to raw spinlock.
>>>
>>>        
>> Oh dear, I had forgotten about that.  I believe kvm_lock might have the
>> same assumption in a few places regarding clock.
>>      
> I only found a problematic section in kvmclock_cpufreq_notifier. Didn't
> see this during my tests as I have CPUFREQ disabled in my .config.
>
> We may need something like this as converting kvm_lock would likely be
> overkill:
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 36f54fb..971ee0d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4530,7 +4530,7 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va
>   	struct cpufreq_freqs *freq = data;
>   	struct kvm *kvm;
>   	struct kvm_vcpu *vcpu;
> -	int i, send_ipi = 0;
> +	int i, me, send_ipi = 0;
>
>   	/*
>   	 * We allow guests to temporarily run on slowing clocks,
> @@ -4583,9 +4583,11 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va
>   		kvm_for_each_vcpu(i, vcpu, kvm) {
>   			if (vcpu->cpu != freq->cpu)
>   				continue;
> +			me = get_cpu();
>   			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> -			if (vcpu->cpu != smp_processor_id())
> +			if (vcpu->cpu != me)
>   				send_ipi = 1;
> +			put_cpu();
>   		}
>   	}
>   	spin_unlock(&kvm_lock);
>
> Jan
>
>    

That looks like a good solution, and I do believe that is the only place 
the lock is used in that fashion - please add a comment though in the 
giant comment block above that preemption protection is needed for RT.  
Also, gcc should catch this, but moving the me variable into the 
kvm_for_each_vcpu loop should allow for better register allocation.

The only other thing I can think of is that RT lock preemption may break 
some of the CPU initialization semantics enforced by kvm_lock if you 
happen to get a hotplug event just as the module is loading.  That 
should be rare, but if it is indeed a bug, it would be nice to fix, it 
would be a panic for sure not to initialize VMX.

Cheers,

Zach

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 14:11     ` Zachary Amsden
@ 2011-02-07 15:00       ` Jan Kiszka
  2011-02-07 15:15         ` Zachary Amsden
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2011-02-07 15:00 UTC (permalink / raw)
  To: Zachary Amsden
  Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 2011-02-07 15:11, Zachary Amsden wrote:
> On 02/07/2011 06:35 AM, Jan Kiszka wrote:
>> On 2011-02-04 22:03, Zachary Amsden wrote:
>>    
>>> On 02/04/2011 04:49 AM, Jan Kiszka wrote:
>>>      
>>>> Code under this lock requires non-preemptibility. Ensure this also over
>>>> -rt by converting it to raw spinlock.
>>>>
>>>>        
>>> Oh dear, I had forgotten about that.  I believe kvm_lock might have the
>>> same assumption in a few places regarding clock.
>>>      
>> I only found a problematic section in kvmclock_cpufreq_notifier. Didn't
>> see this during my tests as I have CPUFREQ disabled in my .config.
>>
>> We may need something like this as converting kvm_lock would likely be
>> overkill:
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 36f54fb..971ee0d 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -4530,7 +4530,7 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va
>>   	struct cpufreq_freqs *freq = data;
>>   	struct kvm *kvm;
>>   	struct kvm_vcpu *vcpu;
>> -	int i, send_ipi = 0;
>> +	int i, me, send_ipi = 0;
>>
>>   	/*
>>   	 * We allow guests to temporarily run on slowing clocks,
>> @@ -4583,9 +4583,11 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va
>>   		kvm_for_each_vcpu(i, vcpu, kvm) {
>>   			if (vcpu->cpu != freq->cpu)
>>   				continue;
>> +			me = get_cpu();
>>   			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
>> -			if (vcpu->cpu != smp_processor_id())
>> +			if (vcpu->cpu != me)
>>   				send_ipi = 1;
>> +			put_cpu();
>>   		}
>>   	}
>>   	spin_unlock(&kvm_lock);
>>
>> Jan
>>
>>    
> 
> That looks like a good solution, and I do believe that is the only place 
> the lock is used in that fashion - please add a comment though in the 
> giant comment block above that preemption protection is needed for RT.  
> Also, gcc should catch this, but moving the me variable into the 
> kvm_for_each_vcpu loop should allow for better register allocation.
> 
> The only other thing I can think of is that RT lock preemption may break 
> some of the CPU initialization semantics enforced by kvm_lock if you 
> happen to get a hotplug event just as the module is loading.  That 
> should be rare, but if it is indeed a bug, it would be nice to fix, it 
> would be a panic for sure not to initialize VMX.

Hmm, is a cpu hotplug notifier allowed to run sleepy code? Can't
imagine. So we already have a strong reason to convert kvm_lock to a
raw_spinlock which obsoletes the above workaround.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 15:00       ` Jan Kiszka
@ 2011-02-07 15:15         ` Zachary Amsden
  2011-02-07 15:38           ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Zachary Amsden @ 2011-02-07 15:15 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 02/07/2011 10:00 AM, Jan Kiszka wrote:
> On 2011-02-07 15:11, Zachary Amsden wrote:
>    
>> On 02/07/2011 06:35 AM, Jan Kiszka wrote:
>>      
>>> On 2011-02-04 22:03, Zachary Amsden wrote:
>>>
>>>        
>>>> On 02/04/2011 04:49 AM, Jan Kiszka wrote:
>>>>
>>>>          
>>>>> Code under this lock requires non-preemptibility. Ensure this also over
>>>>> -rt by converting it to raw spinlock.
>>>>>
>>>>>
>>>>>            
>>>> Oh dear, I had forgotten about that.  I believe kvm_lock might have the
>>>> same assumption in a few places regarding clock.
>>>>
>>>>          
>>> I only found a problematic section in kvmclock_cpufreq_notifier. Didn't
>>> see this during my tests as I have CPUFREQ disabled in my .config.
>>>
>>> We may need something like this as converting kvm_lock would likely be
>>> overkill:
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 36f54fb..971ee0d 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -4530,7 +4530,7 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va
>>>    	struct cpufreq_freqs *freq = data;
>>>    	struct kvm *kvm;
>>>    	struct kvm_vcpu *vcpu;
>>> -	int i, send_ipi = 0;
>>> +	int i, me, send_ipi = 0;
>>>
>>>    	/*
>>>    	 * We allow guests to temporarily run on slowing clocks,
>>> @@ -4583,9 +4583,11 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va
>>>    		kvm_for_each_vcpu(i, vcpu, kvm) {
>>>    			if (vcpu->cpu != freq->cpu)
>>>    				continue;
>>> +			me = get_cpu();
>>>    			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
>>> -			if (vcpu->cpu != smp_processor_id())
>>> +			if (vcpu->cpu != me)
>>>    				send_ipi = 1;
>>> +			put_cpu();
>>>    		}
>>>    	}
>>>    	spin_unlock(&kvm_lock);
>>>
>>> Jan
>>>
>>>
>>>        
>> That looks like a good solution, and I do believe that is the only place
>> the lock is used in that fashion - please add a comment though in the
>> giant comment block above that preemption protection is needed for RT.
>> Also, gcc should catch this, but moving the me variable into the
>> kvm_for_each_vcpu loop should allow for better register allocation.
>>
>> The only other thing I can think of is that RT lock preemption may break
>> some of the CPU initialization semantics enforced by kvm_lock if you
>> happen to get a hotplug event just as the module is loading.  That
>> should be rare, but if it is indeed a bug, it would be nice to fix, it
>> would be a panic for sure not to initialize VMX.
>>      
> Hmm, is a cpu hotplug notifier allowed to run sleepy code? Can't
> imagine. So we already have a strong reason to convert kvm_lock to a
> raw_spinlock which obsoletes the above workaround.
>    

I don't know as it is allowed to sleep, it doesn't call any sleeping 
functions to my knowledge.  What worries me in the RT case is that the 
spinlock acquired for hardware_enable might be preempted and run on 
another CPU, which obviously isn't what you want.

Zach

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 15:15         ` Zachary Amsden
@ 2011-02-07 15:38           ` Jan Kiszka
  2011-02-07 15:52             ` Avi Kivity
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2011-02-07 15:38 UTC (permalink / raw)
  To: Zachary Amsden
  Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 2011-02-07 16:15, Zachary Amsden wrote:
> On 02/07/2011 10:00 AM, Jan Kiszka wrote:
>> On 2011-02-07 15:11, Zachary Amsden wrote:
>>    
>>> On 02/07/2011 06:35 AM, Jan Kiszka wrote:
>>>      
>>>> On 2011-02-04 22:03, Zachary Amsden wrote:
>>>>
>>>>        
>>>>> On 02/04/2011 04:49 AM, Jan Kiszka wrote:
>>>>>
>>>>>          
>>>>>> Code under this lock requires non-preemptibility. Ensure this also over
>>>>>> -rt by converting it to raw spinlock.
>>>>>>
>>>>>>
>>>>>>            
>>>>> Oh dear, I had forgotten about that.  I believe kvm_lock might have the
>>>>> same assumption in a few places regarding clock.
>>>>>
>>>>>          
>>>> I only found a problematic section in kvmclock_cpufreq_notifier. Didn't
>>>> see this during my tests as I have CPUFREQ disabled in my .config.
>>>>
>>>> We may need something like this as converting kvm_lock would likely be
>>>> overkill:
>>>>
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index 36f54fb..971ee0d 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -4530,7 +4530,7 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va
>>>>    	struct cpufreq_freqs *freq = data;
>>>>    	struct kvm *kvm;
>>>>    	struct kvm_vcpu *vcpu;
>>>> -	int i, send_ipi = 0;
>>>> +	int i, me, send_ipi = 0;
>>>>
>>>>    	/*
>>>>    	 * We allow guests to temporarily run on slowing clocks,
>>>> @@ -4583,9 +4583,11 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va
>>>>    		kvm_for_each_vcpu(i, vcpu, kvm) {
>>>>    			if (vcpu->cpu != freq->cpu)
>>>>    				continue;
>>>> +			me = get_cpu();
>>>>    			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
>>>> -			if (vcpu->cpu != smp_processor_id())
>>>> +			if (vcpu->cpu != me)
>>>>    				send_ipi = 1;
>>>> +			put_cpu();
>>>>    		}
>>>>    	}
>>>>    	spin_unlock(&kvm_lock);
>>>>
>>>> Jan
>>>>
>>>>
>>>>        
>>> That looks like a good solution, and I do believe that is the only place
>>> the lock is used in that fashion - please add a comment though in the
>>> giant comment block above that preemption protection is needed for RT.
>>> Also, gcc should catch this, but moving the me variable into the
>>> kvm_for_each_vcpu loop should allow for better register allocation.
>>>
>>> The only other thing I can think of is that RT lock preemption may break
>>> some of the CPU initialization semantics enforced by kvm_lock if you
>>> happen to get a hotplug event just as the module is loading.  That
>>> should be rare, but if it is indeed a bug, it would be nice to fix, it
>>> would be a panic for sure not to initialize VMX.
>>>      
>> Hmm, is a cpu hotplug notifier allowed to run sleepy code? Can't
>> imagine. So we already have a strong reason to convert kvm_lock to a
>> raw_spinlock which obsoletes the above workaround.
>>    
> 
> I don't know as it is allowed to sleep, it doesn't call any sleeping 
> functions to my knowledge.  What worries me in the RT case is that the 
> spinlock acquired for hardware_enable might be preempted and run on 
> another CPU, which obviously isn't what you want.

I see now, there are calls to raw_smp_processor_id.

I think it's best to make this a raw lock. At this chance, some
read-only users of vm_list should be rcu'ified. Will have a look.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 15:38           ` Jan Kiszka
@ 2011-02-07 15:52             ` Avi Kivity
  2011-02-07 15:58               ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2011-02-07 15:52 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 02/07/2011 05:38 PM, Jan Kiszka wrote:
> >
> >  I don't know as it is allowed to sleep, it doesn't call any sleeping
> >  functions to my knowledge.  What worries me in the RT case is that the
> >  spinlock acquired for hardware_enable might be preempted and run on
> >  another CPU, which obviously isn't what you want.
>
> I see now, there are calls to raw_smp_processor_id.
>
> I think it's best to make this a raw lock. At this chance, some
> read-only users of vm_list should be rcu'ified. Will have a look.

vm_list is rarely used, for either read or write.  I don't see the need 
to rcu it.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 15:52             ` Avi Kivity
@ 2011-02-07 15:58               ` Jan Kiszka
  2011-02-07 16:26                 ` Avi Kivity
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2011-02-07 15:58 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 2011-02-07 16:52, Avi Kivity wrote:
> On 02/07/2011 05:38 PM, Jan Kiszka wrote:
>>>
>>>  I don't know as it is allowed to sleep, it doesn't call any sleeping
>>>  functions to my knowledge.  What worries me in the RT case is that the
>>>  spinlock acquired for hardware_enable might be preempted and run on
>>>  another CPU, which obviously isn't what you want.
>>
>> I see now, there are calls to raw_smp_processor_id.
>>
>> I think it's best to make this a raw lock. At this chance, some
>> read-only users of vm_list should be rcu'ified. Will have a look.
> 
> vm_list is rarely used, for either read or write.  I don't see the need 
> to rcu it.

Avoid that code under this lock expands the preempt-disabled period,
specifically under -rt, and specifically as the number of objects over
which we loop is user-defined.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 15:58               ` Jan Kiszka
@ 2011-02-07 16:26                 ` Avi Kivity
  2011-02-07 16:59                   ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2011-02-07 16:26 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 02/07/2011 05:58 PM, Jan Kiszka wrote:
> On 2011-02-07 16:52, Avi Kivity wrote:
> >  On 02/07/2011 05:38 PM, Jan Kiszka wrote:
> >>>
> >>>   I don't know as it is allowed to sleep, it doesn't call any sleeping
> >>>   functions to my knowledge.  What worries me in the RT case is that the
> >>>   spinlock acquired for hardware_enable might be preempted and run on
> >>>   another CPU, which obviously isn't what you want.
> >>
> >>  I see now, there are calls to raw_smp_processor_id.
> >>
> >>  I think it's best to make this a raw lock. At this chance, some
> >>  read-only users of vm_list should be rcu'ified. Will have a look.
> >
> >  vm_list is rarely used, for either read or write.  I don't see the need
> >  to rcu it.
>
> Avoid that code under this lock expands the preempt-disabled period,
> specifically under -rt, and specifically as the number of objects over
> which we loop is user-defined.

Good point; even under non-rt.

(well, actually, cpufreq_notifier and kvm_arch_hardware_enable are 
already non preemptible, and the stats code should just go away?)

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 16:26                 ` Avi Kivity
@ 2011-02-07 16:59                   ` Jan Kiszka
  2011-02-07 17:10                     ` Avi Kivity
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2011-02-07 16:59 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 2011-02-07 17:26, Avi Kivity wrote:
> On 02/07/2011 05:58 PM, Jan Kiszka wrote:
>> On 2011-02-07 16:52, Avi Kivity wrote:
>>>  On 02/07/2011 05:38 PM, Jan Kiszka wrote:
>>>>>
>>>>>   I don't know as it is allowed to sleep, it doesn't call any sleeping
>>>>>   functions to my knowledge.  What worries me in the RT case is that the
>>>>>   spinlock acquired for hardware_enable might be preempted and run on
>>>>>   another CPU, which obviously isn't what you want.
>>>>
>>>>  I see now, there are calls to raw_smp_processor_id.
>>>>
>>>>  I think it's best to make this a raw lock. At this chance, some
>>>>  read-only users of vm_list should be rcu'ified. Will have a look.
>>>
>>>  vm_list is rarely used, for either read or write.  I don't see the need
>>>  to rcu it.
>>
>> Avoid that code under this lock expands the preempt-disabled period,
>> specifically under -rt, and specifically as the number of objects over
>> which we loop is user-defined.
> 
> Good point; even under non-rt.
> 
> (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are 
> already non preemptible, and the stats code should just go away?)

The stats code is trivial to convert, so it doesn't matter.

But what about mmu_shrink and its list_move_tail? How is this
synchronized against kvm_destroy_vm - already today?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 16:59                   ` Jan Kiszka
@ 2011-02-07 17:10                     ` Avi Kivity
  2011-02-07 17:23                       ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2011-02-07 17:10 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 02/07/2011 06:59 PM, Jan Kiszka wrote:
> >
> >  (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are
> >  already non preemptible, and the stats code should just go away?)
>
> The stats code is trivial to convert, so it doesn't matter.

Removal is easier.

> But what about mmu_shrink and its list_move_tail? How is this
> synchronized against kvm_destroy_vm - already today?

kvm_destroy_vm() takes kvm_lock.  If a vm is destroyed before 
mmu_shrink(), mmu_shrink() will never see it.  If we reach mmu_shrink() 
before kvm_destroy_vm(), the latter will wait until mmu_shrink() is done.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 17:10                     ` Avi Kivity
@ 2011-02-07 17:23                       ` Jan Kiszka
  2011-02-08  9:15                         ` Avi Kivity
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2011-02-07 17:23 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 2011-02-07 18:10, Avi Kivity wrote:
> On 02/07/2011 06:59 PM, Jan Kiszka wrote:
>>>
>>>  (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are
>>>  already non preemptible, and the stats code should just go away?)
>>
>> The stats code is trivial to convert, so it doesn't matter.
> 
> Removal is easier.

Is that stat interface no longer used?

> 
>> But what about mmu_shrink and its list_move_tail? How is this
>> synchronized against kvm_destroy_vm - already today?
> 
> kvm_destroy_vm() takes kvm_lock.  If a vm is destroyed before 
> mmu_shrink(), mmu_shrink() will never see it.  If we reach mmu_shrink() 
> before kvm_destroy_vm(), the latter will wait until mmu_shrink() is done.
> 

Ah, I was confused. Would require some more logic if we wanted to make
the loop lock-less, though.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-07 17:23                       ` Jan Kiszka
@ 2011-02-08  9:15                         ` Avi Kivity
  2011-02-08  9:55                           ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2011-02-08  9:15 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 02/07/2011 07:23 PM, Jan Kiszka wrote:
> On 2011-02-07 18:10, Avi Kivity wrote:
> >  On 02/07/2011 06:59 PM, Jan Kiszka wrote:
> >>>
> >>>   (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are
> >>>   already non preemptible, and the stats code should just go away?)
> >>
> >>  The stats code is trivial to convert, so it doesn't matter.
> >
> >  Removal is easier.
>
> Is that stat interface no longer used?

It's there for compatibility.  I'm itching to remove it.  See 
qemu-kvm.git/kvm/kvm_stat for the only known user, and for the 
replacement via tracepoints.

Tracepoints have marginally lower overhead when disabled, and somewhat 
higher overhead when enabled.  A disadvantage of tracepoints is that it 
is harder to associate an event with a vm when that event is triggered 
by a workqueue, but I don't think it matters in practice (kvm_stat 
doesn't even provide a per-vm breakdown).

> >
> >>  But what about mmu_shrink and its list_move_tail? How is this
> >>  synchronized against kvm_destroy_vm - already today?
> >
> >  kvm_destroy_vm() takes kvm_lock.  If a vm is destroyed before
> >  mmu_shrink(), mmu_shrink() will never see it.  If we reach mmu_shrink()
> >  before kvm_destroy_vm(), the latter will wait until mmu_shrink() is done.
> >
>
> Ah, I was confused. Would require some more logic if we wanted to make
> the loop lock-less, though.

Yes, the usual rcu_read_lock() / grab reference / rcu_read_unlock().

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-08  9:15                         ` Avi Kivity
@ 2011-02-08  9:55                           ` Jan Kiszka
  2011-02-08  9:58                             ` Avi Kivity
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2011-02-08  9:55 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 2011-02-08 10:15, Avi Kivity wrote:
> On 02/07/2011 07:23 PM, Jan Kiszka wrote:
>> On 2011-02-07 18:10, Avi Kivity wrote:
>>>  On 02/07/2011 06:59 PM, Jan Kiszka wrote:
>>>>>
>>>>>   (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are
>>>>>   already non preemptible, and the stats code should just go away?)
>>>>
>>>>  The stats code is trivial to convert, so it doesn't matter.
>>>
>>>  Removal is easier.
>>
>> Is that stat interface no longer used?
> 
> It's there for compatibility.  I'm itching to remove it.  See 
> qemu-kvm.git/kvm/kvm_stat for the only known user, and for the 
> replacement via tracepoints.

OK, but that will first take a grace period.

> 
> Tracepoints have marginally lower overhead when disabled, and somewhat 
> higher overhead when enabled.  A disadvantage of tracepoints is that it 
> is harder to associate an event with a vm when that event is triggered 
> by a workqueue, but I don't think it matters in practice (kvm_stat 
> doesn't even provide a per-vm breakdown).

What about using the perf infrastructure for this? Besides that perf can
reuse tracepoints, maybe there is even a more efficient way of added new
stat sources.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-08  9:55                           ` Jan Kiszka
@ 2011-02-08  9:58                             ` Avi Kivity
  0 siblings, 0 replies; 17+ messages in thread
From: Avi Kivity @ 2011-02-08  9:58 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List

On 02/08/2011 11:55 AM, Jan Kiszka wrote:
> >
> >  Tracepoints have marginally lower overhead when disabled, and somewhat
> >  higher overhead when enabled.  A disadvantage of tracepoints is that it
> >  is harder to associate an event with a vm when that event is triggered
> >  by a workqueue, but I don't think it matters in practice (kvm_stat
> >  doesn't even provide a per-vm breakdown).
>
> What about using the perf infrastructure for this? Besides that perf can
> reuse tracepoints, maybe there is even a more efficient way of added new
> stat sources.

We are using the perf infrastructure for this (using tracepoints).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock
  2011-02-04  9:49 [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock Jan Kiszka
  2011-02-04 21:03 ` Zachary Amsden
@ 2011-02-10 10:40 ` Avi Kivity
  1 sibling, 0 replies; 17+ messages in thread
From: Avi Kivity @ 2011-02-10 10:40 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Marcelo Tosatti, kvm, Linux Kernel Mailing List, Zachary Amsden

On 02/04/2011 11:49 AM, Jan Kiszka wrote:
> Code under this lock requires non-preemptibility. Ensure this also over
> -rt by converting it to raw spinlock.

Applied, thanks.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-02-10 10:40 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-04  9:49 [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock Jan Kiszka
2011-02-04 21:03 ` Zachary Amsden
2011-02-07 11:35   ` Jan Kiszka
2011-02-07 14:11     ` Zachary Amsden
2011-02-07 15:00       ` Jan Kiszka
2011-02-07 15:15         ` Zachary Amsden
2011-02-07 15:38           ` Jan Kiszka
2011-02-07 15:52             ` Avi Kivity
2011-02-07 15:58               ` Jan Kiszka
2011-02-07 16:26                 ` Avi Kivity
2011-02-07 16:59                   ` Jan Kiszka
2011-02-07 17:10                     ` Avi Kivity
2011-02-07 17:23                       ` Jan Kiszka
2011-02-08  9:15                         ` Avi Kivity
2011-02-08  9:55                           ` Jan Kiszka
2011-02-08  9:58                             ` Avi Kivity
2011-02-10 10:40 ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.