* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
@ 2018-09-03 14:10 ` Nikita Leshenko
0 siblings, 0 replies; 13+ messages in thread
From: Nikita Leshenko @ 2018-09-03 14:10 UTC (permalink / raw)
To: fengguang.wu
Cc: akpm, linux-mm, fengguang.wu, dongx.peng, jingqi.liu, eddie.dong,
dave.hansen, ying.huang, bgregg, kvm, linux-kernel
On September 2, 2018 5:21:15 AM, fengguang.wu@intel.com wrote:
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 8b47507faab5..0c483720de8d 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3892,6 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
> if (type == KVM_EVENT_CREATE_VM) {
> add_uevent_var(env, "EVENT=create");
> kvm->userspace_pid = task_pid_nr(current);
> + current->kvm = kvm;
Is it OK to store `kvm` on the task_struct? What if the thread that
originally created the VM exits? From the documentation it seems
like a VM is associated with an address space and not a specific
thread, so maybe it should be stored on mm_struct?
From Documentation/virtual/kvm/api.txt:
Only run VM ioctls from the same process (address space) that was used
to create the VM.
-Nikita
> } else if (type == KVM_EVENT_DESTROY_VM) {
> add_uevent_var(env, "EVENT=destroy");
> }
> --
> 2.15.0
>
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
@ 2018-09-03 14:10 ` Nikita Leshenko
0 siblings, 0 replies; 13+ messages in thread
From: Nikita Leshenko @ 2018-09-03 14:10 UTC (permalink / raw)
Cc: akpm, linux-mm, fengguang.wu, dongx.peng, jingqi.liu, eddie.dong,
dave.hansen, ying.huang, bgregg, kvm, linux-kernel
On September 2, 2018 5:21:15 AM, fengguang.wu@intel.com wrote:
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 8b47507faab5..0c483720de8d 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3892,6 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
> if (type == KVM_EVENT_CREATE_VM) {
> add_uevent_var(env, "EVENT=create");
> kvm->userspace_pid = task_pid_nr(current);
> + current->kvm = kvm;
Is it OK to store `kvm` on the task_struct? What if the thread that
originally created the VM exits? From the documentation it seems
like a VM is associated with an address space and not a specific
thread, so maybe it should be stored on mm_struct?
From Documentation/virtual/kvm/api.txt:
Only run VM ioctls from the same process (address space) that was used
to create the VM.
-Nikita
> } else if (type == KVM_EVENT_DESTROY_VM) {
> add_uevent_var(env, "EVENT=destroy");
> }
> --
> 2.15.0
>
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
2018-09-03 14:10 ` Nikita Leshenko
(?)
@ 2018-09-03 16:03 ` Christian Borntraeger
2018-09-04 0:28 ` Fengguang Wu
-1 siblings, 1 reply; 13+ messages in thread
From: Christian Borntraeger @ 2018-09-03 16:03 UTC (permalink / raw)
To: Nikita Leshenko, fengguang.wu
Cc: akpm, linux-mm, dongx.peng, jingqi.liu, eddie.dong, dave.hansen,
ying.huang, bgregg, kvm, linux-kernel
On 09/03/2018 04:10 PM, Nikita Leshenko wrote:
> On September 2, 2018 5:21:15 AM, fengguang.wu@intel.com wrote:
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 8b47507faab5..0c483720de8d 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -3892,6 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
>> if (type == KVM_EVENT_CREATE_VM) {
>> add_uevent_var(env, "EVENT=create");
>> kvm->userspace_pid = task_pid_nr(current);
>> + current->kvm = kvm;
>
> Is it OK to store `kvm` on the task_struct? What if the thread that
> originally created the VM exits? From the documentation it seems
> like a VM is associated with an address space and not a specific
> thread, so maybe it should be stored on mm_struct?
Yes, ioctls accessing the kvm can happen from all threads.
>
> From Documentation/virtual/kvm/api.txt:
> Only run VM ioctls from the same process (address space) that was used
> to create the VM.
>
> -Nikita
>> } else if (type == KVM_EVENT_DESTROY_VM) {
>> add_uevent_var(env, "EVENT=destroy");
>> }
>> --
>> 2.15.0
>>
>>
>>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
2018-09-03 16:03 ` Christian Borntraeger
@ 2018-09-04 0:28 ` Fengguang Wu
2018-09-04 0:46 ` Fengguang Wu
0 siblings, 1 reply; 13+ messages in thread
From: Fengguang Wu @ 2018-09-04 0:28 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Nikita Leshenko, akpm, linux-mm, dongx.peng, jingqi.liu,
eddie.dong, dave.hansen, ying.huang, bgregg, kvm, linux-kernel
Hi Christian and Nikita,
On Mon, Sep 03, 2018 at 06:03:49PM +0200, Christian Borntraeger wrote:
>
>
>On 09/03/2018 04:10 PM, Nikita Leshenko wrote:
>> On September 2, 2018 5:21:15 AM, fengguang.wu@intel.com wrote:
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index 8b47507faab5..0c483720de8d 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -3892,6 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
>>> if (type == KVM_EVENT_CREATE_VM) {
>>> add_uevent_var(env, "EVENT=create");
>>> kvm->userspace_pid = task_pid_nr(current);
>>> + current->kvm = kvm;
>>
>> Is it OK to store `kvm` on the task_struct? What if the thread that
>> originally created the VM exits? From the documentation it seems
>> like a VM is associated with an address space and not a specific
>> thread, so maybe it should be stored on mm_struct?
>
>Yes, ioctls accessing the kvm can happen from all threads.
Good point, thank you for the tips! I'll move kvm pointer to mm_struct.
>> From Documentation/virtual/kvm/api.txt:
>> Only run VM ioctls from the same process (address space) that was used
>> to create the VM.
>>
>> -Nikita
Regards,
Fengguang
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
2018-09-04 0:28 ` Fengguang Wu
@ 2018-09-04 0:46 ` Fengguang Wu
2018-09-04 6:37 ` Nikita Leshenko
0 siblings, 1 reply; 13+ messages in thread
From: Fengguang Wu @ 2018-09-04 0:46 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Nikita Leshenko, akpm, linux-mm, dongx.peng, jingqi.liu,
eddie.dong, dave.hansen, ying.huang, bgregg, kvm, linux-kernel
On Tue, Sep 04, 2018 at 08:28:18AM +0800, Fengguang Wu wrote:
>Hi Christian and Nikita,
>
>On Mon, Sep 03, 2018 at 06:03:49PM +0200, Christian Borntraeger wrote:
>>
>>
>>On 09/03/2018 04:10 PM, Nikita Leshenko wrote:
>>> On September 2, 2018 5:21:15 AM, fengguang.wu@intel.com wrote:
>>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>>> index 8b47507faab5..0c483720de8d 100644
>>>> --- a/virt/kvm/kvm_main.c
>>>> +++ b/virt/kvm/kvm_main.c
>>>> @@ -3892,6 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
>>>> if (type == KVM_EVENT_CREATE_VM) {
>>>> add_uevent_var(env, "EVENT=create");
>>>> kvm->userspace_pid = task_pid_nr(current);
>>>> + current->kvm = kvm;
>>>
>>> Is it OK to store `kvm` on the task_struct? What if the thread that
>>> originally created the VM exits? From the documentation it seems
>>> like a VM is associated with an address space and not a specific
>>> thread, so maybe it should be stored on mm_struct?
>>
>>Yes, ioctls accessing the kvm can happen from all threads.
>
>Good point, thank you for the tips! I'll move kvm pointer to mm_struct.
>
>>> From Documentation/virtual/kvm/api.txt:
>>> Only run VM ioctls from the same process (address space) that was used
>>> to create the VM.
>>>
>>> -Nikita
Here it goes:
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 99ce070e7dcb..27c5446f3deb 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -27,6 +27,7 @@ typedef int vm_fault_t;
struct address_space;
struct mem_cgroup;
struct hmm;
+struct kvm;
/*
* Each physical page in the system has a struct page associated with
@@ -489,10 +490,19 @@ struct mm_struct {
/* HMM needs to track a few things per mm */
struct hmm *hmm;
#endif
+#if IS_ENABLED(CONFIG_KVM)
+ struct kvm *kvm;
+#endif
} __randomize_layout;
extern struct mm_struct init_mm;
+#if IS_ENABLED(CONFIG_KVM)
+static inline struct kvm *mm_kvm(struct mm_struct *mm) { return mm->kvm; }
+#else
+static inline struct kvm *mm_kvm(struct mm_struct *mm) { return NULL; }
+#endif
+
static inline void mm_init_cpumask(struct mm_struct *mm)
{
#ifdef CONFIG_CPUMASK_OFFSTACK
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 0c483720de8d..dca6156a7b35 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3892,7 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
if (type == KVM_EVENT_CREATE_VM) {
add_uevent_var(env, "EVENT=create");
kvm->userspace_pid = task_pid_nr(current);
- current->kvm = kvm;
+ current->mm->kvm = kvm;
} else if (type == KVM_EVENT_DESTROY_VM) {
add_uevent_var(env, "EVENT=destroy");
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
2018-09-04 0:46 ` Fengguang Wu
@ 2018-09-04 6:37 ` Nikita Leshenko
2018-09-04 7:15 ` Fengguang Wu
0 siblings, 1 reply; 13+ messages in thread
From: Nikita Leshenko @ 2018-09-04 6:37 UTC (permalink / raw)
To: Fengguang Wu
Cc: Christian Borntraeger, akpm, linux-mm, dongx.peng, jingqi.liu,
eddie.dong, dave.hansen, ying.huang, bgregg, kvm, linux-kernel
On 4 Sep 2018, at 2:46, Fengguang Wu <fengguang.wu@intel.com> wrote:
>
> Here it goes:
>
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 99ce070e7dcb..27c5446f3deb 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -27,6 +27,7 @@ typedef int vm_fault_t;
> struct address_space;
> struct mem_cgroup;
> struct hmm;
> +struct kvm;
> /*
> * Each physical page in the system has a struct page associated with
> @@ -489,10 +490,19 @@ struct mm_struct {
> /* HMM needs to track a few things per mm */
> struct hmm *hmm;
> #endif
> +#if IS_ENABLED(CONFIG_KVM)
> + struct kvm *kvm;
> +#endif
> } __randomize_layout;
> extern struct mm_struct init_mm;
> +#if IS_ENABLED(CONFIG_KVM)
> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return mm->kvm; }
> +#else
> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return NULL; }
> +#endif
> +
> static inline void mm_init_cpumask(struct mm_struct *mm)
> {
> #ifdef CONFIG_CPUMASK_OFFSTACK
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 0c483720de8d..dca6156a7b35 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3892,7 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
> if (type == KVM_EVENT_CREATE_VM) {
> add_uevent_var(env, "EVENT=create");
> kvm->userspace_pid = task_pid_nr(current);
> - current->kvm = kvm;
> + current->mm->kvm = kvm;
I think you also need to reset kvm to NULL once the VM is
destroyed, otherwise it would point to dangling memory.
-Nikita
> } else if (type == KVM_EVENT_DESTROY_VM) {
> add_uevent_var(env, "EVENT=destroy");
> }
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
2018-09-04 6:37 ` Nikita Leshenko
@ 2018-09-04 7:15 ` Fengguang Wu
2018-09-04 7:43 ` Christian Borntraeger
0 siblings, 1 reply; 13+ messages in thread
From: Fengguang Wu @ 2018-09-04 7:15 UTC (permalink / raw)
To: Nikita Leshenko
Cc: Christian Borntraeger, akpm, linux-mm, dongx.peng, jingqi.liu,
eddie.dong, dave.hansen, ying.huang, bgregg, kvm, linux-kernel
On Tue, Sep 04, 2018 at 08:37:03AM +0200, Nikita Leshenko wrote:
>On 4 Sep 2018, at 2:46, Fengguang Wu <fengguang.wu@intel.com> wrote:
>>
>> Here it goes:
>>
>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>> index 99ce070e7dcb..27c5446f3deb 100644
>> --- a/include/linux/mm_types.h
>> +++ b/include/linux/mm_types.h
>> @@ -27,6 +27,7 @@ typedef int vm_fault_t;
>> struct address_space;
>> struct mem_cgroup;
>> struct hmm;
>> +struct kvm;
>> /*
>> * Each physical page in the system has a struct page associated with
>> @@ -489,10 +490,19 @@ struct mm_struct {
>> /* HMM needs to track a few things per mm */
>> struct hmm *hmm;
>> #endif
>> +#if IS_ENABLED(CONFIG_KVM)
>> + struct kvm *kvm;
>> +#endif
>> } __randomize_layout;
>> extern struct mm_struct init_mm;
>> +#if IS_ENABLED(CONFIG_KVM)
>> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return mm->kvm; }
>> +#else
>> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return NULL; }
>> +#endif
>> +
>> static inline void mm_init_cpumask(struct mm_struct *mm)
>> {
>> #ifdef CONFIG_CPUMASK_OFFSTACK
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 0c483720de8d..dca6156a7b35 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -3892,7 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
>> if (type == KVM_EVENT_CREATE_VM) {
>> add_uevent_var(env, "EVENT=create");
>> kvm->userspace_pid = task_pid_nr(current);
>> - current->kvm = kvm;
>> + current->mm->kvm = kvm;
>I think you also need to reset kvm to NULL once the VM is
>destroyed, otherwise it would point to dangling memory.
Good point! Here is the incremental patch:
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3894,6 +3894,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
kvm->userspace_pid = task_pid_nr(current);
current->mm->kvm = kvm;
} else if (type == KVM_EVENT_DESTROY_VM) {
+ current->mm->kvm = NULL;
add_uevent_var(env, "EVENT=destroy");
}
add_uevent_var(env, "PID=%d", kvm->userspace_pid);
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
2018-09-04 7:15 ` Fengguang Wu
@ 2018-09-04 7:43 ` Christian Borntraeger
0 siblings, 0 replies; 13+ messages in thread
From: Christian Borntraeger @ 2018-09-04 7:43 UTC (permalink / raw)
To: Fengguang Wu, Nikita Leshenko
Cc: akpm, linux-mm, dongx.peng, jingqi.liu, eddie.dong, dave.hansen,
ying.huang, bgregg, kvm, linux-kernel
On 09/04/2018 09:15 AM, Fengguang Wu wrote:
> On Tue, Sep 04, 2018 at 08:37:03AM +0200, Nikita Leshenko wrote:
>> On 4 Sep 2018, at 2:46, Fengguang Wu <fengguang.wu@intel.com> wrote:
>>>
>>> Here it goes:
>>>
>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>>> index 99ce070e7dcb..27c5446f3deb 100644
>>> --- a/include/linux/mm_types.h
>>> +++ b/include/linux/mm_types.h
>>> @@ -27,6 +27,7 @@ typedef int vm_fault_t;
>>> struct address_space;
>>> struct mem_cgroup;
>>> struct hmm;
>>> +struct kvm;
>>> /*
>>> * Each physical page in the system has a struct page associated with
>>> @@ -489,10 +490,19 @@ struct mm_struct {
>>> /* HMM needs to track a few things per mm */
>>> struct hmm *hmm;
>>> #endif
>>> +#if IS_ENABLED(CONFIG_KVM)
>>> + struct kvm *kvm;
>>> +#endif
>>> } __randomize_layout;
>>> extern struct mm_struct init_mm;
>>> +#if IS_ENABLED(CONFIG_KVM)
>>> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return mm->kvm; }
>>> +#else
>>> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return NULL; }
>>> +#endif
>>> +
>>> static inline void mm_init_cpumask(struct mm_struct *mm)
>>> {
>>> #ifdef CONFIG_CPUMASK_OFFSTACK
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index 0c483720de8d..dca6156a7b35 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -3892,7 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
>>> if (type == KVM_EVENT_CREATE_VM) {
>>> add_uevent_var(env, "EVENT=create");
>>> kvm->userspace_pid = task_pid_nr(current);
>>> - current->kvm = kvm;
>>> + current->mm->kvm = kvm;
>> I think you also need to reset kvm to NULL once the VM is
>> destroyed, otherwise it would point to dangling memory.
>
> Good point! Here is the incremental patch:
>
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3894,6 +3894,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
> kvm->userspace_pid = task_pid_nr(current);
> current->mm->kvm = kvm;
> } else if (type == KVM_EVENT_DESTROY_VM) {
> + current->mm->kvm = NULL;
> add_uevent_var(env, "EVENT=destroy");
> }
> add_uevent_var(env, "PID=%d", kvm->userspace_pid);
I think you should put both code snippets somewhere else. This has probably nothing to do
with the uevent. Instead this should go into kvm_destroy_vm and kvm_create_vm. Make sure
to take care of the error handling.
Can you point us to the original discussion about the why and what you are
trying to achieve?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
@ 2018-09-04 7:43 ` Christian Borntraeger
0 siblings, 0 replies; 13+ messages in thread
From: Christian Borntraeger @ 2018-09-04 7:43 UTC (permalink / raw)
To: Fengguang Wu, Nikita Leshenko
Cc: akpm, linux-mm, dongx.peng, jingqi.liu, eddie.dong, dave.hansen,
ying.huang, bgregg, kvm, linux-kernel
On 09/04/2018 09:15 AM, Fengguang Wu wrote:
> On Tue, Sep 04, 2018 at 08:37:03AM +0200, Nikita Leshenko wrote:
>> On 4 Sep 2018, at 2:46, Fengguang Wu <fengguang.wu@intel.com> wrote:
>>>
>>> Here it goes:
>>>
>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>>> index 99ce070e7dcb..27c5446f3deb 100644
>>> --- a/include/linux/mm_types.h
>>> +++ b/include/linux/mm_types.h
>>> @@ -27,6 +27,7 @@ typedef int vm_fault_t;
>>> struct address_space;
>>> struct mem_cgroup;
>>> struct hmm;
>>> +struct kvm;
>>> /*
>>> * Each physical page in the system has a struct page associated with
>>> @@ -489,10 +490,19 @@ struct mm_struct {
>>> A A A A /* HMM needs to track a few things per mm */
>>> A A A A struct hmm *hmm;
>>> #endif
>>> +#if IS_ENABLED(CONFIG_KVM)
>>> +A A A struct kvm *kvm;
>>> +#endif
>>> } __randomize_layout;
>>> extern struct mm_struct init_mm;
>>> +#if IS_ENABLED(CONFIG_KVM)
>>> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return mm->kvm; }
>>> +#else
>>> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return NULL; }
>>> +#endif
>>> +
>>> static inline void mm_init_cpumask(struct mm_struct *mm)
>>> {
>>> #ifdef CONFIG_CPUMASK_OFFSTACK
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index 0c483720de8d..dca6156a7b35 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -3892,7 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
>>> A A A A if (type == KVM_EVENT_CREATE_VM) {
>>> A A A A A A A add_uevent_var(env, "EVENT=create");
>>> A A A A A A A kvm->userspace_pid = task_pid_nr(current);
>>> -A A A A A A A current->kvm = kvm;
>>> +A A A A A A A current->mm->kvm = kvm;
>> I think you also need to reset kvm to NULL once the VM is
>> destroyed, otherwise it would point to dangling memory.
>
> Good point! Here is the incremental patch:
>
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3894,6 +3894,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
> A A A A A A A A A A A A A A kvm->userspace_pid = task_pid_nr(current);
> A A A A A A A A A A A A A A current->mm->kvm = kvm;
> A A A A A A } else if (type == KVM_EVENT_DESTROY_VM) {
> +A A A A A A A A A A A A A A current->mm->kvm = NULL;
> A A A A A A A A A A A A A A add_uevent_var(env, "EVENT=destroy");
> A A A A A A }
> A A A A A A add_uevent_var(env, "PID=%d", kvm->userspace_pid);
I think you should put both code snippets somewhere else. This has probably nothing to do
with the uevent. Instead this should go into kvm_destroy_vm and kvm_create_vm. Make sure
to take care of the error handling.
Can you point us to the original discussion about the why and what you are
trying to achieve?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
2018-09-04 7:43 ` Christian Borntraeger
@ 2018-09-04 8:31 ` Fengguang Wu
-1 siblings, 0 replies; 13+ messages in thread
From: Fengguang Wu @ 2018-09-04 8:31 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Nikita Leshenko, akpm, linux-mm, dongx.peng, jingqi.liu,
eddie.dong, dave.hansen, ying.huang, bgregg, kvm, linux-kernel
On Tue, Sep 04, 2018 at 09:43:50AM +0200, Christian Borntraeger wrote:
>
>
>On 09/04/2018 09:15 AM, Fengguang Wu wrote:
>> On Tue, Sep 04, 2018 at 08:37:03AM +0200, Nikita Leshenko wrote:
>>> On 4 Sep 2018, at 2:46, Fengguang Wu <fengguang.wu@intel.com> wrote:
>>>>
>>>> Here it goes:
>>>>
>>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>>>> index 99ce070e7dcb..27c5446f3deb 100644
>>>> --- a/include/linux/mm_types.h
>>>> +++ b/include/linux/mm_types.h
>>>> @@ -27,6 +27,7 @@ typedef int vm_fault_t;
>>>> struct address_space;
>>>> struct mem_cgroup;
>>>> struct hmm;
>>>> +struct kvm;
>>>> /*
>>>> * Each physical page in the system has a struct page associated with
>>>> @@ -489,10 +490,19 @@ struct mm_struct {
>>>> /* HMM needs to track a few things per mm */
>>>> struct hmm *hmm;
>>>> #endif
>>>> +#if IS_ENABLED(CONFIG_KVM)
>>>> + struct kvm *kvm;
>>>> +#endif
>>>> } __randomize_layout;
>>>> extern struct mm_struct init_mm;
>>>> +#if IS_ENABLED(CONFIG_KVM)
>>>> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return mm->kvm; }
>>>> +#else
>>>> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return NULL; }
>>>> +#endif
>>>> +
>>>> static inline void mm_init_cpumask(struct mm_struct *mm)
>>>> {
>>>> #ifdef CONFIG_CPUMASK_OFFSTACK
>>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>>> index 0c483720de8d..dca6156a7b35 100644
>>>> --- a/virt/kvm/kvm_main.c
>>>> +++ b/virt/kvm/kvm_main.c
>>>> @@ -3892,7 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
>>>> if (type == KVM_EVENT_CREATE_VM) {
>>>> add_uevent_var(env, "EVENT=create");
>>>> kvm->userspace_pid = task_pid_nr(current);
>>>> - current->kvm = kvm;
>>>> + current->mm->kvm = kvm;
>>> I think you also need to reset kvm to NULL once the VM is
>>> destroyed, otherwise it would point to dangling memory.
>>
>> Good point! Here is the incremental patch:
>>
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -3894,6 +3894,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
>> kvm->userspace_pid = task_pid_nr(current);
>> current->mm->kvm = kvm;
>> } else if (type == KVM_EVENT_DESTROY_VM) {
>> + current->mm->kvm = NULL;
>> add_uevent_var(env, "EVENT=destroy");
>> }
>> add_uevent_var(env, "PID=%d", kvm->userspace_pid);
>
>I think you should put both code snippets somewhere else. This has probably nothing to do
>with the uevent. Instead this should go into kvm_destroy_vm and kvm_create_vm. Make sure
>to take care of the error handling.
OK. Will set the pointer late and reset it early like this. Since
there are several error conditions after kvm_create_vm(), it may be
more convenient to set it in kvm_dev_ioctl_create_vm(), when there are
no more errors to handle:
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -724,6 +724,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
struct mm_struct *mm = kvm->mm;
kvm_uevent_notify_change(KVM_EVENT_DESTROY_VM, kvm);
+ current->mm->kvm = NULL;
kvm_destroy_vm_debugfs(kvm);
kvm_arch_sync_events(kvm);
spin_lock(&kvm_lock);
@@ -3206,6 +3207,7 @@ static int kvm_dev_ioctl_create_vm(unsigned long type)
fput(file);
return -ENOMEM;
}
+ current->mm->kvm = kvm;
kvm_uevent_notify_change(KVM_EVENT_CREATE_VM, kvm);
fd_install(r, file);
>Can you point us to the original discussion about the why and what you are
>trying to achieve?
It's the initial RFC post. [PATCH 0] describes some background info.
Basically we're implementing /proc/PID/idle_bitmap for user space to
walk page tables and get "accessed" bits. Since VM's "accessed" bits
will be reflected in EPT (or AMD NPT), we'll need to walk EPT when
detected it is QEMU main process.
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
@ 2018-09-04 8:31 ` Fengguang Wu
0 siblings, 0 replies; 13+ messages in thread
From: Fengguang Wu @ 2018-09-04 8:31 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Nikita Leshenko, akpm, linux-mm, dongx.peng, jingqi.liu,
eddie.dong, dave.hansen, ying.huang, bgregg, kvm, linux-kernel
On Tue, Sep 04, 2018 at 09:43:50AM +0200, Christian Borntraeger wrote:
>
>
>On 09/04/2018 09:15 AM, Fengguang Wu wrote:
>> On Tue, Sep 04, 2018 at 08:37:03AM +0200, Nikita Leshenko wrote:
>>> On 4 Sep 2018, at 2:46, Fengguang Wu <fengguang.wu@intel.com> wrote:
>>>>
>>>> Here it goes:
>>>>
>>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>>>> index 99ce070e7dcb..27c5446f3deb 100644
>>>> --- a/include/linux/mm_types.h
>>>> +++ b/include/linux/mm_types.h
>>>> @@ -27,6 +27,7 @@ typedef int vm_fault_t;
>>>> struct address_space;
>>>> struct mem_cgroup;
>>>> struct hmm;
>>>> +struct kvm;
>>>> /*
>>>> * Each physical page in the system has a struct page associated with
>>>> @@ -489,10 +490,19 @@ struct mm_struct {
>>>> A A A A /* HMM needs to track a few things per mm */
>>>> A A A A struct hmm *hmm;
>>>> #endif
>>>> +#if IS_ENABLED(CONFIG_KVM)
>>>> +A A A struct kvm *kvm;
>>>> +#endif
>>>> } __randomize_layout;
>>>> extern struct mm_struct init_mm;
>>>> +#if IS_ENABLED(CONFIG_KVM)
>>>> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return mm->kvm; }
>>>> +#else
>>>> +static inline struct kvm *mm_kvm(struct mm_struct *mm) { return NULL; }
>>>> +#endif
>>>> +
>>>> static inline void mm_init_cpumask(struct mm_struct *mm)
>>>> {
>>>> #ifdef CONFIG_CPUMASK_OFFSTACK
>>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>>> index 0c483720de8d..dca6156a7b35 100644
>>>> --- a/virt/kvm/kvm_main.c
>>>> +++ b/virt/kvm/kvm_main.c
>>>> @@ -3892,7 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
>>>> A A A A if (type == KVM_EVENT_CREATE_VM) {
>>>> A A A A A A A add_uevent_var(env, "EVENT=create");
>>>> A A A A A A A kvm->userspace_pid = task_pid_nr(current);
>>>> -A A A A A A A current->kvm = kvm;
>>>> +A A A A A A A current->mm->kvm = kvm;
>>> I think you also need to reset kvm to NULL once the VM is
>>> destroyed, otherwise it would point to dangling memory.
>>
>> Good point! Here is the incremental patch:
>>
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -3894,6 +3894,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
>> A A A A A A A A A A A A A A kvm->userspace_pid = task_pid_nr(current);
>> A A A A A A A A A A A A A A current->mm->kvm = kvm;
>> A A A A A A } else if (type == KVM_EVENT_DESTROY_VM) {
>> +A A A A A A A A A A A A A A current->mm->kvm = NULL;
>> A A A A A A A A A A A A A A add_uevent_var(env, "EVENT=destroy");
>> A A A A A A }
>> A A A A A A add_uevent_var(env, "PID=%d", kvm->userspace_pid);
>
>I think you should put both code snippets somewhere else. This has probably nothing to do
>with the uevent. Instead this should go into kvm_destroy_vm and kvm_create_vm. Make sure
>to take care of the error handling.
OK. Will set the pointer late and reset it early like this. Since
there are several error conditions after kvm_create_vm(), it may be
more convenient to set it in kvm_dev_ioctl_create_vm(), when there are
no more errors to handle:
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -724,6 +724,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
struct mm_struct *mm = kvm->mm;
kvm_uevent_notify_change(KVM_EVENT_DESTROY_VM, kvm);
+ current->mm->kvm = NULL;
kvm_destroy_vm_debugfs(kvm);
kvm_arch_sync_events(kvm);
spin_lock(&kvm_lock);
@@ -3206,6 +3207,7 @@ static int kvm_dev_ioctl_create_vm(unsigned long type)
fput(file);
return -ENOMEM;
}
+ current->mm->kvm = kvm;
kvm_uevent_notify_change(KVM_EVENT_CREATE_VM, kvm);
fd_install(r, file);
>Can you point us to the original discussion about the why and what you are
>trying to achieve?
It's the initial RFC post. [PATCH 0] describes some background info.
Basically we're implementing /proc/PID/idle_bitmap for user space to
walk page tables and get "accessed" bits. Since VM's "accessed" bits
will be reflected in EPT (or AMD NPT), we'll need to walk EPT when
detected it is QEMU main process.
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC][PATCH 0/5] introduce /proc/PID/idle_bitmap
@ 2018-09-01 11:28 Fengguang Wu
2018-09-01 11:28 ` Fengguang Wu
0 siblings, 1 reply; 13+ messages in thread
From: Fengguang Wu @ 2018-09-01 11:28 UTC (permalink / raw)
To: Andrew Morton
Cc: Linux Memory Management List, kvm, Peng DongX, Liu Jingqi,
Dong Eddie, Dave Hansen, Huang Ying, Brendan Gregg, Fengguang Wu,
LKML
This new /proc/PID/idle_bitmap interface aims to complement the current global
/sys/kernel/mm/page_idle/bitmap. To enable efficient user space driven migrations.
The pros and cons will be discussed in changelog of "[PATCH] proc: introduce
/proc/PID/idle_bitmap". The driving force is to improve efficiency by 10+
times, so that hot/cold page tracking can be done in some regular intervals in
user space w/o too much overheads. Making it possible for some user space
daemon to do regular page migration between NUMA nodes of different speeds.
Note it's not about NUMA migration between local and remote nodes -- we already
have NUMA balancing for that. This interface and user space migration daemon
targets for NUMA nodes made of different mediums -- ie. DIMM and NVDIMM(*) --
with larger performance gaps. Basic policy will be "move hot pages to DIMM;
cold pages to NVDIMM".
Since NVDIMMs size can easily reach several Terabytes, working set tracking
efficiency will matter and be challeging.
(*) Here we use persistent memory (PMEM) w/o using its persistence.
Persistence is good to have, however it requires modifying applications.
Upcoming NVDIMM products like Intel Apache Pass (AEP) will be more cost and energy
effective than DRAM, but slower. Merely using it in form of NUMA memory node
could immediately benefit many workloads. For example, warm but not hot apps,
workloads with sharp hot/cold page distribution (good for migration), or relies
more on memory size than latency and bandwidth, and do more reads than writes.
This is an early RFC version to collect feedbacks. It's complete enough to demo
the basic ideas and performance, however not usable yet.
Regards,
Fengguang
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
2018-09-01 11:28 [RFC][PATCH 0/5] introduce /proc/PID/idle_bitmap Fengguang Wu
@ 2018-09-01 11:28 ` Fengguang Wu
0 siblings, 0 replies; 13+ messages in thread
From: Fengguang Wu @ 2018-09-01 11:28 UTC (permalink / raw)
To: Andrew Morton
Cc: Linux Memory Management List, Fengguang Wu, Peng DongX,
Liu Jingqi, Dong Eddie, Dave Hansen, Huang Ying, Brendan Gregg,
kvm, LKML
[-- Attachment #1: 0001-kvm-register-in-task_struct.patch --]
[-- Type: text/plain, Size: 1976 bytes --]
The added pointer will be used by the /proc/PID/idle_bitmap code to
quickly identify QEMU task and walk EPT/NPT accordingly. For virtual
machines, the A bits will be set in guest page tables and EPT/NPT,
rather than the QEMU task page table.
This costs 8 bytes in task_struct which could be wasteful for the
majority normal tasks. The alternative is to add a flag only, and
let it find the corresponding VM in kvm vm_list.
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
include/linux/sched.h | 10 ++++++++++
virt/kvm/kvm_main.c | 1 +
2 files changed, 11 insertions(+)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 43731fe51c97..26c8549bbc28 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -38,6 +38,7 @@ struct cfs_rq;
struct fs_struct;
struct futex_pi_state;
struct io_context;
+struct kvm;
struct mempolicy;
struct nameidata;
struct nsproxy;
@@ -1179,6 +1180,9 @@ struct task_struct {
/* Used by LSM modules for access restriction: */
void *security;
#endif
+#if IS_ENABLED(CONFIG_KVM)
+ struct kvm *kvm;
+#endif
/*
* New fields for task_struct should be added above here, so that
@@ -1898,4 +1902,10 @@ static inline void rseq_syscall(struct pt_regs *regs)
#endif
+#if IS_ENABLED(CONFIG_KVM)
+static inline struct kvm *task_kvm(struct task_struct *t) { return t->kvm; }
+#else
+static inline struct kvm *task_kvm(struct task_struct *t) { return NULL; }
+#endif
+
#endif
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8b47507faab5..0c483720de8d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3892,6 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
if (type == KVM_EVENT_CREATE_VM) {
add_uevent_var(env, "EVENT=create");
kvm->userspace_pid = task_pid_nr(current);
+ current->kvm = kvm;
} else if (type == KVM_EVENT_DESTROY_VM) {
add_uevent_var(env, "EVENT=destroy");
}
--
2.15.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct
@ 2018-09-01 11:28 ` Fengguang Wu
0 siblings, 0 replies; 13+ messages in thread
From: Fengguang Wu @ 2018-09-01 11:28 UTC (permalink / raw)
To: Andrew Morton
Cc: Linux Memory Management List, Fengguang Wu, Peng DongX,
Liu Jingqi, Dong Eddie, Dave Hansen, Huang Ying, Brendan Gregg,
kvm, LKML
[-- Attachment #1: 0001-kvm-register-in-task_struct.patch --]
[-- Type: text/plain, Size: 1973 bytes --]
The added pointer will be used by the /proc/PID/idle_bitmap code to
quickly identify QEMU task and walk EPT/NPT accordingly. For virtual
machines, the A bits will be set in guest page tables and EPT/NPT,
rather than the QEMU task page table.
This costs 8 bytes in task_struct which could be wasteful for the
majority normal tasks. The alternative is to add a flag only, and
let it find the corresponding VM in kvm vm_list.
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
include/linux/sched.h | 10 ++++++++++
virt/kvm/kvm_main.c | 1 +
2 files changed, 11 insertions(+)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 43731fe51c97..26c8549bbc28 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -38,6 +38,7 @@ struct cfs_rq;
struct fs_struct;
struct futex_pi_state;
struct io_context;
+struct kvm;
struct mempolicy;
struct nameidata;
struct nsproxy;
@@ -1179,6 +1180,9 @@ struct task_struct {
/* Used by LSM modules for access restriction: */
void *security;
#endif
+#if IS_ENABLED(CONFIG_KVM)
+ struct kvm *kvm;
+#endif
/*
* New fields for task_struct should be added above here, so that
@@ -1898,4 +1902,10 @@ static inline void rseq_syscall(struct pt_regs *regs)
#endif
+#if IS_ENABLED(CONFIG_KVM)
+static inline struct kvm *task_kvm(struct task_struct *t) { return t->kvm; }
+#else
+static inline struct kvm *task_kvm(struct task_struct *t) { return NULL; }
+#endif
+
#endif
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8b47507faab5..0c483720de8d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3892,6 +3892,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm)
if (type == KVM_EVENT_CREATE_VM) {
add_uevent_var(env, "EVENT=create");
kvm->userspace_pid = task_pid_nr(current);
+ current->kvm = kvm;
} else if (type == KVM_EVENT_DESTROY_VM) {
add_uevent_var(env, "EVENT=destroy");
}
--
2.15.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-09-04 8:31 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-03 14:10 [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct Nikita Leshenko
2018-09-03 14:10 ` Nikita Leshenko
2018-09-03 16:03 ` Christian Borntraeger
2018-09-04 0:28 ` Fengguang Wu
2018-09-04 0:46 ` Fengguang Wu
2018-09-04 6:37 ` Nikita Leshenko
2018-09-04 7:15 ` Fengguang Wu
2018-09-04 7:43 ` Christian Borntraeger
2018-09-04 7:43 ` Christian Borntraeger
2018-09-04 8:31 ` Fengguang Wu
2018-09-04 8:31 ` Fengguang Wu
-- strict thread matches above, loose matches on Subject: below --
2018-09-01 11:28 [RFC][PATCH 0/5] introduce /proc/PID/idle_bitmap Fengguang Wu
2018-09-01 11:28 ` [RFC][PATCH 1/5] [PATCH 1/5] kvm: register in task_struct Fengguang Wu
2018-09-01 11:28 ` Fengguang Wu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.