All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anton Ivanov <anton.ivanov@kot-begemot.co.uk>
To: linux-um@lists.infradead.org, Richard Weinberger <richard@nod.at>
Subject: Re: [PATCH v3] Optimise TLB flush for kernel mm in UML
Date: Sat, 6 Oct 2018 22:04:08 +0100	[thread overview]
Message-ID: <83f296bd-feb1-5177-228e-f294aa22fa5f@kot-begemot.co.uk> (raw)
In-Reply-To: <1615399.BP76ARYnqk@blindfold>

On 06/10/2018 21:38, Richard Weinberger wrote:
> Anton,
>
> Am Donnerstag, 4. Oktober 2018, 19:25:10 CEST schrieb anton.ivanov@cambridgegreys.com:
>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>
>> This patch introduces bulking up memory ranges to be passed to
>> mmap/munmap/mprotect instead of doing everything one page at a time.
>>
>> This is already being done for the userspace UML portion, this
>> adds a simplified version of it for the kernel mm.
>>
>> This results in speed up of up to 10%+ in some areas (sequential
>> disk read measured with dd, etc).
> Nice!
> Do you have also data on how much less memory mappings get installed?


Not proper statistics. I had some debug printks early on and instead of 
single pages I was seeing a few hundred Kbytes at a time being mapped in 
places. I can try a few trial runs with some debug printks to collect stats.

>
>> Add further speed-up by removing a mandatory tlb force flush
>> for swapless kernel.
> It is also not entirely clear to me why swap is a problem here,
> can you please elaborate?

I asked this question on the list a while back.

One of the main remaining huge performance bugbears  in UML which 
accounts for most of its "fame" of being slow is the fact that there is 
a full TLB flush every time a fork happens in the UML userspace. It is 
also executed with force = 1.

You pointed me to an old commit from the days svn was being used which 
was fixing exactly that by introducing the force parameter.

I tested force on/off and the condition that commit is trying to cure 
still stands. If swap is enabled the tlb flush on fork/exec needs to 
have force=1. If, however, there is no swap in the system the force is 
not needed. It happily works without it.

Why - dunno. I do not fully understand some of that code.

>   
>> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>> ---
>>   arch/um/kernel/tlb.c | 201 +++++++++++++++++++++++++++++++++++++--------------
>>   1 file changed, 148 insertions(+), 53 deletions(-)
>>
>> diff --git a/arch/um/kernel/tlb.c b/arch/um/kernel/tlb.c
>> index 37508b190106..f754d43105fc 100644
>> --- a/arch/um/kernel/tlb.c
>> +++ b/arch/um/kernel/tlb.c
>> @@ -15,9 +15,15 @@
>>   #include <skas.h>
>>   #include <kern_util.h>
>>   
>> +#ifdef CONFIG_SWAP
>> +#define FORK_MM_FORCE 1
>> +#else
>> +#define FORK_MM_FORCE 0
>> +#endif
>> +
> Hmm, can't we detect at runtime whether a swap device is enabled or not?
> Depending on CONFIG_SWAP is a huge dependency.

I can try to hook up into swapon/off. Need to read that part.

>
>>   struct host_vm_change {
>>   	struct host_vm_op {
>> -		enum { NONE, MMAP, MUNMAP, MPROTECT } type;
>> +		enum { HOST_NONE, HOST_MMAP, HOST_MUNMAP, HOST_MPROTECT } type;
>>   		union {
>>   			struct {
>>   				unsigned long addr;
>> @@ -43,14 +49,34 @@ struct host_vm_change {
>>   	int force;
>>   };
>>   
>> +struct kernel_vm_change {
> Maybe an union would fit better here?

I do not think so. I am trying to leverage the fact that there is only 
one user of this function and the sequence is always

munmap, mmap, mprotect - in this order and everything is strictly 
sequential.

If I do a union implementation similar to the one used by the userspace 
mm using the hvc structure, each mprotect results in a flush of 
everything queued prior to that as well as drop in granularity to a lot 
of single pages.

If I accumulate changes like this I can delay past at least one 
mprotect. This ends up in a larger contiguous areas being unmapped 
followed by mmaped in single operations.

IMHO for the kernel tlb flush this optimization should work.

>
>> +	struct {
>> +		unsigned long phys;
>> +		unsigned long virt;
>> +		unsigned long len;
>> +		unsigned int active;
>> +	} mmap;
>> +	struct {
>> +		unsigned long addr;
>> +		unsigned long len;
>> +		unsigned int active;
>> +	} munmap;
>> +	struct {
>> +		unsigned long addr;
>> +		unsigned long len;
>> +		unsigned int active;
>> +	} mprotect;
>> +};
>> +
>>   #define INIT_HVC(mm, force) \
>>   	((struct host_vm_change) \
>> -	 { .ops		= { { .type = NONE } },	\
>> +	 { .ops		= { { .type = HOST_NONE } },	\
>>   	   .id		= &mm->context.id, \
>>          	   .data	= NULL, \
>>   	   .index	= 0, \
>>   	   .force	= force })
>>   
>> +
>>   static void report_enomem(void)
>>   {
>>   	printk(KERN_ERR "UML ran out of memory on the host side! "
>> @@ -58,7 +84,7 @@ static void report_enomem(void)
>>   			"vm.max_map_count has been reached.\n");
>>   }
>>   
>> -static int do_ops(struct host_vm_change *hvc, int end,
>> +static int do_host_ops(struct host_vm_change *hvc, int end,
>>   		  int finished)
>>   {
>>   	struct host_vm_op *op;
>> @@ -67,22 +93,22 @@ static int do_ops(struct host_vm_change *hvc, int end,
>>   	for (i = 0; i < end && !ret; i++) {
>>   		op = &hvc->ops[i];
>>   		switch (op->type) {
>> -		case MMAP:
>> +		case HOST_MMAP:
>>   			ret = map(hvc->id, op->u.mmap.addr, op->u.mmap.len,
>>   				  op->u.mmap.prot, op->u.mmap.fd,
>>   				  op->u.mmap.offset, finished, &hvc->data);
>>   			break;
>> -		case MUNMAP:
>> +		case HOST_MUNMAP:
>>   			ret = unmap(hvc->id, op->u.munmap.addr,
>>   				    op->u.munmap.len, finished, &hvc->data);
>>   			break;
>> -		case MPROTECT:
>> +		case HOST_MPROTECT:
>>   			ret = protect(hvc->id, op->u.mprotect.addr,
>>   				      op->u.mprotect.len, op->u.mprotect.prot,
>>   				      finished, &hvc->data);
>>   			break;
>>   		default:
>> -			printk(KERN_ERR "Unknown op type %d in do_ops\n",
>> +			printk(KERN_ERR "Unknown op type %d in do_host_ops\n",
>>   			       op->type);
>>   			BUG();
>>   			break;
>> @@ -95,8 +121,32 @@ static int do_ops(struct host_vm_change *hvc, int end,
>>   	return ret;
>>   }
>>   
>> -static int add_mmap(unsigned long virt, unsigned long phys, unsigned long len,
>> -		    unsigned int prot, struct host_vm_change *hvc)
>> +static void do_kern_ops(struct kernel_vm_change *kvc)
>> +{
>> +	int err = 0;
>> +
>> +	if (kvc->munmap.active) {
>> +		err = os_unmap_memory((void *) kvc->munmap.addr,
>> +			kvc->munmap.len);
>> +		kvc->munmap.active = 0;
>> +		if (err < 0)
>> +			panic("munmap failed, errno = %d\n", -err);
>> +	}
>> +	if (kvc->mmap.active) {
>> +		map_memory(kvc->mmap.virt,
>> +			kvc->mmap.phys, kvc->mmap.len, 1, 1, 1);
> mmap can fail, please check for the return code.

The original was not checking for the kernel case. I agree a check 
should be added .

>
>> +		kvc->mmap.active = 0;
>> +	}
>> +	if (kvc->mprotect.active) {
>> +		os_protect_memory((void *) kvc->mprotect.addr,
>> +			kvc->mprotect.len, 1, 1, 1);
> Same.
>
> Thanks,
> //richard
>
>
>
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um
>


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

  reply	other threads:[~2018-10-06 21:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-04 17:25 [PATCH v3] Optimise TLB flush for kernel mm in UML anton.ivanov
2018-10-06 20:38 ` Richard Weinberger
2018-10-06 21:04   ` Anton Ivanov [this message]
2018-10-06 21:15     ` Richard Weinberger
2018-10-07  7:41       ` Anton Ivanov
2018-12-04 11:19       ` Anton Ivanov
2018-12-06  8:19         ` Anton Ivanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83f296bd-feb1-5177-228e-f294aa22fa5f@kot-begemot.co.uk \
    --to=anton.ivanov@kot-begemot.co.uk \
    --cc=linux-um@lists.infradead.org \
    --cc=richard@nod.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.