All of lore.kernel.org
 help / color / mirror / Atom feed
* ARM cacheflush syscall with range that spans multiple vma
@ 2013-06-10  0:05 John Reiser
  2013-06-10  8:59 ` Russell King - ARM Linux
  0 siblings, 1 reply; 6+ messages in thread
From: John Reiser @ 2013-06-10  0:05 UTC (permalink / raw)
  To: linux-arm-kernel

Why does the ARM cacheflush syscall stop after the lowest vma
which intersects the user-requested range?  The range could
span more than one vma having contiguous addresses, such as
two files MAP_SHARED into adjacent pages; or even a region
that contains holes (pages not present.)

The code path in arch/arm/kernel/traps.c is:

arm_syscall():
        case NR(cacheflush):
                return do_cache_op(regs->ARM_r0, regs->ARM_r1, regs->ARM_r2);

do_cache_op() contains no loop for more than one vma:
        vma = find_vma(mm, start);

-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM cacheflush syscall with range that spans multiple vma
  2013-06-10  0:05 ARM cacheflush syscall with range that spans multiple vma John Reiser
@ 2013-06-10  8:59 ` Russell King - ARM Linux
  2013-06-10  9:09   ` Will Deacon
  2013-06-11 10:11   ` Will Deacon
  0 siblings, 2 replies; 6+ messages in thread
From: Russell King - ARM Linux @ 2013-06-10  8:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Jun 09, 2013 at 05:05:24PM -0700, John Reiser wrote:
> Why does the ARM cacheflush syscall stop after the lowest vma
> which intersects the user-requested range?  The range could
> span more than one vma having contiguous addresses, such as
> two files MAP_SHARED into adjacent pages; or even a region
> that contains holes (pages not present.)

Because you're not supposed to use it on large ranges because it's
an expensive operation.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM cacheflush syscall with range that spans multiple vma
  2013-06-10  8:59 ` Russell King - ARM Linux
@ 2013-06-10  9:09   ` Will Deacon
  2013-06-10 20:16     ` John Reiser
  2013-06-11 10:11   ` Will Deacon
  1 sibling, 1 reply; 6+ messages in thread
From: Will Deacon @ 2013-06-10  9:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 10, 2013 at 09:59:48AM +0100, Russell King - ARM Linux wrote:
> On Sun, Jun 09, 2013 at 05:05:24PM -0700, John Reiser wrote:
> > Why does the ARM cacheflush syscall stop after the lowest vma
> > which intersects the user-requested range?  The range could
> > span more than one vma having contiguous addresses, such as
> > two files MAP_SHARED into adjacent pages; or even a region
> > that contains holes (pages not present.)
> 
> Because you're not supposed to use it on large ranges because it's
> an expensive operation.

I posted some patches to address this recently. Obviously it's still
expensive, but it makes the syscall restartable so that you can't DoS the
system.

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git cacheflush

There's WIP code there for a new iovec-based syscall too.

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM cacheflush syscall with range that spans multiple vma
  2013-06-10  9:09   ` Will Deacon
@ 2013-06-10 20:16     ` John Reiser
  2013-06-10 23:42       ` Russell King - ARM Linux
  0 siblings, 1 reply; 6+ messages in thread
From: John Reiser @ 2013-06-10 20:16 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/10/2013 02:09 AM -0700, Will Deacon wrote:
> On Mon, Jun 10, 2013 at 09:59:48AM +0100, Russell King - ARM Linux wrote:
>> On Sun, Jun 09, 2013 at 05:05:24PM -0700, John Reiser wrote:
>>> Why does the ARM cacheflush syscall stop after the lowest vma
>>> which intersects the user-requested range?  The range could
>>> span more than one vma having contiguous addresses, such as
>>> two files MAP_SHARED into adjacent pages; or even a region
>>> that contains holes (pages not present.)
>>
>> Because you're not supposed to use it on large ranges because it's
>> an expensive operation.
> 
> I posted some patches to address this recently. Obviously it's still
> expensive, but it makes the syscall restartable so that you can't DoS the
> system.
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git cacheflush
> 
> There's WIP code there for a new iovec-based syscall too.

Please merge those patches soon.

My "app" is user-mode execve() of a compressed ET_EXEC, so UPX must flush
all of the re-generated .text, which can be a megabyte or more.  Thus I flush
one page per syscall, or write all of .text to a temporary file
(achieves cache flush because DMA accesses only memory, not cache),
or heuristically flush by "sweeping" 1/2 MB of consecutive words (thus
generating deliberate collisions and evictions.)  Each of those sucks.

It is *EXTREMELY* discouraging that cacheflush() misbehaves so badly.
*PLEASE* return an error status when you decide not to honor the API!

-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM cacheflush syscall with range that spans multiple vma
  2013-06-10 20:16     ` John Reiser
@ 2013-06-10 23:42       ` Russell King - ARM Linux
  0 siblings, 0 replies; 6+ messages in thread
From: Russell King - ARM Linux @ 2013-06-10 23:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 10, 2013 at 01:16:57PM -0700, John Reiser wrote:
> On 06/10/2013 02:09 AM -0700, Will Deacon wrote:
> > On Mon, Jun 10, 2013 at 09:59:48AM +0100, Russell King - ARM Linux wrote:
> >> On Sun, Jun 09, 2013 at 05:05:24PM -0700, John Reiser wrote:
> >>> Why does the ARM cacheflush syscall stop after the lowest vma
> >>> which intersects the user-requested range?  The range could
> >>> span more than one vma having contiguous addresses, such as
> >>> two files MAP_SHARED into adjacent pages; or even a region
> >>> that contains holes (pages not present.)
> >>
> >> Because you're not supposed to use it on large ranges because it's
> >> an expensive operation.
> > 
> > I posted some patches to address this recently. Obviously it's still
> > expensive, but it makes the syscall restartable so that you can't DoS the
> > system.
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git cacheflush
> > 
> > There's WIP code there for a new iovec-based syscall too.
> 
> Please merge those patches soon.
> 
> My "app" is user-mode execve() of a compressed ET_EXEC, so UPX must flush
> all of the re-generated .text, which can be a megabyte or more.  Thus I flush
> one page per syscall, or write all of .text to a temporary file
> (achieves cache flush because DMA accesses only memory, not cache),
> or heuristically flush by "sweeping" 1/2 MB of consecutive words (thus
> generating deliberate collisions and evictions.)  Each of those sucks.
> 
> It is *EXTREMELY* discouraging that cacheflush() misbehaves so badly.
> *PLEASE* return an error status when you decide not to honor the API!

So what, you're arranging for your memory to exist as a set of
contiguous but separate mappings of one page each?  Surely not.

You should be able to cover a complete mapping in one go.  If the
function stops because a page is not present and _can_ be populated,
that is a bug.  If it stops because a page is not present and that
page can't be populated, then it's working as it should.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM cacheflush syscall with range that spans multiple vma
  2013-06-10  8:59 ` Russell King - ARM Linux
  2013-06-10  9:09   ` Will Deacon
@ 2013-06-11 10:11   ` Will Deacon
  1 sibling, 0 replies; 6+ messages in thread
From: Will Deacon @ 2013-06-11 10:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 10, 2013 at 09:59:48AM +0100, Russell King - ARM Linux wrote:
> On Sun, Jun 09, 2013 at 05:05:24PM -0700, John Reiser wrote:
> > Why does the ARM cacheflush syscall stop after the lowest vma
> > which intersects the user-requested range?  The range could
> > span more than one vma having contiguous addresses, such as
> > two files MAP_SHARED into adjacent pages; or even a region
> > that contains holes (pages not present.)
> 
> Because you're not supposed to use it on large ranges because it's
> an expensive operation.

FWIW: here's a simple test case which can really affect responsiveness on my
TC2 (2GB of memory). It just creates a single VMA, doesn't bother faulting it
in, then tries to cacheflush the whole range. On a kernel with
CONFIG_PREEMPT_NONE=y, this effectively stalls the system (interrupts are
still taken) until the flush has completed.

Will

--->8

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define BUFSIZE		0x70000000
#define NR_cacheflush	0xf0002

int main(void)
{
	int ret;
	char *region = malloc(BUFSIZE);

	if (!region) {
		fprintf(stderr, "Failed to allocate %u-byte buffer\n", BUFSIZE);
		return -1;
	}

	ret = syscall(NR_cacheflush, region, region + BUFSIZE, 0);
	if (ret)
		printf("syscall returned %d\n", ret);

	return ret;
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-06-11 10:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-10  0:05 ARM cacheflush syscall with range that spans multiple vma John Reiser
2013-06-10  8:59 ` Russell King - ARM Linux
2013-06-10  9:09   ` Will Deacon
2013-06-10 20:16     ` John Reiser
2013-06-10 23:42       ` Russell King - ARM Linux
2013-06-11 10:11   ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.