All of lore.kernel.org
 help / color / mirror / Atom feed
* arm64 cache maintenance on read only address loops forever
@ 2014-02-26  4:59 Laura Abbott
  2014-02-26 13:55 ` Will Deacon
  2014-02-26 14:03 ` Catalin Marinas
  0 siblings, 2 replies; 6+ messages in thread
From: Laura Abbott @ 2014-02-26  4:59 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On arm64, set_pte_at currently write protects user ptes that are not 
dirty. The expected behavior is that the fault handler will fix this up 
on a write to the address. do_page_fault will not mark the fault as a 
write though if ESR has the CM (cache maintenance) bit set. This has the 
unfortunate side effect that if cache maintenance is performed on a user 
address that has not yet been marked as dirty, handle_mm_fault may 
return without actually adjusting the pte or returning an error. This 
means that the fault will be infinitely retried.

Calling cache maintenance on an address that hasn't actually been 
written to isn't all that useful but looping forever seems like a poor 
result. It seems like the check in do_page_fault is too restrictive and 
we need to be able to fault in pages via cache maintenance.

Thoughts?

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 6+ messages in thread

* arm64 cache maintenance on read only address loops forever
  2014-02-26  4:59 arm64 cache maintenance on read only address loops forever Laura Abbott
@ 2014-02-26 13:55 ` Will Deacon
  2014-02-26 21:40   ` Laura Abbott
  2014-02-26 14:03 ` Catalin Marinas
  1 sibling, 1 reply; 6+ messages in thread
From: Will Deacon @ 2014-02-26 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 26, 2014 at 04:59:46AM +0000, Laura Abbott wrote:
> Hi,

Hi Laura,

> On arm64, set_pte_at currently write protects user ptes that are not 
> dirty. The expected behavior is that the fault handler will fix this up 
> on a write to the address. do_page_fault will not mark the fault as a 
> write though if ESR has the CM (cache maintenance) bit set. This has the 
> unfortunate side effect that if cache maintenance is performed on a user 
> address that has not yet been marked as dirty, handle_mm_fault may 
> return without actually adjusting the pte or returning an error. This 
> means that the fault will be infinitely retried.
> 
> Calling cache maintenance on an address that hasn't actually been 
> written to isn't all that useful but looping forever seems like a poor 
> result. It seems like the check in do_page_fault is too restrictive and 
> we need to be able to fault in pages via cache maintenance.

My understanding is that the EL0 cache maintenance instructions only require
read permission (note that DC ZVA is treated like a store and doesn't set
ESR.CM), so I'm failing to appreciate the problem here.

Do you have a small testcase I can play with?

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

* arm64 cache maintenance on read only address loops forever
  2014-02-26  4:59 arm64 cache maintenance on read only address loops forever Laura Abbott
  2014-02-26 13:55 ` Will Deacon
@ 2014-02-26 14:03 ` Catalin Marinas
  2014-02-26 22:00   ` Laura Abbott
  1 sibling, 1 reply; 6+ messages in thread
From: Catalin Marinas @ 2014-02-26 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 25, 2014 at 08:59:46PM -0800, Laura Abbott wrote:
> On arm64, set_pte_at currently write protects user ptes that are not
> dirty. The expected behavior is that the fault handler will fix this
> up on a write to the address. do_page_fault will not mark the fault
> as a write though if ESR has the CM (cache maintenance) bit set.
> This has the unfortunate side effect that if cache maintenance is
> performed on a user address that has not yet been marked as dirty,
> handle_mm_fault may return without actually adjusting the pte or
> returning an error. This means that the fault will be infinitely
> retried.
> 
> Calling cache maintenance on an address that hasn't actually been
> written to isn't all that useful but looping forever seems like a
> poor result. It seems like the check in do_page_fault is too
> restrictive and we need to be able to fault in pages via cache
> maintenance.

Which kernel are you using? We had a fix in this area, commit
db6f41063cbdb58b14846e600e6bc3f4e4c2e888 (arm64: mm: don't treat user
cache maintenance faults as writes).

-- 
Catalin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* arm64 cache maintenance on read only address loops forever
  2014-02-26 13:55 ` Will Deacon
@ 2014-02-26 21:40   ` Laura Abbott
  2014-02-27 18:15     ` Will Deacon
  0 siblings, 1 reply; 6+ messages in thread
From: Laura Abbott @ 2014-02-26 21:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 2/26/2014 5:55 AM, Will Deacon wrote:
> On Wed, Feb 26, 2014 at 04:59:46AM +0000, Laura Abbott wrote:
>> Hi,
>
> Hi Laura,
>
>> On arm64, set_pte_at currently write protects user ptes that are not
>> dirty. The expected behavior is that the fault handler will fix this up
>> on a write to the address. do_page_fault will not mark the fault as a
>> write though if ESR has the CM (cache maintenance) bit set. This has the
>> unfortunate side effect that if cache maintenance is performed on a user
>> address that has not yet been marked as dirty, handle_mm_fault may
>> return without actually adjusting the pte or returning an error. This
>> means that the fault will be infinitely retried.
>>
>> Calling cache maintenance on an address that hasn't actually been
>> written to isn't all that useful but looping forever seems like a poor
>> result. It seems like the check in do_page_fault is too restrictive and
>> we need to be able to fault in pages via cache maintenance.
>
> My understanding is that the EL0 cache maintenance instructions only require
> read permission (note that DC ZVA is treated like a store and doesn't set
> ESR.CM), so I'm failing to appreciate the problem here.
>
> Do you have a small testcase I can play with?
>


You probably won't like the test case because it's breaking assumptions 
pretty badly. This uses 96f083d416c0d01687ed5b37074831f461838455 from
Catalin's devel branch to call __dma_inv_range on an mmaped user space
address. I see three possible outcomes:

1) The test while questionable may have some merit and we will be able 
to flush user space addresses using this API without causing a problem.

2) The test is bad. Instead of looping forever we will instead die with 
a fault.

3) The test is really really bad. Looping forever is your punishment.

Thanks,
Laura


Kernel Module
---

#include <linux/kernel.h>
#include <linux/debugfs.h>
#include <linux/fs.h>
#include <asm/cacheflush.h>
#include <linux/miscdevice.h>
#include <linux/slab.h>

#define PAGE_NUM	256

struct these_pages {
	struct page *pages[PAGE_NUM];
	unsigned long addr;
};

static int cache_debug_open(struct inode *inode, struct file *file)
{
         int ret = 0;
         struct these_pages *data;
	int i;

	data = kmalloc(sizeof(*data), GFP_KERNEL);

	if (!data) {
		return -ENOMEM;
	}

	for (i = 0; i < PAGE_NUM; i++) {
		data->pages[i] = alloc_pages(GFP_KERNEL, 0);
		if (!data->pages[i]) {
			return -ENOMEM;
		}
	}

         file->private_data = data;

         return ret;
}


static int cache_debug_mmap_internal(struct file *file, struct 
vm_area_struct *vma)
{
         struct these_pages *data;
	unsigned long addr = vma->vm_start;
	int i;
	int ret;

         data = file->private_data;
	vma->vm_flags |= VM_IO | VM_DONTEXPAND;

	for (i = 0; i < PAGE_NUM; i++, addr += PAGE_SIZE) {
		ret = vm_insert_page(vma, addr, data->pages[i]);
		if (ret) {
			pr_err(">>> fail %lx\n", addr);
		}
	}
	data->addr = vma->vm_start;

	return 0;
}

static long cache_debug_ioctl(struct file *filp, unsigned int cmd, 
unsigned long arg)
{
	struct these_pages *data = filp->private_data;

	pr_err(">>> start %lx\n", data->addr);
	__dma_inv_range((void *)data->addr, (void *)data->addr + 
PAGE_NUM*PAGE_SIZE);
	pr_err(">>> end");

	return 0;
}

static const struct file_operations cache_debug_fops = {
                 .owner = THIS_MODULE,
                 .open = cache_debug_open,
		.mmap = cache_debug_mmap_internal,
		.unlocked_ioctl = cache_debug_ioctl,
};

static struct miscdevice test_misc = {
	.minor = MISC_DYNAMIC_MINOR,
	.name = "test_dev",
	.fops = &cache_debug_fops,
};

static int cache_debug_init(void)
{
	int ret;

	ret = misc_register(&test_misc);
	if (ret < 0) {
		return -EINVAL;
	}

         return 0;
}
module_init(cache_debug_init);

----
Userspace test
----

void test_test(void)
{
         int fd;
         void *base;

         fd = open("/dev/test_dev", O_RDWR);

         if (fd < 0) {
                 printf("nope\n");
                 exit(1);
         }

         base = mmap(0, 1024*1024, PROT_READ|PROT_WRITE,
                     MAP_SHARED, fd, 0);


         ioctl(fd, 2345, 234566);

         return;
}

int main()
{
         printf("start\n");
         test_test();
         printf("done\n");

         return 0;
}

> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 6+ messages in thread

* arm64 cache maintenance on read only address loops forever
  2014-02-26 14:03 ` Catalin Marinas
@ 2014-02-26 22:00   ` Laura Abbott
  0 siblings, 0 replies; 6+ messages in thread
From: Laura Abbott @ 2014-02-26 22:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 2/26/2014 6:03 AM, Catalin Marinas wrote:
> On Tue, Feb 25, 2014 at 08:59:46PM -0800, Laura Abbott wrote:
>> On arm64, set_pte_at currently write protects user ptes that are not
>> dirty. The expected behavior is that the fault handler will fix this
>> up on a write to the address. do_page_fault will not mark the fault
>> as a write though if ESR has the CM (cache maintenance) bit set.
>> This has the unfortunate side effect that if cache maintenance is
>> performed on a user address that has not yet been marked as dirty,
>> handle_mm_fault may return without actually adjusting the pte or
>> returning an error. This means that the fault will be infinitely
>> retried.
>>
>> Calling cache maintenance on an address that hasn't actually been
>> written to isn't all that useful but looping forever seems like a
>> poor result. It seems like the check in do_page_fault is too
>> restrictive and we need to be able to fault in pages via cache
>> maintenance.
>
> Which kernel are you using? We had a fix in this area, commit
> db6f41063cbdb58b14846e600e6bc3f4e4c2e888 (arm64: mm: don't treat user
> cache maintenance faults as writes).
>

I'm using a 3.10 based kernel with stable fixes pulled in. 
db6f41063cbdb58b14846e600e6bc3f4e4c2e888 is present in the tree and 
reverting does not make a different.

Thanks,
Laura


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 6+ messages in thread

* arm64 cache maintenance on read only address loops forever
  2014-02-26 21:40   ` Laura Abbott
@ 2014-02-27 18:15     ` Will Deacon
  0 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2014-02-27 18:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Laura,

On Wed, Feb 26, 2014 at 09:40:48PM +0000, Laura Abbott wrote:
> On 2/26/2014 5:55 AM, Will Deacon wrote:
> > On Wed, Feb 26, 2014 at 04:59:46AM +0000, Laura Abbott wrote:
> >> Calling cache maintenance on an address that hasn't actually been
> >> written to isn't all that useful but looping forever seems like a poor
> >> result. It seems like the check in do_page_fault is too restrictive and
> >> we need to be able to fault in pages via cache maintenance.
> >
> > My understanding is that the EL0 cache maintenance instructions only require
> > read permission (note that DC ZVA is treated like a store and doesn't set
> > ESR.CM), so I'm failing to appreciate the problem here.
> >
> > Do you have a small testcase I can play with?
> >
> 
> 
> You probably won't like the test case because it's breaking assumptions 
> pretty badly. This uses 96f083d416c0d01687ed5b37074831f461838455 from
> Catalin's devel branch to call __dma_inv_range on an mmaped user space
> address. I see three possible outcomes:
> 
> 1) The test while questionable may have some merit and we will be able 
> to flush user space addresses using this API without causing a problem.
> 
> 2) The test is bad. Instead of looping forever we will instead die with 
> a fault.
> 
> 3) The test is really really bad. Looping forever is your punishment.

Ok, I spent some time getting your test working and, after managing to run
it, it promptly destroyed the flash on my FPGA. Once I managed to recover
that, the issue seems to be that __dma_inv_range doesn't have an exception
table entry for the dc ivac, mainly because it's written to be used only on
the kernel linear mapping. As such, __do_page_fault won't actually resolve
anything. If you enabled CONFIG_DEBUG_VM, you'll get a SEGV at the cost of
searching the tables at fault time.

So yeah, I'd say (2) or (3) depending on your .config ;)

Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-02-27 18:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-26  4:59 arm64 cache maintenance on read only address loops forever Laura Abbott
2014-02-26 13:55 ` Will Deacon
2014-02-26 21:40   ` Laura Abbott
2014-02-27 18:15     ` Will Deacon
2014-02-26 14:03 ` Catalin Marinas
2014-02-26 22:00   ` Laura Abbott

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.