All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: cache aliasing in dup_mmap
       [not found] <e06498070903061426o5875ad13hc6328aa0d3f08ed7@mail.gmail.com>
@ 2009-03-06 22:55 ` Russell King - ARM Linux
       [not found]   ` <e06498070903062020s22222594h7be318fb9f3fdaee@mail.gmail.com>
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-03-06 22:55 UTC (permalink / raw)
  To: Steven Walter; +Cc: linux-arm-kernel, linux-kernel

On Fri, Mar 06, 2009 at 05:26:24PM -0500, Steven Walter wrote:
> I've been tracking down an instance of userspace data corruption, and I
> believe I have found a window during fork where data can be lost.  The
> corruption is occurring on an ARMv5 system with VIVT caches.  Here's the
> scenario in question.  Thread A is forking, Thread B is running in
> userspace:

With VIVT caches, you're missing a few things here:

> Thread A: flush_cache_mm (dup_mmap)

-- cache written back and invalidated

> Thread B: writes to a page in the above mm

-- cache written back and invalidated

> Thread A: pte_wrprotect the above page (copy_one_pte)

-- cache written back and invalidated

> Thread B: writes to the same page again
> 
> During thread B's second write, he'll take a fault and enter the do_wp_page
> case.  We'll end up calling copy_page, which notably uses the kernel virtual
> addresses for the old and new pages.  This means that the new page does not
> necessarily have the data from the first write.

Given the additional flushing I've mentioned above, where could the
problem be?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
       [not found]           ` <e06498070909290925q55777041q6f3ecfaf9847babb@mail.gmail.com>
@ 2009-09-29 16:30             ` Steven Walter
  2009-09-29 19:11               ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Steven Walter @ 2009-09-29 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 8, 2009 at 10:36 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sun, Mar 08, 2009 at 12:10:29PM +0000, Russell King - ARM Linux wrote:
>> I don't think moving flush_cache_mm solves anything - the problem remains,
>> but with a different timing.
>
> On the other hand, this should solve the problem. ?I'm not 100% happy
> with this because we end up using the cache flush ops on the same memory
> several times.

Russell

It appears that this patch is neither in Linus' tree nor the ARM
master branch. ?Was a different fix committed to address the original
memory corruption issue?
--
-Steven Walter <stevenrwalter@gmail.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-09-29 16:30             ` Steven Walter
@ 2009-09-29 19:11               ` Russell King - ARM Linux
  2009-10-05 14:46                 ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-09-29 19:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 29, 2009 at 12:30:27PM -0400, Steven Walter wrote:
> On Sun, Mar 8, 2009 at 10:36 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Sun, Mar 08, 2009 at 12:10:29PM +0000, Russell King - ARM Linux wrote:
> >> I don't think moving flush_cache_mm solves anything - the problem remains,
> >> but with a different timing.
> >
> > On the other hand, this should solve the problem. ?I'm not 100% happy
> > with this because we end up using the cache flush ops on the same memory
> > several times.
> 
> Russell
> 
> It appears that this patch is neither in Linus' tree nor the ARM
> master branch. ?Was a different fix committed to address the original
> memory corruption issue?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-09-29 19:11               ` Russell King - ARM Linux
@ 2009-10-05 14:46                 ` Russell King - ARM Linux
  2009-11-05 14:48                   ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-10-05 14:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 29, 2009 at 08:11:08PM +0100, Russell King - ARM Linux wrote:
> On Tue, Sep 29, 2009 at 12:30:27PM -0400, Steven Walter wrote:
> > On Sun, Mar 8, 2009 at 10:36 AM, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > On Sun, Mar 08, 2009 at 12:10:29PM +0000, Russell King - ARM Linux wrote:
> > >> I don't think moving flush_cache_mm solves anything - the problem remains,
> > >> but with a different timing.
> > >
> > > On the other hand, this should solve the problem. ?I'm not 100% happy
> > > with this because we end up using the cache flush ops on the same memory
> > > several times.
> > 
> > Russell
> > 
> > It appears that this patch is neither in Linus' tree nor the ARM
> > master branch. ?Was a different fix committed to address the original
> > memory corruption issue?
> 
> >From what I can see, there was no positive response to the actual patch
> (and no testing of the actual patch), so it doesn't get merged.
> 
> Changes don't get merged until they themselves have been tested.

I've not heard anything further.  I've thrown the patches into the git
tree under a separate branch now (top two):

http://ftp.arm.linux.org.uk/git/gitweb.cgi?p=linux-2.6-arm.git;a=shortlog;h=refs/heads/cup

I do feel that more work needs to be done to the 2nd patch though.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-10-05 14:46                 ` Russell King - ARM Linux
@ 2009-11-05 14:48                   ` Russell King - ARM Linux
  2009-11-13 14:49                     ` Steven Walter
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-11-05 14:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 05, 2009 at 03:46:23PM +0100, Russell King - ARM Linux wrote:
> On Tue, Sep 29, 2009 at 08:11:08PM +0100, Russell King - ARM Linux wrote:
> > On Tue, Sep 29, 2009 at 12:30:27PM -0400, Steven Walter wrote:
> > > On Sun, Mar 8, 2009 at 10:36 AM, Russell King - ARM Linux
> > > <linux@arm.linux.org.uk> wrote:
> > > > On Sun, Mar 08, 2009 at 12:10:29PM +0000, Russell King - ARM Linux wrote:
> > > >> I don't think moving flush_cache_mm solves anything - the problem remains,
> > > >> but with a different timing.
> > > >
> > > > On the other hand, this should solve the problem. ?I'm not 100% happy
> > > > with this because we end up using the cache flush ops on the same memory
> > > > several times.
> > > 
> > > Russell
> > > 
> > > It appears that this patch is neither in Linus' tree nor the ARM
> > > master branch. ?Was a different fix committed to address the original
> > > memory corruption issue?
> > 
> > >From what I can see, there was no positive response to the actual patch
> > (and no testing of the actual patch), so it doesn't get merged.
> > 
> > Changes don't get merged until they themselves have been tested.
> 
> I've not heard anything further.  I've thrown the patches into the git
> tree under a separate branch now (top two):
> 
> http://ftp.arm.linux.org.uk/git/gitweb.cgi?p=linux-2.6-arm.git;a=shortlog;h=refs/heads/cup
> 
> I do feel that more work needs to be done to the 2nd patch though.

Okay, enough.  This issue is NOT going to get fixed unless you (or
someone else who can reproduce your problem) responds.

No response, no fix.  Simple, really.  I give you two weeks to respond
on this issue so it can be progressed before the above branch is trashed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-05 14:48                   ` Russell King - ARM Linux
@ 2009-11-13 14:49                     ` Steven Walter
  2009-11-16 16:30                       ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Steven Walter @ 2009-11-13 14:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 5, 2009 at 9:48 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
>> I've not heard anything further. ?I've thrown the patches into the git
>> tree under a separate branch now (top two):
>>
>> http://ftp.arm.linux.org.uk/git/gitweb.cgi?p=linux-2.6-arm.git;a=shortlog;h=refs/heads/cup
>>
>> I do feel that more work needs to be done to the 2nd patch though.
>
> Okay, enough. ?This issue is NOT going to get fixed unless you (or
> someone else who can reproduce your problem) responds.
>
> No response, no fix. ?Simple, really. ?I give you two weeks to respond
> on this issue so it can be progressed before the above branch is trashed.
>

I apologize for the late reply.  I have a build with your patches
running in a reboot loop (the easiest way for us to reproduce the
crash).  I should have a good indication of whether the patches fix
the issue by Monday.

Just by examining the changes, your patches seem to be a superset of
the change I made locally, which did fix the issue.  My fix was simply
to add flush_cache_page before cow_user_page in mm/memory.c.
-- 
-Steven Walter <stevenrwalter@gmail.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-13 14:49                     ` Steven Walter
@ 2009-11-16 16:30                       ` Russell King - ARM Linux
  2009-11-16 17:23                         ` Steven Walter
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-11-16 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 13, 2009 at 09:49:42AM -0500, Steven Walter wrote:
> On Thu, Nov 5, 2009 at 9:48 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> >> I've not heard anything further. ?I've thrown the patches into the git
> >> tree under a separate branch now (top two):
> >>
> >> http://ftp.arm.linux.org.uk/git/gitweb.cgi?p=linux-2.6-arm.git;a=shortlog;h=refs/heads/cup
> >>
> >> I do feel that more work needs to be done to the 2nd patch though.
> >
> > Okay, enough. ?This issue is NOT going to get fixed unless you (or
> > someone else who can reproduce your problem) responds.
> >
> > No response, no fix. ?Simple, really. ?I give you two weeks to respond
> > on this issue so it can be progressed before the above branch is trashed.
> >
> 
> I apologize for the late reply.  I have a build with your patches
> running in a reboot loop (the easiest way for us to reproduce the
> crash).  I should have a good indication of whether the patches fix
> the issue by Monday.

Any results?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-16 16:30                       ` Russell King - ARM Linux
@ 2009-11-16 17:23                         ` Steven Walter
  2009-11-16 17:28                           ` Russell King - ARM Linux
  2009-11-16 22:20                           ` Russell King - ARM Linux
  0 siblings, 2 replies; 19+ messages in thread
From: Steven Walter @ 2009-11-16 17:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 16, 2009 at 11:30 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
>> I apologize for the late reply. ?I have a build with your patches
>> running in a reboot loop (the easiest way for us to reproduce the
>> crash). ?I should have a good indication of whether the patches fix
>> the issue by Monday.
>
> Any results?

Yes, SIGSEGV after 88 boots, with a crash that we previously
identified as caused by this cache corruption issue.  Attached are the
patches I used, as manually backported to 2.6.18.5.  Stock 2.6.18.5
would reliably crash in under 1000 reboots, whereas  with my ad-hoc
fix (also attached) we were able to run 8000 reboots before we ended
the test.
-- 
-Steven Walter <stevenrwalter@gmail.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: adhoc.patch
Type: text/x-patch
Size: 1087 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20091116/97483bb4/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-pass-vma-to-copy_user_highpage.patch
Type: text/x-patch
Size: 3658 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20091116/97483bb4/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-call-flush_cache_page-before-copies.patch
Type: text/x-patch
Size: 2078 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20091116/97483bb4/attachment-0002.bin>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-16 17:23                         ` Steven Walter
@ 2009-11-16 17:28                           ` Russell King - ARM Linux
  2009-11-16 21:50                             ` Steven Walter
  2009-11-16 22:20                           ` Russell King - ARM Linux
  1 sibling, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-11-16 17:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 16, 2009 at 12:23:36PM -0500, Steven Walter wrote:
> On Mon, Nov 16, 2009 at 11:30 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> >> I apologize for the late reply. ?I have a build with your patches
> >> running in a reboot loop (the easiest way for us to reproduce the
> >> crash). ?I should have a good indication of whether the patches fix
> >> the issue by Monday.
> >
> > Any results?
> 
> Yes, SIGSEGV after 88 boots, with a crash that we previously
> identified as caused by this cache corruption issue.  Attached are the
> patches I used, as manually backported to 2.6.18.5.  Stock 2.6.18.5
> would reliably crash in under 1000 reboots, whereas  with my ad-hoc
> fix (also attached) we were able to run 8000 reboots before we ended
> the test.

What is copypage-swl.S ?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-16 17:28                           ` Russell King - ARM Linux
@ 2009-11-16 21:50                             ` Steven Walter
  0 siblings, 0 replies; 19+ messages in thread
From: Steven Walter @ 2009-11-16 21:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 16, 2009 at 12:28 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
>> Yes, SIGSEGV after 88 boots, with a crash that we previously
>> identified as caused by this cache corruption issue. ?Attached are the
>> patches I used, as manually backported to 2.6.18.5. ?Stock 2.6.18.5
>> would reliably crash in under 1000 reboots, whereas ?with my ad-hoc
>> fix (also attached) we were able to run 8000 reboots before we ended
>> the test.
>
> What is copypage-swl.S ?

copypage-swl.S contains optimized copy_user / clear_user routines for
the custom ASIC we use.  Preimage attached.
-- 
-Steven Walter <stevenrwalter@gmail.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: copypage-swl.S
Type: application/octet-stream
Size: 2592 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20091116/24fb096c/attachment.obj>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-16 17:23                         ` Steven Walter
  2009-11-16 17:28                           ` Russell King - ARM Linux
@ 2009-11-16 22:20                           ` Russell King - ARM Linux
  2009-11-19  8:57                             ` Russell King - ARM Linux
  1 sibling, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-11-16 22:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 16, 2009 at 12:23:36PM -0500, Steven Walter wrote:
> Yes, SIGSEGV after 88 boots, with a crash that we previously
> identified as caused by this cache corruption issue.  Attached are the
> patches I used, as manually backported to 2.6.18.5.  Stock 2.6.18.5
> would reliably crash in under 1000 reboots, whereas  with my ad-hoc
> fix (also attached) we were able to run 8000 reboots before we ended
> the test.

I can see no reason for the difference between the two.  The backport
looks fine.

The only difference I can see is:

- version which works

> diff --git a/src/mm/memory.c b/src/mm/memory.c
> index 0b7a668..d0bc8c1 100644

- version which doesn't

> diff --git a/src/mm/memory.c b/src/mm/memory.c
> index f508c60..69d8d32 100644

That first index number gives the sha1 hash of the git object storing
this file.  Since they're different, these two patches weren't applied
to the same file.

Could there be other changes in your tree which could be affecting the
behaviour in this area?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-16 22:20                           ` Russell King - ARM Linux
@ 2009-11-19  8:57                             ` Russell King - ARM Linux
  2009-11-19 18:08                               ` Steven Walter
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-11-19  8:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 16, 2009 at 10:20:16PM +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 16, 2009 at 12:23:36PM -0500, Steven Walter wrote:
> > Yes, SIGSEGV after 88 boots, with a crash that we previously
> > identified as caused by this cache corruption issue.  Attached are the
> > patches I used, as manually backported to 2.6.18.5.  Stock 2.6.18.5
> > would reliably crash in under 1000 reboots, whereas  with my ad-hoc
> > fix (also attached) we were able to run 8000 reboots before we ended
> > the test.
> 
> I can see no reason for the difference between the two.  The backport
> looks fine.
> 
> The only difference I can see is:
> 
> - version which works
> 
> > diff --git a/src/mm/memory.c b/src/mm/memory.c
> > index 0b7a668..d0bc8c1 100644
> 
> - version which doesn't
> 
> > diff --git a/src/mm/memory.c b/src/mm/memory.c
> > index f508c60..69d8d32 100644
> 
> That first index number gives the sha1 hash of the git object storing
> this file.  Since they're different, these two patches weren't applied
> to the same file.
> 
> Could there be other changes in your tree which could be affecting the
> behaviour in this area?

I'm afraid that I have nothing further to suggest, and so I'm shelving
the patches.  That means this problem will remain unresolved in future
kernels.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-19  8:57                             ` Russell King - ARM Linux
@ 2009-11-19 18:08                               ` Steven Walter
  2009-11-23 20:31                                 ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Steven Walter @ 2009-11-19 18:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 19, 2009 at 3:57 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> I'm afraid that I have nothing further to suggest, and so I'm shelving
> the patches. ?That means this problem will remain unresolved in future
> kernels.

I had the tester restart the reboot-loop, and so far it's done ~1200
runs with no problem.  I'm thinking now that either I hosed something
up with the build initially, or else the tester didn't flash it
correctly.  1200 runs isn't enough to be conclusive, but it's a lot
better than 88.  I'll let you know when we have something more
definitive.
-- 
-Steven Walter <stevenrwalter@gmail.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-19 18:08                               ` Steven Walter
@ 2009-11-23 20:31                                 ` Russell King - ARM Linux
  2009-11-30  9:20                                   ` Russell King - ARM Linux
  0 siblings, 1 reply; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-11-23 20:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 19, 2009 at 01:08:01PM -0500, Steven Walter wrote:
> On Thu, Nov 19, 2009 at 3:57 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > I'm afraid that I have nothing further to suggest, and so I'm shelving
> > the patches. ?That means this problem will remain unresolved in future
> > kernels.
> 
> I had the tester restart the reboot-loop, and so far it's done ~1200
> runs with no problem.  I'm thinking now that either I hosed something
> up with the build initially, or else the tester didn't flash it
> correctly.  1200 runs isn't enough to be conclusive, but it's a lot
> better than 88.  I'll let you know when we have something more
> definitive.

I'd just like to keep this issue hot so it doesn't get forgotten about.
Are we at a conclusive point on this?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-23 20:31                                 ` Russell King - ARM Linux
@ 2009-11-30  9:20                                   ` Russell King - ARM Linux
  2009-11-30 19:09                                     ` Jamie Lokier
                                                       ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-11-30  9:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 23, 2009 at 08:31:10PM +0000, Russell King - ARM Linux wrote:
> On Thu, Nov 19, 2009 at 01:08:01PM -0500, Steven Walter wrote:
> > On Thu, Nov 19, 2009 at 3:57 AM, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > I'm afraid that I have nothing further to suggest, and so I'm shelving
> > > the patches. ?That means this problem will remain unresolved in future
> > > kernels.
> > 
> > I had the tester restart the reboot-loop, and so far it's done ~1200
> > runs with no problem.  I'm thinking now that either I hosed something
> > up with the build initially, or else the tester didn't flash it
> > correctly.  1200 runs isn't enough to be conclusive, but it's a lot
> > better than 88.  I'll let you know when we have something more
> > definitive.
> 
> I'd just like to keep this issue hot so it doesn't get forgotten about.
> Are we at a conclusive point on this?

Today is decision day.  Do we go with these two patches or not?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-30  9:20                                   ` Russell King - ARM Linux
@ 2009-11-30 19:09                                     ` Jamie Lokier
  2009-11-30 19:23                                       ` Russell King - ARM Linux
  2009-12-01 18:08                                     ` Russell King - ARM Linux
  2009-12-08 22:17                                     ` Steven Walter
  2 siblings, 1 reply; 19+ messages in thread
From: Jamie Lokier @ 2009-11-30 19:09 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux wrote:
> On Mon, Nov 23, 2009 at 08:31:10PM +0000, Russell King - ARM Linux wrote:
> > On Thu, Nov 19, 2009 at 01:08:01PM -0500, Steven Walter wrote:
> > > On Thu, Nov 19, 2009 at 3:57 AM, Russell King - ARM Linux
> > > <linux@arm.linux.org.uk> wrote:
> > > > I'm afraid that I have nothing further to suggest, and so I'm shelving
> > > > the patches. ?That means this problem will remain unresolved in future
> > > > kernels.
> > > 
> > > I had the tester restart the reboot-loop, and so far it's done ~1200
> > > runs with no problem.  I'm thinking now that either I hosed something
> > > up with the build initially, or else the tester didn't flash it
> > > correctly.  1200 runs isn't enough to be conclusive, but it's a lot
> > > better than 88.  I'll let you know when we have something more
> > > definitive.
> > 
> > I'd just like to keep this issue hot so it doesn't get forgotten about.
> > Are we at a conclusive point on this?
> 
> Today is decision day.  Do we go with these two patches or not?

Way back when this thread started, I had a think about the proposed
fix and thought it was insufficient in some cases.  That is, the race
could still occur but not as often.  But it was hard to explain, and I
don't think I did so adequately.

The patches are an improvement.  That's a good enough reason to apply
them, but I'm grateful this thread keeps popping to the top of my
inbox to remind me to have a think about the corner cases again,
because I suspect the issue is not fully closed.

-- Jamie

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-30 19:09                                     ` Jamie Lokier
@ 2009-11-30 19:23                                       ` Russell King - ARM Linux
  0 siblings, 0 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-11-30 19:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 30, 2009 at 07:09:50PM +0000, Jamie Lokier wrote:
> Russell King - ARM Linux wrote:
> > On Mon, Nov 23, 2009 at 08:31:10PM +0000, Russell King - ARM Linux wrote:
> > > On Thu, Nov 19, 2009 at 01:08:01PM -0500, Steven Walter wrote:
> > > > On Thu, Nov 19, 2009 at 3:57 AM, Russell King - ARM Linux
> > > > <linux@arm.linux.org.uk> wrote:
> > > > > I'm afraid that I have nothing further to suggest, and so I'm shelving
> > > > > the patches. ?That means this problem will remain unresolved in future
> > > > > kernels.
> > > > 
> > > > I had the tester restart the reboot-loop, and so far it's done ~1200
> > > > runs with no problem.  I'm thinking now that either I hosed something
> > > > up with the build initially, or else the tester didn't flash it
> > > > correctly.  1200 runs isn't enough to be conclusive, but it's a lot
> > > > better than 88.  I'll let you know when we have something more
> > > > definitive.
> > > 
> > > I'd just like to keep this issue hot so it doesn't get forgotten about.
> > > Are we at a conclusive point on this?
> > 
> > Today is decision day.  Do we go with these two patches or not?
> 
> Way back when this thread started, I had a think about the proposed
> fix and thought it was insufficient in some cases.  That is, the race
> could still occur but not as often.  But it was hard to explain, and I
> don't think I did so adequately.

The only hole I can see would be if we context switched:

        kto = kmap_atomic(to, KM_USER0);
        kfrom = kmap_atomic(from, KM_USER1);
+       flush_cache_page(vma, vaddr, page_to_pfn(from));

	/* HERE */

        xsc3_mc_copy_user_page(kto, kfrom);
        kunmap_atomic(kfrom, KM_USER1);
        kunmap_atomic(kto, KM_USER0);

so that we were no longer reading data from userspace.  Luckily,
kmap_atomic() prevents preemptions occuring, so the above code should
be regarded as being entirely single-threaded - we won't get any context
switching all the time that 'to' is kmapped.

And it doesn't apply to SMP systems because there are no public SMP
systems before ARMv6, and SMP systems are required to be PIPT.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-30  9:20                                   ` Russell King - ARM Linux
  2009-11-30 19:09                                     ` Jamie Lokier
@ 2009-12-01 18:08                                     ` Russell King - ARM Linux
  2009-12-08 22:17                                     ` Steven Walter
  2 siblings, 0 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2009-12-01 18:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 30, 2009 at 09:20:22AM +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 23, 2009 at 08:31:10PM +0000, Russell King - ARM Linux wrote:
> > On Thu, Nov 19, 2009 at 01:08:01PM -0500, Steven Walter wrote:
> > > On Thu, Nov 19, 2009 at 3:57 AM, Russell King - ARM Linux
> > > <linux@arm.linux.org.uk> wrote:
> > > > I'm afraid that I have nothing further to suggest, and so I'm shelving
> > > > the patches. ?That means this problem will remain unresolved in future
> > > > kernels.
> > > 
> > > I had the tester restart the reboot-loop, and so far it's done ~1200
> > > runs with no problem.  I'm thinking now that either I hosed something
> > > up with the build initially, or else the tester didn't flash it
> > > correctly.  1200 runs isn't enough to be conclusive, but it's a lot
> > > better than 88.  I'll let you know when we have something more
> > > definitive.
> > 
> > I'd just like to keep this issue hot so it doesn't get forgotten about.
> > Are we at a conclusive point on this?
> 
> Today is decision day.  Do we go with these two patches or not?

Right, I've had enough of being patient and trying to get responses.  I
will not be merging these fixes for 2.6.33.  What's another three months
or so when it's already taken nine months to get this far already?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* cache aliasing in dup_mmap
  2009-11-30  9:20                                   ` Russell King - ARM Linux
  2009-11-30 19:09                                     ` Jamie Lokier
  2009-12-01 18:08                                     ` Russell King - ARM Linux
@ 2009-12-08 22:17                                     ` Steven Walter
  2 siblings, 0 replies; 19+ messages in thread
From: Steven Walter @ 2009-12-08 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 30, 2009 at 4:20 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Nov 23, 2009 at 08:31:10PM +0000, Russell King - ARM Linux wrote:
>> I'd just like to keep this issue hot so it doesn't get forgotten about.
>> Are we at a conclusive point on this?
>
> Today is decision day. ?Do we go with these two patches or not?

Okay, results are in, albeit later than advertised.  We had ~2600 runs
on the patched kernel and it appears that the cache issue is fixed.
-- 
-Steven Walter <stevenrwalter@gmail.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-12-08 22:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <e06498070903061426o5875ad13hc6328aa0d3f08ed7@mail.gmail.com>
2009-03-06 22:55 ` cache aliasing in dup_mmap Russell King - ARM Linux
     [not found]   ` <e06498070903062020s22222594h7be318fb9f3fdaee@mail.gmail.com>
     [not found]     ` <ca992f110903080105ye955070t260e6a4334186986@mail.gmail.com>
     [not found]       ` <20090308121029.GA21277@n2100.arm.linux.org.uk>
     [not found]         ` <20090308143607.GA27349@n2100.arm.linux.org.uk>
     [not found]           ` <e06498070909290925q55777041q6f3ecfaf9847babb@mail.gmail.com>
2009-09-29 16:30             ` Steven Walter
2009-09-29 19:11               ` Russell King - ARM Linux
2009-10-05 14:46                 ` Russell King - ARM Linux
2009-11-05 14:48                   ` Russell King - ARM Linux
2009-11-13 14:49                     ` Steven Walter
2009-11-16 16:30                       ` Russell King - ARM Linux
2009-11-16 17:23                         ` Steven Walter
2009-11-16 17:28                           ` Russell King - ARM Linux
2009-11-16 21:50                             ` Steven Walter
2009-11-16 22:20                           ` Russell King - ARM Linux
2009-11-19  8:57                             ` Russell King - ARM Linux
2009-11-19 18:08                               ` Steven Walter
2009-11-23 20:31                                 ` Russell King - ARM Linux
2009-11-30  9:20                                   ` Russell King - ARM Linux
2009-11-30 19:09                                     ` Jamie Lokier
2009-11-30 19:23                                       ` Russell King - ARM Linux
2009-12-01 18:08                                     ` Russell King - ARM Linux
2009-12-08 22:17                                     ` Steven Walter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.