All of lore.kernel.org
 help / color / mirror / Atom feed
* threads and fork on machine with VIPT-WB cache
       [not found]     ` <4BB53D26.60601@fsij.org>
@ 2010-04-02  2:41       ` NIIBE Yutaka
  2010-04-02  3:30         ` James Bottomley
  0 siblings, 1 reply; 74+ messages in thread
From: NIIBE Yutaka @ 2010-04-02  2:41 UTC (permalink / raw)
  To: linux-parisc; +Cc: pkg-gauche-devel, 561203

Hi there,

I think that I am catching a bug for threads and fork.  I found it
when debugging FTBFS of Gauche, a Scheme interpreter.  As I think that
the Debian bug #561203 has same cause, I am CC:-ing to the BTS too.
Please send Cc: to me, I am not on linux-parisc list.

Here, I am talking uniprocessor system case.
I assume that PARISC has virtually indexed, physically tagged, write
back cache, I call it VIPT-WB.

I am reading the source in Debian:
	linux-source-2.6.32/kernel/fork.c
	linux-source-2.6.32/mm/memory.c
	linux-source-2.6.32/arch/parisc/include/asm/pgtable.h

To have same semantics as other archs, I think that VIPT-WB cache
machine should have cache flush at ptep_set_wrprotect, so that memory
of the page has up-to-date data.  Yes, it will be huge performance
impact for fork.  But I don't find any good solution other than this
yet.  Well, I will need to write to linux-arch.

Let me explain our case.  As I couldn't catch the scene, but the
result, it includes imagination and interpretation of mine.  Correct
me if I'm wrong.

(1) We have process A with threads.  One of threads calls fork(2) (in
    fact, it's clone(2) without CLONE_VM) when other threads are still
    live.  Let's call this thread A-1.

(2) As a result of clone(2), we have process B.

(3) The memory of process A are copied to process B by dup_mmap
    (fork.c) by A-1 with the context of process A.  There,
    flush_cache_dup_mm is called.

    In case of single thread, flush_cache_dup_mm is enough.  All data
    in cache go to memory.  But we have other threads, in this case.

(4) From dup_mmap, copy_page_range (memory.c) is called.

    Note that there is a possibility of sleep in copy_page_range.
    Allocation of page in pud_alloc, pmd_alloc, or pte_alloc_map_lock
    may need the A-1 thread to be scheduled off (and wakes up the
    swapper or other processes).

(5) Suppose the A-1 thread sleeps in copy_page_range, and another
    thread of A-2 of process A is waken up, and touches memory.  Then
    we have data only in cache, memory has stale data.

(6) A-2 thread sleeps, and A-1 thread is waken up to continue
    copy_page_range -> copy_*_range -> copy_one_pte.

(7) From copy_one_pte, A-1 thread call ptep_set_wrprotect as
    this is COW mapping. (*)

(8) A-1 thread sleeps again in copy_page_range and process B is waken up.

(9) Process B does read-access on memory, which gets *NEW* data in
    cache (if process space identifier color is same).
    Process B does write-access on memory which causes memory fault,
    as it's COW memory.

    Note: Process B sees *NEW* data because it's VIPT-WB cache.
    It shares same memory in this situation.

(10) New page is allocated and memory contents are copied, with
     stale data.

     I assume that kernel access to the memory is by different
     cache line and does not see cache data of A-2.

(11) After falut, process B gets *OLD* data on memory.


(*) When we make COW memory mapping between process A and process B,
    we assume memory were up-to-date.  As this assumption is
    incorrect, I think that we need to flush cache data to memory
    here.


If you have more interest or like ASCII art, please keep reading.

In our Gauche case, we saw this problem on the linked list handling of
pthread implementation (NPTL).  We have two linked list heads, <used>
and <cache>.

Initially, situation of process A is like this:

      +-------------------------------------+
      |                                     |
used  v     ELEM                            |
+-----+     +-----+     +-----+     +-----+ |
|   ------->|   ------->|   ------->|   ----+
+-----+     +-----+     +-----+     +-----+
            |     |     |     |     |     |
            +-----+     +-----+     +-----+

      +-------------+
      |             |
cache v             |
+-----+     +-----+ |
|   ------->|   ----+
+-----+     +-----+                       This is in memory
            |     |
            +-----+

A-2 thread removes ELEM during fork.
This is Process A's final situation, and what Process B sees initially.

      +-------------------------------------+
      |                                     |
used  v                                     |
+-----+                 +-----+     +-----+ |
|   ------------------->|   ------->|   ----+
+-----+                 +-----+     +-----+
                        |     |     |     |
                        +-----+     +-----+

      +---------------------------+
      |     ELEM                  |
      |     +-----+               |
      | +-->|   -----+            |
      | |   +-----+  |            |
      | |   |     |  |            |
cache v |   +-----+  |            |        This is in cache
+-----+ |            |   +-----+  |
|   ----+            +-->|   -----+
+-----+                  +-----+
                         |     |
                         +-----+


Process B scans through linked list with <cache> and update data
in linked list.  After process B touches ELEM, it sees
*OLD* data of ELEM.


      +-------------------------------------+
      |                                     |
used  v                                     |
+-----+                 +-----+     +-----+ |
|   -----------------+->|   ------->|   ----+
+-----+              |  +-----+     +-----+
                     |  |     |     |     |
            ELEM     |  +-----+     +-----+
            +-----+  |
        +-->|   -----+ Wow!
        |   +-----+
        |   |*****|
cache   |   +-----+
+-----+ |                +-----+
|   ----+                |   ----> ... to cache
+-----+                  +-----+
                         |     |
                         +-----+

Process B follows the link and goes different places
and touches wrongly.

      +-------------------------------------+
      |                                     |
used  v                                     |
+-----+                 +-----+     +-----+ |
|   -----------------+->|   ------->|   ----+
+-----+              |  +-----+     +-----+
                     |  |*****|     |     |
            ELEM     |  +-----+     +-----+
            +-----+  |
        +-->|   -----+
        |   +-----+
        |   |*****|
cache   |   +-----+
+-----+ |                +-----+
|   ----+                |   ----> ... to cache
+-----+                  +-----+
                         |     |
                         +-----+

      +-------------------------------------+
      |                                     |
used  v                                     |
+-----+                 +-----+     +-----+ |
|   -----------------+->|   ------->|   ----+
+-----+              |  +-----+     +-----+
                     |  |*****|     |*****|
            ELEM     |  +-----+     +-----+
            +-----+  |
        +-->|   -----+
        |   +-----+
        |   |*****|
cache   |   +-----+
+-----+ |                +-----+
|   ----+                |   ----> ... to cache
+-----+                  +-----+
                         |     |
                         +-----+

Process B scans and goes linked list head of <used> as if it were
element of linked list.  Process B couldn't stop because its
condition is comparison with the head <cache>.  Process B touches
memory, and then it sees *OLD* data of <used>.  Besides,
<cache> is on the same page with <used>, it's contents from
viewpoint of process B is also changed to *OLD*.

      +-------------------------------------+
      |                                     |
used  v                                     |
+-----+ Wow!            +-----+     +-----+ |
|   -----+           +->|   ------->|   ----+
+-----+  |           |  +-----+     +-----+
 *****   |           |  |*****|     |*****|
         |  ELEM     |  +-----+     +-----+
         |  +-----+  |
         +->|   -----+
            +-----+
            |*****|
cache       +-----+
+-----+ Wow!             +-----+
|   -------------------->|   ----> ... to cache
+-----+                  +-----+
                         |     |
                         +-----+

Process B continues scanning this linked list forever.
It enters this loop from <cache>, but <cache>
does not points to ELEM now.
-- 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-02  2:41       ` threads and fork on machine with VIPT-WB cache NIIBE Yutaka
@ 2010-04-02  3:30         ` James Bottomley
  2010-04-02  3:48           ` NIIBE Yutaka
  0 siblings, 1 reply; 74+ messages in thread
From: James Bottomley @ 2010-04-02  3:30 UTC (permalink / raw)
  To: NIIBE Yutaka; +Cc: linux-parisc, pkg-gauche-devel, 561203

On Fri, 2010-04-02 at 11:41 +0900, NIIBE Yutaka wrote:
> (9) Process B does read-access on memory, which gets *NEW* data in
>     cache (if process space identifier color is same).
>     Process B does write-access on memory which causes memory fault,
>     as it's COW memory.
> 
>     Note: Process B sees *NEW* data because it's VIPT-WB cache.
>     It shares same memory in this situation.

So I think the bug here is that you're confusing aliasing with SMP cache
coherence.  In an alias situation, the same physical line is mapped to
multiple lines in a processor's cache (at different virtual addresses),
which means you can get a different answer depending on which alias you
read.

In COW breaking, the page table entry is copied, so A and B no longer
have page table entries at the same physical location.  If the COW is
intact, A and B have the same physical page, but it's also accessed by
the same virtual address, hence no aliasing.

In an SMP incoherent system, A and B could get different results (if on
different CPUs) because the write protect is in the cache of A but not
B.  However, PA is SMP coherent, so the act of B reading a line which is
dirty in A's cache causes a flush before the read completes via the
cache chequerboard logic and B ends up reading the same value A would
have read.

James



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-02  3:30         ` James Bottomley
@ 2010-04-02  3:48           ` NIIBE Yutaka
  2010-04-02  8:05             ` NIIBE Yutaka
  2010-04-02 12:22             ` James Bottomley
  0 siblings, 2 replies; 74+ messages in thread
From: NIIBE Yutaka @ 2010-04-02  3:48 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-parisc, pkg-gauche-devel, 561203

Thanks for your quick reply.

James Bottomley wrote:
> In COW breaking, the page table entry is copied, so A and B no longer
> have page table entries at the same physical location.  If the COW is
> intact, A and B have the same physical page, but it's also accessed by
> the same virtual address, hence no aliasing.

Let me explain more.

In the scenario, I assume:

	No aliasing between A and B.
	We have aliasing between kernel access and user access.

Before COW breaking A and B share same data (with no aliasing same
space identifier color), and B sees data in cache, while memory has
stale data.

At COW breaking, kernel copies the memory, it doesn't see new data
in cache because of aliasing.

Isn't it possible?
-- 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-02  3:48           ` NIIBE Yutaka
@ 2010-04-02  8:05             ` NIIBE Yutaka
  2010-04-02 19:35               ` John David Anglin
  2010-04-02 12:22             ` James Bottomley
  1 sibling, 1 reply; 74+ messages in thread
From: NIIBE Yutaka @ 2010-04-02  8:05 UTC (permalink / raw)
  To: linux-parisc; +Cc: pkg-gauche-devel, 561203

NIIBE Yutaka wrote:
> To have same semantics as other archs, I think that VIPT-WB cache
> machine should have cache flush at ptep_set_wrprotect, so that memory
> of the page has up-to-date data.  Yes, it will be huge performance
> impact for fork.  But I don't find any good solution other than this
> yet.

I think we could do something like (only for VIPT-WB cache machine):

-	static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned 
long address, pte_t *ptep)

+	static inline void ptep_set_wrprotect(struct vm_area_struct *vma, 
struct mm_struct *mm, unsigned long addr, pte_t *ptep)
	{
		pte_t old_pte = *ptep;
+		if (atomic_read(&mm->mm_users) > 1)
+			flush_cache_page(vma, addr, pte_pfn(old_pte));
		set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
	}

Here, we can add condition for the call of flush_cache_page
to avoid big performance impact for non threads case.
-- 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-02  3:48           ` NIIBE Yutaka
  2010-04-02  8:05             ` NIIBE Yutaka
@ 2010-04-02 12:22             ` James Bottomley
  2010-04-05  0:39               ` NIIBE Yutaka
  1 sibling, 1 reply; 74+ messages in thread
From: James Bottomley @ 2010-04-02 12:22 UTC (permalink / raw)
  To: NIIBE Yutaka; +Cc: linux-parisc, pkg-gauche-devel, 561203

On Fri, 2010-04-02 at 12:48 +0900, NIIBE Yutaka wrote:
> Thanks for your quick reply.
> 
> James Bottomley wrote:
> > In COW breaking, the page table entry is copied, so A and B no longer
> > have page table entries at the same physical location.  If the COW is
> > intact, A and B have the same physical page, but it's also accessed by
> > the same virtual address, hence no aliasing.
> 
> Let me explain more.
> 
> In the scenario, I assume:
> 
> 	No aliasing between A and B.
> 	We have aliasing between kernel access and user access.
> 
> Before COW breaking A and B share same data (with no aliasing same
> space identifier color), and B sees data in cache, while memory has
> stale data.
> 
> At COW breaking, kernel copies the memory, it doesn't see new data
> in cache because of aliasing.
> 
> Isn't it possible?

So your theory is that the data the kernel sees doing the page copy can
be stale because of dirty cache lines in userspace (which is certainly
possible in the ordinary way)?  By design that shouldn't happen: the
idea behind COW breaking is that before it breaks, the page is read
only ... this means that processes can have clean cache copies of it,
but never dirty cache copies (because writes are forbidden).  As soon as
one or other process tries to write to the page, it gets a memory
protection trap long before the data it's trying to write goes into the
cache.  By the time the write is allowed to complete (and the cache
becomes dirty), the process will have the new copy of the page which
belongs exclusively to it.

James



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-02  8:05             ` NIIBE Yutaka
@ 2010-04-02 19:35               ` John David Anglin
  2010-04-08 21:11                 ` Helge Deller
  0 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-04-02 19:35 UTC (permalink / raw)
  To: NIIBE Yutaka; +Cc: linux-parisc, pkg-gauche-devel, 561203

On Fri, 02 Apr 2010, NIIBE Yutaka wrote:

> NIIBE Yutaka wrote:
>> To have same semantics as other archs, I think that VIPT-WB cache
>> machine should have cache flush at ptep_set_wrprotect, so that memory
>> of the page has up-to-date data.  Yes, it will be huge performance
>> impact for fork.  But I don't find any good solution other than this
>> yet.
>
> I think we could do something like (only for VIPT-WB cache machine):
>
> -	static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long 
> address, pte_t *ptep)
>
> +	static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct 
> mm_struct *mm, unsigned long addr, pte_t *ptep)
> 	{
> 		pte_t old_pte = *ptep;
> +		if (atomic_read(&mm->mm_users) > 1)
> +			flush_cache_page(vma, addr, pte_pfn(old_pte));
> 		set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
> 	}

I tested the hack below on two machines currently running 2.6.33.2
UP kernels.  The change seems to fix Debian #561203 (minifail bug)!
Thus, I definitely think you are on the right track.  I'll continue
to test.

I suspect the same issue is present for SMP kernels.

Thanks,
Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index a27d2e2..a5d730f 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -14,6 +14,7 @@
 #include <linux/bitops.h>
 #include <asm/processor.h>
 #include <asm/cache.h>
+extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn);
 
 /*
  * kern_addr_valid(ADDR) tests if ADDR is pointing to valid kernel
@@ -456,7 +457,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 	return old_pte;
 }
 
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
 #ifdef CONFIG_SMP
 	unsigned long new, old;
@@ -467,6 +468,8 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 	} while (cmpxchg((unsigned long *) ptep, old, new) != old);
 #else
 	pte_t old_pte = *ptep;
+	if (atomic_read(&mm->mm_users) > 1)
+		flush_cache_page(vma, addr, pte_pfn(old_pte));
 	set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
 #endif
 }
diff --git a/mm/memory.c b/mm/memory.c
index 09e4b1b..21c2916 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -616,7 +616,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	 * in the parent and the child
 	 */
 	if (is_cow_mapping(vm_flags)) {
-		ptep_set_wrprotect(src_mm, addr, src_pte);
+		ptep_set_wrprotect(vma, src_mm, addr, src_pte);
 		pte = pte_wrprotect(pte);
 	}
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-02 12:22             ` James Bottomley
@ 2010-04-05  0:39               ` NIIBE Yutaka
  2010-04-05  2:51                 ` John David Anglin
  0 siblings, 1 reply; 74+ messages in thread
From: NIIBE Yutaka @ 2010-04-05  0:39 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-parisc, pkg-gauche-devel, 561203

Thanks a lot for the discussion.

James Bottomley wrote:
> So your theory is that the data the kernel sees doing the page copy can
> be stale because of dirty cache lines in userspace (which is certainly
> possible in the ordinary way)?

Yes.

> By design that shouldn't happen: the idea behind COW breaking is
> that before it breaks, the page is read only ... this means that
> processes can have clean cache copies of it, but never dirty cache
> copies (because writes are forbidden).

That must be design, I agree.

To keep this condition (no dirty cache for COW page), we need to flush
cache before ptep_set_wrprotect.  That's my point.

Please look at the code path:
   (kernel/fork.c)
   do_fork -> copy_process -> copy_mm -> dup_mm -> dup_mmap ->
   (mm/memory.c)
   copy_page_range -> copy_p*d_range -> copy_one_pte -> ptep_set_wrprotect

The function flush_cache_dup_mm is called from dup_mmap, that's enough
for a case of a process with single thread.

I think that:
We need to flush cache before ptep_set_wrprotect for a process with
multiple threads.  Other threads may change memory after a thread
invokes do_fork and before calling ptep_set_wrprotect.  Specifically,
a process may sleep at pte_alloc function to get a page.
-- 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-05  0:39               ` NIIBE Yutaka
@ 2010-04-05  2:51                 ` John David Anglin
  2010-04-05  2:58                   ` John David Anglin
  2010-04-05 16:18                   ` James Bottomley
  0 siblings, 2 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-05  2:51 UTC (permalink / raw)
  To: NIIBE Yutaka; +Cc: James.Bottomley, linux-parisc, pkg-gauche-devel, 561203

> Thanks a lot for the discussion.
> 
> James Bottomley wrote:
> > So your theory is that the data the kernel sees doing the page copy can
> > be stale because of dirty cache lines in userspace (which is certainly
> > possible in the ordinary way)?
> 
> Yes.
> 
> > By design that shouldn't happen: the idea behind COW breaking is
> > that before it breaks, the page is read only ... this means that
> > processes can have clean cache copies of it, but never dirty cache
> > copies (because writes are forbidden).
> 
> That must be design, I agree.
> 
> To keep this condition (no dirty cache for COW page), we need to flush
> cache before ptep_set_wrprotect.  That's my point.
> 
> Please look at the code path:
>    (kernel/fork.c)
>    do_fork -> copy_process -> copy_mm -> dup_mm -> dup_mmap ->
>    (mm/memory.c)
>    copy_page_range -> copy_p*d_range -> copy_one_pte -> ptep_set_wrprotect
> 
> The function flush_cache_dup_mm is called from dup_mmap, that's enough
> for a case of a process with single thread.
> I think that:
> We need to flush cache before ptep_set_wrprotect for a process with
> multiple threads.  Other threads may change memory after a thread
> invokes do_fork and before calling ptep_set_wrprotect.  Specifically,
> a process may sleep at pte_alloc function to get a page.

I agree.  It is interesting that in the case of the Debian bug that
a thread of the parent process causes the COW break and thereby corrupts
its own memory.  As far as I can tell, the fork'd child never writes
to the memory that causes the fault.

My testing indicates that your suggested change fixes the Debian
bug.  I've attached below my latest test version.  This seems to fix
the bug on both SMP and UP kernels.

However, it doesn't fix all page/cache related issues on parisc
SMP kernels that I commonly see.

My first inclination after even before reading your analysis was
to assume that copy_user_page was broken (i.e, that even if a
processor cache was dirty when the COW page was write protected,
it should be possible to do the flush before the page is copied).
However, this didn't seem to work...  Possibly, there are issues
with aliased addresses.

I note that sparc flushes the entire cache and purges the entire
tlb in kmap_atomic/kunmap_atomic for highmem.  Although the breakage
that I see is not limited to PA8800/PA8900, I'm not convinced
that we maintain coherency that is required for these processors
in copy_user_page when we have multiple threads.

As a side note, kmap_atomic/kunmap_atomic seem to lack calls to
pagefault_disable()/pagefault_enable() on PA8800.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index a27d2e2..b140d5c 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -14,6 +14,7 @@
 #include <linux/bitops.h>
 #include <asm/processor.h>
 #include <asm/cache.h>
+extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn);
 
 /*
  * kern_addr_valid(ADDR) tests if ADDR is pointing to valid kernel
@@ -456,17 +457,22 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 	return old_pte;
 }
 
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
 #ifdef CONFIG_SMP
 	unsigned long new, old;
+#endif
+	pte_t old_pte = *ptep;
+
+	if (atomic_read(&mm->mm_users) > 1)
+		flush_cache_page(vma, addr, pte_pfn(old_pte));
 
+#ifdef CONFIG_SMP
 	do {
 		old = pte_val(*ptep);
 		new = pte_val(pte_wrprotect(__pte (old)));
 	} while (cmpxchg((unsigned long *) ptep, old, new) != old);
 #else
-	pte_t old_pte = *ptep;
 	set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
 #endif
 }
diff --git a/mm/memory.c b/mm/memory.c
index 09e4b1b..21c2916 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -616,7 +616,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	 * in the parent and the child
 	 */
 	if (is_cow_mapping(vm_flags)) {
-		ptep_set_wrprotect(src_mm, addr, src_pte);
+		ptep_set_wrprotect(vma, src_mm, addr, src_pte);
 		pte = pte_wrprotect(pte);
 	}
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-05  2:51                 ` John David Anglin
@ 2010-04-05  2:58                   ` John David Anglin
  2010-04-05 16:18                   ` James Bottomley
  1 sibling, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-05  2:58 UTC (permalink / raw)
  To: John David Anglin
  Cc: gniibe, James.Bottomley, linux-parisc, pkg-gauche-devel, 561203

> > > By design that shouldn't happen: the idea behind COW breaking is
> > > that before it breaks, the page is read only ... this means that
> > > processes can have clean cache copies of it, but never dirty cache
> > > copies (because writes are forbidden).
> > 
> > That must be design, I agree.
> > 
> > To keep this condition (no dirty cache for COW page), we need to flush
> > cache before ptep_set_wrprotect.  That's my point.

Is it possible that a sleep/reschedule could cause the cache to become
dirty again before it is write protected?

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-05  2:51                 ` John David Anglin
  2010-04-05  2:58                   ` John David Anglin
@ 2010-04-05 16:18                   ` James Bottomley
  2010-04-06  4:57                     ` NIIBE Yutaka
  1 sibling, 1 reply; 74+ messages in thread
From: James Bottomley @ 2010-04-05 16:18 UTC (permalink / raw)
  To: John David Anglin; +Cc: NIIBE Yutaka, linux-parisc, pkg-gauche-devel, 561203

On Sun, 2010-04-04 at 22:51 -0400, John David Anglin wrote:
> > Thanks a lot for the discussion.
> > 
> > James Bottomley wrote:
> > > So your theory is that the data the kernel sees doing the page copy can
> > > be stale because of dirty cache lines in userspace (which is certainly
> > > possible in the ordinary way)?
> > 
> > Yes.
> > 
> > > By design that shouldn't happen: the idea behind COW breaking is
> > > that before it breaks, the page is read only ... this means that
> > > processes can have clean cache copies of it, but never dirty cache
> > > copies (because writes are forbidden).
> > 
> > That must be design, I agree.
> > 
> > To keep this condition (no dirty cache for COW page), we need to flush
> > cache before ptep_set_wrprotect.  That's my point.
> > 
> > Please look at the code path:
> >    (kernel/fork.c)
> >    do_fork -> copy_process -> copy_mm -> dup_mm -> dup_mmap ->
> >    (mm/memory.c)
> >    copy_page_range -> copy_p*d_range -> copy_one_pte -> ptep_set_wrprotect
> > 
> > The function flush_cache_dup_mm is called from dup_mmap, that's enough
> > for a case of a process with single thread.
> > I think that:
> > We need to flush cache before ptep_set_wrprotect for a process with
> > multiple threads.  Other threads may change memory after a thread
> > invokes do_fork and before calling ptep_set_wrprotect.  Specifically,
> > a process may sleep at pte_alloc function to get a page.
> 
> I agree.  It is interesting that in the case of the Debian bug that
> a thread of the parent process causes the COW break and thereby corrupts
> its own memory.  As far as I can tell, the fork'd child never writes
> to the memory that causes the fault.
> 
> My testing indicates that your suggested change fixes the Debian
> bug.  I've attached below my latest test version.  This seems to fix
> the bug on both SMP and UP kernels.
> 
> However, it doesn't fix all page/cache related issues on parisc
> SMP kernels that I commonly see.
> 
> My first inclination after even before reading your analysis was
> to assume that copy_user_page was broken (i.e, that even if a
> processor cache was dirty when the COW page was write protected,
> it should be possible to do the flush before the page is copied).
> However, this didn't seem to work...  Possibly, there are issues
> with aliased addresses.
> 
> I note that sparc flushes the entire cache and purges the entire
> tlb in kmap_atomic/kunmap_atomic for highmem.  Although the breakage
> that I see is not limited to PA8800/PA8900, I'm not convinced
> that we maintain coherency that is required for these processors
> in copy_user_page when we have multiple threads.
> 
> As a side note, kmap_atomic/kunmap_atomic seem to lack calls to
> pagefault_disable()/pagefault_enable() on PA8800.
> 
> Dave
> -- 
> J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
> National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
> 
> diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
> index a27d2e2..b140d5c 100644
> --- a/arch/parisc/include/asm/pgtable.h
> +++ b/arch/parisc/include/asm/pgtable.h
> @@ -14,6 +14,7 @@
>  #include <linux/bitops.h>
>  #include <asm/processor.h>
>  #include <asm/cache.h>
> +extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn);
>  
>  /*
>   * kern_addr_valid(ADDR) tests if ADDR is pointing to valid kernel
> @@ -456,17 +457,22 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
>  	return old_pte;
>  }
>  
> -static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> +static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
>  {
>  #ifdef CONFIG_SMP
>  	unsigned long new, old;
> +#endif
> +	pte_t old_pte = *ptep;
> +
> +	if (atomic_read(&mm->mm_users) > 1)

Just to verify there's nothing this is hiding, can you make this 

	if (pte_dirty(old_pte))

and reverify?  The if clause should only trip on the case where the
parent has dirtied the line between flush and now.

> +		flush_cache_page(vma, addr, pte_pfn(old_pte));
>  
> +#ifdef CONFIG_SMP
>  	do {
>  		old = pte_val(*ptep);
>  		new = pte_val(pte_wrprotect(__pte (old)));
>  	} while (cmpxchg((unsigned long *) ptep, old, new) != old);
>  #else
> -	pte_t old_pte = *ptep;
>  	set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
>  #endif
>  }
> diff --git a/mm/memory.c b/mm/memory.c
> index 09e4b1b..21c2916 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -616,7 +616,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
>  	 * in the parent and the child
>  	 */
>  	if (is_cow_mapping(vm_flags)) {
> -		ptep_set_wrprotect(src_mm, addr, src_pte);
> +		ptep_set_wrprotect(vma, src_mm, addr, src_pte);

So this is going to be a hard sell because of the arch churn. There are,
however, three ways to do it with the original signature.

     1. implement copy_user_highpage ... this allows us to copy through
        the child's page cache (which is coherent with the parent's
        before the cow) and thus pick up any cache changes without a
        flush
     2. use the mm identically to flush_user_cache_page_noncurrent.  The
        only reason that needs the vma is for the icache check ... but
        that shouldn't happen here (if the parent is actually doing a
        self modifying exec region, it needs to manage coherency
        itself).
     3. Flush in kmap ... this is something that's been worrying me
        since the flamewars over kmap for pio.

James



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-05 16:18                   ` James Bottomley
@ 2010-04-06  4:57                     ` NIIBE Yutaka
  2010-04-06 13:37                       ` James Bottomley
  0 siblings, 1 reply; 74+ messages in thread
From: NIIBE Yutaka @ 2010-04-06  4:57 UTC (permalink / raw)
  To: James Bottomley; +Cc: John David Anglin, linux-parisc, pkg-gauche-devel, 561203

John David Anglin wrote:
> It is interesting that in the case of the Debian bug that
> a thread of the parent process causes the COW break and thereby corrupts
> its own memory.  As far as I can tell, the fork'd child never writes
> to the memory that causes the fault.

Thanks for writing and testing a patch.

The case of #561203 is second scenario.  I think that this case is
relevant to VIVT-WB machine too (provided kernel does copy by kernel
address).

James Bottomley wrote:
> So this is going to be a hard sell because of the arch churn. There are,
> however, three ways to do it with the original signature.

Currently, I think that signature change would be inevitable for
ptep_set_wrprotect.

>      1. implement copy_user_highpage ... this allows us to copy through
>         the child's page cache (which is coherent with the parent's
>         before the cow) and thus pick up any cache changes without a
>         flush

Let me think about this way.

Well, this would improve both cases of the first scenario of mine and
the second scenario.

But... I think that even if we would have copy_user_highpage which
does copy by user address, we need to flush at ptep_set_wrprotect.  I
think that we need to keep the condition: no dirty cache for COW page.

Think about third scenario of threads and fork:

(1) In process A, there are multiple threads, and a thread A-1 invokes
    fork.  We have process B, with a different space identifier color.

(2) Another thread A-2 in process A runs while A-1 copies memory by
    dup_mmap.  A-2 writes to the address <x> in a page.  Let's call
    this page <oldpage>.

(3) We have dirty cache for <x> by A-2 at the time of
    ptep_set_wrprotect of thread A-1.  Suppose that we don't flush
    here.

(4) A-1 finishes copy, and sleeps.

(5) Child process B is waken up and sees old value at <x> in <oldpage>,
    through different cache line.  B sleeps.

(6) A-2 is waken up.  A-2 touches the memory again, breaks COW.  A-2
    copies data on <oldpage> to <newpage>.  OK, <newpage> is
    consistent with copy_user_highpage by user address.

    Note that during this copy, the cache line of <x> by A-2 is
    flushed out to <oldpage>.  It invokes another memory fault and COW
    break.  (I think that this memory fault is unhealthy.)
    Then, new value goes to <x> on <oldpage> (when it's physically
    tagged cache).

    A-2 sleeps.

(7) Child process B is waken up.  When it accesses at <x>, it sees new
    value suddenly.


If we flush cache to <oldpage> at ptep_set_wrprotect, this couldn't
occur.


			*	*	*


I know that we should not do "threads and fork".  It is difficult to
define clean semantics.  Because another thread may touch memory while
a thread which does memory copy for fork, the memory what the child
process will see may be inconsistent.  For the child, a page might be
new, while another page might be old.

For VIVT-WB cache machine, I am considering a possibility for the
child process to have inconsistent memory even within a single page
(when we have no flush at ptep_set_wrprotect).

It will be needed for me to talk to linux-arch soon or later.
-- 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-06  4:57                     ` NIIBE Yutaka
@ 2010-04-06 13:37                       ` James Bottomley
  2010-04-06 13:44                         ` James Bottomley
  0 siblings, 1 reply; 74+ messages in thread
From: James Bottomley @ 2010-04-06 13:37 UTC (permalink / raw)
  To: NIIBE Yutaka; +Cc: John David Anglin, linux-parisc, pkg-gauche-devel, 561203

On Tue, 2010-04-06 at 13:57 +0900, NIIBE Yutaka wrote:
> John David Anglin wrote:
> > It is interesting that in the case of the Debian bug that
> > a thread of the parent process causes the COW break and thereby corrupts
> > its own memory.  As far as I can tell, the fork'd child never writes
> > to the memory that causes the fault.
> 
> Thanks for writing and testing a patch.
> 
> The case of #561203 is second scenario.  I think that this case is
> relevant to VIVT-WB machine too (provided kernel does copy by kernel
> address).
> 
> James Bottomley wrote:
> > So this is going to be a hard sell because of the arch churn. There are,
> > however, three ways to do it with the original signature.
> 
> Currently, I think that signature change would be inevitable for
> ptep_set_wrprotect.

Well we can't do it by claiming several architectures are wrong in their
implementation.  We might do it by claiming to need vma knowledge ...
however, even if you want the flush, as I said, you don't need to change
the signature.

> >      1. implement copy_user_highpage ... this allows us to copy through
> >         the child's page cache (which is coherent with the parent's
> >         before the cow) and thus pick up any cache changes without a
> >         flush
> 
> Let me think about this way.
> 
> Well, this would improve both cases of the first scenario of mine and
> the second scenario.
> 
> But... I think that even if we would have copy_user_highpage which
> does copy by user address, we need to flush at ptep_set_wrprotect.  I
> think that we need to keep the condition: no dirty cache for COW page.
> 
> Think about third scenario of threads and fork:
> 
> (1) In process A, there are multiple threads, and a thread A-1 invokes
>     fork.  We have process B, with a different space identifier color.

I don't understand what you mean by space colour ... there's cache
colour which refers to the line in the cache to which the the physical
memory maps.  The way PA is set up, space ID doesn't factor into cache
colour.

> (2) Another thread A-2 in process A runs while A-1 copies memory by
>     dup_mmap.  A-2 writes to the address <x> in a page.  Let's call
>     this page <oldpage>.
> 
> (3) We have dirty cache for <x> by A-2 at the time of
>     ptep_set_wrprotect of thread A-1.  Suppose that we don't flush
>     here.
> 
> (4) A-1 finishes copy, and sleeps.
> 
> (5) Child process B is waken up and sees old value at <x> in <oldpage>,
>     through different cache line.  B sleeps.

This isn't possible.  at this point, A and B have the same virtual
address and mapping for <oldpage> this means they are the same cache
colour, so they both see the cached value.

James

> (6) A-2 is waken up.  A-2 touches the memory again, breaks COW.  A-2
>     copies data on <oldpage> to <newpage>.  OK, <newpage> is
>     consistent with copy_user_highpage by user address.
> 
>     Note that during this copy, the cache line of <x> by A-2 is
>     flushed out to <oldpage>.  It invokes another memory fault and COW
>     break.  (I think that this memory fault is unhealthy.)
>     Then, new value goes to <x> on <oldpage> (when it's physically
>     tagged cache).
> 
>     A-2 sleeps.
> 
> (7) Child process B is waken up.  When it accesses at <x>, it sees new
>     value suddenly.
> 
> 
> If we flush cache to <oldpage> at ptep_set_wrprotect, this couldn't
> occur.
> 
> 
> 			*	*	*
> 
> 
> I know that we should not do "threads and fork".  It is difficult to
> define clean semantics.  Because another thread may touch memory while
> a thread which does memory copy for fork, the memory what the child
> process will see may be inconsistent.  For the child, a page might be
> new, while another page might be old.
> 
> For VIVT-WB cache machine, I am considering a possibility for the
> child process to have inconsistent memory even within a single page
> (when we have no flush at ptep_set_wrprotect).
> 
> It will be needed for me to talk to linux-arch soon or later.



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-06 13:37                       ` James Bottomley
@ 2010-04-06 13:44                         ` James Bottomley
  0 siblings, 0 replies; 74+ messages in thread
From: James Bottomley @ 2010-04-06 13:44 UTC (permalink / raw)
  To: NIIBE Yutaka; +Cc: John David Anglin, linux-parisc, pkg-gauche-devel, 561203

On Tue, 2010-04-06 at 08:37 -0500, James Bottomley wrote:
> > (5) Child process B is waken up and sees old value at <x> in
> <oldpage>,
> >     through different cache line.  B sleeps.
> 
> This isn't possible.  at this point, A and B have the same virtual
> address and mapping for <oldpage> this means they are the same cache
> colour, so they both see the cached value.

Perhaps to add more detail to this.  In spite of what the arch manual
says (it says the congruence stride is 16MB), the congruence stride on
all manufactured parisc processors is 4MB.  This means that any virtual
addresses, regardless of space id, that are equal modulo 4MB have the
same cache colour.

James
 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-02 19:35               ` John David Anglin
@ 2010-04-08 21:11                 ` Helge Deller
  2010-04-08 21:54                   ` John David Anglin
  0 siblings, 1 reply; 74+ messages in thread
From: Helge Deller @ 2010-04-08 21:11 UTC (permalink / raw)
  To: John David Anglin
  Cc: John David Anglin, NIIBE Yutaka, linux-parisc, pkg-gauche-devel, 561203

On 04/02/2010 09:35 PM, John David Anglin wrote:
> On Fri, 02 Apr 2010, NIIBE Yutaka wrote:
> 
>> NIIBE Yutaka wrote:
>>> To have same semantics as other archs, I think that VIPT-WB cache
>>> machine should have cache flush at ptep_set_wrprotect, so that memory
>>> of the page has up-to-date data.  Yes, it will be huge performance
>>> impact for fork.  But I don't find any good solution other than this
>>> yet.
>>
>> I think we could do something like (only for VIPT-WB cache machine):
>>
>> -	static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long 
>> address, pte_t *ptep)
>>
>> +	static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct 
>> mm_struct *mm, unsigned long addr, pte_t *ptep)
>> 	{
>> 		pte_t old_pte = *ptep;
>> +		if (atomic_read(&mm->mm_users) > 1)
>> +			flush_cache_page(vma, addr, pte_pfn(old_pte));
>> 		set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
>> 	}
> 
> I tested the hack below on two machines currently running 2.6.33.2
> UP kernels.  The change seems to fix Debian #561203 (minifail bug)!
> Thus, I definitely think you are on the right track.  I'll continue
> to test.
> 
> I suspect the same issue is present for SMP kernels.

Hi Dave,

I tested your patch today on one of my machines with plain kernel 2.6.33 (32bit, SMP, B2000 I think).
Sadly I still did see the minifail bug.

Are you sure, that the patch fixed this bug for you?

Helge

do_page_fault() pid=21470 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=7986 command='minifail3' type=6 address=0x00000003                                                                                 
do_page_fault() pid=19952 command='minifail3' type=6 address=0x00000003                                                                                
do_page_fault() pid=13549 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=21862 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=4615 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=17336 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=21986 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=2157 command='minifail3' type=15 address=0x000000dc
do_page_fault() pid=23886 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=2681 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3229 command='minifail3' type=15 address=0x000000ec
do_page_fault() pid=26095 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=20722 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=19912 command='minifail3' type=15 address=0x000000ec
...
pagealloc: memory corruption
7db0c780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
7db0c790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
7db0c7a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
7db0c7b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Backtrace:
 [<1011ec14>] show_stack+0x18/0x28
 [<10117ba0>] dump_stack+0x1c/0x2c
 [<101c6594>] kernel_map_pages+0x2a0/0x2b8
 [<1019e6c8>] get_page_from_freelist+0x3d4/0x614
 [<1019ea3c>] __alloc_pages_nodemask+0x134/0x610
 [<101b1d20>] do_wp_page+0x268/0xac0
 [<101b3b34>] handle_mm_fault+0x4d4/0x7c4
 [<1011d854>] do_page_fault+0x1f8/0x2fc
 [<1011f450>] handle_interruption+0xec/0x730
 [<10103078>] intr_check_sig+0x0/0x34
...
do_page_fault() pid=13414 command='minifail3' type=15 address=0x000000dc
do_page_fault() pid=22776 command='minifail3' type=15 address=0x00000000
do_page_fault() pid=26290 command='minifail3' type=15 address=0x000000ec
do_page_fault() pid=1399 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=16130 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=26401 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3383 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3400 command='minifail3' type=15 address=0x00000004
do_page_fault() pid=18659 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3730 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=28828 command='minifail3' type=6 address=0x00000003

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-08 21:11                 ` Helge Deller
@ 2010-04-08 21:54                   ` John David Anglin
  2010-04-08 22:44                     ` John David Anglin
  0 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-04-08 21:54 UTC (permalink / raw)
  To: Helge Deller
  Cc: John David Anglin, NIIBE Yutaka, linux-parisc, pkg-gauche-devel, 561203

On Thu, 08 Apr 2010, Helge Deller wrote:

> On 04/02/2010 09:35 PM, John David Anglin wrote:
> > On Fri, 02 Apr 2010, NIIBE Yutaka wrote:
> > 
> >> NIIBE Yutaka wrote:
> >>> To have same semantics as other archs, I think that VIPT-WB cache
> >>> machine should have cache flush at ptep_set_wrprotect, so that memory
> >>> of the page has up-to-date data.  Yes, it will be huge performance
> >>> impact for fork.  But I don't find any good solution other than this
> >>> yet.
> >>
> >> I think we could do something like (only for VIPT-WB cache machine):
> >>
> >> -	static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long 
> >> address, pte_t *ptep)
> >>
> >> +	static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct 
> >> mm_struct *mm, unsigned long addr, pte_t *ptep)
> >> 	{
> >> 		pte_t old_pte = *ptep;
> >> +		if (atomic_read(&mm->mm_users) > 1)
> >> +			flush_cache_page(vma, addr, pte_pfn(old_pte));
> >> 		set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
> >> 	}
> > 
> > I tested the hack below on two machines currently running 2.6.33.2
> > UP kernels.  The change seems to fix Debian #561203 (minifail bug)!
> > Thus, I definitely think you are on the right track.  I'll continue
> > to test.
> > 
> > I suspect the same issue is present for SMP kernels.
> 
> Hi Dave,
> 
> I tested your patch today on one of my machines with plain kernel 2.6.33 (32bit, SMP, B2000 I think).
> Sadly I still did see the minifail bug.
> 
> Are you sure, that the patch fixed this bug for you?

Seemed to, but I have a bunch of other changes installed.  Possibly,
the change to cacheflush.h is important.  It affects all PA8000.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

diff --git a/arch/parisc/hpux/wrappers.S b/arch/parisc/hpux/wrappers.S
index 58c53c8..bdcea33 100644
--- a/arch/parisc/hpux/wrappers.S
+++ b/arch/parisc/hpux/wrappers.S
@@ -88,7 +88,7 @@ ENTRY(hpux_fork_wrapper)
 
 	STREG	%r2,-20(%r30)
 	ldo	64(%r30),%r30
-	STREG	%r2,PT_GR19(%r1)	;! save for child
+	STREG	%r2,PT_SYSCALL_RP(%r1)	;! save for child
 	STREG	%r30,PT_GR21(%r1)	;! save for child
 
 	LDREG	PT_GR30(%r1),%r25
@@ -132,7 +132,7 @@ ENTRY(hpux_child_return)
 	bl,n	schedule_tail, %r2
 #endif
 
-	LDREG	TASK_PT_GR19-TASK_SZ_ALGN-128(%r30),%r2
+	LDREG	TASK_PT_SYSCALL_RP-TASK_SZ_ALGN-128(%r30),%r2
 	b fork_return
 	copy %r0,%r28
 ENDPROC(hpux_child_return)
diff --git a/arch/parisc/include/asm/atomic.h b/arch/parisc/include/asm/atomic.h
index 716634d..d7fabc4 100644
--- a/arch/parisc/include/asm/atomic.h
+++ b/arch/parisc/include/asm/atomic.h
@@ -24,29 +24,46 @@
  * Hash function to index into a different SPINLOCK.
  * Since "a" is usually an address, use one spinlock per cacheline.
  */
-#  define ATOMIC_HASH_SIZE 4
-#  define ATOMIC_HASH(a) (&(__atomic_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
+#  define ATOMIC_HASH_SIZE (4096/L1_CACHE_BYTES)  /* 4 */
+#  define ATOMIC_HASH(a)      (&(__atomic_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
+#  define ATOMIC_USER_HASH(a) (&(__atomic_user_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
 
 extern arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned;
+extern arch_spinlock_t __atomic_user_hash[ATOMIC_HASH_SIZE] __lock_aligned;
 
 /* Can't use raw_spin_lock_irq because of #include problems, so
  * this is the substitute */
-#define _atomic_spin_lock_irqsave(l,f) do {	\
-	arch_spinlock_t *s = ATOMIC_HASH(l);		\
+#define _atomic_spin_lock_irqsave_template(l,f,hash_func) do {	\
+	arch_spinlock_t *s = hash_func;		\
 	local_irq_save(f);			\
 	arch_spin_lock(s);			\
 } while(0)
 
-#define _atomic_spin_unlock_irqrestore(l,f) do {	\
-	arch_spinlock_t *s = ATOMIC_HASH(l);			\
+#define _atomic_spin_unlock_irqrestore_template(l,f,hash_func) do {	\
+	arch_spinlock_t *s = hash_func;			\
 	arch_spin_unlock(s);				\
 	local_irq_restore(f);				\
 } while(0)
 
+/* kernel memory locks */
+#define _atomic_spin_lock_irqsave(l,f)	\
+	_atomic_spin_lock_irqsave_template(l,f,ATOMIC_HASH(l))
+
+#define _atomic_spin_unlock_irqrestore(l,f)	\
+	_atomic_spin_unlock_irqrestore_template(l,f,ATOMIC_HASH(l))
+
+/* userspace memory locks */
+#define _atomic_spin_lock_irqsave_user(l,f)	\
+	_atomic_spin_lock_irqsave_template(l,f,ATOMIC_USER_HASH(l))
+
+#define _atomic_spin_unlock_irqrestore_user(l,f)	\
+	_atomic_spin_unlock_irqrestore_template(l,f,ATOMIC_USER_HASH(l))
 
 #else
 #  define _atomic_spin_lock_irqsave(l,f) do { local_irq_save(f); } while (0)
 #  define _atomic_spin_unlock_irqrestore(l,f) do { local_irq_restore(f); } while (0)
+#  define _atomic_spin_lock_irqsave_user(l,f) _atomic_spin_lock_irqsave(l,f)
+#  define _atomic_spin_unlock_irqrestore_user(l,f) _atomic_spin_lock_irqsave_user(l,f)
 #endif
 
 /* This should get optimized out since it's never called.
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index 7a73b61..ab87176 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -2,6 +2,7 @@
 #define _PARISC_CACHEFLUSH_H
 
 #include <linux/mm.h>
+#include <linux/uaccess.h>
 
 /* The usual comment is "Caches aren't brain-dead on the <architecture>".
  * Unfortunately, that doesn't apply to PA-RISC. */
@@ -113,11 +114,20 @@ static inline void *kmap(struct page *page)
 
 #define kunmap(page)			kunmap_parisc(page_address(page))
 
-#define kmap_atomic(page, idx)		page_address(page)
+static inline void *kmap_atomic(struct page *page, enum km_type idx)
+{
+	pagefault_disable();
+	return page_address(page);
+}
 
-#define kunmap_atomic(addr, idx)	kunmap_parisc(addr)
+static inline void kunmap_atomic(void *addr, enum km_type idx)
+{
+	kunmap_parisc(addr);
+	pagefault_enable();
+}
 
-#define kmap_atomic_pfn(pfn, idx)	page_address(pfn_to_page(pfn))
+#define kmap_atomic_prot(page, idx, prot)	kmap_atomic(page, idx)
+#define kmap_atomic_pfn(pfn, idx)	kmap_atomic(pfn_to_page(pfn), (idx))
 #define kmap_atomic_to_page(ptr)	virt_to_page(ptr)
 #endif
 
diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
index 0c705c3..7bc963e 100644
--- a/arch/parisc/include/asm/futex.h
+++ b/arch/parisc/include/asm/futex.h
@@ -55,6 +55,7 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
 {
 	int err = 0;
 	int uval;
+	unsigned long flags;
 
 	/* futex.c wants to do a cmpxchg_inatomic on kernel NULL, which is
 	 * our gateway page, and causes no end of trouble...
@@ -65,10 +66,15 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
 		return -EFAULT;
 
+	_atomic_spin_lock_irqsave_user(uaddr, flags);
+
 	err = get_user(uval, uaddr);
-	if (err) return -EFAULT;
-	if (uval == oldval)
-		err = put_user(newval, uaddr);
+	if (!err)
+		if (uval == oldval)
+			err = put_user(newval, uaddr);
+
+	_atomic_spin_unlock_irqrestore_user(uaddr, flags);
+
 	if (err) return -EFAULT;
 	return uval;
 }
diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index a27d2e2..53ba987 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -14,6 +14,7 @@
 #include <linux/bitops.h>
 #include <asm/processor.h>
 #include <asm/cache.h>
+extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn);
 
 /*
  * kern_addr_valid(ADDR) tests if ADDR is pointing to valid kernel
@@ -456,17 +457,22 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 	return old_pte;
 }
 
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
 #ifdef CONFIG_SMP
 	unsigned long new, old;
+#endif
+	pte_t old_pte = *ptep;
+
+	if (pte_dirty(old_pte))
+		flush_cache_page(vma, addr, pte_pfn(old_pte));
 
+#ifdef CONFIG_SMP
 	do {
 		old = pte_val(*ptep);
 		new = pte_val(pte_wrprotect(__pte (old)));
 	} while (cmpxchg((unsigned long *) ptep, old, new) != old);
 #else
-	pte_t old_pte = *ptep;
 	set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
 #endif
 }
diff --git a/arch/parisc/include/asm/system.h b/arch/parisc/include/asm/system.h
index d91357b..4653c77 100644
--- a/arch/parisc/include/asm/system.h
+++ b/arch/parisc/include/asm/system.h
@@ -160,7 +160,7 @@ static inline void set_eiem(unsigned long val)
    ldcd). */
 
 #define __PA_LDCW_ALIGNMENT	4
-#define __ldcw_align(a) ((volatile unsigned int *)a)
+#define __ldcw_align(a) (&(a)->slock)
 #define __LDCW	"ldcw,co"
 
 #endif /*!CONFIG_PA20*/
diff --git a/arch/parisc/kernel/asm-offsets.c b/arch/parisc/kernel/asm-offsets.c
index ec787b4..b2f35b2 100644
--- a/arch/parisc/kernel/asm-offsets.c
+++ b/arch/parisc/kernel/asm-offsets.c
@@ -137,6 +137,7 @@ int main(void)
 	DEFINE(TASK_PT_IAOQ0, offsetof(struct task_struct, thread.regs.iaoq[0]));
 	DEFINE(TASK_PT_IAOQ1, offsetof(struct task_struct, thread.regs.iaoq[1]));
 	DEFINE(TASK_PT_CR27, offsetof(struct task_struct, thread.regs.cr27));
+	DEFINE(TASK_PT_SYSCALL_RP, offsetof(struct task_struct, thread.regs.pad0));
 	DEFINE(TASK_PT_ORIG_R28, offsetof(struct task_struct, thread.regs.orig_r28));
 	DEFINE(TASK_PT_KSP, offsetof(struct task_struct, thread.regs.ksp));
 	DEFINE(TASK_PT_KPC, offsetof(struct task_struct, thread.regs.kpc));
@@ -225,6 +226,7 @@ int main(void)
 	DEFINE(PT_IAOQ0, offsetof(struct pt_regs, iaoq[0]));
 	DEFINE(PT_IAOQ1, offsetof(struct pt_regs, iaoq[1]));
 	DEFINE(PT_CR27, offsetof(struct pt_regs, cr27));
+	DEFINE(PT_SYSCALL_RP, offsetof(struct pt_regs, pad0));
 	DEFINE(PT_ORIG_R28, offsetof(struct pt_regs, orig_r28));
 	DEFINE(PT_KSP, offsetof(struct pt_regs, ksp));
 	DEFINE(PT_KPC, offsetof(struct pt_regs, kpc));
@@ -290,5 +292,11 @@ int main(void)
 	BLANK();
 	DEFINE(ASM_PDC_RESULT_SIZE, NUM_PDC_RESULT * sizeof(unsigned long));
 	BLANK();
+
+#ifdef CONFIG_SMP
+	DEFINE(ASM_ATOMIC_HASH_SIZE_SHIFT, __builtin_ffs(ATOMIC_HASH_SIZE)-1);
+	DEFINE(ASM_ATOMIC_HASH_ENTRY_SHIFT, __builtin_ffs(sizeof(__atomic_hash[0]))-1);
+#endif
+
 	return 0;
 }
diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index 3a44f7f..a7e9472 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -364,32 +364,6 @@
 	.align		32
 	.endm
 
-	/* The following are simple 32 vs 64 bit instruction
-	 * abstractions for the macros */
-	.macro		EXTR	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	extrd,u		\reg1,32+(\start),\length,\reg2
-#else
-	extrw,u		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEP	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	depd		\reg1,32+(\start),\length,\reg2
-#else
-	depw		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEPI	val,start,length,reg
-#ifdef CONFIG_64BIT
-	depdi		\val,32+(\start),\length,\reg
-#else
-	depwi		\val,\start,\length,\reg
-#endif
-	.endm
-
 	/* In LP64, the space contains part of the upper 32 bits of the
 	 * fault.  We have to extract this and place it in the va,
 	 * zeroing the corresponding bits in the space register */
@@ -442,19 +416,19 @@
 	 */
 	.macro		L2_ptep	pmd,pte,index,va,fault
 #if PT_NLEVELS == 3
-	EXTR		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
+	extru		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
 #else
-	EXTR		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
+	extru		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
 #endif
-	DEP             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	dep             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	copy		%r0,\pte
 	ldw,s		\index(\pmd),\pmd
 	bb,>=,n		\pmd,_PxD_PRESENT_BIT,\fault
-	DEP		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
+	dep		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
 	copy		\pmd,%r9
 	SHLREG		%r9,PxD_VALUE_SHIFT,\pmd
-	EXTR		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
-	DEP		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	extru		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
+	dep		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	shladd		\index,BITS_PER_PTE_ENTRY,\pmd,\pmd
 	LDREG		%r0(\pmd),\pte		/* pmd is now pte */
 	bb,>=,n		\pte,_PAGE_PRESENT_BIT,\fault
@@ -605,7 +579,7 @@
 	depdi		0,31,32,\tmp
 #endif
 	copy		\va,\tmp1
-	DEPI		0,31,23,\tmp1
+	depi		0,31,23,\tmp1
 	cmpb,COND(<>),n	\tmp,\tmp1,\fault
 	ldi		(_PAGE_DIRTY|_PAGE_WRITE|_PAGE_READ),\prot
 	depd,z		\prot,8,7,\prot
@@ -758,6 +732,10 @@ ENTRY(__kernel_thread)
 
 	STREG	%r22, PT_GR22(%r1)	/* save r22 (arg5) */
 	copy	%r0, %r22		/* user_tid */
+	copy	%r0, %r21		/* child_tid */
+#else
+	stw	%r0, -52(%r30)	     	/* user_tid */
+	stw	%r0, -56(%r30)	     	/* child_tid */
 #endif
 	STREG	%r26, PT_GR26(%r1)  /* Store function & argument for child */
 	STREG	%r25, PT_GR25(%r1)
@@ -765,7 +743,7 @@ ENTRY(__kernel_thread)
 	ldo	CLONE_VM(%r26), %r26   /* Force CLONE_VM since only init_mm */
 	or	%r26, %r24, %r26      /* will have kernel mappings.	 */
 	ldi	1, %r25			/* stack_start, signals kernel thread */
-	stw	%r0, -52(%r30)	     	/* user_tid */
+	ldi	0, %r23			/* child_stack_size */
 #ifdef CONFIG_64BIT
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
@@ -972,7 +950,10 @@ intr_check_sig:
 	BL	do_notify_resume,%r2
 	copy	%r16, %r26			/* struct pt_regs *regs */
 
-	b,n	intr_check_sig
+	mfctl   %cr30,%r16		/* Reload */
+	LDREG	TI_TASK(%r16), %r16	/* thread_info -> task_struct */
+	b	intr_check_sig
+	ldo	TASK_REGS(%r16),%r16
 
 intr_restore:
 	copy            %r16,%r29
@@ -997,13 +978,6 @@ intr_restore:
 
 	rfi
 	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
 
 #ifndef CONFIG_PREEMPT
 # define intr_do_preempt	intr_restore
@@ -1026,14 +1000,12 @@ intr_do_resched:
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	ldil	L%intr_check_sig, %r2
-#ifndef CONFIG_64BIT
-	b	schedule
-#else
-	load32	schedule, %r20
-	bv	%r0(%r20)
-#endif
-	ldo	R%intr_check_sig(%r2), %r2
+	BL	schedule,%r2
+	nop
+	mfctl   %cr30,%r16		/* Reload */
+	LDREG	TI_TASK(%r16), %r16	/* thread_info -> task_struct */
+	b	intr_check_sig
+	ldo	TASK_REGS(%r16),%r16
 
 	/* preempt the current task on returning to kernel
 	 * mode from an interrupt, iff need_resched is set,
@@ -1772,9 +1744,9 @@ ENTRY(sys_fork_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	/* These are call-clobbered registers and therefore
-	   also syscall-clobbered (we hope). */
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 
 	LDREG	PT_GR30(%r1),%r25
@@ -1804,7 +1776,7 @@ ENTRY(child_return)
 	nop
 
 	LDREG	TI_TASK-THREAD_SZ_ALGN-FRAME_SIZE-FRAME_SIZE(%r30), %r1
-	LDREG	TASK_PT_GR19(%r1),%r2
+	LDREG	TASK_PT_SYSCALL_RP(%r1),%r2
 	b	wrapper_exit
 	copy	%r0,%r28
 ENDPROC(child_return)
@@ -1823,8 +1795,9 @@ ENTRY(sys_clone_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	/* WARNING - Clobbers r19 and r21, userspace must save these! */
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 	BL	sys_clone,%r2
 	copy	%r1,%r24
@@ -1847,7 +1820,9 @@ ENTRY(sys_vfork_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 
 	BL	sys_vfork,%r2
@@ -2076,9 +2051,10 @@ syscall_restore:
 	LDREG	TASK_PT_GR31(%r1),%r31	   /* restore syscall rp */
 
 	/* NOTE: We use rsm/ssm pair to make this operation atomic */
+	LDREG   TASK_PT_GR30(%r1),%r1              /* Get user sp */
 	rsm     PSW_SM_I, %r0
-	LDREG   TASK_PT_GR30(%r1),%r30             /* restore user sp */
-	mfsp	%sr3,%r1			   /* Get users space id */
+	copy    %r1,%r30                           /* Restore user sp */
+	mfsp    %sr3,%r1                           /* Get user space id */
 	mtsp    %r1,%sr7                           /* Restore sr7 */
 	ssm     PSW_SM_I, %r0
 
diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c
index cb71f3d..84b3239 100644
--- a/arch/parisc/kernel/setup.c
+++ b/arch/parisc/kernel/setup.c
@@ -128,6 +128,14 @@ void __init setup_arch(char **cmdline_p)
 	printk(KERN_INFO "The 32-bit Kernel has started...\n");
 #endif
 
+	/* Consistency check on the size and alignments of our spinlocks */
+#ifdef CONFIG_SMP
+	BUILD_BUG_ON(sizeof(arch_spinlock_t) != __PA_LDCW_ALIGNMENT);
+	BUG_ON((unsigned long)&__atomic_hash[0] & (__PA_LDCW_ALIGNMENT-1));
+	BUG_ON((unsigned long)&__atomic_hash[1] & (__PA_LDCW_ALIGNMENT-1));
+#endif
+	BUILD_BUG_ON((1<<L1_CACHE_SHIFT) != L1_CACHE_BYTES);
+
 	pdc_console_init();
 
 #ifdef CONFIG_64BIT
diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index f5f9602..68e75ce 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -47,18 +47,17 @@ ENTRY(linux_gateway_page)
 	KILL_INSN
 	.endr
 
-	/* ADDRESS 0xb0 to 0xb4, lws uses 1 insns for entry */
+	/* ADDRESS 0xb0 to 0xb8, lws uses two insns for entry */
 	/* Light-weight-syscall entry must always be located at 0xb0 */
 	/* WARNING: Keep this number updated with table size changes */
 #define __NR_lws_entries (2)
 
 lws_entry:
-	/* Unconditional branch to lws_start, located on the 
-	   same gateway page */
-	b,n	lws_start
+	gate	lws_start, %r0		/* increase privilege */
+	depi	3, 31, 2, %r31		/* Ensure we return into user mode. */
 
-	/* Fill from 0xb4 to 0xe0 */
-	.rept 11
+	/* Fill from 0xb8 to 0xe0 */
+	.rept 10
 	KILL_INSN
 	.endr
 
@@ -423,9 +422,6 @@ tracesys_sigexit:
 
 	*********************************************************/
 lws_start:
-	/* Gate and ensure we return to userspace */
-	gate	.+8, %r0
-	depi	3, 31, 2, %r31	/* Ensure we return to userspace */
 
 #ifdef CONFIG_64BIT
 	/* FIXME: If we are a 64-bit kernel just
@@ -442,7 +438,7 @@ lws_start:
 #endif	
 
         /* Is the lws entry number valid? */
-	comiclr,>>=	__NR_lws_entries, %r20, %r0
+	comiclr,>>	__NR_lws_entries, %r20, %r0
 	b,n	lws_exit_nosys
 
 	/* WARNING: Trashing sr2 and sr3 */
@@ -473,7 +469,7 @@ lws_exit:
 	/* now reset the lowest bit of sp if it was set */
 	xor	%r30,%r1,%r30
 #endif
-	be,n	0(%sr3, %r31)
+	be,n	0(%sr7, %r31)
 
 
 	
@@ -529,7 +525,6 @@ lws_compare_and_swap32:
 #endif
 
 lws_compare_and_swap:
-#ifdef CONFIG_SMP
 	/* Load start of lock table */
 	ldil	L%lws_lock_start, %r20
 	ldo	R%lws_lock_start(%r20), %r28
@@ -572,8 +567,6 @@ cas_wouldblock:
 	ldo	2(%r0), %r28				/* 2nd case */
 	b	lws_exit				/* Contended... */
 	ldo	-EAGAIN(%r0), %r21			/* Spin in userspace */
-#endif
-/* CONFIG_SMP */
 
 	/*
 		prev = *addr;
@@ -601,13 +594,11 @@ cas_action:
 1:	ldw	0(%sr3,%r26), %r28
 	sub,<>	%r28, %r25, %r0
 2:	stw	%r24, 0(%sr3,%r26)
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	/* Clear thread register indicator */
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	/* Return to userspace, set no error */
 	b	lws_exit
@@ -615,12 +606,10 @@ cas_action:
 
 3:		
 	/* Error occured on load or store */
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	b	lws_exit
 	ldo	-EFAULT(%r0),%r21	/* set errno */
@@ -672,7 +661,6 @@ ENTRY(sys_call_table64)
 END(sys_call_table64)
 #endif
 
-#ifdef CONFIG_SMP
 	/*
 		All light-weight-syscall atomic operations 
 		will use this set of locks 
@@ -694,8 +682,6 @@ ENTRY(lws_lock_start)
 	.endr
 END(lws_lock_start)
 	.previous
-#endif
-/* CONFIG_SMP for lws_lock_start */
 
 .end
 
diff --git a/arch/parisc/lib/bitops.c b/arch/parisc/lib/bitops.c
index 353963d..bae6a86 100644
--- a/arch/parisc/lib/bitops.c
+++ b/arch/parisc/lib/bitops.c
@@ -15,6 +15,9 @@
 arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
 	[0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
 };
+arch_spinlock_t __atomic_user_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
+	[0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
+};
 #endif
 
 #ifdef CONFIG_64BIT
diff --git a/kernel/fork.c b/kernel/fork.c
index f88bd98..108b1ed 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -608,7 +608,10 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm)
 			 * We don't check the error code - if userspace has
 			 * not set up a proper pointer then tough luck.
 			 */
+			unsigned long flags;
+			_atomic_spin_lock_irqsave_user(tsk->clear_child_tid, flags);
 			put_user(0, tsk->clear_child_tid);
+			_atomic_spin_unlock_irqrestore_user(tsk->clear_child_tid, flags);
 			sys_futex(tsk->clear_child_tid, FUTEX_WAKE,
 					1, NULL, NULL, 0);
 		}
@@ -1432,8 +1435,12 @@ long do_fork(unsigned long clone_flags,
 
 		nr = task_pid_vnr(p);
 
-		if (clone_flags & CLONE_PARENT_SETTID)
+		if (clone_flags & CLONE_PARENT_SETTID) {
+			unsigned long flags;
+			_atomic_spin_lock_irqsave_user(parent_tidptr, flags);
 			put_user(nr, parent_tidptr);
+			_atomic_spin_unlock_irqrestore_user(parent_tidptr, flags);
+		}
 
 		if (clone_flags & CLONE_VFORK) {
 			p->vfork_done = &vfork;
diff --git a/mm/memory.c b/mm/memory.c
index 09e4b1b..21c2916 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -616,7 +616,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	 * in the parent and the child
 	 */
 	if (is_cow_mapping(vm_flags)) {
-		ptep_set_wrprotect(src_mm, addr, src_pte);
+		ptep_set_wrprotect(vma, src_mm, addr, src_pte);
 		pte = pte_wrprotect(pte);
 	}
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-08 21:54                   ` John David Anglin
@ 2010-04-08 22:44                     ` John David Anglin
  2010-04-09 14:14                       ` Carlos O'Donell
  2010-06-02 15:33                       ` Bug#561203: threads and fork on machine with VIPT-WB cache Modestas Vainius
  0 siblings, 2 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-08 22:44 UTC (permalink / raw)
  To: dave.anglin; +Cc: deller, gniibe, linux-parisc, pkg-gauche-devel, 561203

> On Thu, 08 Apr 2010, Helge Deller wrote:

> > I tested your patch today on one of my machines with plain kernel 2.6.33 (32bit, SMP, B2000 I think).
> > Sadly I still did see the minifail bug.
> > 
> > Are you sure, that the patch fixed this bug for you?
> 
> Seemed to, but I have a bunch of other changes installed.  Possibly,
> the change to cacheflush.h is important.  It affects all PA8000.

I also think the change suggested by James

+       if (pte_dirty(old_pte))

is important for SMP.  With the patch set that I sent, my rp3440 and
gsyprf11 seem reasonably stable running 2.6.33.2 SMP.  I doubt all
problems are solved but things are a lot better than before.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-08 22:44                     ` John David Anglin
@ 2010-04-09 14:14                       ` Carlos O'Donell
  2010-04-09 15:13                         ` John David Anglin
                                           ` (4 more replies)
  2010-06-02 15:33                       ` Bug#561203: threads and fork on machine with VIPT-WB cache Modestas Vainius
  1 sibling, 5 replies; 74+ messages in thread
From: Carlos O'Donell @ 2010-04-09 14:14 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, deller, gniibe, linux-parisc

On Thu, Apr 8, 2010 at 6:44 PM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
>> On Thu, 08 Apr 2010, Helge Deller wrote:
>
>> > I tested your patch today on one of my machines with plain kernel =
2.6.33 (32bit, SMP, B2000 I think).
>> > Sadly I still did see the minifail bug.
>> >
>> > Are you sure, that the patch fixed this bug for you?
>>
>> Seemed to, but I have a bunch of other changes installed. =A0Possibl=
y,
>> the change to cacheflush.h is important. =A0It affects all PA8000.
>
> I also think the change suggested by James
>
> + =A0 =A0 =A0 if (pte_dirty(old_pte))
>
> is important for SMP. =A0With the patch set that I sent, my rp3440 an=
d
> gsyprf11 seem reasonably stable running 2.6.33.2 SMP. =A0I doubt all
> problems are solved but things are a lot better than before.

I have trimmed the CC a bit.

We need to start splitting up your giant "stability" patch into
manageable chunks.

=46or example, are the futex fixes anywhere for Kyle to pickup?

I could test those independently and submit to Kyle after testing.

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-09 14:14                       ` Carlos O'Donell
@ 2010-04-09 15:13                         ` John David Anglin
  2010-04-09 15:48                           ` James Bottomley
                                             ` (2 more replies)
  2010-04-11 17:03                         ` [PATCH] Remove unnecessary macros from entry.S John David Anglin
                                           ` (3 subsequent siblings)
  4 siblings, 3 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-09 15:13 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, deller, gniibe, linux-parisc

On Fri, 09 Apr 2010, Carlos O'Donell wrote:

> We need to start splitting up your giant "stability" patch into
> manageable chunks.

I agree.  What I posted was not intended as a submission.  Some
of the changes aren't mine, some are cleanups, some are "obvious",
some are not obvious and may well be wrong, or inefficient.

I posted the change to not clobber r19 on fork/clone syscalls,
but there has been no response to it.

I will try split up the change this weekend.

> For example, are the futex fixes anywhere for Kyle to pickup?

The futex fixes are Helge's and were posted to the list on Wed,
03 Feb 2010 23:03:49 +0100 along with Helge's minifail3.c.

The change to syscall.S in the above post included two hunks from me.
I removed Helge's portion so that I could enable LWS locking on UP
builds because of the concern that a page fault on COW memory could
cause user code to be scheduled in the locking code.

I think there is merit in using __atomic_user_hash, etc, but it
will take a bit of work to make it available in UP kernels.
I think Helge's change to fork.c is probably not necessary.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-09 15:13                         ` John David Anglin
@ 2010-04-09 15:48                           ` James Bottomley
  2010-04-09 16:22                             ` John David Anglin
  2010-04-10 20:46                           ` Helge Deller
  2010-04-11 16:36                           ` [PATCH] Call pagefault_disable/pagefault_enable in kmap_atomic/kunmap_atomic John David Anglin
  2 siblings, 1 reply; 74+ messages in thread
From: James Bottomley @ 2010-04-09 15:48 UTC (permalink / raw)
  To: John David Anglin; +Cc: Carlos O'Donell, deller, gniibe, linux-parisc

On Fri, 2010-04-09 at 11:13 -0400, John David Anglin wrote:
> On Fri, 09 Apr 2010, Carlos O'Donell wrote:
> 
> > We need to start splitting up your giant "stability" patch into
> > manageable chunks.
> 
> I agree.  What I posted was not intended as a submission.  Some
> of the changes aren't mine, some are cleanups, some are "obvious",
> some are not obvious and may well be wrong, or inefficient.
> 
> I posted the change to not clobber r19 on fork/clone syscalls,
> but there has been no response to it.

Theory looks fine to me ... although Helge sees no difference in
behaviour, I'd be happy to apply on the principle of no harm and
theoretically necessary.

> I will try split up the change this weekend.
> 
> > For example, are the futex fixes anywhere for Kyle to pickup?
> 
> The futex fixes are Helge's and were posted to the list on Wed,
> 03 Feb 2010 23:03:49 +0100 along with Helge's minifail3.c.
> 
> The change to syscall.S in the above post included two hunks from me.
> I removed Helge's portion so that I could enable LWS locking on UP
> builds because of the concern that a page fault on COW memory could
> cause user code to be scheduled in the locking code.
> 
> I think there is merit in using __atomic_user_hash, etc, but it
> will take a bit of work to make it available in UP kernels.
> I think Helge's change to fork.c is probably not necessary.

I'm fairly convinced we need the flush on kmap(_atomic) as well if the
user had dirtied the page.  Right at the moment if the aliases are
inequivalent (which they are about 99.9% of the time) we're in danger of
using stale data in the kernel.  We also need a flush on kunmap if the
kernel has dirtied the page.

James



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-09 15:48                           ` James Bottomley
@ 2010-04-09 16:22                             ` John David Anglin
  2010-04-09 16:31                               ` James Bottomley
  0 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-04-09 16:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: John David Anglin, Carlos O'Donell, deller, gniibe, linux-parisc

On Fri, 09 Apr 2010, James Bottomley wrote:

> > I posted the change to not clobber r19 on fork/clone syscalls,
> > but there has been no response to it.
> 
> Theory looks fine to me ... although Helge sees no difference in
> behaviour, I'd be happy to apply on the principle of no harm and
> theoretically necessary.

Testcase is below.  Compile with `-static' to link with libc.a.
Testcase prints incorrect parent pid.

Carlos recently updated glibc so it saves/restores r19 across syscalls
in non-PIC code.  I thought Helge tested this update, so he wouldn't
see the problem unless he tested with an "old" version of glibc.

While there isn't a lot of code linked with -static (I mainly
use for debugging), there's no real reason that the kernel needs
to clobber r19 across fork/clone.

If we want to preserve pad0 for some future use, I think that it
should be possible to use another syscall clobbered register to
save the return pointer for the child.

> I'm fairly convinced we need the flush on kmap(_atomic) as well if the
> user had dirtied the page.  Right at the moment if the aliases are
> inequivalent (which they are about 99.9% of the time) we're in danger of
> using stale data in the kernel.  We also need a flush on kunmap if the
> kernel has dirtied the page.

This is presumably for PA8800/PA8900 processors?  I'm a bit surprised
that HP would change the cache coherency requirement for these processors
alone given the performance hit in maintaining coherency.  Is this an
issue for all systems with more than two processors?

I've wondered if checking the dirty bit in kunmap would help performance.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/types.h>
#include <unistd.h>

#define CALL_EXIT 0

int main (void)
{
  pid_t child;
  pid_t parent;
  char *cmd[] = { "bash", "-c", "echo In child $$;", (char *)0 };
  char *env[] = { "HOME=/tmp", (char *)0 };
  int ret;

  child = vfork();

  if (child == 0)
    {
      ret = execve("/bin/bash", cmd, env);
      // printf ("ret = %d\n", ret);
      _exit(1);
    }
  else
    {
      // printf("child != 0\n");
    }

  parent = getpid();
  printf("parent is %d\n", (unsigned int)parent);
  printf("child is %d\n", (unsigned int)child);

  return 0;
}

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-09 16:22                             ` John David Anglin
@ 2010-04-09 16:31                               ` James Bottomley
  0 siblings, 0 replies; 74+ messages in thread
From: James Bottomley @ 2010-04-09 16:31 UTC (permalink / raw)
  To: John David Anglin; +Cc: Carlos O'Donell, deller, gniibe, linux-parisc

On Fri, 2010-04-09 at 12:22 -0400, John David Anglin wrote:
> On Fri, 09 Apr 2010, James Bottomley wrote:
> 
> > > I posted the change to not clobber r19 on fork/clone syscalls,
> > > but there has been no response to it.
> > 
> > Theory looks fine to me ... although Helge sees no difference in
> > behaviour, I'd be happy to apply on the principle of no harm and
> > theoretically necessary.
> 
> Testcase is below.  Compile with `-static' to link with libc.a.
> Testcase prints incorrect parent pid.
> 
> Carlos recently updated glibc so it saves/restores r19 across syscalls
> in non-PIC code.  I thought Helge tested this update, so he wouldn't
> see the problem unless he tested with an "old" version of glibc.
> 
> While there isn't a lot of code linked with -static (I mainly
> use for debugging), there's no real reason that the kernel needs
> to clobber r19 across fork/clone.
> 
> If we want to preserve pad0 for some future use, I think that it
> should be possible to use another syscall clobbered register to
> save the return pointer for the child.
> 
> > I'm fairly convinced we need the flush on kmap(_atomic) as well if the
> > user had dirtied the page.  Right at the moment if the aliases are
> > inequivalent (which they are about 99.9% of the time) we're in danger of
> > using stale data in the kernel.  We also need a flush on kunmap if the
> > kernel has dirtied the page.
> 
> This is presumably for PA8800/PA8900 processors?

No the pa88/89 problem was a clean alias resolution issue, which doesn't
exist on earlier processors that don't have the L3 PIPT cache.  The
problem I think we see infrequently (very infrequently) is that we kmap
a user page that has dirty cache lines and access it through the kernel
alias before the lines get cleaned.  This only occurs in a very few
areas of the kernel (like copy_user_highpage), but it's a potential
source of stale data through inequivalent aliasing.

>   I'm a bit surprised
> that HP would change the cache coherency requirement for these processors
> alone given the performance hit in maintaining coherency.  Is this an
> issue for all systems with more than two processors?

The requirement has always been flush the dirty line before accessing
via another alias (appendix F says that) I just think we've been playing
it a bit loose in the kmap area.

> I've wondered if checking the dirty bit in kunmap would help performance.

It would ... enormously ...  and also on kmap.  The latter is much
harder to do since we need the vaddr of the page and we don't get it fed
in via the API.

James



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-09 15:13                         ` John David Anglin
  2010-04-09 15:48                           ` James Bottomley
@ 2010-04-10 20:46                           ` Helge Deller
  2010-04-10 21:56                             ` John David Anglin
  2010-04-10 22:53                             ` John David Anglin
  2010-04-11 16:36                           ` [PATCH] Call pagefault_disable/pagefault_enable in kmap_atomic/kunmap_atomic John David Anglin
  2 siblings, 2 replies; 74+ messages in thread
From: Helge Deller @ 2010-04-10 20:46 UTC (permalink / raw)
  To: John David Anglin
  Cc: John David Anglin, Carlos O'Donell, gniibe, linux-parisc

On 04/09/2010 05:13 PM, John David Anglin wrote:
> On Fri, 09 Apr 2010, Carlos O'Donell wrote:
>> For example, are the futex fixes anywhere for Kyle to pickup?
> 
> The futex fixes are Helge's and were posted to the list on Wed,
> 03 Feb 2010 23:03:49 +0100 along with Helge's minifail3.c.

Yes, but there is still a bug in the patch I sent, and it's still in Dave's big
patchset. See below...

--- a/arch/parisc/include/asm/atomic.h
+++ b/arch/parisc/include/asm/atomic.h
 #else
 #  define _atomic_spin_lock_irqsave(l,f) do { local_irq_save(f); } while (0)
 #  define _atomic_spin_unlock_irqrestore(l,f) do { local_irq_restore(f); } while (0)
+#  define _atomic_spin_lock_irqsave_user(l,f) _atomic_spin_lock_irqsave(l,f)
+#  define _atomic_spin_unlock_irqrestore_user(l,f) _atomic_spin_lock_irqsave_user(l,f)

atomic_spin_lock_irqsave_user() is wrong.
It needs to be:
->  atomic_spin_lock_irqsave_user(l,f) _atomic_spin_lock_irqsave_user(l,f)

In addition, my patch doesn't touches all needed atomic locks.

Nevertheless, on my B2000 (32bit, SMP, 2.6.32.2 kernel) I still do see the minifail bug.
The only difference seems to be, that the minifail3 program doesn't get stuck any
more. It still crashes though from time to time...
So, at least a little improvement :-)

Helge

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-10 20:46                           ` Helge Deller
@ 2010-04-10 21:56                             ` John David Anglin
  2010-04-10 22:53                             ` John David Anglin
  1 sibling, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-10 21:56 UTC (permalink / raw)
  To: Helge Deller; +Cc: dave.anglin, carlos, gniibe, linux-parisc

> On 04/09/2010 05:13 PM, John David Anglin wrote:
> > On Fri, 09 Apr 2010, Carlos O'Donell wrote:
> >> For example, are the futex fixes anywhere for Kyle to pickup?
> > 
> > The futex fixes are Helge's and were posted to the list on Wed,
> > 03 Feb 2010 23:03:49 +0100 along with Helge's minifail3.c.
> 
> Yes, but there is still a bug in the patch I sent, and it's still in Dave's big
> patchset. See below...
> 
> --- a/arch/parisc/include/asm/atomic.h
> +++ b/arch/parisc/include/asm/atomic.h
>  #else
>  #  define _atomic_spin_lock_irqsave(l,f) do { local_irq_save(f); } while (0)
>  #  define _atomic_spin_unlock_irqrestore(l,f) do { local_irq_restore(f); } while (0)
> +#  define _atomic_spin_lock_irqsave_user(l,f) _atomic_spin_lock_irqsave(l,f)
> +#  define _atomic_spin_unlock_irqrestore_user(l,f) _atomic_spin_lock_irqsave_user(l,f)
> 
> atomic_spin_lock_irqsave_user() is wrong.
> It needs to be:
> ->  atomic_spin_lock_irqsave_user(l,f) _atomic_spin_lock_irqsave_user(l,f)

Huh?  I see the following line is wrong...

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-10 20:46                           ` Helge Deller
  2010-04-10 21:56                             ` John David Anglin
@ 2010-04-10 22:53                             ` John David Anglin
  2010-04-11 18:50                               ` Helge Deller
  1 sibling, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-04-10 22:53 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, Carlos O'Donell, gniibe, linux-parisc

On Sat, 10 Apr 2010, Helge Deller wrote:

> Nevertheless, on my B2000 (32bit, SMP, 2.6.32.2 kernel) I still do see the minifail bug.
> The only difference seems to be, that the minifail3 program doesn't get stuck any
> more. It still crashes though from time to time...

There are some issues with your minifail3.c testcase.  The fork'd child
shouldn't do any I/O and it should exit using _exit(0).  Otherwise, it
can corrupt the I/O structures of the parent.  I'm not sure that this
is the issue on your B2000, but it's worth a try.

The testcase when modified as above doesn't crash on my c3750 (32bit, UP,
2.6.32.2 kernel).

I found in debugging this testcase that the crash was always associated
with the stack region for thread_run.  I put a big loop in thread_run.
The index for the loop when compiled at -O0 is constantly being saved
and restored on the stack.  I found that crashes occured after many
iterations of the loop.  Nothing else was going on.

The COW discussion convinced me that cache flushing was the problem.
The fork (clone) syscall causes the stack region used by thread_run
to become COW'd.  When thread_run is scheduled, the loop caused an
instant COW break and stack corruption.  The state of the stack region
generally returned to its state before the fork.

If the above doesn't fix the testcase on your B2000, there must be
some difference and other PA8000 machines.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH] Call pagefault_disable/pagefault_enable in kmap_atomic/kunmap_atomic
  2010-04-09 15:13                         ` John David Anglin
  2010-04-09 15:48                           ` James Bottomley
  2010-04-10 20:46                           ` Helge Deller
@ 2010-04-11 16:36                           ` John David Anglin
  2 siblings, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-11 16:36 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, deller, gniibe, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 822 bytes --]

On Fri, 09 Apr 2010, John David Anglin wrote:

> On Fri, 09 Apr 2010, Carlos O'Donell wrote:
> 
> > We need to start splitting up your giant "stability" patch into
> > manageable chunks.

Here's the first chunk.

Based on the generic implementation of kmap_atomic and kunmap_atomic,
we should call pagefault_disable and pagefault_enable in our PA8000
implementation.

The define for kmap_atomic_prot was also missing, and I updated
kmap_atomic_pfn to use the generic implementation because of the
change to kmap_atomic.

I believe that this change is needed to fix the fork copy-on-write
bug.

Signed-off-by: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

[-- Attachment #2: cacheflush.h.d --]
[-- Type: text/plain, Size: 1181 bytes --]

diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index 7a73b61..ab87176 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -2,6 +2,7 @@
 #define _PARISC_CACHEFLUSH_H
 
 #include <linux/mm.h>
+#include <linux/uaccess.h>
 
 /* The usual comment is "Caches aren't brain-dead on the <architecture>".
  * Unfortunately, that doesn't apply to PA-RISC. */
@@ -113,11 +114,20 @@ static inline void *kmap(struct page *page)
 
 #define kunmap(page)			kunmap_parisc(page_address(page))
 
-#define kmap_atomic(page, idx)		page_address(page)
+static inline void *kmap_atomic(struct page *page, enum km_type idx)
+{
+	pagefault_disable();
+	return page_address(page);
+}
 
-#define kunmap_atomic(addr, idx)	kunmap_parisc(addr)
+static inline void kunmap_atomic(void *addr, enum km_type idx)
+{
+	kunmap_parisc(addr);
+	pagefault_enable();
+}
 
-#define kmap_atomic_pfn(pfn, idx)	page_address(pfn_to_page(pfn))
+#define kmap_atomic_prot(page, idx, prot)	kmap_atomic(page, idx)
+#define kmap_atomic_pfn(pfn, idx)	kmap_atomic(pfn_to_page(pfn), (idx))
 #define kmap_atomic_to_page(ptr)	virt_to_page(ptr)
 #endif
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH] Remove unnecessary macros from entry.S
  2010-04-09 14:14                       ` Carlos O'Donell
  2010-04-09 15:13                         ` John David Anglin
@ 2010-04-11 17:03                         ` John David Anglin
  2010-04-11 17:08                         ` [PATCH] Delete unnecessary nop's in entry.S John David Anglin
                                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-11 17:03 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, deller, gniibe, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 631 bytes --]

On Fri, 09 Apr 2010, Carlos O'Donell wrote:

> We need to start splitting up your giant "stability" patch into
> manageable chunks.

Here's the second chunk.  It's a cleanup.

The EXTR, DEP and DEPI macros are unnecessary.  There are PA 1.X
pneumonics available with the same functionality, and the DEP and DEPI
macros conflict with assembler pneumonics.

Tested on a variety of 32 and 64-bit systems.

Signed-off-by: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

[-- Attachment #2: entry.S.d --]
[-- Type: text/plain, Size: 2209 bytes --]

diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index 3a44f7f..a7e9472 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -364,32 +364,6 @@
 	.align		32
 	.endm
 
-	/* The following are simple 32 vs 64 bit instruction
-	 * abstractions for the macros */
-	.macro		EXTR	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	extrd,u		\reg1,32+(\start),\length,\reg2
-#else
-	extrw,u		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEP	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	depd		\reg1,32+(\start),\length,\reg2
-#else
-	depw		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEPI	val,start,length,reg
-#ifdef CONFIG_64BIT
-	depdi		\val,32+(\start),\length,\reg
-#else
-	depwi		\val,\start,\length,\reg
-#endif
-	.endm
-
 	/* In LP64, the space contains part of the upper 32 bits of the
 	 * fault.  We have to extract this and place it in the va,
 	 * zeroing the corresponding bits in the space register */
@@ -442,19 +416,19 @@
 	 */
 	.macro		L2_ptep	pmd,pte,index,va,fault
 #if PT_NLEVELS == 3
-	EXTR		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
+	extru		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
 #else
-	EXTR		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
+	extru		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
 #endif
-	DEP             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	dep             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	copy		%r0,\pte
 	ldw,s		\index(\pmd),\pmd
 	bb,>=,n		\pmd,_PxD_PRESENT_BIT,\fault
-	DEP		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
+	dep		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
 	copy		\pmd,%r9
 	SHLREG		%r9,PxD_VALUE_SHIFT,\pmd
-	EXTR		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
-	DEP		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	extru		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
+	dep		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	shladd		\index,BITS_PER_PTE_ENTRY,\pmd,\pmd
 	LDREG		%r0(\pmd),\pte		/* pmd is now pte */
 	bb,>=,n		\pte,_PAGE_PRESENT_BIT,\fault
@@ -605,7 +579,7 @@
 	depdi		0,31,32,\tmp
 #endif
 	copy		\va,\tmp1
-	DEPI		0,31,23,\tmp1
+	depi		0,31,23,\tmp1
 	cmpb,COND(<>),n	\tmp,\tmp1,\fault
 	ldi		(_PAGE_DIRTY|_PAGE_WRITE|_PAGE_READ),\prot
 	depd,z		\prot,8,7,\prot

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH] Delete unnecessary nop's in entry.S
  2010-04-09 14:14                       ` Carlos O'Donell
  2010-04-09 15:13                         ` John David Anglin
  2010-04-11 17:03                         ` [PATCH] Remove unnecessary macros from entry.S John David Anglin
@ 2010-04-11 17:08                         ` John David Anglin
  2010-04-11 17:12                         ` [PATCH] Avoid interruption in critical region " John David Anglin
  2010-04-11 17:26                         ` [PATCH] LWS fixes for syscall.S John David Anglin
  4 siblings, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-11 17:08 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, deller, gniibe, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 422 bytes --]

On Fri, 09 Apr 2010, Carlos O'Donell wrote:

> We need to start splitting up your giant "stability" patch into
> manageable chunks.

Here's the third chunk.  It removes some unnecessary nop's.

Signed-off-by: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

[-- Attachment #2: entry.S.d.1 --]
[-- Type: text/plain, Size: 321 bytes --]

diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index 3a44f7f..a7e9472 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -997,13 +978,6 @@ intr_restore:
 
 	rfi
 	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
 
 #ifndef CONFIG_PREEMPT
 # define intr_do_preempt	intr_restore

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH] Avoid interruption in critical region in entry.S
  2010-04-09 14:14                       ` Carlos O'Donell
                                           ` (2 preceding siblings ...)
  2010-04-11 17:08                         ` [PATCH] Delete unnecessary nop's in entry.S John David Anglin
@ 2010-04-11 17:12                         ` John David Anglin
  2010-04-11 18:24                           ` James Bottomley
  2010-04-11 17:26                         ` [PATCH] LWS fixes for syscall.S John David Anglin
  4 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-04-11 17:12 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, deller, gniibe, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 409 bytes --]

On Fri, 09 Apr 2010, Carlos O'Donell wrote:

> We need to start splitting up your giant "stability" patch into
> manageable chunks.

Here's the fourth chunk.  Am I being paranoid?

Signed-off-by: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

[-- Attachment #2: entry.S.d.2 --]
[-- Type: text/plain, Size: 755 bytes --]

diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index 3a44f7f..a7e9472 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -2076,9 +2051,10 @@ syscall_restore:
 	LDREG	TASK_PT_GR31(%r1),%r31	   /* restore syscall rp */
 
 	/* NOTE: We use rsm/ssm pair to make this operation atomic */
+	LDREG   TASK_PT_GR30(%r1),%r1              /* Get user sp */
 	rsm     PSW_SM_I, %r0
-	LDREG   TASK_PT_GR30(%r1),%r30             /* restore user sp */
-	mfsp	%sr3,%r1			   /* Get users space id */
+	copy    %r1,%r30                           /* Restore user sp */
+	mfsp    %sr3,%r1                           /* Get user space id */
 	mtsp    %r1,%sr7                           /* Restore sr7 */
 	ssm     PSW_SM_I, %r0
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH] LWS fixes for syscall.S
  2010-04-09 14:14                       ` Carlos O'Donell
                                           ` (3 preceding siblings ...)
  2010-04-11 17:12                         ` [PATCH] Avoid interruption in critical region " John David Anglin
@ 2010-04-11 17:26                         ` John David Anglin
  4 siblings, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-11 17:26 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, deller, gniibe, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 837 bytes --]

On Fri, 09 Apr 2010, Carlos O'Donell wrote:

> We need to start splitting up your giant "stability" patch into
> manageable chunks.

Here's the fifth chunk.  It contains a variety of fixes to the LWS
code in syscall.S.

1) Gate immediately and save a branch.
2) Fix off by one error in checking entry number.
3) Use sr7 instead of sr3 in error return path as sr3 might not
   contain correct value.
4) Enable locking on UP systems to prevent incorrect operation of
   the cas_action critical region on page faults.

Fixes 2 and 4 are new.

Tested on several systems, including UP c3750 with 2.6.33.2 kernel.

Signed-off-by: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

[-- Attachment #2: syscall.S.d --]
[-- Type: text/plain, Size: 2899 bytes --]

diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index f5f9602..68e75ce 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -47,18 +47,17 @@ ENTRY(linux_gateway_page)
 	KILL_INSN
 	.endr
 
-	/* ADDRESS 0xb0 to 0xb4, lws uses 1 insns for entry */
+	/* ADDRESS 0xb0 to 0xb8, lws uses two insns for entry */
 	/* Light-weight-syscall entry must always be located at 0xb0 */
 	/* WARNING: Keep this number updated with table size changes */
 #define __NR_lws_entries (2)
 
 lws_entry:
-	/* Unconditional branch to lws_start, located on the 
-	   same gateway page */
-	b,n	lws_start
+	gate	lws_start, %r0		/* increase privilege */
+	depi	3, 31, 2, %r31		/* Ensure we return into user mode. */
 
-	/* Fill from 0xb4 to 0xe0 */
-	.rept 11
+	/* Fill from 0xb8 to 0xe0 */
+	.rept 10
 	KILL_INSN
 	.endr
 
@@ -423,9 +422,6 @@ tracesys_sigexit:
 
 	*********************************************************/
 lws_start:
-	/* Gate and ensure we return to userspace */
-	gate	.+8, %r0
-	depi	3, 31, 2, %r31	/* Ensure we return to userspace */
 
 #ifdef CONFIG_64BIT
 	/* FIXME: If we are a 64-bit kernel just
@@ -442,7 +438,7 @@ lws_start:
 #endif	
 
         /* Is the lws entry number valid? */
-	comiclr,>>=	__NR_lws_entries, %r20, %r0
+	comiclr,>>	__NR_lws_entries, %r20, %r0
 	b,n	lws_exit_nosys
 
 	/* WARNING: Trashing sr2 and sr3 */
@@ -473,7 +469,7 @@ lws_exit:
 	/* now reset the lowest bit of sp if it was set */
 	xor	%r30,%r1,%r30
 #endif
-	be,n	0(%sr3, %r31)
+	be,n	0(%sr7, %r31)
 
 
 	
@@ -529,7 +525,6 @@ lws_compare_and_swap32:
 #endif
 
 lws_compare_and_swap:
-#ifdef CONFIG_SMP
 	/* Load start of lock table */
 	ldil	L%lws_lock_start, %r20
 	ldo	R%lws_lock_start(%r20), %r28
@@ -572,8 +567,6 @@ cas_wouldblock:
 	ldo	2(%r0), %r28				/* 2nd case */
 	b	lws_exit				/* Contended... */
 	ldo	-EAGAIN(%r0), %r21			/* Spin in userspace */
-#endif
-/* CONFIG_SMP */
 
 	/*
 		prev = *addr;
@@ -601,13 +594,11 @@ cas_action:
 1:	ldw	0(%sr3,%r26), %r28
 	sub,<>	%r28, %r25, %r0
 2:	stw	%r24, 0(%sr3,%r26)
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	/* Clear thread register indicator */
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	/* Return to userspace, set no error */
 	b	lws_exit
@@ -615,12 +606,10 @@ cas_action:
 
 3:		
 	/* Error occured on load or store */
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	b	lws_exit
 	ldo	-EFAULT(%r0),%r21	/* set errno */
@@ -672,7 +661,6 @@ ENTRY(sys_call_table64)
 END(sys_call_table64)
 #endif
 
-#ifdef CONFIG_SMP
 	/*
 		All light-weight-syscall atomic operations 
 		will use this set of locks 
@@ -694,8 +682,6 @@ ENTRY(lws_lock_start)
 	.endr
 END(lws_lock_start)
 	.previous
-#endif
-/* CONFIG_SMP for lws_lock_start */
 
 .end
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH] Avoid interruption in critical region in entry.S
  2010-04-11 17:12                         ` [PATCH] Avoid interruption in critical region " John David Anglin
@ 2010-04-11 18:24                           ` James Bottomley
  2010-04-11 18:45                             ` John David Anglin
  0 siblings, 1 reply; 74+ messages in thread
From: James Bottomley @ 2010-04-11 18:24 UTC (permalink / raw)
  To: John David Anglin; +Cc: Carlos O'Donell, deller, gniibe, linux-parisc

On Sun, 2010-04-11 at 13:12 -0400, John David Anglin wrote:
> On Fri, 09 Apr 2010, Carlos O'Donell wrote:
> 
> > We need to start splitting up your giant "stability" patch into
> > manageable chunks.
> 
> Here's the fourth chunk.  Am I being paranoid?

Could you explain what difference you think it makes ... because I can't
really see one.

All the patch seems to be doing is setting r1 to the stack pointer with
interrupts enabled and then copying the value with interrupts disabled,
which is fine, but I don't see how it's different from setting r30
directly from the task entry within the interrupt disabled region.

James



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH] Avoid interruption in critical region in entry.S
  2010-04-11 18:24                           ` James Bottomley
@ 2010-04-11 18:45                             ` John David Anglin
  2010-04-11 18:53                               ` James Bottomley
  0 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-04-11 18:45 UTC (permalink / raw)
  To: James Bottomley
  Cc: John David Anglin, Carlos O'Donell, deller, gniibe, linux-parisc

On Sun, 11 Apr 2010, James Bottomley wrote:

> On Sun, 2010-04-11 at 13:12 -0400, John David Anglin wrote:
> > On Fri, 09 Apr 2010, Carlos O'Donell wrote:
> > 
> > > We need to start splitting up your giant "stability" patch into
> > > manageable chunks.
> > 
> > Here's the fourth chunk.  Am I being paranoid?
> 
> Could you explain what difference you think it makes ... because I can't
> really see one.
> 
> All the patch seems to be doing is setting r1 to the stack pointer with
> interrupts enabled and then copying the value with interrupts disabled,
> which is fine, but I don't see how it's different from setting r30
> directly from the task entry within the interrupt disabled region.

If it is possible for an interruption such as a data TLB miss to occur
in the instruction that loads the stack pointer, then the period while
interrupts are disabled will be extended while the TLB miss is handled.
So, placing the load outside the critical region keeps the period where
interrupts are disabled as short as possible.

It may not be a big deal here but in time critical code issues like this
are important.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-10 22:53                             ` John David Anglin
@ 2010-04-11 18:50                               ` Helge Deller
  2010-04-11 22:25                                 ` John David Anglin
  0 siblings, 1 reply; 74+ messages in thread
From: Helge Deller @ 2010-04-11 18:50 UTC (permalink / raw)
  To: John David Anglin
  Cc: John David Anglin, Carlos O'Donell, gniibe, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 2623 bytes --]

On 04/11/2010 12:53 AM, John David Anglin wrote:
> On Sat, 10 Apr 2010, Helge Deller wrote:
> 
>> Nevertheless, on my B2000 (32bit, SMP, 2.6.32.2 kernel) I still do see the minifail bug.
>> The only difference seems to be, that the minifail3 program doesn't get stuck any
>> more. It still crashes though from time to time...
> 
> There are some issues with your minifail3.c testcase.  The fork'd child
> shouldn't do any I/O and it should exit using _exit(0).  Otherwise, it
> can corrupt the I/O structures of the parent.  I'm not sure that this
> is the issue on your B2000, but it's worth a try.
> 
> The testcase when modified as above doesn't crash on my c3750 (32bit, UP,
> 2.6.32.2 kernel).
> 
> I found in debugging this testcase that the crash was always associated
> with the stack region for thread_run.  I put a big loop in thread_run.
> The index for the loop when compiled at -O0 is constantly being saved
> and restored on the stack.  I found that crashes occured after many
> iterations of the loop.  Nothing else was going on.
> 
> The COW discussion convinced me that cache flushing was the problem.
> The fork (clone) syscall causes the stack region used by thread_run
> to become COW'd.  When thread_run is scheduled, the loop caused an
> instant COW break and stack corruption.  The state of the stack region
> generally returned to its state before the fork.
> 
> If the above doesn't fix the testcase on your B2000, there must be
> some difference and other PA8000 machines.

Hi Dave,

I did tested the attached testcase. I think this is the version you sent last
time, and which has the _exit(0).

Nevertheless, I still see the crashes with all kernel patches applied.

What I usually do is to start up more than 8 screen sessions. In each of the
sessions I start the bash loop:
-> i=0; while true; do i=$(($i+1)); echo Run $i; ./minifail; done;
and detach from the screen sessions.
After some time, the load goes up to 8-16 and a few crashes fill the syslog.
I'm sure the crashes are related to how much load the machine is, and how
often process switches will happen.
How many minifail testcases do you run in parallel?


ls3017:/scratch/linux-git# uname -a

Linux ls3017 2.6.33.2-32bit #31 SMP Fri Apr 9 12:36:49 CEST 2010 parisc GNU/Linux

ls3017:/scratch/linux-git# cat /proc/cpuinfo 

cpu family      : PA-RISC 2.0

cpu             : PA8500 (PCX-W)

cpu MHz         : 440.000000

model           : 9000/785/J5000

model name      : Forte W 2-way

I-cache         : 512 KB

D-cache         : 1024 KB (WB, direct mapped)

ITLB entries    : 160

DTLB entries    : 160 - shared with ITLB


Helge

[-- Attachment #2: minifail_dave.cpp --]
[-- Type: text/plain, Size: 1062 bytes --]

#include <pthread.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

/*
  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561203

  clone(child_stack=0x4088d040, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x4108c4e8, tls=0x4108c900, child_tidptr=0x4108c4e8) = 14819
[pid 14819] set_robust_list(0x4108c4f0, 0xc) = 0
[pid 14818] clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x40002028) = 14820

 g++  minifail.cpp -o minifail -O0 -pthread -g

 i=0; while true; do i=$(($i+1)); echo Run $i; ./minifail; done;

 */
void* thread_run(void* arg) {
	write(1,"Thread OK.\n",11);
}

int pure_test() {
	pthread_t thread;
	pthread_create(&thread, NULL, thread_run, NULL);

	switch (fork()) {
		case -1:
			perror("fork() failed");
		case 0:
			write(1,"Child OK.\n",10);
			_exit(0);
		default:
			break;
		
	}
	
	pthread_join(thread, NULL);
	return 0;
}

int main(int argc, char** argv) {
	return pure_test();
}


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH] Avoid interruption in critical region in entry.S
  2010-04-11 18:45                             ` John David Anglin
@ 2010-04-11 18:53                               ` James Bottomley
  0 siblings, 0 replies; 74+ messages in thread
From: James Bottomley @ 2010-04-11 18:53 UTC (permalink / raw)
  To: John David Anglin; +Cc: Carlos O'Donell, deller, gniibe, linux-parisc

On Sun, 2010-04-11 at 14:45 -0400, John David Anglin wrote:
> On Sun, 11 Apr 2010, James Bottomley wrote:
> 
> > On Sun, 2010-04-11 at 13:12 -0400, John David Anglin wrote:
> > > On Fri, 09 Apr 2010, Carlos O'Donell wrote:
> > > 
> > > > We need to start splitting up your giant "stability" patch into
> > > > manageable chunks.
> > > 
> > > Here's the fourth chunk.  Am I being paranoid?
> > 
> > Could you explain what difference you think it makes ... because I can't
> > really see one.
> > 
> > All the patch seems to be doing is setting r1 to the stack pointer with
> > interrupts enabled and then copying the value with interrupts disabled,
> > which is fine, but I don't see how it's different from setting r30
> > directly from the task entry within the interrupt disabled region.
> 
> If it is possible for an interruption such as a data TLB miss to occur
> in the instruction that loads the stack pointer, then the period while
> interrupts are disabled will be extended while the TLB miss is handled.
> So, placing the load outside the critical region keeps the period where
> interrupts are disabled as short as possible.

Agreed, it's possible

> It may not be a big deal here but in time critical code issues like this
> are important.

I suppose it can't hurt ... TLB fault interruptions in interrupts are a
fact of life on PA ... even if it can be avoided in this case, there's
still hundreds of others where it can't.

James



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-11 18:50                               ` Helge Deller
@ 2010-04-11 22:25                                 ` John David Anglin
  2010-04-12 21:02                                   ` Helge Deller
  0 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-04-11 22:25 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, Carlos O'Donell, gniibe, linux-parisc

On Sun, 11 Apr 2010, Helge Deller wrote:

> Nevertheless, I still see the crashes with all kernel patches applied.
> 
> What I usually do is to start up more than 8 screen sessions. In each of the
> sessions I start the bash loop:
> -> i=0; while true; do i=$(($i+1)); echo Run $i; ./minifail; done;
> and detach from the screen sessions.
> After some time, the load goes up to 8-16 and a few crashes fill the syslog.
> I'm sure the crashes are related to how much load the machine is, and how
> often process switches will happen.
> How many minifail testcases do you run in parallel?

Sigh, never more than one...

That said, I did realize last night that the cache flush in ptep_set_wrprotect
based on pte_dirty was flawed.  In a SMP kernel with a user on a different
cpu pounding on the page to be write protected, there was a race between
the pte_dirty check and the write protect.

Further, I don't believe the dirty bit is reliable.  Our cmpxchg is not
atomic with respect to changes in the dirty bit.  Thus, there is a small
window where a change in the dirty bit could get lost.

So for now, I think it safest to move the flush after the setting of the
write protect bit, and do it unconditionally.  This should be ok since
page faults are disabled.  I recognize that this will hurt performance.

I'm going to test the following on my rp3440.  The flushing has greatly
improved SMP userspace stability.  However, I have still seen a few issues
in the GCC testsuite.

Maybe it will help your B2000.  However, let's just go one step at a time.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index a27d2e2..e85f43c 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -14,6 +14,7 @@
 #include <linux/bitops.h>
 #include <asm/processor.h>
 #include <asm/cache.h>
+extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn);
 
 /*
  * kern_addr_valid(ADDR) tests if ADDR is pointing to valid kernel
@@ -456,7 +457,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 	return old_pte;
 }
 
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
 #ifdef CONFIG_SMP
 	unsigned long new, old;
@@ -469,6 +470,8 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 	pte_t old_pte = *ptep;
 	set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
 #endif
+
+	flush_cache_page(vma, addr, pte_pfn(*ptep));
 }
 
 #define pte_same(A,B)	(pte_val(A) == pte_val(B))

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-11 22:25                                 ` John David Anglin
@ 2010-04-12 21:02                                   ` Helge Deller
  2010-04-12 21:41                                     ` John David Anglin
  0 siblings, 1 reply; 74+ messages in thread
From: Helge Deller @ 2010-04-12 21:02 UTC (permalink / raw)
  To: John David Anglin
  Cc: John David Anglin, Carlos O'Donell, gniibe, linux-parisc

On 04/12/2010 12:25 AM, John David Anglin wrote:
> On Sun, 11 Apr 2010, Helge Deller wrote:
> 
>> Nevertheless, I still see the crashes with all kernel patches applied.
>>
>> What I usually do is to start up more than 8 screen sessions. In each of the
>> sessions I start the bash loop:
>> -> i=0; while true; do i=$(($i+1)); echo Run $i; ./minifail; done;
>> and detach from the screen sessions.
>> After some time, the load goes up to 8-16 and a few crashes fill the syslog.
>> I'm sure the crashes are related to how much load the machine is, and how
>> often process switches will happen.
>> How many minifail testcases do you run in parallel?
> 
> Sigh, never more than one...
> 
> That said, I did realize last night that the cache flush in ptep_set_wrprotect
> based on pte_dirty was flawed.  In a SMP kernel with a user on a different
> cpu pounding on the page to be write protected, there was a race between
> the pte_dirty check and the write protect.
> 
> Further, I don't believe the dirty bit is reliable.  Our cmpxchg is not
> atomic with respect to changes in the dirty bit.  Thus, there is a small
> window where a change in the dirty bit could get lost.
> 
> So for now, I think it safest to move the flush after the setting of the
> write protect bit, and do it unconditionally.  This should be ok since
> page faults are disabled.  I recognize that this will hurt performance.
> 
> I'm going to test the following on my rp3440.  The flushing has greatly
> improved SMP userspace stability.  However, I have still seen a few issues
> in the GCC testsuite.
> 
> Maybe it will help your B2000.  However, let's just go one step at a time.

Sadly no luck :-(
minifail still crashes...

Helge

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-12 21:02                                   ` Helge Deller
@ 2010-04-12 21:41                                     ` John David Anglin
  2010-04-13 11:55                                       ` Helge Deller
  0 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-04-12 21:41 UTC (permalink / raw)
  To: Helge Deller; +Cc: dave.anglin, carlos, gniibe, linux-parisc

> > Maybe it will help your B2000.  However, let's just go one step at a time.
> 
> Sadly no luck :-(
> minifail still crashes...

I wonder if it would help to enable the kunmap_atomic support used
for the PA8800/PA8900 for the B2000.  It might be non equivalent aliasing
causes corruption.

Are you using standard 4K pages?

I assume that it's always the thread created by pthread_create that's causing
the segv.  What does the stack region for the thread look like when it drops
core?  Possibly, we have two separate issues.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-12 21:41                                     ` John David Anglin
@ 2010-04-13 11:55                                       ` Helge Deller
  2010-04-13 14:03                                         ` John David Anglin
                                                           ` (2 more replies)
  0 siblings, 3 replies; 74+ messages in thread
From: Helge Deller @ 2010-04-13 11:55 UTC (permalink / raw)
  To: John David Anglin; +Cc: linux-parisc, gniibe, carlos, dave.anglin

> I wonder if it would help to enable the kunmap_atomic support used
> for the PA8800/PA8900 for the B2000.  It might be non equivalent alia=
sing
> causes corruption.

I did changed asm/processor.h to include my CPU:
static inline int parisc_requires_coherency(void)
{
#ifdef CONFIG_PA8X00
        return (boot_cpu_data.cpu_type >=3D pcxu);
#endif
}

CONFIG_PA8X00 is defined.

Still crashes.

> Are you using standard 4K pages?

Yes: CONFIG_PARISC_PAGE_SIZE_4KB=3Dy


> I assume that it's always the thread created by pthread_create that's
> causing the segv. =20

Yes, all my tests up to now indicated that too.

> What does the stack region for the thread look
> like when it drops core?  Possibly, we have two separate issues.

do_page_fault() pid=3D3890 command=3D'minifail_dave' type=3D6 address=3D=
0x00000003

     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001001111111100001111 Not tainted
r00-03  0004ff0f 10561000 401190d7 c046e3c0
r04-07  4012b5f4 00000007 4012bdf4 00000000
r08-11  4012be64 00000000 c046e3ca 0000001c
r12-15  4012be60 4012c7f8 00000000 c046e448
r16-19  4012c0b0 c046e448 40129270 00000000
r20-23  00000000 00000000 00000000 00000000
r24-27  fffffff5 ffffffd3 4012c0b0 00011dac
r28-31  00000000 4012c0b0 c046e4c0 401190d7
sr00-03  00008dd2 00000000 00000000 00008dd2
sr04-07  00008dd2 00008dd2 00008dd2 00008dd2

IASQ: 00008dd2 00008dd2 IAOQ: 00000003 00000007
 IIR: 43ffff80    ISR: 00008dd2  IOR: 40000bd0
 CPU:        0   CR30: 87d24000 CR31: ffffffff
 ORIG_R28: 00000000
 IAOQ[0]: 00000003
 IAOQ[1]: 00000007
 RP(r2): 401190d7


or=20

do_page_fault() pid=3D28779 command=3D'minifail_dave' type=3D6 address=3D=
0x00000003

     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001001111111100001111 Not tainted
r00-03  0004ff0f 10561000 401190d7 bff943c0
r04-07  4012b5f4 00000007 4012bdf4 00000000
r08-11  4012be64 00000000 bff943ca 0000001c
r12-15  4012be60 4012c7f8 00000000 bff94448
r16-19  4012c0b0 bff94448 40129270 00000000
r20-23  00000000 00000000 00000000 00000000
r24-27  fffffff5 ffffffd3 4012c0b0 00011dac
r28-31  00000000 4012c0b0 bff944c0 401190d7
sr00-03  000070bc 00000755 00000000 000070bc
sr04-07  000070bc 000070bc 000070bc 000070bc
IASQ: 000070bc 000070bc IAOQ: 00000003 00000007
 IIR: 43ffff80    ISR: 000070bc  IOR: 40000bd0
 CPU:        1   CR30: 8cfe4000 CR31: ffffffff
 ORIG_R28: 00000000
 IAOQ[0]: 00000003
 IAOQ[1]: 00000007
 RP(r2): 401190d7


or



do_page_fault() pid=3D9898 command=3D'minifail_dave' type=3D15 address=3D=
0x000000fc

     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001101111111100001111 Not tainted
r00-03  0006ff0f 0029608d 40114127 00000000
r04-07  4012b5f4 40001768 00000007 00000000
r08-11  f4385cbf 40129270 4012bdf4 40d52348
r12-15  07a1c2e5 0000001f 40d52448 4012b5f4
r16-19  4012bdf4 40d52358 00000000 4012b5f4
r20-23  0001014c 00000000 0f843530 4031bdc4
r24-27  40d52348 00000003 000000f0 00011dac
r28-31  000000f0 40d52448 40d52580 4011410b
sr00-03  0000a53c 00000000 00000000 0000a53c
sr04-07  0000a53c 0000a53c 0000a53c 0000a53c

IASQ: 0000a53c 0000a53c IAOQ: 40113c23 40113c27
 IIR: 0f58101c    ISR: 0000a53c  IOR: 000000fc
 CPU:        0   CR30: 87c78000 CR31: ffffffff
 ORIG_R28: 00000000
 IAOQ[0]: 40113c23
 IAOQ[1]: 40113c27
 RP(r2): 40114127
--=20
GRATIS f=FCr alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-13 11:55                                       ` Helge Deller
@ 2010-04-13 14:03                                         ` John David Anglin
  2010-04-15 22:35                                         ` John David Anglin
  2010-04-19 16:26                                         ` John David Anglin
  2 siblings, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-13 14:03 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, gniibe, carlos, dave.anglin

> > I assume that it's always the thread created by pthread_create that's
> > causing the segv.  
> 
> Yes, all my tests up to now indicated that too.

info thread tells you which thread is running.

The stack region for the thread is allocated by the mmap syscall prior
to the clone syscall.  You can see where it is allocated with strace.
On my c3750, it was allocated at 0x40000000, but I have seen it allocated
in other locations on 64-bit systems.

So, in gdb, you can display the bottom bit with 'x/128x 0x40000000'.

If you run minifail under gdb and set a break at the start of
thread_run, you can see what the stack should look like when
thread_run is entered.

The COW break typically causes most of the stack that is dirty to revert to
nearly all zeros.  Since the return pointer, rp, is saved on the stack,
a function return causes the thread to branch to location 0 and
fault.  This is the most common failure.

In the minifail versions that I made with a big loop in thread_run,
it's possible to detect the COW break mid loop and generate a core
dump.   As a result, the application state is consistent.

The dumps below aren't that useful since they don't say much about the
cause of the fault.

> > What does the stack region for the thread look
> > like when it drops core?  Possibly, we have two separate issues.
> 
> do_page_fault() pid=3890 command='minifail_dave' type=6 address=0x00000003
> 
>      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
> PSW: 00000000000001001111111100001111 Not tainted
> r00-03  0004ff0f 10561000 401190d7 c046e3c0
> r04-07  4012b5f4 00000007 4012bdf4 00000000
> r08-11  4012be64 00000000 c046e3ca 0000001c
> r12-15  4012be60 4012c7f8 00000000 c046e448
> r16-19  4012c0b0 c046e448 40129270 00000000
> r20-23  00000000 00000000 00000000 00000000
> r24-27  fffffff5 ffffffd3 4012c0b0 00011dac
> r28-31  00000000 4012c0b0 c046e4c0 401190d7

The stack pointer in this one seems to indicate the parent was running.
So, I think this failure has a different cause.  It might be useful to
debug the core dump for a failure similar to this with gdb.

> sr00-03  00008dd2 00000000 00000000 00008dd2
> sr04-07  00008dd2 00008dd2 00008dd2 00008dd2
> 
> IASQ: 00008dd2 00008dd2 IAOQ: 00000003 00000007
>  IIR: 43ffff80    ISR: 00008dd2  IOR: 40000bd0
>  CPU:        0   CR30: 87d24000 CR31: ffffffff
>  ORIG_R28: 00000000
>  IAOQ[0]: 00000003
>  IAOQ[1]: 00000007
>  RP(r2): 401190d7
> 
> 
> or 
> 
> do_page_fault() pid=28779 command='minifail_dave' type=6 address=0x00000003
> 
>      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
> PSW: 00000000000001001111111100001111 Not tainted
> r00-03  0004ff0f 10561000 401190d7 bff943c0
> r04-07  4012b5f4 00000007 4012bdf4 00000000
> r08-11  4012be64 00000000 bff943ca 0000001c
> r12-15  4012be60 4012c7f8 00000000 bff94448
> r16-19  4012c0b0 bff94448 40129270 00000000
> r20-23  00000000 00000000 00000000 00000000
> r24-27  fffffff5 ffffffd3 4012c0b0 00011dac
> r28-31  00000000 4012c0b0 bff944c0 401190d7

Stack pointer in this one is wierd.  It probably must have been corrupted
by fault.

> sr00-03  000070bc 00000755 00000000 000070bc
> sr04-07  000070bc 000070bc 000070bc 000070bc
> IASQ: 000070bc 000070bc IAOQ: 00000003 00000007
>  IIR: 43ffff80    ISR: 000070bc  IOR: 40000bd0
>  CPU:        1   CR30: 8cfe4000 CR31: ffffffff
>  ORIG_R28: 00000000
>  IAOQ[0]: 00000003
>  IAOQ[1]: 00000007
>  RP(r2): 401190d7

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-13 11:55                                       ` Helge Deller
  2010-04-13 14:03                                         ` John David Anglin
@ 2010-04-15 22:35                                         ` John David Anglin
  2010-04-19 16:26                                         ` John David Anglin
  2 siblings, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-15 22:35 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, gniibe, carlos, dave.anglin

On Tue, 13 Apr 2010, Helge Deller wrote:

> Still crashes.

After thinking about this some more, I believe out updating of pte's
is broken.  We have a lock, pa_dbit_lock, but this is only used in
the dbit traps.  Even there, the implementation is flawed on SMP
machines.

The fundamental issue is a pte can't be updated in one instruction.
The old pte has to be loaded, modified, and then written back to
memory.  It this isn't made atomic with a lock, we drop write protect,
dirty, and accessed bits occassionally.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-13 11:55                                       ` Helge Deller
  2010-04-13 14:03                                         ` John David Anglin
  2010-04-15 22:35                                         ` John David Anglin
@ 2010-04-19 16:26                                         ` John David Anglin
  2010-04-20 17:59                                           ` Helge Deller
  2010-05-01 18:34                                           ` Thibaut VARENE
  2 siblings, 2 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-19 16:26 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, gniibe, carlos, dave.anglin

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

Hi Helge,

On Tue, 13 Apr 2010, Helge Deller wrote:

> Still crashes.

Can you you try the patch below?  The change to cacheflush.h is the same
as before.

I have lightly tested the attached change on rp3440 with SMP 2.6.33.2
kernel.  It got through a GCC build at -j8, which is something of a
record.  However, I did see one issue this morning in the ada testsuite:

malloc: ../bash/make_cmd.c:100: assertion botched
malloc: block on free list clobbered
Aborting.../home/dave/gnu/gcc/gcc/gcc/testsuite/ada/acats/run_all.sh: line 67: 29176 Aborted                 (core dumped) ls ${i}.adb >> ${i}.lst 2> /dev/null

I have seen this before.

The change reworks all code that manipulates ptes to use the pa_dbit_lock
to ensure that we don't lose state information during updates.  I also
added code to purge the tlb associated with the pte as it wasn't obvious
to me how for example the write protect bit got set in the tlb.

Someone had clearly tried to fix the dirty bit handling in the past,
but the change was incomplete.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

[-- Attachment #2: pte.d.2 --]
[-- Type: text/plain, Size: 10433 bytes --]

diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index 7a73b61..ab87176 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -2,6 +2,7 @@
 #define _PARISC_CACHEFLUSH_H
 
 #include <linux/mm.h>
+#include <linux/uaccess.h>
 
 /* The usual comment is "Caches aren't brain-dead on the <architecture>".
  * Unfortunately, that doesn't apply to PA-RISC. */
@@ -113,11 +114,20 @@ static inline void *kmap(struct page *page)
 
 #define kunmap(page)			kunmap_parisc(page_address(page))
 
-#define kmap_atomic(page, idx)		page_address(page)
+static inline void *kmap_atomic(struct page *page, enum km_type idx)
+{
+	pagefault_disable();
+	return page_address(page);
+}
 
-#define kunmap_atomic(addr, idx)	kunmap_parisc(addr)
+static inline void kunmap_atomic(void *addr, enum km_type idx)
+{
+	kunmap_parisc(addr);
+	pagefault_enable();
+}
 
-#define kmap_atomic_pfn(pfn, idx)	page_address(pfn_to_page(pfn))
+#define kmap_atomic_prot(page, idx, prot)	kmap_atomic(page, idx)
+#define kmap_atomic_pfn(pfn, idx)	kmap_atomic(pfn_to_page(pfn), (idx))
 #define kmap_atomic_to_page(ptr)	virt_to_page(ptr)
 #endif
 
diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index a27d2e2..6a221af 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -38,7 +38,8 @@
         do{                                                     \
                 *(pteptr) = (pteval);                           \
         } while(0)
-#define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval)
+#define set_pte_at(mm,addr,ptep,pteval)				\
+	do { set_pte(ptep,pteval); purge_tlb_page(mm, addr); } while(0) 
 
 #endif /* !__ASSEMBLY__ */
 
@@ -410,6 +411,8 @@ extern void paging_init (void);
 
 #define PG_dcache_dirty         PG_arch_1
 
+extern void flush_cache_page(struct vm_area_struct *, unsigned long, unsigned long);
+extern void purge_tlb_page(struct mm_struct *, unsigned long);
 extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t);
 
 /* Encode and de-code a swap entry */
@@ -423,22 +426,39 @@ extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t);
 #define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)		((pte_t) { (x).val })
 
-static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
+extern spinlock_t pa_dbit_lock;
+
+static inline void pte_update_lock (void)
 {
 #ifdef CONFIG_SMP
-	if (!pte_young(*ptep))
-		return 0;
-	return test_and_clear_bit(xlate_pabit(_PAGE_ACCESSED_BIT), &pte_val(*ptep));
-#else
-	pte_t pte = *ptep;
-	if (!pte_young(pte))
-		return 0;
-	set_pte_at(vma->vm_mm, addr, ptep, pte_mkold(pte));
-	return 1;
+	preempt_disable();
+	spin_lock(&pa_dbit_lock);
+#endif
+}
+static inline void pte_update_unlock (void)
+{
+#ifdef CONFIG_SMP
+	spin_unlock(&pa_dbit_lock);
+	preempt_enable();
 #endif
 }
 
-extern spinlock_t pa_dbit_lock;
+static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
+{
+	pte_t pte;
+
+	pte_update_lock();
+	pte = *ptep;
+	if (!pte_young(pte)) {
+		pte_update_unlock();
+		return 0;
+	}
+	set_pte(ptep, pte_mkold(pte));
+	pte_update_unlock();
+	purge_tlb_page(vma->vm_mm, addr);
+
+	return 1;
+}
 
 struct mm_struct;
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
@@ -446,29 +466,29 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
 	pte_t old_pte;
 	pte_t pte;
 
-	spin_lock(&pa_dbit_lock);
+	pte_update_lock();
 	pte = old_pte = *ptep;
 	pte_val(pte) &= ~_PAGE_PRESENT;
 	pte_val(pte) |= _PAGE_FLUSH;
-	set_pte_at(mm,addr,ptep,pte);
-	spin_unlock(&pa_dbit_lock);
+	set_pte(ptep,pte);
+	pte_update_unlock();
+	purge_tlb_page(mm, addr);
 
 	return old_pte;
 }
 
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-#ifdef CONFIG_SMP
-	unsigned long new, old;
+	pte_t old_pte;
 
-	do {
-		old = pte_val(*ptep);
-		new = pte_val(pte_wrprotect(__pte (old)));
-	} while (cmpxchg((unsigned long *) ptep, old, new) != old);
-#else
-	pte_t old_pte = *ptep;
-	set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
-#endif
+	pte_update_lock();
+	old_pte = *ptep;
+	set_pte(ptep, pte_wrprotect(old_pte));
+	pte_update_unlock();
+
+	if (pte_present(old_pte) && pte_dirty(old_pte))
+		flush_cache_page(vma, addr, pte_pfn(*ptep));
+	purge_tlb_page(mm, addr);
 }
 
 #define pte_same(A,B)	(pte_val(A) == pte_val(B))
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index b6ed34d..cd64e38 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -577,3 +577,17 @@ flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long
 		__flush_cache_page(vma, vmaddr);
 
 }
+
+void purge_tlb_page(struct mm_struct *mm, unsigned long addr)
+{
+        unsigned long flags;
+
+        /* For one page, it's not worth testing the split_tlb variable */
+
+        mb();
+        mtsp(mm->context,1);
+        purge_tlb_start(flags);
+        pdtlb(addr);
+        pitlb(addr);
+        purge_tlb_end(flags);
+}
diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index 3a44f7f..12ebb8a 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -490,19 +464,57 @@
 
 	/* Set the _PAGE_ACCESSED bit of the PTE.  Be clever and
 	 * don't needlessly dirty the cache line if it was already set */
-	.macro		update_ptep	ptep,pte,tmp,tmp1
+	.macro		update_ptep	ptep,pte,spc,tmp,tmp1
+#ifdef CONFIG_SMP
+	bb,<,n		\pte,_PAGE_ACCESSED_BIT,3f
+	cmpib,COND(=),n        0,\spc,2f
+	load32		PA(pa_dbit_lock),\tmp
+1:
+	LDCW		0(\tmp),\tmp1
+	cmpib,COND(=)         0,\tmp1,1b
+	nop
+	LDREG		0(\ptep),\pte
+2:
+	ldi		_PAGE_ACCESSED,\tmp1
+	or		\tmp1,\pte,\pte
+	STREG		\pte,0(\ptep)
+
+	cmpib,COND(=),n        0,\spc,3f
+	ldi             1,\tmp1
+	stw             \tmp1,0(\tmp)
+3:
+#else
 	ldi		_PAGE_ACCESSED,\tmp1
 	or		\tmp1,\pte,\tmp
 	and,COND(<>)	\tmp1,\pte,%r0
 	STREG		\tmp,0(\ptep)
+#endif
 	.endm
 
 	/* Set the dirty bit (and accessed bit).  No need to be
 	 * clever, this is only used from the dirty fault */
-	.macro		update_dirty	ptep,pte,tmp
-	ldi		_PAGE_ACCESSED|_PAGE_DIRTY,\tmp
-	or		\tmp,\pte,\pte
+	.macro		update_dirty	ptep,pte,spc,tmp,tmp1
+#ifdef CONFIG_SMP
+	cmpib,COND(=),n        0,\spc,2f
+	load32		PA(pa_dbit_lock),\tmp
+1:
+	LDCW		0(\tmp),\tmp1
+	cmpib,COND(=)         0,\tmp1,1b
+	nop
+	LDREG		0(\ptep),\pte
+2:
+#endif
+
+	ldi		_PAGE_ACCESSED|_PAGE_DIRTY,\tmp1
+	or		\tmp1,\pte,\pte
 	STREG		\pte,0(\ptep)
+
+#ifdef CONFIG_SMP
+	cmpib,COND(=),n        0,\spc,3f
+	ldi             1,\tmp1
+	stw             \tmp1,0(\tmp)
+3:
+#endif
 	.endm
 
 	/* bitshift difference between a PFN (based on kernel's PAGE_SIZE)
@@ -1214,7 +1224,7 @@ dtlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,dtlb_check_alias_20w
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1
 
 	make_insert_tlb	spc,pte,prot
 	
@@ -1238,7 +1248,7 @@ nadtlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,nadtlb_check_flush_20w
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1
 
 	make_insert_tlb	spc,pte,prot
 
@@ -1272,7 +1282,7 @@ dtlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,dtlb_check_alias_11
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1
 
 	make_insert_tlb_11	spc,pte,prot
 
@@ -1321,7 +1331,7 @@ nadtlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,nadtlb_check_flush_11
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1
 
 	make_insert_tlb_11	spc,pte,prot
 
@@ -1368,7 +1378,7 @@ dtlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,dtlb_check_alias_20
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1
 
 	make_insert_tlb	spc,pte,prot
 
@@ -1394,7 +1404,7 @@ nadtlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,nadtlb_check_flush_20
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1
 
 	make_insert_tlb	spc,pte,prot
 
@@ -1508,7 +1518,7 @@ itlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1
 
 	make_insert_tlb	spc,pte,prot
 	
@@ -1526,7 +1536,7 @@ itlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1
 
 	make_insert_tlb_11	spc,pte,prot
 
@@ -1548,7 +1558,7 @@ itlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1
 
 	make_insert_tlb	spc,pte,prot
 
@@ -1570,29 +1580,11 @@ dbit_trap_20w:
 
 	L3_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_20w
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_20w:
-	LDCW		0(t0),t1
-	cmpib,COND(=)         0,t1,dbit_spin_20w
-	nop
-
-dbit_nolock_20w:
-#endif
-	update_dirty	ptp,pte,t1
+	update_dirty	ptp,pte,spc,t0,t1
 
 	make_insert_tlb	spc,pte,prot
 		
 	idtlbt          pte,prot
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_20w
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_20w:
-#endif
 
 	rfir
 	nop
@@ -1606,18 +1598,7 @@ dbit_trap_11:
 
 	L2_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_11
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_11:
-	LDCW		0(t0),t1
-	cmpib,=         0,t1,dbit_spin_11
-	nop
-
-dbit_nolock_11:
-#endif
-	update_dirty	ptp,pte,t1
+	update_dirty	ptp,pte,spc,t0,t1
 
 	make_insert_tlb_11	spc,pte,prot
 
@@ -1628,13 +1609,6 @@ dbit_nolock_11:
 	idtlbp		prot,(%sr1,va)
 
 	mtsp            t1, %sr1     /* Restore sr1 */
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_11
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_11:
-#endif
 
 	rfir
 	nop
@@ -1646,18 +1620,7 @@ dbit_trap_20:
 
 	L2_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_20
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_20:
-	LDCW		0(t0),t1
-	cmpib,=         0,t1,dbit_spin_20
-	nop
-
-dbit_nolock_20:
-#endif
-	update_dirty	ptp,pte,t1
+	update_dirty	ptp,pte,spc,t0,t1
 
 	make_insert_tlb	spc,pte,prot
 
@@ -1665,14 +1628,6 @@ dbit_nolock_20:
 	
         idtlbt          pte,prot
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_20
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_20:
-#endif

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-19 16:26                                         ` John David Anglin
@ 2010-04-20 17:59                                           ` Helge Deller
  2010-04-20 18:52                                             ` John David Anglin
  2010-05-09 12:43                                             ` John David Anglin
  2010-05-01 18:34                                           ` Thibaut VARENE
  1 sibling, 2 replies; 74+ messages in thread
From: Helge Deller @ 2010-04-20 17:59 UTC (permalink / raw)
  To: John David Anglin; +Cc: John David Anglin, linux-parisc, gniibe, carlos

Hi Dave,

On 04/19/2010 06:26 PM, John David Anglin wrote:
> On Tue, 13 Apr 2010, Helge Deller wrote:
>> Still crashes.
> 
> Can you you try the patch below?  The change to cacheflush.h is the same
> as before.

Thanks for the patch.
I applied it on top of a clean 2.6.33.2 kernel and ran multiple parallel 
minifail programs on my B2000 (2 CPUs, SMP kernel, 32bit kernel).
Sadly minifail still crashed the same way as before.

Should I have applied other patches as well?

Helge

> I have lightly tested the attached change on rp3440 with SMP 2.6.33.2
> kernel.  It got through a GCC build at -j8, which is something of a
> record.  However, I did see one issue this morning in the ada testsuite:
> 
> malloc: ../bash/make_cmd.c:100: assertion botched
> malloc: block on free list clobbered
> Aborting.../home/dave/gnu/gcc/gcc/gcc/testsuite/ada/acats/run_all.sh: line 67: 29176 Aborted                 (core dumped) ls ${i}.adb >> ${i}.lst 2> /dev/null
> 
> I have seen this before.
> 
> The change reworks all code that manipulates ptes to use the pa_dbit_lock
> to ensure that we don't lose state information during updates.  I also
> added code to purge the tlb associated with the pte as it wasn't obvious
> to me how for example the write protect bit got set in the tlb.
> 
> Someone had clearly tried to fix the dirty bit handling in the past,
> but the change was incomplete.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-20 17:59                                           ` Helge Deller
@ 2010-04-20 18:52                                             ` John David Anglin
  2010-05-09 12:43                                             ` John David Anglin
  1 sibling, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-04-20 18:52 UTC (permalink / raw)
  To: Helge Deller; +Cc: dave.anglin, linux-parisc, gniibe, carlos

Helge,

Ok, I think the next thing to try is to try a patch with the tlb
purges and inserts inside the locked region so that the pte and tlb
updates are fully consistent.

I fired up four windows and started minifail on gsyprf11 and got
a core dump in the thread within a couple of minutes.  The parent
was in pthread_join.

Overall stability is definitely better with the change on my rp3440.
I have now got through three GCC builds at -j8.  This never worked before,
so I think we have some progress.

Thanks for testing,
Dave

> Thanks for the patch.
> I applied it on top of a clean 2.6.33.2 kernel and ran multiple parallel 
> minifail programs on my B2000 (2 CPUs, SMP kernel, 32bit kernel).
> Sadly minifail still crashed the same way as before.
> 
> Should I have applied other patches as well?
> 
> Helge
> 
> > I have lightly tested the attached change on rp3440 with SMP 2.6.33.2
> > kernel.  It got through a GCC build at -j8, which is something of a
> > record.  However, I did see one issue this morning in the ada testsuite:
> > 
> > malloc: ../bash/make_cmd.c:100: assertion botched
> > malloc: block on free list clobbered
> > Aborting.../home/dave/gnu/gcc/gcc/gcc/testsuite/ada/acats/run_all.sh: line 67: 29176 Aborted                 (core dumped) ls ${i}.adb >> ${i}.lst 2> /dev/null
> > 
> > I have seen this before.
> > 
> > The change reworks all code that manipulates ptes to use the pa_dbit_lock
> > to ensure that we don't lose state information during updates.  I also
> > added code to purge the tlb associated with the pte as it wasn't obvious
> > to me how for example the write protect bit got set in the tlb.
> > 
> > Someone had clearly tried to fix the dirty bit handling in the past,
> > but the change was incomplete.
> 


-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-19 16:26                                         ` John David Anglin
  2010-04-20 17:59                                           ` Helge Deller
@ 2010-05-01 18:34                                           ` Thibaut VARENE
  2010-05-01 20:17                                             ` John David Anglin
  1 sibling, 1 reply; 74+ messages in thread
From: Thibaut VARENE @ 2010-05-01 18:34 UTC (permalink / raw)
  To: John David Anglin; +Cc: Helge Deller, linux-parisc, gniibe, carlos

On Mon, Apr 19, 2010 at 6:26 PM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
> Hi Helge,
>
> On Tue, 13 Apr 2010, Helge Deller wrote:
>
>> Still crashes.
>
> Can you you try the patch below? =C2=A0The change to cacheflush.h is =
the same
> as before.

=46or the records, while setting up the wiki's TestCases page, it
noticed that the initial large patch that you sent (see
https://patchwork.kernel.org/patch/91525/ ) contained bits that
weren't part of the split chunks you sent afterwards.

This patch (pte.d.2) seems to update some of those chunks and also
contains bits that weren't either part of them.

That being said so that we do not loose track of potentially useful
code. Though maybe kyle has all of this sorted out already and I'm
just unable to figure it out myself ;-)

HTH

T-Bone

--=20
Thibaut VARENE
http://www.parisc-linux.org/~varenet/
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-01 18:34                                           ` Thibaut VARENE
@ 2010-05-01 20:17                                             ` John David Anglin
  2010-05-02 10:53                                               ` Thibaut VARÈNE
  0 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-05-01 20:17 UTC (permalink / raw)
  To: Thibaut VARENE; +Cc: dave.anglin, deller, linux-parisc, gniibe, carlos

> On Mon, Apr 19, 2010 at 6:26 PM, John David Anglin
> <dave@hiauly1.hia.nrc.ca> wrote:
> > Hi Helge,
> >
> > On Tue, 13 Apr 2010, Helge Deller wrote:
> >
> >> Still crashes.
> >
> > Can you you try the patch below? =C2=A0The change to cacheflush.h is the =
> same
> > as before.
> 
> For the records, while setting up the wiki's TestCases page, it
> noticed that the initial large patch that you sent (see
> https://patchwork.kernel.org/patch/91525/ ) contained bits that
> weren't part of the split chunks you sent afterwards.
> 
> This patch (pte.d.2) seems to update some of those chunks and also
> contains bits that weren't either part of them.

The split chunks were mainly cleanups.  As far as I know, they are
obvious and provide no significant change in functionality.  I didn't
intentionally change any of the split hunks in patch4 (pte.d.2) although
this patch does touch some of the same files.  Possibly, the LWS fixes
should be split into two (obvious and UP locking).

Both the original patch and pte.d.2 were experimental.  Since I sent it,
I continued to experiment and reached a change that appears to fix the
minifail bug in a somewhat different manner than proposed by James.  However,
I'm still seeing some issues that appear to be PTE related (segmentation
faults in sh mainly).

At this point, I don't know why I still see problems.  I have one idea left
to try.  I also would like to implement copy_user_page with equivalent
aliasing.  My first attempt didn't work.  I just enabled code in pacache.S.

I have more or less reached the conclusion that our PTE/TLB management
is quite broken on SMP.  I tried James' patch but had trouble with segmention
faults on my rp3440 and a GCC build died early in stage 1 (make -j8
bootstrap).  I need to try it with a clean build.

I may be wrong but I think a flush in kmap(_atomic) won't work on SMP
because another user may just redirty the page when it is shared.

> That being said so that we do not loose track of potentially useful
> code. Though maybe kyle has all of this sorted out already and I'm
> just unable to figure it out myself ;-)

I don't think there's a clear path.  I've come to realize that I don't
understand what's required of the higher level code.  The documentation
doesn't help much.  Looking at other archs provides some clues.  I've
looked at ia64 a bit (see for example TLB shootout support and retry in
TLB miss handler).

Regarding the wiki, it's a useful summary.  However, #561203 (minifail
bug) is not a "Futex wait failure".  We may have futex bugs, but I'm not
aware of a testcase.  The minifail bug is a "Threads and fork" problem
arising from cache corruption.  Mainly, copy_user_page is broken when
copying memory shared by more than one process.  There are also issues
in PTE/TLB management on SMP systems.  Probably, the vfork/execve bug
is caused by the same problem.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-01 20:17                                             ` John David Anglin
@ 2010-05-02 10:53                                               ` Thibaut VARÈNE
  0 siblings, 0 replies; 74+ messages in thread
From: Thibaut VARÈNE @ 2010-05-02 10:53 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, linux-parisc

Le 1 mai 10 =E0 22:17, John David Anglin a =E9crit :
>
> Regarding the wiki, it's a useful summary.  However, #561203 (minifai=
l
> bug) is not a "Futex wait failure".  We may have futex bugs, but I'm =
=20
> not
> aware of a testcase.  The minifail bug is a "Threads and fork" proble=
m
> arising from cache corruption.  Mainly, copy_user_page is broken when
> copying memory shared by more than one process.  There are also issue=
s
> in PTE/TLB management on SMP systems.  Probably, the vfork/execve bug
> is caused by the same problem.


Many thanks for the feedback. The reason why I initially put the =20
minifail bug under "Futex wait failure" was because I found it =20
discussed under such a thread ;-)

I've merged this section under "Threads & fork", and have quoted your =20
summary at the top of the section.

HTH

T-Bone

--=20
Thibaut VAR=C8NE
http://www.parisc-linux.org/~varenet/

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-04-20 17:59                                           ` Helge Deller
  2010-04-20 18:52                                             ` John David Anglin
@ 2010-05-09 12:43                                             ` John David Anglin
  2010-05-09 14:14                                               ` Carlos O'Donell
  2010-05-10  9:56                                               ` Helge Deller
  1 sibling, 2 replies; 74+ messages in thread
From: John David Anglin @ 2010-05-09 12:43 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, linux-parisc, gniibe, carlos

[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]

On Tue, 20 Apr 2010, Helge Deller wrote:

> Hi Dave,
> 
> On 04/19/2010 06:26 PM, John David Anglin wrote:
> > On Tue, 13 Apr 2010, Helge Deller wrote:
> >> Still crashes.
> > 
> > Can you you try the patch below?  The change to cacheflush.h is the same
> > as before.
> 
> Thanks for the patch.
> I applied it on top of a clean 2.6.33.2 kernel and ran multiple parallel 
> minifail programs on my B2000 (2 CPUs, SMP kernel, 32bit kernel).
> Sadly minifail still crashed the same way as before.

Attached is my latest 2.6.33.3 patch bundle.  It uses a slightly modified
version of James' minifail fix.

The big change is the management of PTE updates and the TLB exception
support on SMP configs.  I have modified what was formerly the pa_dbit_lock
and used it for all user page table updates.  I also added a recheck of the
PTE after TLB inserts.  The idea for this was derived from a similar check
in arch/ia64/kernel/ivt.S.  I also whack the TLB page in ptep_set_wrprotect
and modified the TLB locking for clear_user_page (it's now in asm).

So far, the change is lightly tested.  I've been burned enough to know
that there are likely still problems.  However, so far I haven't seen any
random segvs on my rp3440 or gsyprf11.

I would appreciate pa'ers testing this change.  If it looks good, I'll
extract the new PTE handling and formally submit.

There are some obvious performance improvements that could be made like
lock hashing.  However, I just wanted something that works as a first
step.  It's hard to test this stuff because the failures are random.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

[-- Attachment #2: diff-20100508.d.3 --]
[-- Type: text/plain, Size: 36397 bytes --]

diff --git a/arch/parisc/hpux/wrappers.S b/arch/parisc/hpux/wrappers.S
index 58c53c8..bdcea33 100644
--- a/arch/parisc/hpux/wrappers.S
+++ b/arch/parisc/hpux/wrappers.S
@@ -88,7 +88,7 @@ ENTRY(hpux_fork_wrapper)
 
 	STREG	%r2,-20(%r30)
 	ldo	64(%r30),%r30
-	STREG	%r2,PT_GR19(%r1)	;! save for child
+	STREG	%r2,PT_SYSCALL_RP(%r1)	;! save for child
 	STREG	%r30,PT_GR21(%r1)	;! save for child
 
 	LDREG	PT_GR30(%r1),%r25
@@ -132,7 +132,7 @@ ENTRY(hpux_child_return)
 	bl,n	schedule_tail, %r2
 #endif
 
-	LDREG	TASK_PT_GR19-TASK_SZ_ALGN-128(%r30),%r2
+	LDREG	TASK_PT_SYSCALL_RP-TASK_SZ_ALGN-128(%r30),%r2
 	b fork_return
 	copy %r0,%r28
 ENDPROC(hpux_child_return)
diff --git a/arch/parisc/include/asm/atomic.h b/arch/parisc/include/asm/atomic.h
index 716634d..ad7df44 100644
--- a/arch/parisc/include/asm/atomic.h
+++ b/arch/parisc/include/asm/atomic.h
@@ -24,29 +24,46 @@
  * Hash function to index into a different SPINLOCK.
  * Since "a" is usually an address, use one spinlock per cacheline.
  */
-#  define ATOMIC_HASH_SIZE 4
-#  define ATOMIC_HASH(a) (&(__atomic_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
+#  define ATOMIC_HASH_SIZE (4096/L1_CACHE_BYTES)  /* 4 */
+#  define ATOMIC_HASH(a)      (&(__atomic_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
+#  define ATOMIC_USER_HASH(a) (&(__atomic_user_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
 
 extern arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned;
+extern arch_spinlock_t __atomic_user_hash[ATOMIC_HASH_SIZE] __lock_aligned;
 
 /* Can't use raw_spin_lock_irq because of #include problems, so
  * this is the substitute */
-#define _atomic_spin_lock_irqsave(l,f) do {	\
-	arch_spinlock_t *s = ATOMIC_HASH(l);		\
+#define _atomic_spin_lock_irqsave_template(l,f,hash_func) do {	\
+	arch_spinlock_t *s = hash_func;		\
 	local_irq_save(f);			\
 	arch_spin_lock(s);			\
 } while(0)
 
-#define _atomic_spin_unlock_irqrestore(l,f) do {	\
-	arch_spinlock_t *s = ATOMIC_HASH(l);			\
+#define _atomic_spin_unlock_irqrestore_template(l,f,hash_func) do {	\
+	arch_spinlock_t *s = hash_func;			\
 	arch_spin_unlock(s);				\
 	local_irq_restore(f);				\
 } while(0)
 
+/* kernel memory locks */
+#define _atomic_spin_lock_irqsave(l,f)	\
+	_atomic_spin_lock_irqsave_template(l,f,ATOMIC_HASH(l))
+
+#define _atomic_spin_unlock_irqrestore(l,f)	\
+	_atomic_spin_unlock_irqrestore_template(l,f,ATOMIC_HASH(l))
+
+/* userspace memory locks */
+#define _atomic_spin_lock_irqsave_user(l,f)	\
+	_atomic_spin_lock_irqsave_template(l,f,ATOMIC_USER_HASH(l))
+
+#define _atomic_spin_unlock_irqrestore_user(l,f)	\
+	_atomic_spin_unlock_irqrestore_template(l,f,ATOMIC_USER_HASH(l))
 
 #else
 #  define _atomic_spin_lock_irqsave(l,f) do { local_irq_save(f); } while (0)
 #  define _atomic_spin_unlock_irqrestore(l,f) do { local_irq_restore(f); } while (0)
+#  define _atomic_spin_lock_irqsave_user(l,f) _atomic_spin_lock_irqsave(l,f)
+#  define _atomic_spin_unlock_irqrestore_user(l,f) _atomic_spin_unlock_irqrestore(l,f)
 #endif
 
 /* This should get optimized out since it's never called.
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index 7a73b61..b90c895 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -2,6 +2,7 @@
 #define _PARISC_CACHEFLUSH_H
 
 #include <linux/mm.h>
+#include <linux/uaccess.h>
 
 /* The usual comment is "Caches aren't brain-dead on the <architecture>".
  * Unfortunately, that doesn't apply to PA-RISC. */
@@ -104,21 +105,32 @@ void mark_rodata_ro(void);
 #define ARCH_HAS_KMAP
 
 void kunmap_parisc(void *addr);
+void *kmap_parisc(struct page *page);
 
 static inline void *kmap(struct page *page)
 {
 	might_sleep();
-	return page_address(page);
+	return kmap_parisc(page);
 }
 
 #define kunmap(page)			kunmap_parisc(page_address(page))
 
-#define kmap_atomic(page, idx)		page_address(page)
+static inline void *kmap_atomic(struct page *page, enum km_type idx)
+{
+	pagefault_disable();
+	return kmap_parisc(page);
+}
 
-#define kunmap_atomic(addr, idx)	kunmap_parisc(addr)
+static inline void kunmap_atomic(void *addr, enum km_type idx)
+{
+	kunmap_parisc(addr);
+	pagefault_enable();
+}
 
-#define kmap_atomic_pfn(pfn, idx)	page_address(pfn_to_page(pfn))
-#define kmap_atomic_to_page(ptr)	virt_to_page(ptr)
+#define kmap_atomic_prot(page, idx, prot)	kmap_atomic(page, idx)
+#define kmap_atomic_pfn(pfn, idx)	kmap_atomic(pfn_to_page(pfn), (idx))
+#define kmap_atomic_to_page(ptr)	virt_to_page(kmap_atomic(virt_to_page(ptr), (enum km_type) 0))
+#define kmap_flush_unused()	do {} while(0)
 #endif
 
 #endif /* _PARISC_CACHEFLUSH_H */
diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
index 0c705c3..7bc963e 100644
--- a/arch/parisc/include/asm/futex.h
+++ b/arch/parisc/include/asm/futex.h
@@ -55,6 +55,7 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
 {
 	int err = 0;
 	int uval;
+	unsigned long flags;
 
 	/* futex.c wants to do a cmpxchg_inatomic on kernel NULL, which is
 	 * our gateway page, and causes no end of trouble...
@@ -65,10 +66,15 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
 		return -EFAULT;
 
+	_atomic_spin_lock_irqsave_user(uaddr, flags);
+
 	err = get_user(uval, uaddr);
-	if (err) return -EFAULT;
-	if (uval == oldval)
-		err = put_user(newval, uaddr);
+	if (!err)
+		if (uval == oldval)
+			err = put_user(newval, uaddr);
+
+	_atomic_spin_unlock_irqrestore_user(uaddr, flags);
+
 	if (err) return -EFAULT;
 	return uval;
 }
diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index a27d2e2..4de5bb1 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -30,15 +30,21 @@
  */
 #define kern_addr_valid(addr)	(1)
 
+extern spinlock_t pa_pte_lock;
+extern spinlock_t pa_tlb_lock;
+
 /* Certain architectures need to do special things when PTEs
  * within a page table are directly modified.  Thus, the following
  * hook is made available.
  */
-#define set_pte(pteptr, pteval)                                 \
-        do{                                                     \
+#define set_pte(pteptr, pteval)					\
+        do {							\
+		unsigned long flags;				\
+		spin_lock_irqsave(&pa_pte_lock, flags);		\
                 *(pteptr) = (pteval);                           \
+		spin_unlock_irqrestore(&pa_pte_lock, flags);	\
         } while(0)
-#define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval)
+#define set_pte_at(mm,addr,ptep,pteval)	set_pte(ptep, pteval)
 
 #endif /* !__ASSEMBLY__ */
 
@@ -262,6 +268,7 @@ extern unsigned long *empty_zero_page;
 #define pte_none(x)     ((pte_val(x) == 0) || (pte_val(x) & _PAGE_FLUSH))
 #define pte_present(x)	(pte_val(x) & _PAGE_PRESENT)
 #define pte_clear(mm,addr,xp)	do { pte_val(*(xp)) = 0; } while (0)
+#define pte_same(A,B)	(pte_val(A) == pte_val(B))
 
 #define pmd_flag(x)	(pmd_val(x) & PxD_FLAG_MASK)
 #define pmd_address(x)	((unsigned long)(pmd_val(x) &~ PxD_FLAG_MASK) << PxD_VALUE_SHIFT)
@@ -423,56 +430,82 @@ extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t);
 #define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)		((pte_t) { (x).val })
 
-static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
+static inline void __flush_tlb_page(struct mm_struct *mm, unsigned long addr)
 {
-#ifdef CONFIG_SMP
-	if (!pte_young(*ptep))
-		return 0;
-	return test_and_clear_bit(xlate_pabit(_PAGE_ACCESSED_BIT), &pte_val(*ptep));
-#else
-	pte_t pte = *ptep;
-	if (!pte_young(pte))
-		return 0;
-	set_pte_at(vma->vm_mm, addr, ptep, pte_mkold(pte));
-	return 1;
-#endif
+	unsigned long flags;
+
+	/* For one page, it's not worth testing the split_tlb variable.  */
+	spin_lock_irqsave(&pa_tlb_lock, flags);
+	mtsp(mm->context,1);
+	pdtlb(addr);
+	pitlb(addr);
+	spin_unlock_irqrestore(&pa_tlb_lock, flags);
 }
 
-extern spinlock_t pa_dbit_lock;
+static inline int ptep_set_access_flags(struct vm_area_struct *vma, unsigned
+ long addr, pte_t *ptep, pte_t entry, int dirty)
+{
+	int changed;
+	unsigned long flags;
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	changed = !pte_same(*ptep, entry);
+	if (changed) {
+		*ptep = entry;
+	}
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
+	if (changed) {
+		__flush_tlb_page(vma->vm_mm, addr);
+	}
+	return changed;
+}
+
+static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
+{
+	pte_t pte;
+	unsigned long flags;
+	int r;
+
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	pte = *ptep;
+	if (pte_young(pte)) {
+		*ptep = pte_mkold(pte);
+		r = 1;
+	} else {
+		r = 0;
+	}
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
+
+	return r;
+}
 
 struct mm_struct;
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-	pte_t old_pte;
-	pte_t pte;
+	pte_t pte, old_pte;
+	unsigned long flags;
 
-	spin_lock(&pa_dbit_lock);
+	spin_lock_irqsave(&pa_pte_lock, flags);
 	pte = old_pte = *ptep;
 	pte_val(pte) &= ~_PAGE_PRESENT;
 	pte_val(pte) |= _PAGE_FLUSH;
-	set_pte_at(mm,addr,ptep,pte);
-	spin_unlock(&pa_dbit_lock);
+	*ptep = pte;
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
 
 	return old_pte;
 }
 
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-#ifdef CONFIG_SMP
-	unsigned long new, old;
+	pte_t old_pte;
+	unsigned long flags;
 
-	do {
-		old = pte_val(*ptep);
-		new = pte_val(pte_wrprotect(__pte (old)));
-	} while (cmpxchg((unsigned long *) ptep, old, new) != old);
-#else
-	pte_t old_pte = *ptep;
-	set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
-#endif
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	old_pte = *ptep;
+	*ptep = pte_wrprotect(old_pte);
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
+	__flush_tlb_page(mm, addr);
 }
 
-#define pte_same(A,B)	(pte_val(A) == pte_val(B))
-
 #endif /* !__ASSEMBLY__ */
 
 
@@ -504,6 +537,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 
 #define HAVE_ARCH_UNMAPPED_AREA
 
+#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
 #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
diff --git a/arch/parisc/include/asm/system.h b/arch/parisc/include/asm/system.h
index d91357b..4653c77 100644
--- a/arch/parisc/include/asm/system.h
+++ b/arch/parisc/include/asm/system.h
@@ -160,7 +160,7 @@ static inline void set_eiem(unsigned long val)
    ldcd). */
 
 #define __PA_LDCW_ALIGNMENT	4
-#define __ldcw_align(a) ((volatile unsigned int *)a)
+#define __ldcw_align(a) (&(a)->slock)
 #define __LDCW	"ldcw,co"
 
 #endif /*!CONFIG_PA20*/
diff --git a/arch/parisc/kernel/asm-offsets.c b/arch/parisc/kernel/asm-offsets.c
index ec787b4..b2f35b2 100644
--- a/arch/parisc/kernel/asm-offsets.c
+++ b/arch/parisc/kernel/asm-offsets.c
@@ -137,6 +137,7 @@ int main(void)
 	DEFINE(TASK_PT_IAOQ0, offsetof(struct task_struct, thread.regs.iaoq[0]));
 	DEFINE(TASK_PT_IAOQ1, offsetof(struct task_struct, thread.regs.iaoq[1]));
 	DEFINE(TASK_PT_CR27, offsetof(struct task_struct, thread.regs.cr27));
+	DEFINE(TASK_PT_SYSCALL_RP, offsetof(struct task_struct, thread.regs.pad0));
 	DEFINE(TASK_PT_ORIG_R28, offsetof(struct task_struct, thread.regs.orig_r28));
 	DEFINE(TASK_PT_KSP, offsetof(struct task_struct, thread.regs.ksp));
 	DEFINE(TASK_PT_KPC, offsetof(struct task_struct, thread.regs.kpc));
@@ -225,6 +226,7 @@ int main(void)
 	DEFINE(PT_IAOQ0, offsetof(struct pt_regs, iaoq[0]));
 	DEFINE(PT_IAOQ1, offsetof(struct pt_regs, iaoq[1]));
 	DEFINE(PT_CR27, offsetof(struct pt_regs, cr27));
+	DEFINE(PT_SYSCALL_RP, offsetof(struct pt_regs, pad0));
 	DEFINE(PT_ORIG_R28, offsetof(struct pt_regs, orig_r28));
 	DEFINE(PT_KSP, offsetof(struct pt_regs, ksp));
 	DEFINE(PT_KPC, offsetof(struct pt_regs, kpc));
@@ -290,5 +292,11 @@ int main(void)
 	BLANK();
 	DEFINE(ASM_PDC_RESULT_SIZE, NUM_PDC_RESULT * sizeof(unsigned long));
 	BLANK();
+
+#ifdef CONFIG_SMP
+	DEFINE(ASM_ATOMIC_HASH_SIZE_SHIFT, __builtin_ffs(ATOMIC_HASH_SIZE)-1);
+	DEFINE(ASM_ATOMIC_HASH_ENTRY_SHIFT, __builtin_ffs(sizeof(__atomic_hash[0]))-1);
+#endif
+
 	return 0;
 }
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index b6ed34d..a9a4e44 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -336,9 +336,9 @@ __flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr)
 	}
 }
 
-void flush_dcache_page(struct page *page)
+static void flush_user_dcache_page_internal(struct address_space *mapping,
+					    struct page *page)
 {
-	struct address_space *mapping = page_mapping(page);
 	struct vm_area_struct *mpnt;
 	struct prio_tree_iter iter;
 	unsigned long offset;
@@ -346,14 +346,6 @@ void flush_dcache_page(struct page *page)
 	pgoff_t pgoff;
 	unsigned long pfn = page_to_pfn(page);
 
-
-	if (mapping && !mapping_mapped(mapping)) {
-		set_bit(PG_dcache_dirty, &page->flags);
-		return;
-	}
-
-	flush_kernel_dcache_page(page);
-
 	if (!mapping)
 		return;
 
@@ -387,6 +379,19 @@ void flush_dcache_page(struct page *page)
 	}
 	flush_dcache_mmap_unlock(mapping);
 }
+
+void flush_dcache_page(struct page *page)
+{
+	struct address_space *mapping = page_mapping(page);
+
+	if (mapping && !mapping_mapped(mapping)) {
+		set_bit(PG_dcache_dirty, &page->flags);
+		return;
+	}
+
+	flush_kernel_dcache_page(page);
+	flush_user_dcache_page_internal(mapping, page);
+}
 EXPORT_SYMBOL(flush_dcache_page);
 
 /* Defined in arch/parisc/kernel/pacache.S */
@@ -395,15 +400,12 @@ EXPORT_SYMBOL(flush_kernel_dcache_page_asm);
 EXPORT_SYMBOL(flush_data_cache_local);
 EXPORT_SYMBOL(flush_kernel_icache_range_asm);
 
-void clear_user_page_asm(void *page, unsigned long vaddr)
+static void clear_user_page_asm(void *page, unsigned long vaddr)
 {
-	unsigned long flags;
 	/* This function is implemented in assembly in pacache.S */
 	extern void __clear_user_page_asm(void *page, unsigned long vaddr);
 
-	purge_tlb_start(flags);
 	__clear_user_page_asm(page, vaddr);
-	purge_tlb_end(flags);
 }
 
 #define FLUSH_THRESHOLD 0x80000 /* 0.5MB */
@@ -440,7 +442,6 @@ void __init parisc_setup_cache_timing(void)
 }
 
 extern void purge_kernel_dcache_page(unsigned long);
-extern void clear_user_page_asm(void *page, unsigned long vaddr);
 
 void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
 {
@@ -470,21 +471,9 @@ void copy_user_page(void *vto, void *vfrom, unsigned long vaddr,
 {
 	/* no coherency needed (all in kmap/kunmap) */
 	copy_user_page_asm(vto, vfrom);
-	if (!parisc_requires_coherency())
-		flush_kernel_dcache_page_asm(vto);
 }
 EXPORT_SYMBOL(copy_user_page);
 
-#ifdef CONFIG_PA8X00
-
-void kunmap_parisc(void *addr)
-{
-	if (parisc_requires_coherency())
-		flush_kernel_dcache_page_addr(addr);
-}
-EXPORT_SYMBOL(kunmap_parisc);
-#endif
-
 void __flush_tlb_range(unsigned long sid, unsigned long start,
 		       unsigned long end)
 {
@@ -577,3 +566,25 @@ flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long
 		__flush_cache_page(vma, vmaddr);
 
 }
+
+void *kmap_parisc(struct page *page)
+{
+	/* this is a killer.  There's no easy way to test quickly if
+	 * this page is dirty in any userspace.  Additionally, for
+	 * kernel alterations of the page, we'd need it invalidated
+	 * here anyway, so currently flush (and invalidate)
+	 * universally */
+	flush_user_dcache_page_internal(page_mapping(page), page);
+	return page_address(page);
+}
+EXPORT_SYMBOL(kmap_parisc);
+
+void kunmap_parisc(void *addr)
+{
+	/* flush and invalidate the kernel mapping.  We need the
+	 * invalidate so we don't have stale data at this cache
+	 * location the next time the page is mapped */
+	flush_kernel_dcache_page_addr(addr);
+}
+EXPORT_SYMBOL(kunmap_parisc);
+
diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index 3a44f7f..e1c0128 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -45,7 +45,7 @@
 	.level 2.0
 #endif
 
-	.import         pa_dbit_lock,data
+	.import         pa_pte_lock,data
 
 	/* space_to_prot macro creates a prot id from a space id */
 
@@ -364,32 +364,6 @@
 	.align		32
 	.endm
 
-	/* The following are simple 32 vs 64 bit instruction
-	 * abstractions for the macros */
-	.macro		EXTR	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	extrd,u		\reg1,32+(\start),\length,\reg2
-#else
-	extrw,u		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEP	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	depd		\reg1,32+(\start),\length,\reg2
-#else
-	depw		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEPI	val,start,length,reg
-#ifdef CONFIG_64BIT
-	depdi		\val,32+(\start),\length,\reg
-#else
-	depwi		\val,\start,\length,\reg
-#endif
-	.endm
-
 	/* In LP64, the space contains part of the upper 32 bits of the
 	 * fault.  We have to extract this and place it in the va,
 	 * zeroing the corresponding bits in the space register */
@@ -442,19 +416,19 @@
 	 */
 	.macro		L2_ptep	pmd,pte,index,va,fault
 #if PT_NLEVELS == 3
-	EXTR		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
+	extru		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
 #else
-	EXTR		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
+	extru		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
 #endif
-	DEP             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	dep             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	copy		%r0,\pte
 	ldw,s		\index(\pmd),\pmd
 	bb,>=,n		\pmd,_PxD_PRESENT_BIT,\fault
-	DEP		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
+	dep		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
 	copy		\pmd,%r9
 	SHLREG		%r9,PxD_VALUE_SHIFT,\pmd
-	EXTR		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
-	DEP		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	extru		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
+	dep		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	shladd		\index,BITS_PER_PTE_ENTRY,\pmd,\pmd
 	LDREG		%r0(\pmd),\pte		/* pmd is now pte */
 	bb,>=,n		\pte,_PAGE_PRESENT_BIT,\fault
@@ -488,13 +462,44 @@
 	L2_ptep		\pgd,\pte,\index,\va,\fault
 	.endm
 
+	/* SMP lock for consistent PTE updates.  Unlocks and jumps
+	   to FAULT if the page is not present.  Note the preceeding
+	   load of the PTE can't be deleted since we can't fault holding
+	   the lock.  */ 
+	.macro		pte_lock	ptep,pte,spc,tmp,tmp1,fault
+#ifdef CONFIG_SMP
+	cmpib,COND(=),n        0,\spc,2f
+	load32		PA(pa_pte_lock),\tmp1
+1:
+	LDCW		0(\tmp1),\tmp
+	cmpib,COND(=)         0,\tmp,1b
+	nop
+	LDREG		%r0(\ptep),\pte
+	bb,<,n		\pte,_PAGE_PRESENT_BIT,2f
+	ldi             1,\tmp
+	stw             \tmp,0(\tmp1)
+	b,n		\fault
+2:
+#endif
+	.endm
+
+	.macro		pte_unlock	spc,tmp,tmp1
+#ifdef CONFIG_SMP
+	cmpib,COND(=),n        0,\spc,1f
+	ldi             1,\tmp
+	stw             \tmp,0(\tmp1)
+1:
+#endif
+	.endm
+
 	/* Set the _PAGE_ACCESSED bit of the PTE.  Be clever and
 	 * don't needlessly dirty the cache line if it was already set */
-	.macro		update_ptep	ptep,pte,tmp,tmp1
-	ldi		_PAGE_ACCESSED,\tmp1
-	or		\tmp1,\pte,\tmp
-	and,COND(<>)	\tmp1,\pte,%r0
-	STREG		\tmp,0(\ptep)
+	.macro		update_ptep	ptep,pte,tmp
+	bb,<,n		\pte,_PAGE_ACCESSED_BIT,1f
+	ldi		_PAGE_ACCESSED,\tmp
+	or		\tmp,\pte,\pte
+	STREG		\pte,0(\ptep)
+1:
 	.endm
 
 	/* Set the dirty bit (and accessed bit).  No need to be
@@ -605,7 +610,7 @@
 	depdi		0,31,32,\tmp
 #endif
 	copy		\va,\tmp1
-	DEPI		0,31,23,\tmp1
+	depi		0,31,23,\tmp1
 	cmpb,COND(<>),n	\tmp,\tmp1,\fault
 	ldi		(_PAGE_DIRTY|_PAGE_WRITE|_PAGE_READ),\prot
 	depd,z		\prot,8,7,\prot
@@ -622,6 +627,39 @@
 	or		%r26,%r0,\pte
 	.endm 
 
+	/* Save PTE for recheck if SMP.  */
+	.macro		save_pte	pte,tmp
+#ifdef CONFIG_SMP
+	copy		\pte,\tmp
+#endif
+	.endm
+
+	/* Reload the PTE and purge the data TLB entry if the new
+	   value is different from the old one.  */
+	.macro		dtlb_recheck	ptep,old_pte,spc,va,tmp
+#ifdef CONFIG_SMP
+	LDREG		%r0(\ptep),\tmp
+	cmpb,COND(=),n	\old_pte,\tmp,1f
+	mfsp		%sr1,\tmp
+	mtsp		\spc,%sr1
+	pdtlb,l		%r0(%sr1,\va)
+	mtsp		\tmp,%sr1
+1:
+#endif
+	.endm
+
+	.macro		itlb_recheck	ptep,old_pte,spc,va,tmp
+#ifdef CONFIG_SMP
+	LDREG		%r0(\ptep),\tmp
+	cmpb,COND(=),n	\old_pte,\tmp,1f
+	mfsp		%sr1,\tmp
+	mtsp		\spc,%sr1
+	pitlb,l		%r0(%sr1,\va)
+	mtsp		\tmp,%sr1
+1:
+#endif
+	.endm
+
 
 	/*
 	 * Align fault_vector_20 on 4K boundary so that both
@@ -758,6 +796,10 @@ ENTRY(__kernel_thread)
 
 	STREG	%r22, PT_GR22(%r1)	/* save r22 (arg5) */
 	copy	%r0, %r22		/* user_tid */
+	copy	%r0, %r21		/* child_tid */
+#else
+	stw	%r0, -52(%r30)	     	/* user_tid */
+	stw	%r0, -56(%r30)	     	/* child_tid */
 #endif
 	STREG	%r26, PT_GR26(%r1)  /* Store function & argument for child */
 	STREG	%r25, PT_GR25(%r1)
@@ -765,7 +807,7 @@ ENTRY(__kernel_thread)
 	ldo	CLONE_VM(%r26), %r26   /* Force CLONE_VM since only init_mm */
 	or	%r26, %r24, %r26      /* will have kernel mappings.	 */
 	ldi	1, %r25			/* stack_start, signals kernel thread */
-	stw	%r0, -52(%r30)	     	/* user_tid */
+	ldi	0, %r23			/* child_stack_size */
 #ifdef CONFIG_64BIT
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
@@ -972,7 +1014,10 @@ intr_check_sig:
 	BL	do_notify_resume,%r2
 	copy	%r16, %r26			/* struct pt_regs *regs */
 
-	b,n	intr_check_sig
+	mfctl   %cr30,%r16		/* Reload */
+	LDREG	TI_TASK(%r16), %r16	/* thread_info -> task_struct */
+	b	intr_check_sig
+	ldo	TASK_REGS(%r16),%r16
 
 intr_restore:
 	copy            %r16,%r29
@@ -997,13 +1042,6 @@ intr_restore:
 
 	rfi
 	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
 
 #ifndef CONFIG_PREEMPT
 # define intr_do_preempt	intr_restore
@@ -1026,14 +1064,12 @@ intr_do_resched:
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	ldil	L%intr_check_sig, %r2
-#ifndef CONFIG_64BIT
-	b	schedule
-#else
-	load32	schedule, %r20
-	bv	%r0(%r20)
-#endif
-	ldo	R%intr_check_sig(%r2), %r2
+	BL	schedule,%r2
+	nop
+	mfctl   %cr30,%r16		/* Reload */
+	LDREG	TI_TASK(%r16), %r16	/* thread_info -> task_struct */
+	b	intr_check_sig
+	ldo	TASK_REGS(%r16),%r16
 
 	/* preempt the current task on returning to kernel
 	 * mode from an interrupt, iff need_resched is set,
@@ -1214,11 +1250,14 @@ dtlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,dtlb_check_alias_20w
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,dtlb_check_alias_20w
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-	
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1238,11 +1277,10 @@ nadtlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,nadtlb_check_flush_20w
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1272,8 +1310,11 @@ dtlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,dtlb_check_alias_11
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,dtlb_check_alias_11
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
@@ -1283,6 +1324,7 @@ dtlb_miss_11:
 	idtlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1321,11 +1363,9 @@ nadtlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,nadtlb_check_flush_11
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
-
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
 	mtsp		spc,%sr1
 
@@ -1333,6 +1373,7 @@ nadtlb_miss_11:
 	idtlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1368,13 +1409,17 @@ dtlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,dtlb_check_alias_20
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,dtlb_check_alias_20
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0
 
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1394,13 +1439,13 @@ nadtlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,nadtlb_check_flush_20
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0
 	
         idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1508,11 +1553,14 @@ itlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,itlb_fault
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-	
 	iitlbt          pte,prot
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1526,8 +1574,11 @@ itlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,itlb_fault
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
@@ -1537,6 +1588,7 @@ itlb_miss_11:
 	iitlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1548,13 +1600,17 @@ itlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,itlb_fault
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0	
 
 	iitlbt          pte,prot
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1570,29 +1626,14 @@ dbit_trap_20w:
 
 	L3_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_20w
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_20w:
-	LDCW		0(t0),t1
-	cmpib,COND(=)         0,t1,dbit_spin_20w
-	nop
-
-dbit_nolock_20w:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-		
 	idtlbt          pte,prot
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_20w
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_20w:
-#endif
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1606,35 +1647,21 @@ dbit_trap_11:
 
 	L2_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_11
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_11:
-	LDCW		0(t0),t1
-	cmpib,=         0,t1,dbit_spin_11
-	nop
-
-dbit_nolock_11:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
-	mfsp            %sr1,t1  /* Save sr1 so we can use it in tlb inserts */
+	mfsp            %sr1,t0  /* Save sr1 so we can use it in tlb inserts */
 	mtsp		spc,%sr1
 
 	idtlba		pte,(%sr1,va)
 	idtlbp		prot,(%sr1,va)
 
-	mtsp            t1, %sr1     /* Restore sr1 */
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_11
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_11:
-#endif
+	mtsp            t0, %sr1     /* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1646,32 +1673,17 @@ dbit_trap_20:
 
 	L2_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_20
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_20:
-	LDCW		0(t0),t1
-	cmpib,=         0,t1,dbit_spin_20
-	nop
-
-dbit_nolock_20:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
-	f_extend	pte,t1
+	f_extend	pte,t0
 	
         idtlbt          pte,prot
-
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_20
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_20:
-#endif
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1772,9 +1784,9 @@ ENTRY(sys_fork_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	/* These are call-clobbered registers and therefore
-	   also syscall-clobbered (we hope). */
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 
 	LDREG	PT_GR30(%r1),%r25
@@ -1804,7 +1816,7 @@ ENTRY(child_return)
 	nop
 
 	LDREG	TI_TASK-THREAD_SZ_ALGN-FRAME_SIZE-FRAME_SIZE(%r30), %r1
-	LDREG	TASK_PT_GR19(%r1),%r2
+	LDREG	TASK_PT_SYSCALL_RP(%r1),%r2
 	b	wrapper_exit
 	copy	%r0,%r28
 ENDPROC(child_return)
@@ -1823,8 +1835,9 @@ ENTRY(sys_clone_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	/* WARNING - Clobbers r19 and r21, userspace must save these! */
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 	BL	sys_clone,%r2
 	copy	%r1,%r24
@@ -1847,7 +1860,9 @@ ENTRY(sys_vfork_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 
 	BL	sys_vfork,%r2
@@ -2076,9 +2091,10 @@ syscall_restore:
 	LDREG	TASK_PT_GR31(%r1),%r31	   /* restore syscall rp */
 
 	/* NOTE: We use rsm/ssm pair to make this operation atomic */
+	LDREG   TASK_PT_GR30(%r1),%r1              /* Get user sp */
 	rsm     PSW_SM_I, %r0
-	LDREG   TASK_PT_GR30(%r1),%r30             /* restore user sp */
-	mfsp	%sr3,%r1			   /* Get users space id */
+	copy    %r1,%r30                           /* Restore user sp */
+	mfsp    %sr3,%r1                           /* Get user space id */
 	mtsp    %r1,%sr7                           /* Restore sr7 */
 	ssm     PSW_SM_I, %r0
 
diff --git a/arch/parisc/kernel/pacache.S b/arch/parisc/kernel/pacache.S
index 09b77b2..4f0d975 100644
--- a/arch/parisc/kernel/pacache.S
+++ b/arch/parisc/kernel/pacache.S
@@ -277,6 +277,7 @@ ENDPROC(flush_data_cache_local)
 
 	.align	16
 
+#if 1
 ENTRY(copy_user_page_asm)
 	.proc
 	.callinfo NO_CALLS
@@ -400,6 +401,7 @@ ENTRY(copy_user_page_asm)
 
 	.procend
 ENDPROC(copy_user_page_asm)
+#endif
 
 /*
  * NOTE: Code in clear_user_page has a hard coded dependency on the
@@ -548,17 +550,33 @@ ENTRY(__clear_user_page_asm)
 	depwi		0, 31,12, %r28		/* Clear any offset bits */
 #endif
 
+#ifdef CONFIG_SMP
+	ldil		L%pa_tlb_lock, %r1
+	ldo		R%pa_tlb_lock(%r1), %r24
+	rsm		PSW_SM_I, %r22
+1:
+	LDCW		0(%r24),%r25
+	cmpib,COND(=)	0,%r25,1b
+	nop
+#endif
+
 	/* Purge any old translation */
 
 	pdtlb		0(%r28)
 
+#ifdef CONFIG_SMP
+	ldi		1,%r25
+	stw		%r25,0(%r24)
+	mtsm		%r22
+#endif
+
 #ifdef CONFIG_64BIT
 	ldi		(PAGE_SIZE / 128), %r1
 
 	/* PREFETCH (Write) has not (yet) been proven to help here */
 	/* #define	PREFETCHW_OP	ldd		256(%0), %r0 */
 
-1:	std		%r0, 0(%r28)
+2:	std		%r0, 0(%r28)
 	std		%r0, 8(%r28)
 	std		%r0, 16(%r28)
 	std		%r0, 24(%r28)
@@ -574,13 +592,13 @@ ENTRY(__clear_user_page_asm)
 	std		%r0, 104(%r28)
 	std		%r0, 112(%r28)
 	std		%r0, 120(%r28)
-	addib,COND(>)		-1, %r1, 1b
+	addib,COND(>)		-1, %r1, 2b
 	ldo		128(%r28), %r28
 
 #else	/* ! CONFIG_64BIT */
 	ldi		(PAGE_SIZE / 64), %r1
 
-1:
+2:
 	stw		%r0, 0(%r28)
 	stw		%r0, 4(%r28)
 	stw		%r0, 8(%r28)
@@ -597,7 +615,7 @@ ENTRY(__clear_user_page_asm)
 	stw		%r0, 52(%r28)
 	stw		%r0, 56(%r28)
 	stw		%r0, 60(%r28)
-	addib,COND(>)		-1, %r1, 1b
+	addib,COND(>)		-1, %r1, 2b
 	ldo		64(%r28), %r28
 #endif	/* CONFIG_64BIT */
 
diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c
index cb71f3d..84b3239 100644
--- a/arch/parisc/kernel/setup.c
+++ b/arch/parisc/kernel/setup.c
@@ -128,6 +128,14 @@ void __init setup_arch(char **cmdline_p)
 	printk(KERN_INFO "The 32-bit Kernel has started...\n");
 #endif
 
+	/* Consistency check on the size and alignments of our spinlocks */
+#ifdef CONFIG_SMP
+	BUILD_BUG_ON(sizeof(arch_spinlock_t) != __PA_LDCW_ALIGNMENT);
+	BUG_ON((unsigned long)&__atomic_hash[0] & (__PA_LDCW_ALIGNMENT-1));
+	BUG_ON((unsigned long)&__atomic_hash[1] & (__PA_LDCW_ALIGNMENT-1));
+#endif
+	BUILD_BUG_ON((1<<L1_CACHE_SHIFT) != L1_CACHE_BYTES);
+
 	pdc_console_init();
 
 #ifdef CONFIG_64BIT
diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index f5f9602..68e75ce 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -47,18 +47,17 @@ ENTRY(linux_gateway_page)
 	KILL_INSN
 	.endr
 
-	/* ADDRESS 0xb0 to 0xb4, lws uses 1 insns for entry */
+	/* ADDRESS 0xb0 to 0xb8, lws uses two insns for entry */
 	/* Light-weight-syscall entry must always be located at 0xb0 */
 	/* WARNING: Keep this number updated with table size changes */
 #define __NR_lws_entries (2)
 
 lws_entry:
-	/* Unconditional branch to lws_start, located on the 
-	   same gateway page */
-	b,n	lws_start
+	gate	lws_start, %r0		/* increase privilege */
+	depi	3, 31, 2, %r31		/* Ensure we return into user mode. */
 
-	/* Fill from 0xb4 to 0xe0 */
-	.rept 11
+	/* Fill from 0xb8 to 0xe0 */
+	.rept 10
 	KILL_INSN
 	.endr
 
@@ -423,9 +422,6 @@ tracesys_sigexit:
 
 	*********************************************************/
 lws_start:
-	/* Gate and ensure we return to userspace */
-	gate	.+8, %r0
-	depi	3, 31, 2, %r31	/* Ensure we return to userspace */
 
 #ifdef CONFIG_64BIT
 	/* FIXME: If we are a 64-bit kernel just
@@ -442,7 +438,7 @@ lws_start:
 #endif	
 
         /* Is the lws entry number valid? */
-	comiclr,>>=	__NR_lws_entries, %r20, %r0
+	comiclr,>>	__NR_lws_entries, %r20, %r0
 	b,n	lws_exit_nosys
 
 	/* WARNING: Trashing sr2 and sr3 */
@@ -473,7 +469,7 @@ lws_exit:
 	/* now reset the lowest bit of sp if it was set */
 	xor	%r30,%r1,%r30
 #endif
-	be,n	0(%sr3, %r31)
+	be,n	0(%sr7, %r31)
 
 
 	
@@ -529,7 +525,6 @@ lws_compare_and_swap32:
 #endif
 
 lws_compare_and_swap:
-#ifdef CONFIG_SMP
 	/* Load start of lock table */
 	ldil	L%lws_lock_start, %r20
 	ldo	R%lws_lock_start(%r20), %r28
@@ -572,8 +567,6 @@ cas_wouldblock:
 	ldo	2(%r0), %r28				/* 2nd case */
 	b	lws_exit				/* Contended... */
 	ldo	-EAGAIN(%r0), %r21			/* Spin in userspace */
-#endif
-/* CONFIG_SMP */
 
 	/*
 		prev = *addr;
@@ -601,13 +594,11 @@ cas_action:
 1:	ldw	0(%sr3,%r26), %r28
 	sub,<>	%r28, %r25, %r0
 2:	stw	%r24, 0(%sr3,%r26)
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	/* Clear thread register indicator */
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	/* Return to userspace, set no error */
 	b	lws_exit
@@ -615,12 +606,10 @@ cas_action:
 
 3:		
 	/* Error occured on load or store */
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	b	lws_exit
 	ldo	-EFAULT(%r0),%r21	/* set errno */
@@ -672,7 +661,6 @@ ENTRY(sys_call_table64)
 END(sys_call_table64)
 #endif
 
-#ifdef CONFIG_SMP
 	/*
 		All light-weight-syscall atomic operations 
 		will use this set of locks 
@@ -694,8 +682,6 @@ ENTRY(lws_lock_start)
 	.endr
 END(lws_lock_start)
 	.previous
-#endif
-/* CONFIG_SMP for lws_lock_start */
 
 .end
 
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 8b58bf0..804b024 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -47,7 +47,7 @@
 			  /*  dumped to the console via printk)          */
 
 #if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-DEFINE_SPINLOCK(pa_dbit_lock);
+DEFINE_SPINLOCK(pa_pte_lock);
 #endif
 
 static void parisc_show_stack(struct task_struct *task, unsigned long *sp,
diff --git a/arch/parisc/lib/bitops.c b/arch/parisc/lib/bitops.c
index 353963d..bae6a86 100644
--- a/arch/parisc/lib/bitops.c
+++ b/arch/parisc/lib/bitops.c
@@ -15,6 +15,9 @@
 arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
 	[0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
 };
+arch_spinlock_t __atomic_user_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
+	[0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
+};
 #endif
 
 #ifdef CONFIG_64BIT
diff --git a/arch/parisc/math-emu/decode_exc.c b/arch/parisc/math-emu/decode_exc.c
index 3ca1c61..27a7492 100644
--- a/arch/parisc/math-emu/decode_exc.c
+++ b/arch/parisc/math-emu/decode_exc.c
@@ -342,6 +342,7 @@ decode_fpu(unsigned int Fpu_register[], unsigned int trap_counts[])
 		return SIGNALCODE(SIGFPE, FPE_FLTINV);
 	  case DIVISIONBYZEROEXCEPTION:
 		update_trap_counts(Fpu_register, aflags, bflags, trap_counts);
+		Clear_excp_register(exception_index);
 	  	return SIGNALCODE(SIGFPE, FPE_FLTDIV);
 	  case INEXACTEXCEPTION:
 		update_trap_counts(Fpu_register, aflags, bflags, trap_counts);

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-09 12:43                                             ` John David Anglin
@ 2010-05-09 14:14                                               ` Carlos O'Donell
  2010-05-10  9:56                                               ` Helge Deller
  1 sibling, 0 replies; 74+ messages in thread
From: Carlos O'Donell @ 2010-05-09 14:14 UTC (permalink / raw)
  To: John David Anglin; +Cc: Helge Deller, linux-parisc, gniibe

n Sun, May 9, 2010 at 8:43 AM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
> Attached is my latest 2.6.33.3 patch bundle. =A0It uses a slightly mo=
dified
> version of James' minifail fix.
>
> The big change is the management of PTE updates and the TLB exception
> support on SMP configs. =A0I have modified what was formerly the pa_d=
bit_lock
> and used it for all user page table updates. =A0I also added a rechec=
k of the
> PTE after TLB inserts. =A0The idea for this was derived from a simila=
r check
> in arch/ia64/kernel/ivt.S. =A0I also whack the TLB page in ptep_set_w=
rprotect
> and modified the TLB locking for clear_user_page (it's now in asm).
>
> So far, the change is lightly tested. =A0I've been burned enough to k=
now
> that there are likely still problems. =A0However, so far I haven't se=
en any
> random segvs on my rp3440 or gsyprf11.
>
> I would appreciate pa'ers testing this change. =A0If it looks good, I=
'll
> extract the new PTE handling and formally submit.

Thanks Dave! I'll run this through the glibc testsuite a couple of
times to see if anything shakes out.

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-09 12:43                                             ` John David Anglin
  2010-05-09 14:14                                               ` Carlos O'Donell
@ 2010-05-10  9:56                                               ` Helge Deller
  2010-05-10 14:56                                                 ` John David Anglin
  1 sibling, 1 reply; 74+ messages in thread
From: Helge Deller @ 2010-05-10  9:56 UTC (permalink / raw)
  To: John David Anglin; +Cc: carlos, gniibe, linux-parisc, dave.anglin

> Attached is my latest 2.6.33.3 patch bundle.  It uses a slightly modi=
fied
> version of James' minifail fix.

Thanks Dave!

I ran the patch my usual way (SMP kernel, on top of 2.6.33.3, multiple =
screens, minifail_dave.cpp testcase).
In summary I got after 10 seconds 2 segfaults, and in all of the screen=
s the minifail_dave testcases just hang.

"ps -ef" gives this:
root      2018     1  0 11:29 ?        00:00:00 SCREEN

root      2019  2018  0 11:29 pts/1    00:00:01 /bin/bash

root      2623  2019  0 11:30 pts/1    00:00:00 ./minifail_dave

root      2625  2623  0 11:30 pts/1    00:00:00 [minifail_dave] <defunc=
t>


ls3017:~# strace -p 2623

Process 2623 attached - interrupt to quit

futex(0x4113c4e8, FUTEX_WAIT, 2624, NULL^C <unfinished ...>

Process 2623 detached

ls3017:~# strace -p 2625

attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted


Interestingly, I got only 2 segfaults but 10 screens/minifails were run=
ning (and all were zombies).
This means to me, that the segfaults and the zombie/futex-waits are mos=
t likely unrelated.

Helge
--=20
GRATIS f=FCr alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-10  9:56                                               ` Helge Deller
@ 2010-05-10 14:56                                                 ` John David Anglin
  2010-05-10 19:20                                                   ` Helge Deller
  0 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-05-10 14:56 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, carlos, gniibe, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 1478 bytes --]

On Mon, 10 May 2010, Helge Deller wrote:

> > Attached is my latest 2.6.33.3 patch bundle.  It uses a slightly modified
> > version of James' minifail fix.
> 
> Thanks Dave!
> 
> I ran the patch my usual way (SMP kernel, on top of 2.6.33.3, multiple screens, minifail_dave.cpp testcase).
> In summary I got after 10 seconds 2 segfaults, and in all of the screens the minifail_dave testcases just hang.

Yes, just after sending, I noticed gcc testsuite and minifail were
broken on gsyprf11.  I did modify James' change a bit because I
thought we had a double flush in kunmap on machines that don't require
coherency.  However, I think the minifail problem is with parisc_kmap.
Possibly, it needs to handle non-current contexts.

The attached works better on gsyprf11.  I haven't tested it on anything
else.

It goes back to the original flush scheme which requires some modification
to the pte_set_wrprotect api.  There might be some race issues in the
write protect and cache flush.  That's why I added the preempt_disable/
preempt_enable.  I think the api can be changed without affecting other
backends.  Also, I think cache flushes can be minimized by checking
pte_dirty after the page is write protected.

If this doesn't work, I think copy_user_page will have to use
equivalent aliasing for copy_user_page.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

[-- Attachment #2: diff-20100510.d --]
[-- Type: text/plain, Size: 34544 bytes --]

diff --git a/arch/parisc/hpux/wrappers.S b/arch/parisc/hpux/wrappers.S
index 58c53c8..bdcea33 100644
--- a/arch/parisc/hpux/wrappers.S
+++ b/arch/parisc/hpux/wrappers.S
@@ -88,7 +88,7 @@ ENTRY(hpux_fork_wrapper)
 
 	STREG	%r2,-20(%r30)
 	ldo	64(%r30),%r30
-	STREG	%r2,PT_GR19(%r1)	;! save for child
+	STREG	%r2,PT_SYSCALL_RP(%r1)	;! save for child
 	STREG	%r30,PT_GR21(%r1)	;! save for child
 
 	LDREG	PT_GR30(%r1),%r25
@@ -132,7 +132,7 @@ ENTRY(hpux_child_return)
 	bl,n	schedule_tail, %r2
 #endif
 
-	LDREG	TASK_PT_GR19-TASK_SZ_ALGN-128(%r30),%r2
+	LDREG	TASK_PT_SYSCALL_RP-TASK_SZ_ALGN-128(%r30),%r2
 	b fork_return
 	copy %r0,%r28
 ENDPROC(hpux_child_return)
diff --git a/arch/parisc/include/asm/atomic.h b/arch/parisc/include/asm/atomic.h
index 716634d..ad7df44 100644
--- a/arch/parisc/include/asm/atomic.h
+++ b/arch/parisc/include/asm/atomic.h
@@ -24,29 +24,46 @@
  * Hash function to index into a different SPINLOCK.
  * Since "a" is usually an address, use one spinlock per cacheline.
  */
-#  define ATOMIC_HASH_SIZE 4
-#  define ATOMIC_HASH(a) (&(__atomic_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
+#  define ATOMIC_HASH_SIZE (4096/L1_CACHE_BYTES)  /* 4 */
+#  define ATOMIC_HASH(a)      (&(__atomic_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
+#  define ATOMIC_USER_HASH(a) (&(__atomic_user_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
 
 extern arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned;
+extern arch_spinlock_t __atomic_user_hash[ATOMIC_HASH_SIZE] __lock_aligned;
 
 /* Can't use raw_spin_lock_irq because of #include problems, so
  * this is the substitute */
-#define _atomic_spin_lock_irqsave(l,f) do {	\
-	arch_spinlock_t *s = ATOMIC_HASH(l);		\
+#define _atomic_spin_lock_irqsave_template(l,f,hash_func) do {	\
+	arch_spinlock_t *s = hash_func;		\
 	local_irq_save(f);			\
 	arch_spin_lock(s);			\
 } while(0)
 
-#define _atomic_spin_unlock_irqrestore(l,f) do {	\
-	arch_spinlock_t *s = ATOMIC_HASH(l);			\
+#define _atomic_spin_unlock_irqrestore_template(l,f,hash_func) do {	\
+	arch_spinlock_t *s = hash_func;			\
 	arch_spin_unlock(s);				\
 	local_irq_restore(f);				\
 } while(0)
 
+/* kernel memory locks */
+#define _atomic_spin_lock_irqsave(l,f)	\
+	_atomic_spin_lock_irqsave_template(l,f,ATOMIC_HASH(l))
+
+#define _atomic_spin_unlock_irqrestore(l,f)	\
+	_atomic_spin_unlock_irqrestore_template(l,f,ATOMIC_HASH(l))
+
+/* userspace memory locks */
+#define _atomic_spin_lock_irqsave_user(l,f)	\
+	_atomic_spin_lock_irqsave_template(l,f,ATOMIC_USER_HASH(l))
+
+#define _atomic_spin_unlock_irqrestore_user(l,f)	\
+	_atomic_spin_unlock_irqrestore_template(l,f,ATOMIC_USER_HASH(l))
 
 #else
 #  define _atomic_spin_lock_irqsave(l,f) do { local_irq_save(f); } while (0)
 #  define _atomic_spin_unlock_irqrestore(l,f) do { local_irq_restore(f); } while (0)
+#  define _atomic_spin_lock_irqsave_user(l,f) _atomic_spin_lock_irqsave(l,f)
+#  define _atomic_spin_unlock_irqrestore_user(l,f) _atomic_spin_unlock_irqrestore(l,f)
 #endif
 
 /* This should get optimized out since it's never called.
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index 7a73b61..89dce4f 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -2,6 +2,7 @@
 #define _PARISC_CACHEFLUSH_H
 
 #include <linux/mm.h>
+#include <linux/uaccess.h>
 
 /* The usual comment is "Caches aren't brain-dead on the <architecture>".
  * Unfortunately, that doesn't apply to PA-RISC. */
@@ -113,12 +114,22 @@ static inline void *kmap(struct page *page)
 
 #define kunmap(page)			kunmap_parisc(page_address(page))
 
-#define kmap_atomic(page, idx)		page_address(page)
+static inline void *kmap_atomic(struct page *page, enum km_type idx)
+{
+	pagefault_disable();
+	return page_address(page);
+}
 
-#define kunmap_atomic(addr, idx)	kunmap_parisc(addr)
+static inline void kunmap_atomic(void *addr, enum km_type idx)
+{
+	kunmap_parisc(addr);
+	pagefault_enable();
+}
 
-#define kmap_atomic_pfn(pfn, idx)	page_address(pfn_to_page(pfn))
-#define kmap_atomic_to_page(ptr)	virt_to_page(ptr)
+#define kmap_atomic_prot(page, idx, prot)	kmap_atomic(page, idx)
+#define kmap_atomic_pfn(pfn, idx)	kmap_atomic(pfn_to_page(pfn), (idx))
+#define kmap_atomic_to_page(ptr)	virt_to_page(kmap_atomic(virt_to_page(ptr), (enum km_type) 0))
+#define kmap_flush_unused()	do {} while(0)
 #endif
 
 #endif /* _PARISC_CACHEFLUSH_H */
diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
index 0c705c3..7bc963e 100644
--- a/arch/parisc/include/asm/futex.h
+++ b/arch/parisc/include/asm/futex.h
@@ -55,6 +55,7 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
 {
 	int err = 0;
 	int uval;
+	unsigned long flags;
 
 	/* futex.c wants to do a cmpxchg_inatomic on kernel NULL, which is
 	 * our gateway page, and causes no end of trouble...
@@ -65,10 +66,15 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
 		return -EFAULT;
 
+	_atomic_spin_lock_irqsave_user(uaddr, flags);
+
 	err = get_user(uval, uaddr);
-	if (err) return -EFAULT;
-	if (uval == oldval)
-		err = put_user(newval, uaddr);
+	if (!err)
+		if (uval == oldval)
+			err = put_user(newval, uaddr);
+
+	_atomic_spin_unlock_irqrestore_user(uaddr, flags);
+
 	if (err) return -EFAULT;
 	return uval;
 }
diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index a27d2e2..f2d8866 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -30,15 +30,21 @@
  */
 #define kern_addr_valid(addr)	(1)
 
+extern spinlock_t pa_pte_lock;
+extern spinlock_t pa_tlb_lock;
+
 /* Certain architectures need to do special things when PTEs
  * within a page table are directly modified.  Thus, the following
  * hook is made available.
  */
-#define set_pte(pteptr, pteval)                                 \
-        do{                                                     \
+#define set_pte(pteptr, pteval)					\
+        do {							\
+		unsigned long flags;				\
+		spin_lock_irqsave(&pa_pte_lock, flags);		\
                 *(pteptr) = (pteval);                           \
+		spin_unlock_irqrestore(&pa_pte_lock, flags);	\
         } while(0)
-#define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval)
+#define set_pte_at(mm,addr,ptep,pteval)	set_pte(ptep, pteval)
 
 #endif /* !__ASSEMBLY__ */
 
@@ -262,6 +268,7 @@ extern unsigned long *empty_zero_page;
 #define pte_none(x)     ((pte_val(x) == 0) || (pte_val(x) & _PAGE_FLUSH))
 #define pte_present(x)	(pte_val(x) & _PAGE_PRESENT)
 #define pte_clear(mm,addr,xp)	do { pte_val(*(xp)) = 0; } while (0)
+#define pte_same(A,B)	(pte_val(A) == pte_val(B))
 
 #define pmd_flag(x)	(pmd_val(x) & PxD_FLAG_MASK)
 #define pmd_address(x)	((unsigned long)(pmd_val(x) &~ PxD_FLAG_MASK) << PxD_VALUE_SHIFT)
@@ -410,6 +417,7 @@ extern void paging_init (void);
 
 #define PG_dcache_dirty         PG_arch_1
 
+extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn);
 extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t);
 
 /* Encode and de-code a swap entry */
@@ -423,56 +431,85 @@ extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t);
 #define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)		((pte_t) { (x).val })
 
-static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
+static inline void __flush_tlb_page(struct mm_struct *mm, unsigned long addr)
 {
-#ifdef CONFIG_SMP
-	if (!pte_young(*ptep))
-		return 0;
-	return test_and_clear_bit(xlate_pabit(_PAGE_ACCESSED_BIT), &pte_val(*ptep));
-#else
-	pte_t pte = *ptep;
-	if (!pte_young(pte))
-		return 0;
-	set_pte_at(vma->vm_mm, addr, ptep, pte_mkold(pte));
-	return 1;
-#endif
+	unsigned long flags;
+
+	/* For one page, it's not worth testing the split_tlb variable.  */
+	spin_lock_irqsave(&pa_tlb_lock, flags);
+	mtsp(mm->context,1);
+	pdtlb(addr);
+	pitlb(addr);
+	spin_unlock_irqrestore(&pa_tlb_lock, flags);
 }
 
-extern spinlock_t pa_dbit_lock;
+static inline int ptep_set_access_flags(struct vm_area_struct *vma, unsigned
+ long addr, pte_t *ptep, pte_t entry, int dirty)
+{
+	int changed;
+	unsigned long flags;
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	changed = !pte_same(*ptep, entry);
+	if (changed) {
+		*ptep = entry;
+	}
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
+	if (changed) {
+		__flush_tlb_page(vma->vm_mm, addr);
+	}
+	return changed;
+}
+
+static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
+{
+	pte_t pte;
+	unsigned long flags;
+	int r;
+
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	pte = *ptep;
+	if (pte_young(pte)) {
+		*ptep = pte_mkold(pte);
+		r = 1;
+	} else {
+		r = 0;
+	}
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
+
+	return r;
+}
 
 struct mm_struct;
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-	pte_t old_pte;
-	pte_t pte;
+	pte_t pte, old_pte;
+	unsigned long flags;
 
-	spin_lock(&pa_dbit_lock);
+	spin_lock_irqsave(&pa_pte_lock, flags);
 	pte = old_pte = *ptep;
 	pte_val(pte) &= ~_PAGE_PRESENT;
 	pte_val(pte) |= _PAGE_FLUSH;
-	set_pte_at(mm,addr,ptep,pte);
-	spin_unlock(&pa_dbit_lock);
+	*ptep = pte;
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
 
 	return old_pte;
 }
 
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-#ifdef CONFIG_SMP
-	unsigned long new, old;
-
-	do {
-		old = pte_val(*ptep);
-		new = pte_val(pte_wrprotect(__pte (old)));
-	} while (cmpxchg((unsigned long *) ptep, old, new) != old);
-#else
-	pte_t old_pte = *ptep;
-	set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
-#endif
+	pte_t old_pte;
+	unsigned long flags;
+
+	preempt_disable();
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	old_pte = *ptep;
+	*ptep = pte_wrprotect(old_pte);
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
+	__flush_tlb_page(mm, addr);
+	flush_cache_page(vma, addr, pte_pfn(old_pte));
+	preempt_enable();
 }
 
-#define pte_same(A,B)	(pte_val(A) == pte_val(B))
-
 #endif /* !__ASSEMBLY__ */
 
 
@@ -504,6 +541,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 
 #define HAVE_ARCH_UNMAPPED_AREA
 
+#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
 #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
diff --git a/arch/parisc/include/asm/system.h b/arch/parisc/include/asm/system.h
index d91357b..4653c77 100644
--- a/arch/parisc/include/asm/system.h
+++ b/arch/parisc/include/asm/system.h
@@ -160,7 +160,7 @@ static inline void set_eiem(unsigned long val)
    ldcd). */
 
 #define __PA_LDCW_ALIGNMENT	4
-#define __ldcw_align(a) ((volatile unsigned int *)a)
+#define __ldcw_align(a) (&(a)->slock)
 #define __LDCW	"ldcw,co"
 
 #endif /*!CONFIG_PA20*/
diff --git a/arch/parisc/kernel/asm-offsets.c b/arch/parisc/kernel/asm-offsets.c
index ec787b4..b2f35b2 100644
--- a/arch/parisc/kernel/asm-offsets.c
+++ b/arch/parisc/kernel/asm-offsets.c
@@ -137,6 +137,7 @@ int main(void)
 	DEFINE(TASK_PT_IAOQ0, offsetof(struct task_struct, thread.regs.iaoq[0]));
 	DEFINE(TASK_PT_IAOQ1, offsetof(struct task_struct, thread.regs.iaoq[1]));
 	DEFINE(TASK_PT_CR27, offsetof(struct task_struct, thread.regs.cr27));
+	DEFINE(TASK_PT_SYSCALL_RP, offsetof(struct task_struct, thread.regs.pad0));
 	DEFINE(TASK_PT_ORIG_R28, offsetof(struct task_struct, thread.regs.orig_r28));
 	DEFINE(TASK_PT_KSP, offsetof(struct task_struct, thread.regs.ksp));
 	DEFINE(TASK_PT_KPC, offsetof(struct task_struct, thread.regs.kpc));
@@ -225,6 +226,7 @@ int main(void)
 	DEFINE(PT_IAOQ0, offsetof(struct pt_regs, iaoq[0]));
 	DEFINE(PT_IAOQ1, offsetof(struct pt_regs, iaoq[1]));
 	DEFINE(PT_CR27, offsetof(struct pt_regs, cr27));
+	DEFINE(PT_SYSCALL_RP, offsetof(struct pt_regs, pad0));
 	DEFINE(PT_ORIG_R28, offsetof(struct pt_regs, orig_r28));
 	DEFINE(PT_KSP, offsetof(struct pt_regs, ksp));
 	DEFINE(PT_KPC, offsetof(struct pt_regs, kpc));
@@ -290,5 +292,11 @@ int main(void)
 	BLANK();
 	DEFINE(ASM_PDC_RESULT_SIZE, NUM_PDC_RESULT * sizeof(unsigned long));
 	BLANK();
+
+#ifdef CONFIG_SMP
+	DEFINE(ASM_ATOMIC_HASH_SIZE_SHIFT, __builtin_ffs(ATOMIC_HASH_SIZE)-1);
+	DEFINE(ASM_ATOMIC_HASH_ENTRY_SHIFT, __builtin_ffs(sizeof(__atomic_hash[0]))-1);
+#endif
+
 	return 0;
 }
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index b6ed34d..67241ac 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -395,15 +395,12 @@ EXPORT_SYMBOL(flush_kernel_dcache_page_asm);
 EXPORT_SYMBOL(flush_data_cache_local);
 EXPORT_SYMBOL(flush_kernel_icache_range_asm);
 
-void clear_user_page_asm(void *page, unsigned long vaddr)
+static void clear_user_page_asm(void *page, unsigned long vaddr)
 {
-	unsigned long flags;
 	/* This function is implemented in assembly in pacache.S */
 	extern void __clear_user_page_asm(void *page, unsigned long vaddr);
 
-	purge_tlb_start(flags);
 	__clear_user_page_asm(page, vaddr);
-	purge_tlb_end(flags);
 }
 
 #define FLUSH_THRESHOLD 0x80000 /* 0.5MB */
@@ -440,7 +437,6 @@ void __init parisc_setup_cache_timing(void)
 }
 
 extern void purge_kernel_dcache_page(unsigned long);
-extern void clear_user_page_asm(void *page, unsigned long vaddr);
 
 void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
 {
diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index 3a44f7f..e1c0128 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -45,7 +45,7 @@
 	.level 2.0
 #endif
 
-	.import         pa_dbit_lock,data
+	.import         pa_pte_lock,data
 
 	/* space_to_prot macro creates a prot id from a space id */
 
@@ -364,32 +364,6 @@
 	.align		32
 	.endm
 
-	/* The following are simple 32 vs 64 bit instruction
-	 * abstractions for the macros */
-	.macro		EXTR	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	extrd,u		\reg1,32+(\start),\length,\reg2
-#else
-	extrw,u		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEP	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	depd		\reg1,32+(\start),\length,\reg2
-#else
-	depw		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEPI	val,start,length,reg
-#ifdef CONFIG_64BIT
-	depdi		\val,32+(\start),\length,\reg
-#else
-	depwi		\val,\start,\length,\reg
-#endif
-	.endm
-
 	/* In LP64, the space contains part of the upper 32 bits of the
 	 * fault.  We have to extract this and place it in the va,
 	 * zeroing the corresponding bits in the space register */
@@ -442,19 +416,19 @@
 	 */
 	.macro		L2_ptep	pmd,pte,index,va,fault
 #if PT_NLEVELS == 3
-	EXTR		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
+	extru		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
 #else
-	EXTR		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
+	extru		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
 #endif
-	DEP             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	dep             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	copy		%r0,\pte
 	ldw,s		\index(\pmd),\pmd
 	bb,>=,n		\pmd,_PxD_PRESENT_BIT,\fault
-	DEP		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
+	dep		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
 	copy		\pmd,%r9
 	SHLREG		%r9,PxD_VALUE_SHIFT,\pmd
-	EXTR		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
-	DEP		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	extru		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
+	dep		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	shladd		\index,BITS_PER_PTE_ENTRY,\pmd,\pmd
 	LDREG		%r0(\pmd),\pte		/* pmd is now pte */
 	bb,>=,n		\pte,_PAGE_PRESENT_BIT,\fault
@@ -488,13 +462,44 @@
 	L2_ptep		\pgd,\pte,\index,\va,\fault
 	.endm
 
+	/* SMP lock for consistent PTE updates.  Unlocks and jumps
+	   to FAULT if the page is not present.  Note the preceeding
+	   load of the PTE can't be deleted since we can't fault holding
+	   the lock.  */ 
+	.macro		pte_lock	ptep,pte,spc,tmp,tmp1,fault
+#ifdef CONFIG_SMP
+	cmpib,COND(=),n        0,\spc,2f
+	load32		PA(pa_pte_lock),\tmp1
+1:
+	LDCW		0(\tmp1),\tmp
+	cmpib,COND(=)         0,\tmp,1b
+	nop
+	LDREG		%r0(\ptep),\pte
+	bb,<,n		\pte,_PAGE_PRESENT_BIT,2f
+	ldi             1,\tmp
+	stw             \tmp,0(\tmp1)
+	b,n		\fault
+2:
+#endif
+	.endm
+
+	.macro		pte_unlock	spc,tmp,tmp1
+#ifdef CONFIG_SMP
+	cmpib,COND(=),n        0,\spc,1f
+	ldi             1,\tmp
+	stw             \tmp,0(\tmp1)
+1:
+#endif
+	.endm
+
 	/* Set the _PAGE_ACCESSED bit of the PTE.  Be clever and
 	 * don't needlessly dirty the cache line if it was already set */
-	.macro		update_ptep	ptep,pte,tmp,tmp1
-	ldi		_PAGE_ACCESSED,\tmp1
-	or		\tmp1,\pte,\tmp
-	and,COND(<>)	\tmp1,\pte,%r0
-	STREG		\tmp,0(\ptep)
+	.macro		update_ptep	ptep,pte,tmp
+	bb,<,n		\pte,_PAGE_ACCESSED_BIT,1f
+	ldi		_PAGE_ACCESSED,\tmp
+	or		\tmp,\pte,\pte
+	STREG		\pte,0(\ptep)
+1:
 	.endm
 
 	/* Set the dirty bit (and accessed bit).  No need to be
@@ -605,7 +610,7 @@
 	depdi		0,31,32,\tmp
 #endif
 	copy		\va,\tmp1
-	DEPI		0,31,23,\tmp1
+	depi		0,31,23,\tmp1
 	cmpb,COND(<>),n	\tmp,\tmp1,\fault
 	ldi		(_PAGE_DIRTY|_PAGE_WRITE|_PAGE_READ),\prot
 	depd,z		\prot,8,7,\prot
@@ -622,6 +627,39 @@
 	or		%r26,%r0,\pte
 	.endm 
 
+	/* Save PTE for recheck if SMP.  */
+	.macro		save_pte	pte,tmp
+#ifdef CONFIG_SMP
+	copy		\pte,\tmp
+#endif
+	.endm
+
+	/* Reload the PTE and purge the data TLB entry if the new
+	   value is different from the old one.  */
+	.macro		dtlb_recheck	ptep,old_pte,spc,va,tmp
+#ifdef CONFIG_SMP
+	LDREG		%r0(\ptep),\tmp
+	cmpb,COND(=),n	\old_pte,\tmp,1f
+	mfsp		%sr1,\tmp
+	mtsp		\spc,%sr1
+	pdtlb,l		%r0(%sr1,\va)
+	mtsp		\tmp,%sr1
+1:
+#endif
+	.endm
+
+	.macro		itlb_recheck	ptep,old_pte,spc,va,tmp
+#ifdef CONFIG_SMP
+	LDREG		%r0(\ptep),\tmp
+	cmpb,COND(=),n	\old_pte,\tmp,1f
+	mfsp		%sr1,\tmp
+	mtsp		\spc,%sr1
+	pitlb,l		%r0(%sr1,\va)
+	mtsp		\tmp,%sr1
+1:
+#endif
+	.endm
+
 
 	/*
 	 * Align fault_vector_20 on 4K boundary so that both
@@ -758,6 +796,10 @@ ENTRY(__kernel_thread)
 
 	STREG	%r22, PT_GR22(%r1)	/* save r22 (arg5) */
 	copy	%r0, %r22		/* user_tid */
+	copy	%r0, %r21		/* child_tid */
+#else
+	stw	%r0, -52(%r30)	     	/* user_tid */
+	stw	%r0, -56(%r30)	     	/* child_tid */
 #endif
 	STREG	%r26, PT_GR26(%r1)  /* Store function & argument for child */
 	STREG	%r25, PT_GR25(%r1)
@@ -765,7 +807,7 @@ ENTRY(__kernel_thread)
 	ldo	CLONE_VM(%r26), %r26   /* Force CLONE_VM since only init_mm */
 	or	%r26, %r24, %r26      /* will have kernel mappings.	 */
 	ldi	1, %r25			/* stack_start, signals kernel thread */
-	stw	%r0, -52(%r30)	     	/* user_tid */
+	ldi	0, %r23			/* child_stack_size */
 #ifdef CONFIG_64BIT
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
@@ -972,7 +1014,10 @@ intr_check_sig:
 	BL	do_notify_resume,%r2
 	copy	%r16, %r26			/* struct pt_regs *regs */
 
-	b,n	intr_check_sig
+	mfctl   %cr30,%r16		/* Reload */
+	LDREG	TI_TASK(%r16), %r16	/* thread_info -> task_struct */
+	b	intr_check_sig
+	ldo	TASK_REGS(%r16),%r16
 
 intr_restore:
 	copy            %r16,%r29
@@ -997,13 +1042,6 @@ intr_restore:
 
 	rfi
 	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
 
 #ifndef CONFIG_PREEMPT
 # define intr_do_preempt	intr_restore
@@ -1026,14 +1064,12 @@ intr_do_resched:
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	ldil	L%intr_check_sig, %r2
-#ifndef CONFIG_64BIT
-	b	schedule
-#else
-	load32	schedule, %r20
-	bv	%r0(%r20)
-#endif
-	ldo	R%intr_check_sig(%r2), %r2
+	BL	schedule,%r2
+	nop
+	mfctl   %cr30,%r16		/* Reload */
+	LDREG	TI_TASK(%r16), %r16	/* thread_info -> task_struct */
+	b	intr_check_sig
+	ldo	TASK_REGS(%r16),%r16
 
 	/* preempt the current task on returning to kernel
 	 * mode from an interrupt, iff need_resched is set,
@@ -1214,11 +1250,14 @@ dtlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,dtlb_check_alias_20w
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,dtlb_check_alias_20w
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-	
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1238,11 +1277,10 @@ nadtlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,nadtlb_check_flush_20w
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1272,8 +1310,11 @@ dtlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,dtlb_check_alias_11
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,dtlb_check_alias_11
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
@@ -1283,6 +1324,7 @@ dtlb_miss_11:
 	idtlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1321,11 +1363,9 @@ nadtlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,nadtlb_check_flush_11
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
-
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
 	mtsp		spc,%sr1
 
@@ -1333,6 +1373,7 @@ nadtlb_miss_11:
 	idtlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1368,13 +1409,17 @@ dtlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,dtlb_check_alias_20
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,dtlb_check_alias_20
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0
 
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1394,13 +1439,13 @@ nadtlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,nadtlb_check_flush_20
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0
 	
         idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1508,11 +1553,14 @@ itlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,itlb_fault
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-	
 	iitlbt          pte,prot
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1526,8 +1574,11 @@ itlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,itlb_fault
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
@@ -1537,6 +1588,7 @@ itlb_miss_11:
 	iitlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1548,13 +1600,17 @@ itlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,itlb_fault
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0	
 
 	iitlbt          pte,prot
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1570,29 +1626,14 @@ dbit_trap_20w:
 
 	L3_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_20w
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_20w:
-	LDCW		0(t0),t1
-	cmpib,COND(=)         0,t1,dbit_spin_20w
-	nop
-
-dbit_nolock_20w:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-		
 	idtlbt          pte,prot
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_20w
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_20w:
-#endif
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1606,35 +1647,21 @@ dbit_trap_11:
 
 	L2_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_11
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_11:
-	LDCW		0(t0),t1
-	cmpib,=         0,t1,dbit_spin_11
-	nop
-
-dbit_nolock_11:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
-	mfsp            %sr1,t1  /* Save sr1 so we can use it in tlb inserts */
+	mfsp            %sr1,t0  /* Save sr1 so we can use it in tlb inserts */
 	mtsp		spc,%sr1
 
 	idtlba		pte,(%sr1,va)
 	idtlbp		prot,(%sr1,va)
 
-	mtsp            t1, %sr1     /* Restore sr1 */
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_11
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_11:
-#endif
+	mtsp            t0, %sr1     /* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1646,32 +1673,17 @@ dbit_trap_20:
 
 	L2_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_20
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_20:
-	LDCW		0(t0),t1
-	cmpib,=         0,t1,dbit_spin_20
-	nop
-
-dbit_nolock_20:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
-	f_extend	pte,t1
+	f_extend	pte,t0
 	
         idtlbt          pte,prot
-
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_20
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_20:
-#endif
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1772,9 +1784,9 @@ ENTRY(sys_fork_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	/* These are call-clobbered registers and therefore
-	   also syscall-clobbered (we hope). */
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 
 	LDREG	PT_GR30(%r1),%r25
@@ -1804,7 +1816,7 @@ ENTRY(child_return)
 	nop
 
 	LDREG	TI_TASK-THREAD_SZ_ALGN-FRAME_SIZE-FRAME_SIZE(%r30), %r1
-	LDREG	TASK_PT_GR19(%r1),%r2
+	LDREG	TASK_PT_SYSCALL_RP(%r1),%r2
 	b	wrapper_exit
 	copy	%r0,%r28
 ENDPROC(child_return)
@@ -1823,8 +1835,9 @@ ENTRY(sys_clone_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	/* WARNING - Clobbers r19 and r21, userspace must save these! */
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 	BL	sys_clone,%r2
 	copy	%r1,%r24
@@ -1847,7 +1860,9 @@ ENTRY(sys_vfork_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 
 	BL	sys_vfork,%r2
@@ -2076,9 +2091,10 @@ syscall_restore:
 	LDREG	TASK_PT_GR31(%r1),%r31	   /* restore syscall rp */
 
 	/* NOTE: We use rsm/ssm pair to make this operation atomic */
+	LDREG   TASK_PT_GR30(%r1),%r1              /* Get user sp */
 	rsm     PSW_SM_I, %r0
-	LDREG   TASK_PT_GR30(%r1),%r30             /* restore user sp */
-	mfsp	%sr3,%r1			   /* Get users space id */
+	copy    %r1,%r30                           /* Restore user sp */
+	mfsp    %sr3,%r1                           /* Get user space id */
 	mtsp    %r1,%sr7                           /* Restore sr7 */
 	ssm     PSW_SM_I, %r0
 
diff --git a/arch/parisc/kernel/pacache.S b/arch/parisc/kernel/pacache.S
index 09b77b2..4f0d975 100644
--- a/arch/parisc/kernel/pacache.S
+++ b/arch/parisc/kernel/pacache.S
@@ -277,6 +277,7 @@ ENDPROC(flush_data_cache_local)
 
 	.align	16
 
+#if 1
 ENTRY(copy_user_page_asm)
 	.proc
 	.callinfo NO_CALLS
@@ -400,6 +401,7 @@ ENTRY(copy_user_page_asm)
 
 	.procend
 ENDPROC(copy_user_page_asm)
+#endif
 
 /*
  * NOTE: Code in clear_user_page has a hard coded dependency on the
@@ -548,17 +550,33 @@ ENTRY(__clear_user_page_asm)
 	depwi		0, 31,12, %r28		/* Clear any offset bits */
 #endif
 
+#ifdef CONFIG_SMP
+	ldil		L%pa_tlb_lock, %r1
+	ldo		R%pa_tlb_lock(%r1), %r24
+	rsm		PSW_SM_I, %r22
+1:
+	LDCW		0(%r24),%r25
+	cmpib,COND(=)	0,%r25,1b
+	nop
+#endif
+
 	/* Purge any old translation */
 
 	pdtlb		0(%r28)
 
+#ifdef CONFIG_SMP
+	ldi		1,%r25
+	stw		%r25,0(%r24)
+	mtsm		%r22
+#endif
+
 #ifdef CONFIG_64BIT
 	ldi		(PAGE_SIZE / 128), %r1
 
 	/* PREFETCH (Write) has not (yet) been proven to help here */
 	/* #define	PREFETCHW_OP	ldd		256(%0), %r0 */
 
-1:	std		%r0, 0(%r28)
+2:	std		%r0, 0(%r28)
 	std		%r0, 8(%r28)
 	std		%r0, 16(%r28)
 	std		%r0, 24(%r28)
@@ -574,13 +592,13 @@ ENTRY(__clear_user_page_asm)
 	std		%r0, 104(%r28)
 	std		%r0, 112(%r28)
 	std		%r0, 120(%r28)
-	addib,COND(>)		-1, %r1, 1b
+	addib,COND(>)		-1, %r1, 2b
 	ldo		128(%r28), %r28
 
 #else	/* ! CONFIG_64BIT */
 	ldi		(PAGE_SIZE / 64), %r1
 
-1:
+2:
 	stw		%r0, 0(%r28)
 	stw		%r0, 4(%r28)
 	stw		%r0, 8(%r28)
@@ -597,7 +615,7 @@ ENTRY(__clear_user_page_asm)
 	stw		%r0, 52(%r28)
 	stw		%r0, 56(%r28)
 	stw		%r0, 60(%r28)
-	addib,COND(>)		-1, %r1, 1b
+	addib,COND(>)		-1, %r1, 2b
 	ldo		64(%r28), %r28
 #endif	/* CONFIG_64BIT */
 
diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c
index cb71f3d..84b3239 100644
--- a/arch/parisc/kernel/setup.c
+++ b/arch/parisc/kernel/setup.c
@@ -128,6 +128,14 @@ void __init setup_arch(char **cmdline_p)
 	printk(KERN_INFO "The 32-bit Kernel has started...\n");
 #endif
 
+	/* Consistency check on the size and alignments of our spinlocks */
+#ifdef CONFIG_SMP
+	BUILD_BUG_ON(sizeof(arch_spinlock_t) != __PA_LDCW_ALIGNMENT);
+	BUG_ON((unsigned long)&__atomic_hash[0] & (__PA_LDCW_ALIGNMENT-1));
+	BUG_ON((unsigned long)&__atomic_hash[1] & (__PA_LDCW_ALIGNMENT-1));
+#endif
+	BUILD_BUG_ON((1<<L1_CACHE_SHIFT) != L1_CACHE_BYTES);
+
 	pdc_console_init();
 
 #ifdef CONFIG_64BIT
diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index f5f9602..68e75ce 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -47,18 +47,17 @@ ENTRY(linux_gateway_page)
 	KILL_INSN
 	.endr
 
-	/* ADDRESS 0xb0 to 0xb4, lws uses 1 insns for entry */
+	/* ADDRESS 0xb0 to 0xb8, lws uses two insns for entry */
 	/* Light-weight-syscall entry must always be located at 0xb0 */
 	/* WARNING: Keep this number updated with table size changes */
 #define __NR_lws_entries (2)
 
 lws_entry:
-	/* Unconditional branch to lws_start, located on the 
-	   same gateway page */
-	b,n	lws_start
+	gate	lws_start, %r0		/* increase privilege */
+	depi	3, 31, 2, %r31		/* Ensure we return into user mode. */
 
-	/* Fill from 0xb4 to 0xe0 */
-	.rept 11
+	/* Fill from 0xb8 to 0xe0 */
+	.rept 10
 	KILL_INSN
 	.endr
 
@@ -423,9 +422,6 @@ tracesys_sigexit:
 
 	*********************************************************/
 lws_start:
-	/* Gate and ensure we return to userspace */
-	gate	.+8, %r0
-	depi	3, 31, 2, %r31	/* Ensure we return to userspace */
 
 #ifdef CONFIG_64BIT
 	/* FIXME: If we are a 64-bit kernel just
@@ -442,7 +438,7 @@ lws_start:
 #endif	
 
         /* Is the lws entry number valid? */
-	comiclr,>>=	__NR_lws_entries, %r20, %r0
+	comiclr,>>	__NR_lws_entries, %r20, %r0
 	b,n	lws_exit_nosys
 
 	/* WARNING: Trashing sr2 and sr3 */
@@ -473,7 +469,7 @@ lws_exit:
 	/* now reset the lowest bit of sp if it was set */
 	xor	%r30,%r1,%r30
 #endif
-	be,n	0(%sr3, %r31)
+	be,n	0(%sr7, %r31)
 
 
 	
@@ -529,7 +525,6 @@ lws_compare_and_swap32:
 #endif
 
 lws_compare_and_swap:
-#ifdef CONFIG_SMP
 	/* Load start of lock table */
 	ldil	L%lws_lock_start, %r20
 	ldo	R%lws_lock_start(%r20), %r28
@@ -572,8 +567,6 @@ cas_wouldblock:
 	ldo	2(%r0), %r28				/* 2nd case */
 	b	lws_exit				/* Contended... */
 	ldo	-EAGAIN(%r0), %r21			/* Spin in userspace */
-#endif
-/* CONFIG_SMP */
 
 	/*
 		prev = *addr;
@@ -601,13 +594,11 @@ cas_action:
 1:	ldw	0(%sr3,%r26), %r28
 	sub,<>	%r28, %r25, %r0
 2:	stw	%r24, 0(%sr3,%r26)
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	/* Clear thread register indicator */
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	/* Return to userspace, set no error */
 	b	lws_exit
@@ -615,12 +606,10 @@ cas_action:
 
 3:		
 	/* Error occured on load or store */
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	b	lws_exit
 	ldo	-EFAULT(%r0),%r21	/* set errno */
@@ -672,7 +661,6 @@ ENTRY(sys_call_table64)
 END(sys_call_table64)
 #endif
 
-#ifdef CONFIG_SMP
 	/*
 		All light-weight-syscall atomic operations 
 		will use this set of locks 
@@ -694,8 +682,6 @@ ENTRY(lws_lock_start)
 	.endr
 END(lws_lock_start)
 	.previous
-#endif
-/* CONFIG_SMP for lws_lock_start */
 
 .end
 
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 8b58bf0..804b024 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -47,7 +47,7 @@
 			  /*  dumped to the console via printk)          */
 
 #if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-DEFINE_SPINLOCK(pa_dbit_lock);
+DEFINE_SPINLOCK(pa_pte_lock);
 #endif
 
 static void parisc_show_stack(struct task_struct *task, unsigned long *sp,
diff --git a/arch/parisc/lib/bitops.c b/arch/parisc/lib/bitops.c
index 353963d..bae6a86 100644
--- a/arch/parisc/lib/bitops.c
+++ b/arch/parisc/lib/bitops.c
@@ -15,6 +15,9 @@
 arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
 	[0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
 };
+arch_spinlock_t __atomic_user_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
+	[0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
+};
 #endif
 
 #ifdef CONFIG_64BIT
diff --git a/arch/parisc/math-emu/decode_exc.c b/arch/parisc/math-emu/decode_exc.c
index 3ca1c61..27a7492 100644
--- a/arch/parisc/math-emu/decode_exc.c
+++ b/arch/parisc/math-emu/decode_exc.c
@@ -342,6 +342,7 @@ decode_fpu(unsigned int Fpu_register[], unsigned int trap_counts[])
 		return SIGNALCODE(SIGFPE, FPE_FLTINV);
 	  case DIVISIONBYZEROEXCEPTION:
 		update_trap_counts(Fpu_register, aflags, bflags, trap_counts);
+		Clear_excp_register(exception_index);
 	  	return SIGNALCODE(SIGFPE, FPE_FLTDIV);
 	  case INEXACTEXCEPTION:
 		update_trap_counts(Fpu_register, aflags, bflags, trap_counts);
diff --git a/mm/memory.c b/mm/memory.c
index 09e4b1b..21c2916 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -616,7 +616,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	 * in the parent and the child
 	 */
 	if (is_cow_mapping(vm_flags)) {
-		ptep_set_wrprotect(src_mm, addr, src_pte);
+		ptep_set_wrprotect(vma, src_mm, addr, src_pte);
 		pte = pte_wrprotect(pte);
 	}
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-10 14:56                                                 ` John David Anglin
@ 2010-05-10 19:20                                                   ` Helge Deller
  2010-05-10 21:07                                                     ` John David Anglin
  2010-05-11 20:41                                                     ` Helge Deller
  0 siblings, 2 replies; 74+ messages in thread
From: Helge Deller @ 2010-05-10 19:20 UTC (permalink / raw)
  To: John David Anglin; +Cc: John David Anglin, carlos, gniibe, linux-parisc

On 05/10/2010 04:56 PM, John David Anglin wrote:
> Yes, just after sending, I noticed gcc testsuite and minifail were
> broken on gsyprf11.  [...]
> 
> The attached works better on gsyprf11.  I haven't tested it on anything
> else.

Hi Dave,

Ugh... :-(
At boot I get:
...
INIT: Entering runlevel: 2

Starting enhanced syslogd: rsyslogd.

Starting system message bus: dbuseth0: Setting full-duplex based on MII#1 link partner capability of 41e1.

Starting OpenBSD Secure Shell server: sshd


Backtrace:


High Priority Machine Check (HPMC): Code=1 regs=106f5080 (Addr=00000000)

Kernel panic - not syncing: High Priority Machine Check (HPMC)

Backtrace:

 [<1011ebc4>] show_stack+0x18/0x28

 [<10117b90>] dump_stack+0x1c/0x2c

 [<10117c18>] panic+0x78/0x1e8

 [<1011f134>] parisc_terminate+0xe4/0xfc

 [<1011f504>] handle_interruption+0x1f0/

I'm not sure if this relates to your patch though...

Anyway, the machine is located at work and I can't restart it remotely.
I'll test it again tomorrow...

Helge

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-10 19:20                                                   ` Helge Deller
@ 2010-05-10 21:07                                                     ` John David Anglin
  2010-05-11 16:37                                                       ` John David Anglin
  2010-05-11 20:44                                                       ` Helge Deller
  2010-05-11 20:41                                                     ` Helge Deller
  1 sibling, 2 replies; 74+ messages in thread
From: John David Anglin @ 2010-05-10 21:07 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, carlos, gniibe, linux-parisc

On Mon, 10 May 2010, Helge Deller wrote:

> On 05/10/2010 04:56 PM, John David Anglin wrote:
> > Yes, just after sending, I noticed gcc testsuite and minifail were
> > broken on gsyprf11.  [...]
> > 
> > The attached works better on gsyprf11.  I haven't tested it on anything
> > else.
> 
> Hi Dave,
> 
> Ugh... :-(
> At boot I get:
> ...
> INIT: Entering runlevel: 2
> 
> Starting enhanced syslogd: rsyslogd.
> 
> Starting system message bus: dbuseth0: Setting full-duplex based on MII#1 link partner capability of 41e1.
> 
> Starting OpenBSD Secure Shell server: sshd

I'll do a little more testing tonight.  In testing, I have had a
couple of boot failures on gsyprf11 starting iptables.  Haven't
seen any hpmc's starting sshd.

If you find the location of the hpmc is in code modified by change,
that would be useful information.  Possibly, there is an issue with
the pdtlb,l instruction.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-10 21:07                                                     ` John David Anglin
@ 2010-05-11 16:37                                                       ` John David Anglin
  2010-05-11 21:39                                                         ` John David Anglin
  2010-05-11 20:44                                                       ` Helge Deller
  1 sibling, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-05-11 16:37 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, carlos, gniibe, linux-parisc

On Mon, 10 May 2010, John David Anglin wrote:

> I'll do a little more testing tonight.  In testing, I have had a
> couple of boot failures on gsyprf11 starting iptables.  Haven't
> seen any hpmc's starting sshd.

It appears James' kmap change is needed on PA8800 (rp3440) for cache
coherency.  I see segmentation faults in sh with just the flush in
ptep_set_wrprotect.  I haven't had a chance to test whether the flush
in ptep_set_wrprotect is needed to fix minifail on this machine.
However, a GCC build completed successfully with the kmap change and
without the ptep_set_wrprotect flush.

On gsyprf11 (PA8700), James' kmap change doesn't fix minifail and we need
the cache flush in ptep_set_wrprotect to fix minifail.  The kmap change
causes general instability by itself.

Thus, it appears necessary to merge the two approaches and treat PA8700
and PA8800 differently.

I think we are really killing performance with the cache flushing...

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-10 19:20                                                   ` Helge Deller
  2010-05-10 21:07                                                     ` John David Anglin
@ 2010-05-11 20:41                                                     ` Helge Deller
  2010-05-11 21:26                                                       ` John David Anglin
  1 sibling, 1 reply; 74+ messages in thread
From: Helge Deller @ 2010-05-11 20:41 UTC (permalink / raw)
  To: John David Anglin; +Cc: John David Anglin, carlos, gniibe, linux-parisc

On 05/10/2010 09:20 PM, Helge Deller wrote:
> On 05/10/2010 04:56 PM, John David Anglin wrote:
>> Yes, just after sending, I noticed gcc testsuite and minifail were
>> broken on gsyprf11.  [...]
>>
>> The attached works better on gsyprf11.  I haven't tested it on anything
>> else.

Hi Dave,

I still can see segfaults of the minifail_dave.cpp program.

In addition I see some memory corruptions.
Those corruptions are not new due to your patch, I could see them with
other patches (and maybe even plain 2.6.32.3 kernel) as well...

pagealloc: memory corruption

6e8291c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

6e8291d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

6e8291e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

6e8291f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

Backtrace:

 [<1011ebc4>] show_stack+0x18/0x28

 [<10117b90>] dump_stack+0x1c/0x2c

 [<101c7020>] kernel_map_pages+0x2a0/0x2b8

 [<1019ebc8>] get_page_from_freelist+0x3d4/0x630

 [<1019ef58>] __alloc_pages_nodemask+0x134/0x610

 [<101b23d8>] do_wp_page+0x2c0/0xb50

 [<101b4290>] handle_mm_fault+0x514/0x844

 [<1011d870>] do_page_fault+0x1f8/0x2fc

 [<1011f400>] handle_interruption+0xec/0x730

 [<10103078>] intr_check_sig+0x0/0x40


Helge

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-10 21:07                                                     ` John David Anglin
  2010-05-11 16:37                                                       ` John David Anglin
@ 2010-05-11 20:44                                                       ` Helge Deller
  1 sibling, 0 replies; 74+ messages in thread
From: Helge Deller @ 2010-05-11 20:44 UTC (permalink / raw)
  To: John David Anglin; +Cc: John David Anglin, carlos, gniibe, linux-parisc

On 05/10/2010 11:07 PM, John David Anglin wrote:
> If you find the location of the hpmc is in code modified by change,
> that would be useful information.  Possibly, there is an issue with
> the pdtlb,l instruction.

Sadly I don't have any more info than this:

High Priority Machine Check (HPMC): Code=1 regs=106f5080 (Addr=00000000)

Kernel panic - not syncing: High Priority Machine Check (HPMC)

Backtrace:

 [<1011ebc4>] show_stack+0x18/0x28

 [<10117b90>] dump_stack+0x1c/0x2c

 [<10117c18>] panic+0x78/0x1e8

 [<1011f134>] parisc_terminate+0xe4/0xfc

 [<1011f504>] handle_interruption+0x1f0/0x730

 [<10103078>] intr_check_sig+0x0/0x40


The second time the kernel did booted without problems, even though
I didn't changed anything.

Helge

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-11 20:41                                                     ` Helge Deller
@ 2010-05-11 21:26                                                       ` John David Anglin
  2010-05-11 21:41                                                         ` Helge Deller
  0 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-05-11 21:26 UTC (permalink / raw)
  To: Helge Deller; +Cc: dave.anglin, carlos, gniibe, linux-parisc

> On 05/10/2010 09:20 PM, Helge Deller wrote:
> > On 05/10/2010 04:56 PM, John David Anglin wrote:
> >> Yes, just after sending, I noticed gcc testsuite and minifail were
> >> broken on gsyprf11.  [...]
> >>
> >> The attached works better on gsyprf11.  I haven't tested it on anything
> >> else.
> 
> Hi Dave,
> 
> I still can see segfaults of the minifail_dave.cpp program.
> 
> In addition I see some memory corruptions.
> Those corruptions are not new due to your patch, I could see them with
> other patches (and maybe even plain 2.6.32.3 kernel) as well...
> 
> pagealloc: memory corruption

I haven't seen this.  Which allocator are you using?  I have
CONFIG_SLAB=y.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-11 16:37                                                       ` John David Anglin
@ 2010-05-11 21:39                                                         ` John David Anglin
  0 siblings, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-05-11 21:39 UTC (permalink / raw)
  To: dave.anglin; +Cc: deller, carlos, gniibe, linux-parisc

> On gsyprf11 (PA8700), James' kmap change doesn't fix minifail and we need
> the cache flush in ptep_set_wrprotect to fix minifail.  The kmap change
> causes general instability by itself.

I'm going to merge the two changes and see what happens.  With just
the flush in ptep_set_wrprotect, I'm seeing some segvs in sh on
gsyprf11.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-11 21:26                                                       ` John David Anglin
@ 2010-05-11 21:41                                                         ` Helge Deller
  2010-05-15 21:02                                                           ` John David Anglin
  0 siblings, 1 reply; 74+ messages in thread
From: Helge Deller @ 2010-05-11 21:41 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, carlos, gniibe, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 771 bytes --]

On 05/11/2010 11:26 PM, John David Anglin wrote:
>> On 05/10/2010 09:20 PM, Helge Deller wrote:
>>> On 05/10/2010 04:56 PM, John David Anglin wrote:
>>>> Yes, just after sending, I noticed gcc testsuite and minifail were
>>>> broken on gsyprf11.  [...]
>>>>
>>>> The attached works better on gsyprf11.  I haven't tested it on anything
>>>> else.
>>
>> Hi Dave,
>>
>> I still can see segfaults of the minifail_dave.cpp program.
>>
>> In addition I see some memory corruptions.
>> Those corruptions are not new due to your patch, I could see them with
>> other patches (and maybe even plain 2.6.32.3 kernel) as well...
>>
>> pagealloc: memory corruption
> 
> I haven't seen this.  Which allocator are you using?  I have
> CONFIG_SLAB=y.

Yes.
Full .config attached.

Helge

[-- Attachment #2: .config --]
[-- Type: text/plain, Size: 44707 bytes --]

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.33.3
# Mon May 10 20:12:09 2010
#
CONFIG_PARISC=y
CONFIG_MMU=y
CONFIG_STACK_GROWSUP=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME=y
CONFIG_TIME_LOW_RES=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_IRQ_PER_CPU=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_CONSTRUCTORS=y

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION="-32bit"
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
# CONFIG_TASKSTATS is not set
# CONFIG_AUDIT is not set

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_TREE_PREEMPT_RCU is not set
# CONFIG_TINY_RCU is not set
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_FANOUT=32
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=16
# CONFIG_GROUP_SCHED is not set
# CONFIG_CGROUPS is not set
# CONFIG_SYSFS_DEPRECATED_V2 is not set
# CONFIG_RELAY is not set
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_IPC_NS is not set
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_NET_NS is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_LZO=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
# CONFIG_EMBEDDED is not set
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_HAVE_PERF_EVENTS=y

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_PERF_COUNTERS is not set
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_PCI_QUIRKS=y
CONFIG_COMPAT_BRK=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
CONFIG_PROFILING=y
# CONFIG_OPROFILE is not set
CONFIG_HAVE_OPROFILE=y
CONFIG_USE_GENERIC_SMP_HELPERS=y

#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
CONFIG_SLOW_WORK=y
# CONFIG_SLOW_WORK_DEBUG is not set
# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_INIT_ALL_POSSIBLE=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
# CONFIG_LBDAF is not set
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_BLK_DEV_INTEGRITY is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"
# CONFIG_INLINE_SPIN_TRYLOCK is not set
# CONFIG_INLINE_SPIN_TRYLOCK_BH is not set
# CONFIG_INLINE_SPIN_LOCK is not set
# CONFIG_INLINE_SPIN_LOCK_BH is not set
# CONFIG_INLINE_SPIN_LOCK_IRQ is not set
# CONFIG_INLINE_SPIN_LOCK_IRQSAVE is not set
# CONFIG_INLINE_SPIN_UNLOCK is not set
# CONFIG_INLINE_SPIN_UNLOCK_BH is not set
# CONFIG_INLINE_SPIN_UNLOCK_IRQ is not set
# CONFIG_INLINE_SPIN_UNLOCK_IRQRESTORE is not set
# CONFIG_INLINE_READ_TRYLOCK is not set
# CONFIG_INLINE_READ_LOCK is not set
# CONFIG_INLINE_READ_LOCK_BH is not set
# CONFIG_INLINE_READ_LOCK_IRQ is not set
# CONFIG_INLINE_READ_LOCK_IRQSAVE is not set
# CONFIG_INLINE_READ_UNLOCK is not set
# CONFIG_INLINE_READ_UNLOCK_BH is not set
# CONFIG_INLINE_READ_UNLOCK_IRQ is not set
# CONFIG_INLINE_READ_UNLOCK_IRQRESTORE is not set
# CONFIG_INLINE_WRITE_TRYLOCK is not set
# CONFIG_INLINE_WRITE_LOCK is not set
# CONFIG_INLINE_WRITE_LOCK_BH is not set
# CONFIG_INLINE_WRITE_LOCK_IRQ is not set
# CONFIG_INLINE_WRITE_LOCK_IRQSAVE is not set
# CONFIG_INLINE_WRITE_UNLOCK is not set
# CONFIG_INLINE_WRITE_UNLOCK_BH is not set
# CONFIG_INLINE_WRITE_UNLOCK_IRQ is not set
# CONFIG_INLINE_WRITE_UNLOCK_IRQRESTORE is not set
# CONFIG_MUTEX_SPIN_ON_OWNER is not set
# CONFIG_FREEZER is not set

#
# Processor type and features
#
# CONFIG_PA7000 is not set
# CONFIG_PA7100LC is not set
# CONFIG_PA7200 is not set
# CONFIG_PA7300LC is not set
CONFIG_PA8X00=y
CONFIG_PA20=y
CONFIG_PREFETCH=y
# CONFIG_64BIT is not set
CONFIG_PARISC_PAGE_SIZE_4KB=y
# CONFIG_PARISC_PAGE_SIZE_16KB is not set
# CONFIG_PARISC_PAGE_SIZE_64KB is not set
CONFIG_SMP=y
CONFIG_HOTPLUG_CPU=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_HZ_100=y
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=100
# CONFIG_SCHED_HRTICK is not set
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=999999
# CONFIG_PHYS_ADDR_T_64BIT is not set
CONFIG_ZONE_DMA_FLAG=0
CONFIG_VIRT_TO_BUS=y
# CONFIG_KSM is not set
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_HPUX=y
CONFIG_NR_CPUS=32

#
# Bus options (PCI, PCMCIA, EISA, GSC, ISA)
#
CONFIG_GSC=y
# CONFIG_HPPB is not set
CONFIG_IOMMU_CCIO=y
CONFIG_GSC_LASI=y
CONFIG_GSC_WAX=y
CONFIG_EISA=y
CONFIG_EISA_NAMES=y
# CONFIG_ISA is not set
CONFIG_PCI=y
# CONFIG_ARCH_SUPPORTS_MSI is not set
CONFIG_PCI_LEGACY=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_STUB is not set
# CONFIG_PCI_IOV is not set
CONFIG_GSC_DINO=y
CONFIG_PCI_LBA=y
CONFIG_IOSAPIC=y
CONFIG_IOMMU_SBA=y
CONFIG_IOMMU_HELPER=y
CONFIG_PCCARD=y
CONFIG_PCMCIA=y
CONFIG_PCMCIA_LOAD_CIS=y
CONFIG_PCMCIA_IOCTL=y
CONFIG_CARDBUS=y

#
# PC-card bridges
#
CONFIG_YENTA=y
CONFIG_YENTA_O2=y
CONFIG_YENTA_RICOH=y
CONFIG_YENTA_TI=y
CONFIG_YENTA_ENE_TUNE=y
CONFIG_YENTA_TOSHIBA=y
CONFIG_PD6729=y
CONFIG_I82092=y
CONFIG_PCCARD_NONSTATIC=y
# CONFIG_HOTPLUG_PCI is not set

#
# PA-RISC specific drivers
#
CONFIG_SUPERIO=y
CONFIG_CHASSIS_LCD_LED=y
# CONFIG_PDC_CHASSIS is not set
CONFIG_PDC_CHASSIS_WARN=y
CONFIG_PDC_STABLE=y

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
# CONFIG_HAVE_AOUT is not set
CONFIG_BINFMT_SOM=y
CONFIG_BINFMT_MISC=m
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_PACKET_MMAP is not set
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=m
# CONFIG_XFRM_SUB_POLICY is not set
# CONFIG_XFRM_MIGRATE is not set
# CONFIG_XFRM_STATISTICS is not set
CONFIG_NET_KEY=m
# CONFIG_NET_KEY_MIGRATE is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
# CONFIG_IP_ADVANCED_ROUTER is not set
CONFIG_IP_FIB_HASH=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
# CONFIG_IP_PNP_RARP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_XFRM_TUNNEL is not set
# CONFIG_INET_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
# CONFIG_INET_LRO is not set
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
# CONFIG_TCP_MD5SIG is not set
# CONFIG_IPV6 is not set
# CONFIG_NETWORK_SECMARK is not set
# CONFIG_NETFILTER is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_RDS is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_NET_DSA is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
CONFIG_LLC=m
CONFIG_LLC2=m
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_PHONET is not set
# CONFIG_IEEE802154 is not set
# CONFIG_NET_SCHED is not set
# CONFIG_DCB is not set

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
CONFIG_WIRELESS=y
CONFIG_WIRELESS_EXT=y
CONFIG_WEXT_CORE=y
CONFIG_WEXT_PROC=y
CONFIG_WEXT_SPY=y
CONFIG_WEXT_PRIV=y
# CONFIG_CFG80211 is not set
CONFIG_WIRELESS_EXT_SYSFS=y
CONFIG_LIB80211=m
CONFIG_LIB80211_CRYPT_WEP=m
CONFIG_LIB80211_CRYPT_CCMP=m
CONFIG_LIB80211_CRYPT_TKIP=m
# CONFIG_LIB80211_DEBUG is not set

#
# CFG80211 needs to be enabled for MAC80211
#
# CONFIG_WIMAX is not set
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
# CONFIG_DEVTMPFS is not set
# CONFIG_STANDALONE is not set
# CONFIG_PREVENT_FIRMWARE_BUILD is not set
CONFIG_FW_LOADER=y
CONFIG_FIRMWARE_IN_KERNEL=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_SYS_HYPERVISOR is not set
# CONFIG_CONNECTOR is not set
# CONFIG_MTD is not set
CONFIG_PARPORT=y
CONFIG_PARPORT_PC=m
# CONFIG_PARPORT_SERIAL is not set
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
CONFIG_PARPORT_PC_PCMCIA=m
CONFIG_PARPORT_GSC=y
# CONFIG_PARPORT_AX88796 is not set
CONFIG_PARPORT_1284=y
CONFIG_PARPORT_NOT_PC=y
CONFIG_BLK_DEV=y
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_CRYPTOLOOP=y

#
# DRBD disabled because PROC_FS, INET or CONNECTOR not selected
#
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SX8 is not set
# CONFIG_BLK_DEV_UB is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=6144
# CONFIG_BLK_DEV_XIP is not set
# CONFIG_CDROM_PKTCDVD is not set
# CONFIG_ATA_OVER_ETH is not set
# CONFIG_BLK_DEV_HD is not set
CONFIG_MISC_DEVICES=y
# CONFIG_PHANTOM is not set
# CONFIG_SGI_IOC4 is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_HP_ILO is not set
# CONFIG_C2PORT is not set

#
# EEPROM support
#
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_CB710_CORE is not set
CONFIG_HAVE_IDE=y
CONFIG_IDE=y

#
# Please see Documentation/ide/ide.txt for help/info on IDE drives
#
CONFIG_IDE_XFER_MODE=y
CONFIG_IDE_ATAPI=y
# CONFIG_BLK_DEV_IDE_SATA is not set
CONFIG_IDE_GD=y
CONFIG_IDE_GD_ATA=y
# CONFIG_IDE_GD_ATAPI is not set
CONFIG_BLK_DEV_IDECS=y
# CONFIG_BLK_DEV_DELKIN is not set
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_IDECD_VERBOSE_ERRORS=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_PROC_FS=y

#
# IDE chipset support/bugfixes
#
# CONFIG_BLK_DEV_PLATFORM is not set
CONFIG_BLK_DEV_IDEDMA_SFF=y

#
# PCI IDE chipsets support
#
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_PCIBUS_ORDER=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_JMICRON is not set
# CONFIG_BLK_DEV_SC1200 is not set
# CONFIG_BLK_DEV_PIIX is not set
# CONFIG_BLK_DEV_IT8172 is not set
# CONFIG_BLK_DEV_IT8213 is not set
# CONFIG_BLK_DEV_IT821X is not set
CONFIG_BLK_DEV_NS87415=y
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_BLK_DEV_TC86C001 is not set
CONFIG_BLK_DEV_IDEDMA=y

#
# SCSI device support
#
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
# CONFIG_SCSI_TGT is not set
# CONFIG_SCSI_NETLINK is not set
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=y
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=y
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_CHR_DEV_SG=y
# CONFIG_CHR_DEV_SCH is not set
# CONFIG_SCSI_MULTI_LUN is not set
# CONFIG_SCSI_CONSTANTS is not set
# CONFIG_SCSI_LOGGING is not set
# CONFIG_SCSI_SCAN_ASYNC is not set
CONFIG_SCSI_WAIT_SCAN=m

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=y
# CONFIG_SCSI_FC_ATTRS is not set
# CONFIG_SCSI_ISCSI_ATTRS is not set
# CONFIG_SCSI_SAS_LIBSAS is not set
# CONFIG_SCSI_SRP_ATTRS is not set
CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
# CONFIG_SCSI_BNX2_ISCSI is not set
# CONFIG_BE2ISCSI is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_HPSA is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_3W_SAS is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA1740 is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC94XX is not set
# CONFIG_SCSI_MVSAS is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
# CONFIG_SCSI_MPT2SAS is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_LIBFC is not set
# CONFIG_LIBFCOE is not set
# CONFIG_FCOE is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
CONFIG_SCSI_LASI700=y
CONFIG_53C700_LE_ON_BE=y
# CONFIG_SCSI_STEX is not set
CONFIG_SCSI_SYM53C8XX_2=y
CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1
CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16
CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64
CONFIG_SCSI_SYM53C8XX_MMIO=y
# CONFIG_SCSI_IPR is not set
CONFIG_SCSI_ZALON=y
CONFIG_SCSI_NCR53C8XX_DEFAULT_TAGS=8
CONFIG_SCSI_NCR53C8XX_MAX_TAGS=32
CONFIG_SCSI_NCR53C8XX_SYNC=20
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLA_FC is not set
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_SIM710 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
# CONFIG_SCSI_PMCRAID is not set
# CONFIG_SCSI_PM8001 is not set
# CONFIG_SCSI_SRP is not set
# CONFIG_SCSI_BFA_FC is not set
# CONFIG_SCSI_LOWLEVEL_PCMCIA is not set
CONFIG_SCSI_DH=y
# CONFIG_SCSI_DH_RDAC is not set
# CONFIG_SCSI_DH_HP_SW is not set
# CONFIG_SCSI_DH_EMC is not set
# CONFIG_SCSI_DH_ALUA is not set
# CONFIG_SCSI_OSD_INITIATOR is not set
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_VERBOSE_ERROR=y
CONFIG_SATA_PMP=y
# CONFIG_SATA_AHCI is not set
# CONFIG_SATA_SIL24 is not set
CONFIG_ATA_SFF=y
# CONFIG_SATA_SVW is not set
# CONFIG_ATA_PIIX is not set
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SX4 is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATP867X is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CS5520 is not set
# CONFIG_PATA_CS5530 is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
# CONFIG_ATA_GENERIC is not set
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PCMCIA is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RDC is not set
# CONFIG_PATA_RZ1000 is not set
# CONFIG_PATA_SC1200 is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_TOSHIBA is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set
# CONFIG_PATA_SCH is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_AUTODETECT=y
CONFIG_MD_LINEAR=y
CONFIG_MD_RAID0=y
CONFIG_MD_RAID1=y
CONFIG_MD_RAID10=y
CONFIG_MD_RAID456=y
# CONFIG_MULTICORE_RAID456 is not set
CONFIG_MD_RAID6_PQ=y
# CONFIG_ASYNC_RAID6_TEST is not set
# CONFIG_MD_MULTIPATH is not set
# CONFIG_MD_FAULTY is not set
CONFIG_BLK_DEV_DM=y
# CONFIG_DM_DEBUG is not set
# CONFIG_DM_CRYPT is not set
# CONFIG_DM_SNAPSHOT is not set
# CONFIG_DM_MIRROR is not set
# CONFIG_DM_ZERO is not set
# CONFIG_DM_MULTIPATH is not set
# CONFIG_DM_DELAY is not set
CONFIG_DM_UEVENT=y
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#

#
# You can enable one or both FireWire driver stacks.
#

#
# The newer stack is recommended.
#
# CONFIG_FIREWIRE is not set
# CONFIG_IEEE1394 is not set
# CONFIG_I2O is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_BONDING=m
# CONFIG_MACVLAN is not set
# CONFIG_EQUALIZER is not set
CONFIG_TUN=m
# CONFIG_VETH is not set
# CONFIG_ARCNET is not set
# CONFIG_PHYLIB is not set
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
CONFIG_LASI_82596=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_CASSINI is not set
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_VENDOR_SMC is not set
# CONFIG_ETHOC is not set
# CONFIG_DNET is not set
CONFIG_NET_TULIP=y
# CONFIG_DE2104X is not set
CONFIG_TULIP=y
# CONFIG_TULIP_MWI is not set
# CONFIG_TULIP_MMIO is not set
# CONFIG_TULIP_NAPI is not set
# CONFIG_DE4X5 is not set
# CONFIG_WINBOND_840 is not set
# CONFIG_DM9102 is not set
# CONFIG_ULI526X is not set
# CONFIG_PCMCIA_XIRCOM is not set
# CONFIG_DEPCA is not set
# CONFIG_HP100 is not set
# CONFIG_IBM_NEW_EMAC_ZMII is not set
# CONFIG_IBM_NEW_EMAC_RGMII is not set
# CONFIG_IBM_NEW_EMAC_TAH is not set
# CONFIG_IBM_NEW_EMAC_EMAC4 is not set
# CONFIG_IBM_NEW_EMAC_NO_FLOW_CTRL is not set
# CONFIG_IBM_NEW_EMAC_MAL_CLR_ICINTSTAT is not set
# CONFIG_IBM_NEW_EMAC_MAL_COMMON_ERR is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_AC3200 is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_CS89x0 is not set
# CONFIG_E100 is not set
# CONFIG_LNE390 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_NE3210 is not set
# CONFIG_ES3210 is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_R6040 is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SMSC9420 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_KS8842 is not set
# CONFIG_KS8851_MLL is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_SC92031 is not set
# CONFIG_NET_POCKET is not set
# CONFIG_ATL2 is not set
# CONFIG_NETDEV_1000 is not set
# CONFIG_NETDEV_10000 is not set
# CONFIG_TR is not set
CONFIG_WLAN=y
# CONFIG_PCMCIA_RAYCS is not set
# CONFIG_ATMEL is not set
# CONFIG_AIRO_CS is not set
# CONFIG_PCMCIA_WL3501 is not set
# CONFIG_PRISM54 is not set
# CONFIG_USB_ZD1201 is not set
CONFIG_HOSTAP=m
CONFIG_HOSTAP_FIRMWARE=y
CONFIG_HOSTAP_FIRMWARE_NVRAM=y
# CONFIG_HOSTAP_PLX is not set
CONFIG_HOSTAP_PCI=m
CONFIG_HOSTAP_CS=m

#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#

#
# USB Network Adapters
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set
CONFIG_NET_PCMCIA=y
# CONFIG_PCMCIA_3C589 is not set
# CONFIG_PCMCIA_3C574 is not set
# CONFIG_PCMCIA_FMVJ18X is not set
# CONFIG_PCMCIA_PCNET is not set
# CONFIG_PCMCIA_NMCLAN is not set
# CONFIG_PCMCIA_SMC91C92 is not set
# CONFIG_PCMCIA_XIRC2PS is not set
# CONFIG_PCMCIA_AXNET is not set
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PLIP is not set
CONFIG_PPP=m
# CONFIG_PPP_MULTILINK is not set
# CONFIG_PPP_FILTER is not set
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
# CONFIG_PPP_MPPE is not set
CONFIG_PPPOE=m
# CONFIG_PPPOL2TP is not set
# CONFIG_SLIP is not set
CONFIG_SLHC=m
# CONFIG_NET_FC is not set
# CONFIG_NETCONSOLE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_VMXNET3 is not set
# CONFIG_ISDN is not set
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y
# CONFIG_INPUT_FF_MEMLESS is not set
CONFIG_INPUT_POLLDEV=y
# CONFIG_INPUT_SPARSEKMAP is not set

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_KEYBOARD_ATKBD_HP_KEYCODES=y
# CONFIG_KEYBOARD_ATKBD_RDI_KEYCODES is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_HIL_OLD is not set
CONFIG_KEYBOARD_HIL=m
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
# CONFIG_MOUSE_PS2_ELANTECH is not set
# CONFIG_MOUSE_PS2_SENTELIC is not set
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_SERIAL=y
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_BCM5974 is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_ATI_REMOTE is not set
# CONFIG_INPUT_ATI_REMOTE2 is not set
# CONFIG_INPUT_KEYSPAN_REMOTE is not set
# CONFIG_INPUT_POWERMATE is not set
# CONFIG_INPUT_YEALINK is not set
# CONFIG_INPUT_CM109 is not set
CONFIG_INPUT_UINPUT=m
CONFIG_HP_SDC_RTC=m

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_PARKBD is not set
CONFIG_SERIO_GSCPS2=y
CONFIG_HP_SDC=m
CONFIG_HIL_MLC=m
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
# CONFIG_SERIO_ALTERA_PS2 is not set
# CONFIG_GAMEPORT is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_DEVKMEM=y
# CONFIG_SERIAL_NONSTANDARD is not set
# CONFIG_NOZOMI is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_GSC=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_CS=y
CONFIG_SERIAL_8250_NR_UARTS=17
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
# CONFIG_SERIAL_8250_RSA is not set

#
# Non-8250 serial port support
#
CONFIG_SERIAL_MUX=y
CONFIG_SERIAL_MUX_CONSOLE=y
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=64
CONFIG_PRINTER=m
# CONFIG_LP_CONSOLE is not set
CONFIG_PPDEV=m
# CONFIG_IPMI_HANDLER is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set

#
# PCMCIA character devices
#
# CONFIG_SYNCLINK_CS is not set
# CONFIG_CARDMAN_4000 is not set
# CONFIG_CARDMAN_4040 is not set
# CONFIG_IPWIRELESS is not set
# CONFIG_RAW_DRIVER is not set
# CONFIG_TCG_TPM is not set
CONFIG_DEVPORT=y
# CONFIG_I2C is not set
# CONFIG_SPI is not set

#
# PPS support
#
# CONFIG_PPS is not set
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_BATTERY_DS2760 is not set
# CONFIG_HWMON is not set
# CONFIG_THERMAL is not set
# CONFIG_WATCHDOG is not set
CONFIG_SSB_POSSIBLE=y

#
# Sonics Silicon Backplane
#
# CONFIG_SSB is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_CORE is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_MFD_TMIO is not set
# CONFIG_REGULATOR is not set
CONFIG_MEDIA_SUPPORT=m

#
# Multimedia core support
#
# CONFIG_VIDEO_DEV is not set
# CONFIG_DVB_CORE is not set
# CONFIG_VIDEO_MEDIA is not set

#
# Multimedia drivers
#
CONFIG_IR_CORE=m
CONFIG_VIDEO_IR=m
# CONFIG_DAB is not set

#
# Graphics support
#
# CONFIG_AGP is not set
CONFIG_VGA_ARB=y
# CONFIG_DRM is not set
# CONFIG_VGASTATE is not set
CONFIG_VIDEO_OUTPUT_CONTROL=y
CONFIG_FB=y
# CONFIG_FIRMWARE_EDID is not set
# CONFIG_FB_DDC is not set
# CONFIG_FB_BOOT_VESA_SUPPORT is not set
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
# CONFIG_FB_SYS_FILLRECT is not set
# CONFIG_FB_SYS_COPYAREA is not set
# CONFIG_FB_SYS_IMAGEBLIT is not set
CONFIG_FB_FOREIGN_ENDIAN=y
CONFIG_FB_BOTH_ENDIAN=y
# CONFIG_FB_BIG_ENDIAN is not set
# CONFIG_FB_LITTLE_ENDIAN is not set
# CONFIG_FB_SYS_FOPS is not set
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
# CONFIG_FB_BACKLIGHT is not set
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
CONFIG_FB_STI=y
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIA is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
CONFIG_FB_VOODOO1=m
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_BROADSHEET is not set
# CONFIG_BACKLIGHT_LCD_SUPPORT is not set

#
# Display device support
#
# CONFIG_DISPLAY_SUPPORT is not set

#
# Console display driver support
#
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=128
CONFIG_DUMMY_CONSOLE_ROWS=48
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
# CONFIG_FRAMEBUFFER_CONSOLE_ROTATION is not set
CONFIG_STI_CONSOLE=y
CONFIG_FONTS=y
# CONFIG_FONT_8x8 is not set
CONFIG_FONT_8x16=y
# CONFIG_FONT_6x11 is not set
# CONFIG_FONT_7x14 is not set
# CONFIG_FONT_PEARL_8x8 is not set
# CONFIG_FONT_ACORN_8x8 is not set
# CONFIG_FONT_MINI_4x6 is not set
# CONFIG_FONT_SUN8x16 is not set
# CONFIG_FONT_SUN12x22 is not set
# CONFIG_FONT_10x18 is not set
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
# CONFIG_LOGO_LINUX_VGA16 is not set
# CONFIG_LOGO_LINUX_CLUT224 is not set
CONFIG_LOGO_PARISC_CLUT224=y
CONFIG_SOUND=m
CONFIG_SOUND_OSS_CORE=y
CONFIG_SOUND_OSS_CORE_PRECLAIM=y
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_SEQUENCER=m
# CONFIG_SND_SEQ_DUMMY is not set
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_PCM_OSS_PLUGINS=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_SUPPORT_OLD_API=y
CONFIG_SND_VERBOSE_PROCFS=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
CONFIG_SND_VMASTER=y
# CONFIG_SND_RAWMIDI_SEQ is not set
# CONFIG_SND_OPL3_LIB_SEQ is not set
# CONFIG_SND_OPL4_LIB_SEQ is not set
# CONFIG_SND_SBAWE_SEQ is not set
# CONFIG_SND_EMU10K1_SEQ is not set
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_DRIVERS=y
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_MTS64 is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
# CONFIG_SND_PORTMAN2X4 is not set
# CONFIG_SND_AC97_POWER_SAVE is not set
CONFIG_SND_PCI=y
CONFIG_SND_AD1889=m
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AW2 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CA0106 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_OXYGEN is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CS5535AUDIO is not set
# CONFIG_SND_CTXFI is not set
# CONFIG_SND_DARLA20 is not set
# CONFIG_SND_GINA20 is not set
# CONFIG_SND_LAYLA20 is not set
# CONFIG_SND_DARLA24 is not set
# CONFIG_SND_GINA24 is not set
# CONFIG_SND_LAYLA24 is not set
# CONFIG_SND_MONA is not set
# CONFIG_SND_MIA is not set
# CONFIG_SND_ECHO3G is not set
# CONFIG_SND_INDIGO is not set
# CONFIG_SND_INDIGOIO is not set
# CONFIG_SND_INDIGODJ is not set
# CONFIG_SND_INDIGOIOX is not set
# CONFIG_SND_INDIGODJX is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_EMU10K1X is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_HDA_INTEL is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_HDSPM is not set
# CONFIG_SND_HIFIER is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_LX6464ES is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_PCXHR is not set
# CONFIG_SND_RIPTIDE is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VIRTUOSO is not set
# CONFIG_SND_VX222 is not set
# CONFIG_SND_YMFPCI is not set
CONFIG_SND_USB=y
# CONFIG_SND_USB_AUDIO is not set
# CONFIG_SND_USB_CAIAQ is not set
# CONFIG_SND_PCMCIA is not set
CONFIG_SND_GSC=y
CONFIG_SND_HARMONY=m
# CONFIG_SND_SOC is not set
# CONFIG_SOUND_PRIME is not set
CONFIG_AC97_BUS=m
CONFIG_HID_SUPPORT=y
CONFIG_HID=y
CONFIG_HIDRAW=y

#
# USB Input Devices
#
CONFIG_USB_HID=y
# CONFIG_HID_PID is not set
# CONFIG_USB_HIDDEV is not set

#
# Special HID drivers
#
CONFIG_HID_A4TECH=y
CONFIG_HID_APPLE=y
CONFIG_HID_BELKIN=y
CONFIG_HID_CHERRY=y
CONFIG_HID_CHICONY=y
CONFIG_HID_CYPRESS=y
CONFIG_HID_DRAGONRISE=y
# CONFIG_DRAGONRISE_FF is not set
CONFIG_HID_EZKEY=y
CONFIG_HID_KYE=y
CONFIG_HID_GYRATION=y
CONFIG_HID_TWINHAN=y
CONFIG_HID_KENSINGTON=y
CONFIG_HID_LOGITECH=y
# CONFIG_LOGITECH_FF is not set
# CONFIG_LOGIRUMBLEPAD2_FF is not set
CONFIG_HID_MICROSOFT=y
CONFIG_HID_MONTEREY=y
CONFIG_HID_NTRIG=y
CONFIG_HID_PANTHERLORD=y
# CONFIG_PANTHERLORD_FF is not set
CONFIG_HID_PETALYNX=y
CONFIG_HID_SAMSUNG=y
CONFIG_HID_SONY=y
CONFIG_HID_SUNPLUS=y
CONFIG_HID_GREENASIA=y
# CONFIG_GREENASIA_FF is not set
CONFIG_HID_SMARTJOYPLUS=y
# CONFIG_SMARTJOYPLUS_FF is not set
CONFIG_HID_TOPSEED=y
CONFIG_HID_THRUSTMASTER=y
# CONFIG_THRUSTMASTER_FF is not set
CONFIG_HID_ZEROPLUS=y
# CONFIG_ZEROPLUS_FF is not set
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB=y
# CONFIG_USB_DEBUG is not set
CONFIG_USB_ANNOUNCE_NEW_DEVICES=y

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
# CONFIG_USB_DEVICE_CLASS is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_OTG is not set
CONFIG_USB_MON=y
# CONFIG_USB_WUSB is not set
# CONFIG_USB_WUSB_CBAF is not set

#
# USB Host Controller Drivers
#
# CONFIG_USB_C67X00_HCD is not set
# CONFIG_USB_XHCI_HCD is not set
# CONFIG_USB_EHCI_HCD is not set
# CONFIG_USB_OXU210HP_HCD is not set
# CONFIG_USB_ISP116X_HCD is not set
# CONFIG_USB_ISP1760_HCD is not set
# CONFIG_USB_ISP1362_HCD is not set
CONFIG_USB_OHCI_HCD=y
# CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set
# CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_UHCI_HCD=y
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set
# CONFIG_USB_WHCI_HCD is not set
# CONFIG_USB_HWA_HCD is not set

#
# USB Device Class drivers
#
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set
# CONFIG_USB_WDM is not set
# CONFIG_USB_TMC is not set

#
# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may
#

#
# also be needed; see USB_STORAGE Help for more info
#
# CONFIG_USB_STORAGE is not set
# CONFIG_USB_LIBUSUAL is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set

#
# USB port drivers
#
# CONFIG_USB_USS720 is not set
# CONFIG_USB_SERIAL is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
# CONFIG_USB_SEVSEG is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_BERRY_CHARGE is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_FTDI_ELAN is not set
# CONFIG_USB_APPLEDISPLAY is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
# CONFIG_USB_TEST is not set
# CONFIG_USB_ISIGHTFW is not set
# CONFIG_USB_VST is not set
# CONFIG_USB_GADGET is not set

#
# OTG and related infrastructure
#
# CONFIG_NOP_USB_XCEIV is not set
# CONFIG_UWB is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y

#
# LED drivers
#

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=y
CONFIG_LEDS_TRIGGER_IDE_DISK=y
CONFIG_LEDS_TRIGGER_HEARTBEAT=y
# CONFIG_LEDS_TRIGGER_BACKLIGHT is not set
CONFIG_LEDS_TRIGGER_DEFAULT_ON=y

#
# iptables trigger is under Netfilter config (LED target)
#
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_DEBUG is not set

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set

#
# SPI RTC drivers
#

#
# Platform RTC drivers
#
# CONFIG_RTC_DRV_DS1286 is not set
# CONFIG_RTC_DRV_DS1511 is not set
# CONFIG_RTC_DRV_DS1553 is not set
# CONFIG_RTC_DRV_DS1742 is not set
# CONFIG_RTC_DRV_STK17TA8 is not set
# CONFIG_RTC_DRV_M48T86 is not set
# CONFIG_RTC_DRV_M48T35 is not set
# CONFIG_RTC_DRV_M48T59 is not set
# CONFIG_RTC_DRV_MSM6242 is not set
# CONFIG_RTC_DRV_BQ4802 is not set
# CONFIG_RTC_DRV_RP5C01 is not set
# CONFIG_RTC_DRV_V3020 is not set

#
# on-CPU RTC drivers
#
CONFIG_RTC_DRV_GENERIC=y
CONFIG_DMADEVICES=y

#
# DMA Devices
#
CONFIG_AUXDISPLAY=y
# CONFIG_KS0108 is not set
# CONFIG_UIO is not set

#
# TI VLYNQ
#
# CONFIG_STAGING is not set

#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT2_FS_SECURITY=y
# CONFIG_EXT2_FS_XIP is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
# CONFIG_EXT4_FS is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=m
# CONFIG_REISERFS_CHECK is not set
CONFIG_REISERFS_PROC_INFO=y
# CONFIG_REISERFS_FS_XATTR is not set
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_XFS_FS=m
CONFIG_XFS_QUOTA=y
# CONFIG_XFS_POSIX_ACL is not set
# CONFIG_XFS_RT is not set
# CONFIG_XFS_DEBUG is not set
# CONFIG_OCFS2_FS is not set
# CONFIG_BTRFS_FS is not set
# CONFIG_NILFS2_FS is not set
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
CONFIG_PRINT_QUOTA_WARNING=y
CONFIG_QUOTA_TREE=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_AUTOFS_FS=y
# CONFIG_AUTOFS4_FS is not set
# CONFIG_FUSE_FS is not set
CONFIG_GENERIC_ACL=y

#
# Caches
#
# CONFIG_FSCACHE is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
# CONFIG_ZISOFS is not set
# CONFIG_UDF_FS is not set

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
# CONFIG_MSDOS_FS is not set
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
# CONFIG_HUGETLB_PAGE is not set
# CONFIG_CONFIGFS_FS is not set
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_SQUASHFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_NFS_V3_ACL is not set
# CONFIG_NFS_V4 is not set
CONFIG_ROOT_NFS=y
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V3_ACL is not set
CONFIG_NFSD_V4=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=y
CONFIG_RPCSEC_GSS_KRB5=y
CONFIG_RPCSEC_GSS_SPKM3=m
CONFIG_SMB_FS=m
CONFIG_SMB_NLS_DEFAULT=y
CONFIG_SMB_NLS_REMOTE="cp437"
CONFIG_CIFS=m
# CONFIG_CIFS_STATS is not set
CONFIG_CIFS_WEAK_PW_HASH=y
# CONFIG_CIFS_UPCALL is not set
CONFIG_CIFS_XATTR=y
CONFIG_CIFS_POSIX=y
# CONFIG_CIFS_DEBUG2 is not set
# CONFIG_CIFS_DFS_UPCALL is not set
# CONFIG_CIFS_EXPERIMENTAL is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=y
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=m
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_UTF8=y
# CONFIG_DLM is not set

#
# Kernel hacking
#
# CONFIG_PRINTK_TIME is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=1024
CONFIG_MAGIC_SYSRQ=y
# CONFIG_STRIP_ASM_SYMS is not set
CONFIG_UNUSED_SYMBOLS=y
CONFIG_DEBUG_FS=y
# CONFIG_HEADERS_CHECK is not set
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_SHIRQ=y
CONFIG_DETECT_SOFTLOCKUP=y
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=1
CONFIG_DETECT_HUNG_TASK=y
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
CONFIG_TIMER_STATS=y
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_DEBUG_SLAB is not set
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_RT_MUTEX_TESTER=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_INFO is not set
CONFIG_DEBUG_VM=y
# CONFIG_DEBUG_WRITECOUNT is not set
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_SG=y
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_CREDENTIALS is not set
CONFIG_FRAME_POINTER=y
# CONFIG_BOOT_PRINTK_DELAY is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_CPU_STALL_DETECTOR is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
# CONFIG_FAULT_INJECTION is not set
CONFIG_LATENCYTOP=y
CONFIG_SYSCTL_SYSCALL_CHECK=y
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_WANT_PAGE_DEBUG_FLAGS=y
CONFIG_PAGE_POISONING=y
CONFIG_DYNAMIC_DEBUG=y
# CONFIG_SAMPLES is not set
# CONFIG_DEBUG_RODATA is not set

#
# Security options
#
CONFIG_KEYS=y
CONFIG_KEYS_DEBUG_PROC_KEYS=y
# CONFIG_SECURITY is not set
# CONFIG_SECURITYFS is not set
# CONFIG_DEFAULT_SECURITY_SELINUX is not set
# CONFIG_DEFAULT_SECURITY_SMACK is not set
# CONFIG_DEFAULT_SECURITY_TOMOYO is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_DEFAULT_SECURITY=""
CONFIG_XOR_BLOCKS=y
CONFIG_ASYNC_CORE=y
CONFIG_ASYNC_MEMCPY=y
CONFIG_ASYNC_XOR=y
CONFIG_ASYNC_PQ=y
CONFIG_ASYNC_RAID6_RECOV=y
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=m
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_PCOMP=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
# CONFIG_CRYPTO_GF128MUL is not set
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_WORKQUEUE=y
# CONFIG_CRYPTO_CRYPTD is not set
CONFIG_CRYPTO_AUTHENC=m
CONFIG_CRYPTO_TEST=m

#
# Authenticated Encryption with Associated Data
#
# CONFIG_CRYPTO_CCM is not set
# CONFIG_CRYPTO_GCM is not set
# CONFIG_CRYPTO_SEQIV is not set

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
# CONFIG_CRYPTO_CTR is not set
# CONFIG_CRYPTO_CTS is not set
CONFIG_CRYPTO_ECB=m
# CONFIG_CRYPTO_LRW is not set
# CONFIG_CRYPTO_PCBC is not set
# CONFIG_CRYPTO_XTS is not set

#
# Hash modes
#
CONFIG_CRYPTO_HMAC=y
# CONFIG_CRYPTO_XCBC is not set
# CONFIG_CRYPTO_VMAC is not set

#
# Digest
#
CONFIG_CRYPTO_CRC32C=m
# CONFIG_CRYPTO_GHASH is not set
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
# CONFIG_CRYPTO_RMD128 is not set
# CONFIG_CRYPTO_RMD160 is not set
# CONFIG_CRYPTO_RMD256 is not set
# CONFIG_CRYPTO_RMD320 is not set
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_WP512=m

#
# Ciphers
#
CONFIG_CRYPTO_AES=m
CONFIG_CRYPTO_ANUBIS=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_BLOWFISH=m
# CONFIG_CRYPTO_CAMELLIA is not set
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_DES=y
# CONFIG_CRYPTO_FCRYPT is not set
CONFIG_CRYPTO_KHAZAD=m
# CONFIG_CRYPTO_SALSA20 is not set
# CONFIG_CRYPTO_SEED is not set
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m

#
# Compression
#
CONFIG_CRYPTO_DEFLATE=y
# CONFIG_CRYPTO_ZLIB is not set
# CONFIG_CRYPTO_LZO is not set

#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
CONFIG_CRYPTO_HW=y
# CONFIG_CRYPTO_DEV_HIFN_795X is not set
# CONFIG_BINARY_PRINTF is not set

#
# Library routines
#
CONFIG_BITREVERSE=y
CONFIG_GENERIC_FIND_LAST_BIT=y
CONFIG_CRC_CCITT=m
# CONFIG_CRC16 is not set
CONFIG_CRC_T10DIF=y
# CONFIG_CRC_ITU_T is not set
CONFIG_CRC32=y
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_NLATTR=y
CONFIG_GENERIC_ATOMIC64=y

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-11 21:41                                                         ` Helge Deller
@ 2010-05-15 21:02                                                           ` John David Anglin
  2010-05-16 20:22                                                             ` Helge Deller
  2010-05-23 13:11                                                             ` Carlos O'Donell
  0 siblings, 2 replies; 74+ messages in thread
From: John David Anglin @ 2010-05-15 21:02 UTC (permalink / raw)
  To: Helge Deller; +Cc: dave.anglin, carlos, gniibe, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 979 bytes --]

On Tue, 11 May 2010, Helge Deller wrote:

> On 05/11/2010 11:26 PM, John David Anglin wrote:
> >> On 05/10/2010 09:20 PM, Helge Deller wrote:
> >>> On 05/10/2010 04:56 PM, John David Anglin wrote:
> >>>> Yes, just after sending, I noticed gcc testsuite and minifail were
> >>>> broken on gsyprf11.  [...]
> >>>>
> >>>> The attached works better on gsyprf11.  I haven't tested it on anything
> >>>> else.
> >>
> >> Hi Dave,
> >>
> >> I still can see segfaults of the minifail_dave.cpp program.

I haven't seen any abnormal segfaults with the attached change on
my rp3440 and gsyprf11.  Need to flush in pte_wrprotect for minifail,
and to flush in copy_user_page to fix occasional segfaults in sh.
The flushes in kmap/kunmap weren't sufficient.

Still experimenting to see if number of flushes can be reduced, etc.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

[-- Attachment #2: diff-20100514.d --]
[-- Type: text/plain, Size: 37392 bytes --]

diff --git a/arch/parisc/hpux/wrappers.S b/arch/parisc/hpux/wrappers.S
index 58c53c8..bdcea33 100644
--- a/arch/parisc/hpux/wrappers.S
+++ b/arch/parisc/hpux/wrappers.S
@@ -88,7 +88,7 @@ ENTRY(hpux_fork_wrapper)
 
 	STREG	%r2,-20(%r30)
 	ldo	64(%r30),%r30
-	STREG	%r2,PT_GR19(%r1)	;! save for child
+	STREG	%r2,PT_SYSCALL_RP(%r1)	;! save for child
 	STREG	%r30,PT_GR21(%r1)	;! save for child
 
 	LDREG	PT_GR30(%r1),%r25
@@ -132,7 +132,7 @@ ENTRY(hpux_child_return)
 	bl,n	schedule_tail, %r2
 #endif
 
-	LDREG	TASK_PT_GR19-TASK_SZ_ALGN-128(%r30),%r2
+	LDREG	TASK_PT_SYSCALL_RP-TASK_SZ_ALGN-128(%r30),%r2
 	b fork_return
 	copy %r0,%r28
 ENDPROC(hpux_child_return)
diff --git a/arch/parisc/include/asm/atomic.h b/arch/parisc/include/asm/atomic.h
index 716634d..ad7df44 100644
--- a/arch/parisc/include/asm/atomic.h
+++ b/arch/parisc/include/asm/atomic.h
@@ -24,29 +24,46 @@
  * Hash function to index into a different SPINLOCK.
  * Since "a" is usually an address, use one spinlock per cacheline.
  */
-#  define ATOMIC_HASH_SIZE 4
-#  define ATOMIC_HASH(a) (&(__atomic_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
+#  define ATOMIC_HASH_SIZE (4096/L1_CACHE_BYTES)  /* 4 */
+#  define ATOMIC_HASH(a)      (&(__atomic_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
+#  define ATOMIC_USER_HASH(a) (&(__atomic_user_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
 
 extern arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned;
+extern arch_spinlock_t __atomic_user_hash[ATOMIC_HASH_SIZE] __lock_aligned;
 
 /* Can't use raw_spin_lock_irq because of #include problems, so
  * this is the substitute */
-#define _atomic_spin_lock_irqsave(l,f) do {	\
-	arch_spinlock_t *s = ATOMIC_HASH(l);		\
+#define _atomic_spin_lock_irqsave_template(l,f,hash_func) do {	\
+	arch_spinlock_t *s = hash_func;		\
 	local_irq_save(f);			\
 	arch_spin_lock(s);			\
 } while(0)
 
-#define _atomic_spin_unlock_irqrestore(l,f) do {	\
-	arch_spinlock_t *s = ATOMIC_HASH(l);			\
+#define _atomic_spin_unlock_irqrestore_template(l,f,hash_func) do {	\
+	arch_spinlock_t *s = hash_func;			\
 	arch_spin_unlock(s);				\
 	local_irq_restore(f);				\
 } while(0)
 
+/* kernel memory locks */
+#define _atomic_spin_lock_irqsave(l,f)	\
+	_atomic_spin_lock_irqsave_template(l,f,ATOMIC_HASH(l))
+
+#define _atomic_spin_unlock_irqrestore(l,f)	\
+	_atomic_spin_unlock_irqrestore_template(l,f,ATOMIC_HASH(l))
+
+/* userspace memory locks */
+#define _atomic_spin_lock_irqsave_user(l,f)	\
+	_atomic_spin_lock_irqsave_template(l,f,ATOMIC_USER_HASH(l))
+
+#define _atomic_spin_unlock_irqrestore_user(l,f)	\
+	_atomic_spin_unlock_irqrestore_template(l,f,ATOMIC_USER_HASH(l))
 
 #else
 #  define _atomic_spin_lock_irqsave(l,f) do { local_irq_save(f); } while (0)
 #  define _atomic_spin_unlock_irqrestore(l,f) do { local_irq_restore(f); } while (0)
+#  define _atomic_spin_lock_irqsave_user(l,f) _atomic_spin_lock_irqsave(l,f)
+#  define _atomic_spin_unlock_irqrestore_user(l,f) _atomic_spin_unlock_irqrestore(l,f)
 #endif
 
 /* This should get optimized out since it's never called.
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index 7a73b61..b90c895 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -2,6 +2,7 @@
 #define _PARISC_CACHEFLUSH_H
 
 #include <linux/mm.h>
+#include <linux/uaccess.h>
 
 /* The usual comment is "Caches aren't brain-dead on the <architecture>".
  * Unfortunately, that doesn't apply to PA-RISC. */
@@ -104,21 +105,32 @@ void mark_rodata_ro(void);
 #define ARCH_HAS_KMAP
 
 void kunmap_parisc(void *addr);
+void *kmap_parisc(struct page *page);
 
 static inline void *kmap(struct page *page)
 {
 	might_sleep();
-	return page_address(page);
+	return kmap_parisc(page);
 }
 
 #define kunmap(page)			kunmap_parisc(page_address(page))
 
-#define kmap_atomic(page, idx)		page_address(page)
+static inline void *kmap_atomic(struct page *page, enum km_type idx)
+{
+	pagefault_disable();
+	return kmap_parisc(page);
+}
 
-#define kunmap_atomic(addr, idx)	kunmap_parisc(addr)
+static inline void kunmap_atomic(void *addr, enum km_type idx)
+{
+	kunmap_parisc(addr);
+	pagefault_enable();
+}
 
-#define kmap_atomic_pfn(pfn, idx)	page_address(pfn_to_page(pfn))
-#define kmap_atomic_to_page(ptr)	virt_to_page(ptr)
+#define kmap_atomic_prot(page, idx, prot)	kmap_atomic(page, idx)
+#define kmap_atomic_pfn(pfn, idx)	kmap_atomic(pfn_to_page(pfn), (idx))
+#define kmap_atomic_to_page(ptr)	virt_to_page(kmap_atomic(virt_to_page(ptr), (enum km_type) 0))
+#define kmap_flush_unused()	do {} while(0)
 #endif
 
 #endif /* _PARISC_CACHEFLUSH_H */
diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
index 0c705c3..7bc963e 100644
--- a/arch/parisc/include/asm/futex.h
+++ b/arch/parisc/include/asm/futex.h
@@ -55,6 +55,7 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
 {
 	int err = 0;
 	int uval;
+	unsigned long flags;
 
 	/* futex.c wants to do a cmpxchg_inatomic on kernel NULL, which is
 	 * our gateway page, and causes no end of trouble...
@@ -65,10 +66,15 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
 		return -EFAULT;
 
+	_atomic_spin_lock_irqsave_user(uaddr, flags);
+
 	err = get_user(uval, uaddr);
-	if (err) return -EFAULT;
-	if (uval == oldval)
-		err = put_user(newval, uaddr);
+	if (!err)
+		if (uval == oldval)
+			err = put_user(newval, uaddr);
+
+	_atomic_spin_unlock_irqrestore_user(uaddr, flags);
+
 	if (err) return -EFAULT;
 	return uval;
 }
diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index a27d2e2..2c6da94 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -30,15 +30,21 @@
  */
 #define kern_addr_valid(addr)	(1)
 
+extern spinlock_t pa_pte_lock;
+extern spinlock_t pa_tlb_lock;
+
 /* Certain architectures need to do special things when PTEs
  * within a page table are directly modified.  Thus, the following
  * hook is made available.
  */
-#define set_pte(pteptr, pteval)                                 \
-        do{                                                     \
+#define set_pte(pteptr, pteval)					\
+        do {							\
+		unsigned long flags;				\
+		spin_lock_irqsave(&pa_pte_lock, flags);		\
                 *(pteptr) = (pteval);                           \
+		spin_unlock_irqrestore(&pa_pte_lock, flags);	\
         } while(0)
-#define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval)
+#define set_pte_at(mm,addr,ptep,pteval)	set_pte(ptep, pteval)
 
 #endif /* !__ASSEMBLY__ */
 
@@ -262,6 +268,7 @@ extern unsigned long *empty_zero_page;
 #define pte_none(x)     ((pte_val(x) == 0) || (pte_val(x) & _PAGE_FLUSH))
 #define pte_present(x)	(pte_val(x) & _PAGE_PRESENT)
 #define pte_clear(mm,addr,xp)	do { pte_val(*(xp)) = 0; } while (0)
+#define pte_same(A,B)	(pte_val(A) == pte_val(B))
 
 #define pmd_flag(x)	(pmd_val(x) & PxD_FLAG_MASK)
 #define pmd_address(x)	((unsigned long)(pmd_val(x) &~ PxD_FLAG_MASK) << PxD_VALUE_SHIFT)
@@ -410,6 +417,7 @@ extern void paging_init (void);
 
 #define PG_dcache_dirty         PG_arch_1
 
+extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn);
 extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t);
 
 /* Encode and de-code a swap entry */
@@ -423,56 +431,84 @@ extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t);
 #define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)		((pte_t) { (x).val })
 
-static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
+static inline void __flush_tlb_page(struct mm_struct *mm, unsigned long addr)
 {
-#ifdef CONFIG_SMP
-	if (!pte_young(*ptep))
-		return 0;
-	return test_and_clear_bit(xlate_pabit(_PAGE_ACCESSED_BIT), &pte_val(*ptep));
-#else
-	pte_t pte = *ptep;
-	if (!pte_young(pte))
-		return 0;
-	set_pte_at(vma->vm_mm, addr, ptep, pte_mkold(pte));
-	return 1;
-#endif
+	unsigned long flags;
+
+	/* For one page, it's not worth testing the split_tlb variable.  */
+	spin_lock_irqsave(&pa_tlb_lock, flags);
+	mtsp(mm->context,1);
+	pdtlb(addr);
+	pitlb(addr);
+	spin_unlock_irqrestore(&pa_tlb_lock, flags);
 }
 
-extern spinlock_t pa_dbit_lock;
+static inline int ptep_set_access_flags(struct vm_area_struct *vma, unsigned
+ long addr, pte_t *ptep, pte_t entry, int dirty)
+{
+	int changed;
+	unsigned long flags;
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	changed = !pte_same(*ptep, entry);
+	if (changed) {
+		*ptep = entry;
+	}
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
+	if (changed) {
+		__flush_tlb_page(vma->vm_mm, addr);
+	}
+	return changed;
+}
+
+static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
+{
+	pte_t pte;
+	unsigned long flags;
+	int r;
+
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	pte = *ptep;
+	if (pte_young(pte)) {
+		*ptep = pte_mkold(pte);
+		r = 1;
+	} else {
+		r = 0;
+	}
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
+
+	return r;
+}
 
 struct mm_struct;
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-	pte_t old_pte;
-	pte_t pte;
+	pte_t pte, old_pte;
+	unsigned long flags;
 
-	spin_lock(&pa_dbit_lock);
+	spin_lock_irqsave(&pa_pte_lock, flags);
 	pte = old_pte = *ptep;
 	pte_val(pte) &= ~_PAGE_PRESENT;
 	pte_val(pte) |= _PAGE_FLUSH;
-	set_pte_at(mm,addr,ptep,pte);
-	spin_unlock(&pa_dbit_lock);
+	*ptep = pte;
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
 
 	return old_pte;
 }
 
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-#ifdef CONFIG_SMP
-	unsigned long new, old;
-
-	do {
-		old = pte_val(*ptep);
-		new = pte_val(pte_wrprotect(__pte (old)));
-	} while (cmpxchg((unsigned long *) ptep, old, new) != old);
-#else
-	pte_t old_pte = *ptep;
-	set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
-#endif
+	pte_t old_pte;
+	unsigned long flags;
+
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	old_pte = *ptep;
+	*ptep = pte_wrprotect(old_pte);
+	WARN_ON(!pte_present(old_pte) && !(pte_val(old_pte) & _PAGE_FLUSH));
+	__flush_tlb_page(mm, addr);
+	flush_cache_page(vma, addr, pte_pfn(old_pte));
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
 }
 
-#define pte_same(A,B)	(pte_val(A) == pte_val(B))
-
 #endif /* !__ASSEMBLY__ */
 
 
@@ -504,6 +540,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 
 #define HAVE_ARCH_UNMAPPED_AREA
 
+#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
 #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
diff --git a/arch/parisc/include/asm/system.h b/arch/parisc/include/asm/system.h
index d91357b..4653c77 100644
--- a/arch/parisc/include/asm/system.h
+++ b/arch/parisc/include/asm/system.h
@@ -160,7 +160,7 @@ static inline void set_eiem(unsigned long val)
    ldcd). */
 
 #define __PA_LDCW_ALIGNMENT	4
-#define __ldcw_align(a) ((volatile unsigned int *)a)
+#define __ldcw_align(a) (&(a)->slock)
 #define __LDCW	"ldcw,co"
 
 #endif /*!CONFIG_PA20*/
diff --git a/arch/parisc/kernel/asm-offsets.c b/arch/parisc/kernel/asm-offsets.c
index ec787b4..b2f35b2 100644
--- a/arch/parisc/kernel/asm-offsets.c
+++ b/arch/parisc/kernel/asm-offsets.c
@@ -137,6 +137,7 @@ int main(void)
 	DEFINE(TASK_PT_IAOQ0, offsetof(struct task_struct, thread.regs.iaoq[0]));
 	DEFINE(TASK_PT_IAOQ1, offsetof(struct task_struct, thread.regs.iaoq[1]));
 	DEFINE(TASK_PT_CR27, offsetof(struct task_struct, thread.regs.cr27));
+	DEFINE(TASK_PT_SYSCALL_RP, offsetof(struct task_struct, thread.regs.pad0));
 	DEFINE(TASK_PT_ORIG_R28, offsetof(struct task_struct, thread.regs.orig_r28));
 	DEFINE(TASK_PT_KSP, offsetof(struct task_struct, thread.regs.ksp));
 	DEFINE(TASK_PT_KPC, offsetof(struct task_struct, thread.regs.kpc));
@@ -225,6 +226,7 @@ int main(void)
 	DEFINE(PT_IAOQ0, offsetof(struct pt_regs, iaoq[0]));
 	DEFINE(PT_IAOQ1, offsetof(struct pt_regs, iaoq[1]));
 	DEFINE(PT_CR27, offsetof(struct pt_regs, cr27));
+	DEFINE(PT_SYSCALL_RP, offsetof(struct pt_regs, pad0));
 	DEFINE(PT_ORIG_R28, offsetof(struct pt_regs, orig_r28));
 	DEFINE(PT_KSP, offsetof(struct pt_regs, ksp));
 	DEFINE(PT_KPC, offsetof(struct pt_regs, kpc));
@@ -290,5 +292,11 @@ int main(void)
 	BLANK();
 	DEFINE(ASM_PDC_RESULT_SIZE, NUM_PDC_RESULT * sizeof(unsigned long));
 	BLANK();
+
+#ifdef CONFIG_SMP
+	DEFINE(ASM_ATOMIC_HASH_SIZE_SHIFT, __builtin_ffs(ATOMIC_HASH_SIZE)-1);
+	DEFINE(ASM_ATOMIC_HASH_ENTRY_SHIFT, __builtin_ffs(sizeof(__atomic_hash[0]))-1);
+#endif
+
 	return 0;
 }
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index b6ed34d..517537d 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -336,9 +336,9 @@ __flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr)
 	}
 }
 
-void flush_dcache_page(struct page *page)
+static void flush_user_dcache_page_internal(struct address_space *mapping,
+					    struct page *page)
 {
-	struct address_space *mapping = page_mapping(page);
 	struct vm_area_struct *mpnt;
 	struct prio_tree_iter iter;
 	unsigned long offset;
@@ -346,14 +346,6 @@ void flush_dcache_page(struct page *page)
 	pgoff_t pgoff;
 	unsigned long pfn = page_to_pfn(page);
 
-
-	if (mapping && !mapping_mapped(mapping)) {
-		set_bit(PG_dcache_dirty, &page->flags);
-		return;
-	}
-
-	flush_kernel_dcache_page(page);
-
 	if (!mapping)
 		return;
 
@@ -387,6 +379,19 @@ void flush_dcache_page(struct page *page)
 	}
 	flush_dcache_mmap_unlock(mapping);
 }
+
+void flush_dcache_page(struct page *page)
+{
+	struct address_space *mapping = page_mapping(page);
+
+	if (mapping && !mapping_mapped(mapping)) {
+		set_bit(PG_dcache_dirty, &page->flags);
+		return;
+	}
+
+	flush_kernel_dcache_page(page);
+	flush_user_dcache_page_internal(mapping, page);
+}
 EXPORT_SYMBOL(flush_dcache_page);
 
 /* Defined in arch/parisc/kernel/pacache.S */
@@ -395,15 +400,12 @@ EXPORT_SYMBOL(flush_kernel_dcache_page_asm);
 EXPORT_SYMBOL(flush_data_cache_local);
 EXPORT_SYMBOL(flush_kernel_icache_range_asm);
 
-void clear_user_page_asm(void *page, unsigned long vaddr)
+static void clear_user_page_asm(void *page, unsigned long vaddr)
 {
-	unsigned long flags;
 	/* This function is implemented in assembly in pacache.S */
 	extern void __clear_user_page_asm(void *page, unsigned long vaddr);
 
-	purge_tlb_start(flags);
 	__clear_user_page_asm(page, vaddr);
-	purge_tlb_end(flags);
 }
 
 #define FLUSH_THRESHOLD 0x80000 /* 0.5MB */
@@ -440,7 +442,6 @@ void __init parisc_setup_cache_timing(void)
 }
 
 extern void purge_kernel_dcache_page(unsigned long);
-extern void clear_user_page_asm(void *page, unsigned long vaddr);
 
 void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
 {
@@ -470,21 +471,10 @@ void copy_user_page(void *vto, void *vfrom, unsigned long vaddr,
 {
 	/* no coherency needed (all in kmap/kunmap) */
 	copy_user_page_asm(vto, vfrom);
-	if (!parisc_requires_coherency())
-		flush_kernel_dcache_page_asm(vto);
+	flush_kernel_dcache_page_asm(vto);
 }
 EXPORT_SYMBOL(copy_user_page);
 
-#ifdef CONFIG_PA8X00
-
-void kunmap_parisc(void *addr)
-{
-	if (parisc_requires_coherency())
-		flush_kernel_dcache_page_addr(addr);
-}
-EXPORT_SYMBOL(kunmap_parisc);
-#endif
-
 void __flush_tlb_range(unsigned long sid, unsigned long start,
 		       unsigned long end)
 {
@@ -577,3 +567,25 @@ flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long
 		__flush_cache_page(vma, vmaddr);
 
 }
+
+void *kmap_parisc(struct page *page)
+{
+	/* this is a killer.  There's no easy way to test quickly if
+	 * this page is dirty in any userspace.  Additionally, for
+	 * kernel alterations of the page, we'd need it invalidated
+	 * here anyway, so currently flush (and invalidate)
+	 * universally */
+	flush_user_dcache_page_internal(page_mapping(page), page);
+	return page_address(page);
+}
+EXPORT_SYMBOL(kmap_parisc);
+
+void kunmap_parisc(void *addr)
+{
+	/* flush and invalidate the kernel mapping.  We need the
+	 * invalidate so we don't have stale data at this cache
+	 * location the next time the page is mapped */
+	flush_kernel_dcache_page_addr(addr);
+}
+EXPORT_SYMBOL(kunmap_parisc);
+
diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index 3a44f7f..e1c0128 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -45,7 +45,7 @@
 	.level 2.0
 #endif
 
-	.import         pa_dbit_lock,data
+	.import         pa_pte_lock,data
 
 	/* space_to_prot macro creates a prot id from a space id */
 
@@ -364,32 +364,6 @@
 	.align		32
 	.endm
 
-	/* The following are simple 32 vs 64 bit instruction
-	 * abstractions for the macros */
-	.macro		EXTR	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	extrd,u		\reg1,32+(\start),\length,\reg2
-#else
-	extrw,u		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEP	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	depd		\reg1,32+(\start),\length,\reg2
-#else
-	depw		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEPI	val,start,length,reg
-#ifdef CONFIG_64BIT
-	depdi		\val,32+(\start),\length,\reg
-#else
-	depwi		\val,\start,\length,\reg
-#endif
-	.endm
-
 	/* In LP64, the space contains part of the upper 32 bits of the
 	 * fault.  We have to extract this and place it in the va,
 	 * zeroing the corresponding bits in the space register */
@@ -442,19 +416,19 @@
 	 */
 	.macro		L2_ptep	pmd,pte,index,va,fault
 #if PT_NLEVELS == 3
-	EXTR		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
+	extru		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
 #else
-	EXTR		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
+	extru		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
 #endif
-	DEP             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	dep             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	copy		%r0,\pte
 	ldw,s		\index(\pmd),\pmd
 	bb,>=,n		\pmd,_PxD_PRESENT_BIT,\fault
-	DEP		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
+	dep		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
 	copy		\pmd,%r9
 	SHLREG		%r9,PxD_VALUE_SHIFT,\pmd
-	EXTR		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
-	DEP		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	extru		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
+	dep		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	shladd		\index,BITS_PER_PTE_ENTRY,\pmd,\pmd
 	LDREG		%r0(\pmd),\pte		/* pmd is now pte */
 	bb,>=,n		\pte,_PAGE_PRESENT_BIT,\fault
@@ -488,13 +462,44 @@
 	L2_ptep		\pgd,\pte,\index,\va,\fault
 	.endm
 
+	/* SMP lock for consistent PTE updates.  Unlocks and jumps
+	   to FAULT if the page is not present.  Note the preceeding
+	   load of the PTE can't be deleted since we can't fault holding
+	   the lock.  */ 
+	.macro		pte_lock	ptep,pte,spc,tmp,tmp1,fault
+#ifdef CONFIG_SMP
+	cmpib,COND(=),n        0,\spc,2f
+	load32		PA(pa_pte_lock),\tmp1
+1:
+	LDCW		0(\tmp1),\tmp
+	cmpib,COND(=)         0,\tmp,1b
+	nop
+	LDREG		%r0(\ptep),\pte
+	bb,<,n		\pte,_PAGE_PRESENT_BIT,2f
+	ldi             1,\tmp
+	stw             \tmp,0(\tmp1)
+	b,n		\fault
+2:
+#endif
+	.endm
+
+	.macro		pte_unlock	spc,tmp,tmp1
+#ifdef CONFIG_SMP
+	cmpib,COND(=),n        0,\spc,1f
+	ldi             1,\tmp
+	stw             \tmp,0(\tmp1)
+1:
+#endif
+	.endm
+
 	/* Set the _PAGE_ACCESSED bit of the PTE.  Be clever and
 	 * don't needlessly dirty the cache line if it was already set */
-	.macro		update_ptep	ptep,pte,tmp,tmp1
-	ldi		_PAGE_ACCESSED,\tmp1
-	or		\tmp1,\pte,\tmp
-	and,COND(<>)	\tmp1,\pte,%r0
-	STREG		\tmp,0(\ptep)
+	.macro		update_ptep	ptep,pte,tmp
+	bb,<,n		\pte,_PAGE_ACCESSED_BIT,1f
+	ldi		_PAGE_ACCESSED,\tmp
+	or		\tmp,\pte,\pte
+	STREG		\pte,0(\ptep)
+1:
 	.endm
 
 	/* Set the dirty bit (and accessed bit).  No need to be
@@ -605,7 +610,7 @@
 	depdi		0,31,32,\tmp
 #endif
 	copy		\va,\tmp1
-	DEPI		0,31,23,\tmp1
+	depi		0,31,23,\tmp1
 	cmpb,COND(<>),n	\tmp,\tmp1,\fault
 	ldi		(_PAGE_DIRTY|_PAGE_WRITE|_PAGE_READ),\prot
 	depd,z		\prot,8,7,\prot
@@ -622,6 +627,39 @@
 	or		%r26,%r0,\pte
 	.endm 
 
+	/* Save PTE for recheck if SMP.  */
+	.macro		save_pte	pte,tmp
+#ifdef CONFIG_SMP
+	copy		\pte,\tmp
+#endif
+	.endm
+
+	/* Reload the PTE and purge the data TLB entry if the new
+	   value is different from the old one.  */
+	.macro		dtlb_recheck	ptep,old_pte,spc,va,tmp
+#ifdef CONFIG_SMP
+	LDREG		%r0(\ptep),\tmp
+	cmpb,COND(=),n	\old_pte,\tmp,1f
+	mfsp		%sr1,\tmp
+	mtsp		\spc,%sr1
+	pdtlb,l		%r0(%sr1,\va)
+	mtsp		\tmp,%sr1
+1:
+#endif
+	.endm
+
+	.macro		itlb_recheck	ptep,old_pte,spc,va,tmp
+#ifdef CONFIG_SMP
+	LDREG		%r0(\ptep),\tmp
+	cmpb,COND(=),n	\old_pte,\tmp,1f
+	mfsp		%sr1,\tmp
+	mtsp		\spc,%sr1
+	pitlb,l		%r0(%sr1,\va)
+	mtsp		\tmp,%sr1
+1:
+#endif
+	.endm
+
 
 	/*
 	 * Align fault_vector_20 on 4K boundary so that both
@@ -758,6 +796,10 @@ ENTRY(__kernel_thread)
 
 	STREG	%r22, PT_GR22(%r1)	/* save r22 (arg5) */
 	copy	%r0, %r22		/* user_tid */
+	copy	%r0, %r21		/* child_tid */
+#else
+	stw	%r0, -52(%r30)	     	/* user_tid */
+	stw	%r0, -56(%r30)	     	/* child_tid */
 #endif
 	STREG	%r26, PT_GR26(%r1)  /* Store function & argument for child */
 	STREG	%r25, PT_GR25(%r1)
@@ -765,7 +807,7 @@ ENTRY(__kernel_thread)
 	ldo	CLONE_VM(%r26), %r26   /* Force CLONE_VM since only init_mm */
 	or	%r26, %r24, %r26      /* will have kernel mappings.	 */
 	ldi	1, %r25			/* stack_start, signals kernel thread */
-	stw	%r0, -52(%r30)	     	/* user_tid */
+	ldi	0, %r23			/* child_stack_size */
 #ifdef CONFIG_64BIT
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
@@ -972,7 +1014,10 @@ intr_check_sig:
 	BL	do_notify_resume,%r2
 	copy	%r16, %r26			/* struct pt_regs *regs */
 
-	b,n	intr_check_sig
+	mfctl   %cr30,%r16		/* Reload */
+	LDREG	TI_TASK(%r16), %r16	/* thread_info -> task_struct */
+	b	intr_check_sig
+	ldo	TASK_REGS(%r16),%r16
 
 intr_restore:
 	copy            %r16,%r29
@@ -997,13 +1042,6 @@ intr_restore:
 
 	rfi
 	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
 
 #ifndef CONFIG_PREEMPT
 # define intr_do_preempt	intr_restore
@@ -1026,14 +1064,12 @@ intr_do_resched:
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	ldil	L%intr_check_sig, %r2
-#ifndef CONFIG_64BIT
-	b	schedule
-#else
-	load32	schedule, %r20
-	bv	%r0(%r20)
-#endif
-	ldo	R%intr_check_sig(%r2), %r2
+	BL	schedule,%r2
+	nop
+	mfctl   %cr30,%r16		/* Reload */
+	LDREG	TI_TASK(%r16), %r16	/* thread_info -> task_struct */
+	b	intr_check_sig
+	ldo	TASK_REGS(%r16),%r16
 
 	/* preempt the current task on returning to kernel
 	 * mode from an interrupt, iff need_resched is set,
@@ -1214,11 +1250,14 @@ dtlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,dtlb_check_alias_20w
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,dtlb_check_alias_20w
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-	
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1238,11 +1277,10 @@ nadtlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,nadtlb_check_flush_20w
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1272,8 +1310,11 @@ dtlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,dtlb_check_alias_11
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,dtlb_check_alias_11
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
@@ -1283,6 +1324,7 @@ dtlb_miss_11:
 	idtlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1321,11 +1363,9 @@ nadtlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,nadtlb_check_flush_11
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
-
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
 	mtsp		spc,%sr1
 
@@ -1333,6 +1373,7 @@ nadtlb_miss_11:
 	idtlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1368,13 +1409,17 @@ dtlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,dtlb_check_alias_20
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,dtlb_check_alias_20
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0
 
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1394,13 +1439,13 @@ nadtlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,nadtlb_check_flush_20
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0
 	
         idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1508,11 +1553,14 @@ itlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,itlb_fault
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-	
 	iitlbt          pte,prot
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1526,8 +1574,11 @@ itlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,itlb_fault
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
@@ -1537,6 +1588,7 @@ itlb_miss_11:
 	iitlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1548,13 +1600,17 @@ itlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	pte_lock	ptp,pte,spc,t0,t1,itlb_fault
+	update_ptep	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0	
 
 	iitlbt          pte,prot
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1570,29 +1626,14 @@ dbit_trap_20w:
 
 	L3_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_20w
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_20w:
-	LDCW		0(t0),t1
-	cmpib,COND(=)         0,t1,dbit_spin_20w
-	nop
-
-dbit_nolock_20w:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-		
 	idtlbt          pte,prot
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_20w
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_20w:
-#endif
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1606,35 +1647,21 @@ dbit_trap_11:
 
 	L2_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_11
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_11:
-	LDCW		0(t0),t1
-	cmpib,=         0,t1,dbit_spin_11
-	nop
-
-dbit_nolock_11:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
-	mfsp            %sr1,t1  /* Save sr1 so we can use it in tlb inserts */
+	mfsp            %sr1,t0  /* Save sr1 so we can use it in tlb inserts */
 	mtsp		spc,%sr1
 
 	idtlba		pte,(%sr1,va)
 	idtlbp		prot,(%sr1,va)
 
-	mtsp            t1, %sr1     /* Restore sr1 */
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_11
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_11:
-#endif
+	mtsp            t0, %sr1     /* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1646,32 +1673,17 @@ dbit_trap_20:
 
 	L2_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_20
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_20:
-	LDCW		0(t0),t1
-	cmpib,=         0,t1,dbit_spin_20
-	nop
-
-dbit_nolock_20:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
-	f_extend	pte,t1
+	f_extend	pte,t0
 	
         idtlbt          pte,prot
-
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_20
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_20:
-#endif
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1772,9 +1784,9 @@ ENTRY(sys_fork_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	/* These are call-clobbered registers and therefore
-	   also syscall-clobbered (we hope). */
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 
 	LDREG	PT_GR30(%r1),%r25
@@ -1804,7 +1816,7 @@ ENTRY(child_return)
 	nop
 
 	LDREG	TI_TASK-THREAD_SZ_ALGN-FRAME_SIZE-FRAME_SIZE(%r30), %r1
-	LDREG	TASK_PT_GR19(%r1),%r2
+	LDREG	TASK_PT_SYSCALL_RP(%r1),%r2
 	b	wrapper_exit
 	copy	%r0,%r28
 ENDPROC(child_return)
@@ -1823,8 +1835,9 @@ ENTRY(sys_clone_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	/* WARNING - Clobbers r19 and r21, userspace must save these! */
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 	BL	sys_clone,%r2
 	copy	%r1,%r24
@@ -1847,7 +1860,9 @@ ENTRY(sys_vfork_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 
 	BL	sys_vfork,%r2
@@ -2076,9 +2091,10 @@ syscall_restore:
 	LDREG	TASK_PT_GR31(%r1),%r31	   /* restore syscall rp */
 
 	/* NOTE: We use rsm/ssm pair to make this operation atomic */
+	LDREG   TASK_PT_GR30(%r1),%r1              /* Get user sp */
 	rsm     PSW_SM_I, %r0
-	LDREG   TASK_PT_GR30(%r1),%r30             /* restore user sp */
-	mfsp	%sr3,%r1			   /* Get users space id */
+	copy    %r1,%r30                           /* Restore user sp */
+	mfsp    %sr3,%r1                           /* Get user space id */
 	mtsp    %r1,%sr7                           /* Restore sr7 */
 	ssm     PSW_SM_I, %r0
 
diff --git a/arch/parisc/kernel/pacache.S b/arch/parisc/kernel/pacache.S
index 09b77b2..4f0d975 100644
--- a/arch/parisc/kernel/pacache.S
+++ b/arch/parisc/kernel/pacache.S
@@ -277,6 +277,7 @@ ENDPROC(flush_data_cache_local)
 
 	.align	16
 
+#if 1
 ENTRY(copy_user_page_asm)
 	.proc
 	.callinfo NO_CALLS
@@ -400,6 +401,7 @@ ENTRY(copy_user_page_asm)
 
 	.procend
 ENDPROC(copy_user_page_asm)
+#endif
 
 /*
  * NOTE: Code in clear_user_page has a hard coded dependency on the
@@ -548,17 +550,33 @@ ENTRY(__clear_user_page_asm)
 	depwi		0, 31,12, %r28		/* Clear any offset bits */
 #endif
 
+#ifdef CONFIG_SMP
+	ldil		L%pa_tlb_lock, %r1
+	ldo		R%pa_tlb_lock(%r1), %r24
+	rsm		PSW_SM_I, %r22
+1:
+	LDCW		0(%r24),%r25
+	cmpib,COND(=)	0,%r25,1b
+	nop
+#endif
+
 	/* Purge any old translation */
 
 	pdtlb		0(%r28)
 
+#ifdef CONFIG_SMP
+	ldi		1,%r25
+	stw		%r25,0(%r24)
+	mtsm		%r22
+#endif
+
 #ifdef CONFIG_64BIT
 	ldi		(PAGE_SIZE / 128), %r1
 
 	/* PREFETCH (Write) has not (yet) been proven to help here */
 	/* #define	PREFETCHW_OP	ldd		256(%0), %r0 */
 
-1:	std		%r0, 0(%r28)
+2:	std		%r0, 0(%r28)
 	std		%r0, 8(%r28)
 	std		%r0, 16(%r28)
 	std		%r0, 24(%r28)
@@ -574,13 +592,13 @@ ENTRY(__clear_user_page_asm)
 	std		%r0, 104(%r28)
 	std		%r0, 112(%r28)
 	std		%r0, 120(%r28)
-	addib,COND(>)		-1, %r1, 1b
+	addib,COND(>)		-1, %r1, 2b
 	ldo		128(%r28), %r28
 
 #else	/* ! CONFIG_64BIT */
 	ldi		(PAGE_SIZE / 64), %r1
 
-1:
+2:
 	stw		%r0, 0(%r28)
 	stw		%r0, 4(%r28)
 	stw		%r0, 8(%r28)
@@ -597,7 +615,7 @@ ENTRY(__clear_user_page_asm)
 	stw		%r0, 52(%r28)
 	stw		%r0, 56(%r28)
 	stw		%r0, 60(%r28)
-	addib,COND(>)		-1, %r1, 1b
+	addib,COND(>)		-1, %r1, 2b
 	ldo		64(%r28), %r28
 #endif	/* CONFIG_64BIT */
 
diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c
index cb71f3d..84b3239 100644
--- a/arch/parisc/kernel/setup.c
+++ b/arch/parisc/kernel/setup.c
@@ -128,6 +128,14 @@ void __init setup_arch(char **cmdline_p)
 	printk(KERN_INFO "The 32-bit Kernel has started...\n");
 #endif
 
+	/* Consistency check on the size and alignments of our spinlocks */
+#ifdef CONFIG_SMP
+	BUILD_BUG_ON(sizeof(arch_spinlock_t) != __PA_LDCW_ALIGNMENT);
+	BUG_ON((unsigned long)&__atomic_hash[0] & (__PA_LDCW_ALIGNMENT-1));
+	BUG_ON((unsigned long)&__atomic_hash[1] & (__PA_LDCW_ALIGNMENT-1));
+#endif
+	BUILD_BUG_ON((1<<L1_CACHE_SHIFT) != L1_CACHE_BYTES);
+
 	pdc_console_init();
 
 #ifdef CONFIG_64BIT
diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index f5f9602..68e75ce 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -47,18 +47,17 @@ ENTRY(linux_gateway_page)
 	KILL_INSN
 	.endr
 
-	/* ADDRESS 0xb0 to 0xb4, lws uses 1 insns for entry */
+	/* ADDRESS 0xb0 to 0xb8, lws uses two insns for entry */
 	/* Light-weight-syscall entry must always be located at 0xb0 */
 	/* WARNING: Keep this number updated with table size changes */
 #define __NR_lws_entries (2)
 
 lws_entry:
-	/* Unconditional branch to lws_start, located on the 
-	   same gateway page */
-	b,n	lws_start
+	gate	lws_start, %r0		/* increase privilege */
+	depi	3, 31, 2, %r31		/* Ensure we return into user mode. */
 
-	/* Fill from 0xb4 to 0xe0 */
-	.rept 11
+	/* Fill from 0xb8 to 0xe0 */
+	.rept 10
 	KILL_INSN
 	.endr
 
@@ -423,9 +422,6 @@ tracesys_sigexit:
 
 	*********************************************************/
 lws_start:
-	/* Gate and ensure we return to userspace */
-	gate	.+8, %r0
-	depi	3, 31, 2, %r31	/* Ensure we return to userspace */
 
 #ifdef CONFIG_64BIT
 	/* FIXME: If we are a 64-bit kernel just
@@ -442,7 +438,7 @@ lws_start:
 #endif	
 
         /* Is the lws entry number valid? */
-	comiclr,>>=	__NR_lws_entries, %r20, %r0
+	comiclr,>>	__NR_lws_entries, %r20, %r0
 	b,n	lws_exit_nosys
 
 	/* WARNING: Trashing sr2 and sr3 */
@@ -473,7 +469,7 @@ lws_exit:
 	/* now reset the lowest bit of sp if it was set */
 	xor	%r30,%r1,%r30
 #endif
-	be,n	0(%sr3, %r31)
+	be,n	0(%sr7, %r31)
 
 
 	
@@ -529,7 +525,6 @@ lws_compare_and_swap32:
 #endif
 
 lws_compare_and_swap:
-#ifdef CONFIG_SMP
 	/* Load start of lock table */
 	ldil	L%lws_lock_start, %r20
 	ldo	R%lws_lock_start(%r20), %r28
@@ -572,8 +567,6 @@ cas_wouldblock:
 	ldo	2(%r0), %r28				/* 2nd case */
 	b	lws_exit				/* Contended... */
 	ldo	-EAGAIN(%r0), %r21			/* Spin in userspace */
-#endif
-/* CONFIG_SMP */
 
 	/*
 		prev = *addr;
@@ -601,13 +594,11 @@ cas_action:
 1:	ldw	0(%sr3,%r26), %r28
 	sub,<>	%r28, %r25, %r0
 2:	stw	%r24, 0(%sr3,%r26)
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	/* Clear thread register indicator */
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	/* Return to userspace, set no error */
 	b	lws_exit
@@ -615,12 +606,10 @@ cas_action:
 
 3:		
 	/* Error occured on load or store */
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	b	lws_exit
 	ldo	-EFAULT(%r0),%r21	/* set errno */
@@ -672,7 +661,6 @@ ENTRY(sys_call_table64)
 END(sys_call_table64)
 #endif
 
-#ifdef CONFIG_SMP
 	/*
 		All light-weight-syscall atomic operations 
 		will use this set of locks 
@@ -694,8 +682,6 @@ ENTRY(lws_lock_start)
 	.endr
 END(lws_lock_start)
 	.previous
-#endif
-/* CONFIG_SMP for lws_lock_start */
 
 .end
 
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 8b58bf0..804b024 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -47,7 +47,7 @@
 			  /*  dumped to the console via printk)          */
 
 #if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-DEFINE_SPINLOCK(pa_dbit_lock);
+DEFINE_SPINLOCK(pa_pte_lock);
 #endif
 
 static void parisc_show_stack(struct task_struct *task, unsigned long *sp,
diff --git a/arch/parisc/lib/bitops.c b/arch/parisc/lib/bitops.c
index 353963d..bae6a86 100644
--- a/arch/parisc/lib/bitops.c
+++ b/arch/parisc/lib/bitops.c
@@ -15,6 +15,9 @@
 arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
 	[0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
 };
+arch_spinlock_t __atomic_user_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
+	[0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
+};
 #endif
 
 #ifdef CONFIG_64BIT
diff --git a/arch/parisc/math-emu/decode_exc.c b/arch/parisc/math-emu/decode_exc.c
index 3ca1c61..27a7492 100644
--- a/arch/parisc/math-emu/decode_exc.c
+++ b/arch/parisc/math-emu/decode_exc.c
@@ -342,6 +342,7 @@ decode_fpu(unsigned int Fpu_register[], unsigned int trap_counts[])
 		return SIGNALCODE(SIGFPE, FPE_FLTINV);
 	  case DIVISIONBYZEROEXCEPTION:
 		update_trap_counts(Fpu_register, aflags, bflags, trap_counts);
+		Clear_excp_register(exception_index);
 	  	return SIGNALCODE(SIGFPE, FPE_FLTDIV);
 	  case INEXACTEXCEPTION:
 		update_trap_counts(Fpu_register, aflags, bflags, trap_counts);
diff --git a/mm/memory.c b/mm/memory.c
index 09e4b1b..21c2916 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -616,7 +616,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	 * in the parent and the child
 	 */
 	if (is_cow_mapping(vm_flags)) {
-		ptep_set_wrprotect(src_mm, addr, src_pte);
+		ptep_set_wrprotect(vma, src_mm, addr, src_pte);
 		pte = pte_wrprotect(pte);
 	}
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-15 21:02                                                           ` John David Anglin
@ 2010-05-16 20:22                                                             ` Helge Deller
  2010-05-16 21:38                                                               ` John David Anglin
  2010-05-22 17:25                                                               ` John David Anglin
  2010-05-23 13:11                                                             ` Carlos O'Donell
  1 sibling, 2 replies; 74+ messages in thread
From: Helge Deller @ 2010-05-16 20:22 UTC (permalink / raw)
  To: John David Anglin; +Cc: John David Anglin, carlos, gniibe, linux-parisc

On 05/15/2010 11:02 PM, John David Anglin wrote:
> On Tue, 11 May 2010, Helge Deller wrote:
> 
>> On 05/11/2010 11:26 PM, John David Anglin wrote:
>>>> On 05/10/2010 09:20 PM, Helge Deller wrote:
>>>>> On 05/10/2010 04:56 PM, John David Anglin wrote:
>>>>>> Yes, just after sending, I noticed gcc testsuite and minifail were
>>>>>> broken on gsyprf11.  [...]
>>>>>>
>>>>>> The attached works better on gsyprf11.  I haven't tested it on anything
>>>>>> else.
>>>>
>>>> Hi Dave,
>>>>
>>>> I still can see segfaults of the minifail_dave.cpp program.
> 
> I haven't seen any abnormal segfaults with the attached change on
> my rp3440 and gsyprf11.  Need to flush in pte_wrprotect for minifail,
> and to flush in copy_user_page to fix occasional segfaults in sh.
> The flushes in kmap/kunmap weren't sufficient.

I wish could report success as well....
But I still do see segfaults (although they seem to happen not as often as before).
Which kernel do you test? I'm on 2.6.33.3 + your patches.

Helge

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-16 20:22                                                             ` Helge Deller
@ 2010-05-16 21:38                                                               ` John David Anglin
  2010-05-22 17:25                                                               ` John David Anglin
  1 sibling, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-05-16 21:38 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, carlos, gniibe, linux-parisc

On Sun, 16 May 2010, Helge Deller wrote:

> I wish could report success as well....
> But I still do see segfaults (although they seem to happen not as often as before).

Thanks again for testing.

I'm convinced that we have various race conditions in doing cache
user and kernel flushes that are not easily fixed.  For example,
in pte_wrprotect, it difficult to ensure that the cache is clean
in a SMP environment.  Think there are similar problems with
copy_user_page.

At the moment, I'm fairly pessimistic about finding a solution with
our current implementation of copy_user_page.  I've done more than a
hundred kernel builds, and while things are better, there are still
problems that look like cache corruption.  Often, it takes more than
a day for problems to appear.

> Which kernel do you test? I'm on 2.6.33.3 + your patches.

I'm on to 2.6.33.4.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-16 20:22                                                             ` Helge Deller
  2010-05-16 21:38                                                               ` John David Anglin
@ 2010-05-22 17:25                                                               ` John David Anglin
  1 sibling, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-05-22 17:25 UTC (permalink / raw)
  To: Helge Deller; +Cc: John David Anglin, carlos, gniibe, linux-parisc

On Sun, 16 May 2010, Helge Deller wrote:

> I wish could report success as well....
> But I still do see segfaults (although they seem to happen not as often as before).

For reference, see this discussion:
http://readlist.com/lists/vger.kernel.org/linux-kernel/54/270861.html

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-15 21:02                                                           ` John David Anglin
  2010-05-16 20:22                                                             ` Helge Deller
@ 2010-05-23 13:11                                                             ` Carlos O'Donell
  2010-05-23 14:43                                                               ` John David Anglin
  1 sibling, 1 reply; 74+ messages in thread
From: Carlos O'Donell @ 2010-05-23 13:11 UTC (permalink / raw)
  To: John David Anglin; +Cc: Helge Deller, linux-parisc

On Sat, May 15, 2010 at 5:02 PM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
> I haven't seen any abnormal segfaults with the attached change on
> my rp3440 and gsyprf11. =A0Need to flush in pte_wrprotect for minifai=
l,
> and to flush in copy_user_page to fix occasional segfaults in sh.
> The flushes in kmap/kunmap weren't sufficient.
>
> Still experimenting to see if number of flushes can be reduced, etc.

Applying your patch to 2.6.33, I had 2 hunk rejects.

I fixed up the rejects, but I was curious, what is your patch based on?

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-05-23 13:11                                                             ` Carlos O'Donell
@ 2010-05-23 14:43                                                               ` John David Anglin
  0 siblings, 0 replies; 74+ messages in thread
From: John David Anglin @ 2010-05-23 14:43 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, deller, linux-parisc

> Applying your patch to 2.6.33, I had 2 hunk rejects.

Where were the rejects?  I don't think anything that I have changed
has changed since 2.6.33 was released.

> I fixed up the rejects, but I was curious, what is your patch based on?

git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.33.y.git

Included below is my current diff.  I have reworked pacache.S and cache.c
to make it easier to test various alternatives.  I added 64-bit support
to copy_user_page_asm and an implementation of clear_page_asm.  Routine
names have been revamped to distinguish implementations using the tmp
alias region.

I haven't tested all permutations, but I don't have a stable fix.

I think we need to do something similar to mips.  See their implementation
of kmap_coherent, kunmap_coherent, copy_user_highpage, copy_to_user_page,
copy_from_user_page.  Currently, our implementations of copy_user_page,
copy_to_user_page and copy_from_user_page all use non equivalent aliasing.
<http://readlist.com/lists/vger.kernel.org/linux-kernel/54/271417.html>
discusses why this is a ispecial problem on PA8800.

I like the mips approach in that the pte is setup in kmap_coherent.
This avoids doing anything special in tlb handler.  However, the
downside may be that our tmp alias region is quite large, and we
may need multiple regions for each cpu.

Possibly, the simplest thing to try is to implement copy_to_user_page
and copy_from_user_page using the tmp alias region.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

diff --git a/arch/parisc/hpux/wrappers.S b/arch/parisc/hpux/wrappers.S
index 58c53c8..bdcea33 100644
--- a/arch/parisc/hpux/wrappers.S
+++ b/arch/parisc/hpux/wrappers.S
@@ -88,7 +88,7 @@ ENTRY(hpux_fork_wrapper)
 
 	STREG	%r2,-20(%r30)
 	ldo	64(%r30),%r30
-	STREG	%r2,PT_GR19(%r1)	;! save for child
+	STREG	%r2,PT_SYSCALL_RP(%r1)	;! save for child
 	STREG	%r30,PT_GR21(%r1)	;! save for child
 
 	LDREG	PT_GR30(%r1),%r25
@@ -132,7 +132,7 @@ ENTRY(hpux_child_return)
 	bl,n	schedule_tail, %r2
 #endif
 
-	LDREG	TASK_PT_GR19-TASK_SZ_ALGN-128(%r30),%r2
+	LDREG	TASK_PT_SYSCALL_RP-TASK_SZ_ALGN-128(%r30),%r2
 	b fork_return
 	copy %r0,%r28
 ENDPROC(hpux_child_return)
diff --git a/arch/parisc/include/asm/atomic.h b/arch/parisc/include/asm/atomic.h
index 716634d..ad7df44 100644
--- a/arch/parisc/include/asm/atomic.h
+++ b/arch/parisc/include/asm/atomic.h
@@ -24,29 +24,46 @@
  * Hash function to index into a different SPINLOCK.
  * Since "a" is usually an address, use one spinlock per cacheline.
  */
-#  define ATOMIC_HASH_SIZE 4
-#  define ATOMIC_HASH(a) (&(__atomic_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
+#  define ATOMIC_HASH_SIZE (4096/L1_CACHE_BYTES)  /* 4 */
+#  define ATOMIC_HASH(a)      (&(__atomic_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
+#  define ATOMIC_USER_HASH(a) (&(__atomic_user_hash[ (((unsigned long) (a))/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
 
 extern arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned;
+extern arch_spinlock_t __atomic_user_hash[ATOMIC_HASH_SIZE] __lock_aligned;
 
 /* Can't use raw_spin_lock_irq because of #include problems, so
  * this is the substitute */
-#define _atomic_spin_lock_irqsave(l,f) do {	\
-	arch_spinlock_t *s = ATOMIC_HASH(l);		\
+#define _atomic_spin_lock_irqsave_template(l,f,hash_func) do {	\
+	arch_spinlock_t *s = hash_func;		\
 	local_irq_save(f);			\
 	arch_spin_lock(s);			\
 } while(0)
 
-#define _atomic_spin_unlock_irqrestore(l,f) do {	\
-	arch_spinlock_t *s = ATOMIC_HASH(l);			\
+#define _atomic_spin_unlock_irqrestore_template(l,f,hash_func) do {	\
+	arch_spinlock_t *s = hash_func;			\
 	arch_spin_unlock(s);				\
 	local_irq_restore(f);				\
 } while(0)
 
+/* kernel memory locks */
+#define _atomic_spin_lock_irqsave(l,f)	\
+	_atomic_spin_lock_irqsave_template(l,f,ATOMIC_HASH(l))
+
+#define _atomic_spin_unlock_irqrestore(l,f)	\
+	_atomic_spin_unlock_irqrestore_template(l,f,ATOMIC_HASH(l))
+
+/* userspace memory locks */
+#define _atomic_spin_lock_irqsave_user(l,f)	\
+	_atomic_spin_lock_irqsave_template(l,f,ATOMIC_USER_HASH(l))
+
+#define _atomic_spin_unlock_irqrestore_user(l,f)	\
+	_atomic_spin_unlock_irqrestore_template(l,f,ATOMIC_USER_HASH(l))
 
 #else
 #  define _atomic_spin_lock_irqsave(l,f) do { local_irq_save(f); } while (0)
 #  define _atomic_spin_unlock_irqrestore(l,f) do { local_irq_restore(f); } while (0)
+#  define _atomic_spin_lock_irqsave_user(l,f) _atomic_spin_lock_irqsave(l,f)
+#  define _atomic_spin_unlock_irqrestore_user(l,f) _atomic_spin_unlock_irqrestore(l,f)
 #endif
 
 /* This should get optimized out since it's never called.
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index 7a73b61..b90c895 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -2,6 +2,7 @@
 #define _PARISC_CACHEFLUSH_H
 
 #include <linux/mm.h>
+#include <linux/uaccess.h>
 
 /* The usual comment is "Caches aren't brain-dead on the <architecture>".
  * Unfortunately, that doesn't apply to PA-RISC. */
@@ -104,21 +105,32 @@ void mark_rodata_ro(void);
 #define ARCH_HAS_KMAP
 
 void kunmap_parisc(void *addr);
+void *kmap_parisc(struct page *page);
 
 static inline void *kmap(struct page *page)
 {
 	might_sleep();
-	return page_address(page);
+	return kmap_parisc(page);
 }
 
 #define kunmap(page)			kunmap_parisc(page_address(page))
 
-#define kmap_atomic(page, idx)		page_address(page)
+static inline void *kmap_atomic(struct page *page, enum km_type idx)
+{
+	pagefault_disable();
+	return kmap_parisc(page);
+}
 
-#define kunmap_atomic(addr, idx)	kunmap_parisc(addr)
+static inline void kunmap_atomic(void *addr, enum km_type idx)
+{
+	kunmap_parisc(addr);
+	pagefault_enable();
+}
 
-#define kmap_atomic_pfn(pfn, idx)	page_address(pfn_to_page(pfn))
-#define kmap_atomic_to_page(ptr)	virt_to_page(ptr)
+#define kmap_atomic_prot(page, idx, prot)	kmap_atomic(page, idx)
+#define kmap_atomic_pfn(pfn, idx)	kmap_atomic(pfn_to_page(pfn), (idx))
+#define kmap_atomic_to_page(ptr)	virt_to_page(kmap_atomic(virt_to_page(ptr), (enum km_type) 0))
+#define kmap_flush_unused()	do {} while(0)
 #endif
 
 #endif /* _PARISC_CACHEFLUSH_H */
diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
index 0c705c3..7bc963e 100644
--- a/arch/parisc/include/asm/futex.h
+++ b/arch/parisc/include/asm/futex.h
@@ -55,6 +55,7 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
 {
 	int err = 0;
 	int uval;
+	unsigned long flags;
 
 	/* futex.c wants to do a cmpxchg_inatomic on kernel NULL, which is
 	 * our gateway page, and causes no end of trouble...
@@ -65,10 +66,15 @@ futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
 		return -EFAULT;
 
+	_atomic_spin_lock_irqsave_user(uaddr, flags);
+
 	err = get_user(uval, uaddr);
-	if (err) return -EFAULT;
-	if (uval == oldval)
-		err = put_user(newval, uaddr);
+	if (!err)
+		if (uval == oldval)
+			err = put_user(newval, uaddr);
+
+	_atomic_spin_unlock_irqrestore_user(uaddr, flags);
+
 	if (err) return -EFAULT;
 	return uval;
 }
diff --git a/arch/parisc/include/asm/page.h b/arch/parisc/include/asm/page.h
index a84cc1f..cca0f53 100644
--- a/arch/parisc/include/asm/page.h
+++ b/arch/parisc/include/asm/page.h
@@ -21,15 +21,18 @@
 #include <asm/types.h>
 #include <asm/cache.h>
 
-#define clear_page(page)	memset((void *)(page), 0, PAGE_SIZE)
-#define copy_page(to,from)      copy_user_page_asm((void *)(to), (void *)(from))
+#define clear_page(page)	clear_page_asm((void *)(page))
+#define copy_page(to,from)      copy_page_asm((void *)(to), (void *)(from))
 
 struct page;
 
-void copy_user_page_asm(void *to, void *from);
-void copy_user_page(void *vto, void *vfrom, unsigned long vaddr,
+extern void copy_page_asm(void *to, void *from);
+extern void clear_page_asm(void *page);
+extern void copy_user_page_asm(void *to, void *from, unsigned long vaddr);
+extern void clear_user_page_asm(void *page, unsigned long vaddr);
+extern void copy_user_page(void *vto, void *vfrom, unsigned long vaddr,
 			   struct page *pg);
-void clear_user_page(void *page, unsigned long vaddr, struct page *pg);
+extern void clear_user_page(void *page, unsigned long vaddr, struct page *pg);
 
 /*
  * These are used to make use of C type-checking..
diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index a27d2e2..8050948 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -14,6 +14,7 @@
 #include <linux/bitops.h>
 #include <asm/processor.h>
 #include <asm/cache.h>
+#include <linux/uaccess.h>
 
 /*
  * kern_addr_valid(ADDR) tests if ADDR is pointing to valid kernel
@@ -30,15 +31,21 @@
  */
 #define kern_addr_valid(addr)	(1)
 
+extern spinlock_t pa_pte_lock;
+extern spinlock_t pa_tlb_lock;
+
 /* Certain architectures need to do special things when PTEs
  * within a page table are directly modified.  Thus, the following
  * hook is made available.
  */
-#define set_pte(pteptr, pteval)                                 \
-        do{                                                     \
+#define set_pte(pteptr, pteval)					\
+        do {							\
+		unsigned long flags;				\
+		spin_lock_irqsave(&pa_pte_lock, flags);		\
                 *(pteptr) = (pteval);                           \
+		spin_unlock_irqrestore(&pa_pte_lock, flags);	\
         } while(0)
-#define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval)
+#define set_pte_at(mm,addr,ptep,pteval)	set_pte(ptep, pteval)
 
 #endif /* !__ASSEMBLY__ */
 
@@ -262,6 +269,7 @@ extern unsigned long *empty_zero_page;
 #define pte_none(x)     ((pte_val(x) == 0) || (pte_val(x) & _PAGE_FLUSH))
 #define pte_present(x)	(pte_val(x) & _PAGE_PRESENT)
 #define pte_clear(mm,addr,xp)	do { pte_val(*(xp)) = 0; } while (0)
+#define pte_same(A,B)	(pte_val(A) == pte_val(B))
 
 #define pmd_flag(x)	(pmd_val(x) & PxD_FLAG_MASK)
 #define pmd_address(x)	((unsigned long)(pmd_val(x) &~ PxD_FLAG_MASK) << PxD_VALUE_SHIFT)
@@ -410,6 +418,7 @@ extern void paging_init (void);
 
 #define PG_dcache_dirty         PG_arch_1
 
+extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn);
 extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t);
 
 /* Encode and de-code a swap entry */
@@ -423,56 +432,83 @@ extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t);
 #define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val(pte) })
 #define __swp_entry_to_pte(x)		((pte_t) { (x).val })
 
-static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
+static inline void __flush_tlb_page(struct mm_struct *mm, unsigned long addr)
 {
-#ifdef CONFIG_SMP
-	if (!pte_young(*ptep))
-		return 0;
-	return test_and_clear_bit(xlate_pabit(_PAGE_ACCESSED_BIT), &pte_val(*ptep));
-#else
-	pte_t pte = *ptep;
-	if (!pte_young(pte))
-		return 0;
-	set_pte_at(vma->vm_mm, addr, ptep, pte_mkold(pte));
-	return 1;
-#endif
+	unsigned long flags;
+
+	/* For one page, it's not worth testing the split_tlb variable.  */
+	spin_lock_irqsave(&pa_tlb_lock, flags);
+	mtsp(mm->context,1);
+	pdtlb(addr);
+	pitlb(addr);
+	spin_unlock_irqrestore(&pa_tlb_lock, flags);
 }
 
-extern spinlock_t pa_dbit_lock;
+static inline int ptep_set_access_flags(struct vm_area_struct *vma, unsigned
+ long addr, pte_t *ptep, pte_t entry, int dirty)
+{
+	int changed;
+	unsigned long flags;
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	changed = !pte_same(*ptep, entry);
+	if (changed) {
+		*ptep = entry;
+	}
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
+	if (changed) {
+		__flush_tlb_page(vma->vm_mm, addr);
+	}
+	return changed;
+}
+
+static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
+{
+	pte_t pte;
+	unsigned long flags;
+	int r;
+
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	pte = *ptep;
+	if (pte_young(pte)) {
+		*ptep = pte_mkold(pte);
+		r = 1;
+	} else {
+		r = 0;
+	}
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
+
+	return r;
+}
 
 struct mm_struct;
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-	pte_t old_pte;
-	pte_t pte;
+	pte_t pte, old_pte;
+	unsigned long flags;
 
-	spin_lock(&pa_dbit_lock);
+	spin_lock_irqsave(&pa_pte_lock, flags);
 	pte = old_pte = *ptep;
 	pte_val(pte) &= ~_PAGE_PRESENT;
 	pte_val(pte) |= _PAGE_FLUSH;
-	set_pte_at(mm,addr,ptep,pte);
-	spin_unlock(&pa_dbit_lock);
+	*ptep = pte;
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
 
 	return old_pte;
 }
 
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-#ifdef CONFIG_SMP
-	unsigned long new, old;
-
-	do {
-		old = pte_val(*ptep);
-		new = pte_val(pte_wrprotect(__pte (old)));
-	} while (cmpxchg((unsigned long *) ptep, old, new) != old);
-#else
-	pte_t old_pte = *ptep;
-	set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
-#endif
+	pte_t old_pte;
+	unsigned long flags;
+
+	spin_lock_irqsave(&pa_pte_lock, flags);
+	old_pte = *ptep;
+	*ptep = pte_wrprotect(old_pte);
+	__flush_tlb_page(mm, addr);
+	flush_cache_page(vma, addr, pte_pfn(old_pte));
+	spin_unlock_irqrestore(&pa_pte_lock, flags);
 }
 
-#define pte_same(A,B)	(pte_val(A) == pte_val(B))
-
 #endif /* !__ASSEMBLY__ */
 
 
@@ -504,6 +540,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 
 #define HAVE_ARCH_UNMAPPED_AREA
 
+#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
 #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
diff --git a/arch/parisc/include/asm/system.h b/arch/parisc/include/asm/system.h
index d91357b..4653c77 100644
--- a/arch/parisc/include/asm/system.h
+++ b/arch/parisc/include/asm/system.h
@@ -160,7 +160,7 @@ static inline void set_eiem(unsigned long val)
    ldcd). */
 
 #define __PA_LDCW_ALIGNMENT	4
-#define __ldcw_align(a) ((volatile unsigned int *)a)
+#define __ldcw_align(a) (&(a)->slock)
 #define __LDCW	"ldcw,co"
 
 #endif /*!CONFIG_PA20*/
diff --git a/arch/parisc/kernel/asm-offsets.c b/arch/parisc/kernel/asm-offsets.c
index ec787b4..b2f35b2 100644
--- a/arch/parisc/kernel/asm-offsets.c
+++ b/arch/parisc/kernel/asm-offsets.c
@@ -137,6 +137,7 @@ int main(void)
 	DEFINE(TASK_PT_IAOQ0, offsetof(struct task_struct, thread.regs.iaoq[0]));
 	DEFINE(TASK_PT_IAOQ1, offsetof(struct task_struct, thread.regs.iaoq[1]));
 	DEFINE(TASK_PT_CR27, offsetof(struct task_struct, thread.regs.cr27));
+	DEFINE(TASK_PT_SYSCALL_RP, offsetof(struct task_struct, thread.regs.pad0));
 	DEFINE(TASK_PT_ORIG_R28, offsetof(struct task_struct, thread.regs.orig_r28));
 	DEFINE(TASK_PT_KSP, offsetof(struct task_struct, thread.regs.ksp));
 	DEFINE(TASK_PT_KPC, offsetof(struct task_struct, thread.regs.kpc));
@@ -225,6 +226,7 @@ int main(void)
 	DEFINE(PT_IAOQ0, offsetof(struct pt_regs, iaoq[0]));
 	DEFINE(PT_IAOQ1, offsetof(struct pt_regs, iaoq[1]));
 	DEFINE(PT_CR27, offsetof(struct pt_regs, cr27));
+	DEFINE(PT_SYSCALL_RP, offsetof(struct pt_regs, pad0));
 	DEFINE(PT_ORIG_R28, offsetof(struct pt_regs, orig_r28));
 	DEFINE(PT_KSP, offsetof(struct pt_regs, ksp));
 	DEFINE(PT_KPC, offsetof(struct pt_regs, kpc));
@@ -290,5 +292,11 @@ int main(void)
 	BLANK();
 	DEFINE(ASM_PDC_RESULT_SIZE, NUM_PDC_RESULT * sizeof(unsigned long));
 	BLANK();
+
+#ifdef CONFIG_SMP
+	DEFINE(ASM_ATOMIC_HASH_SIZE_SHIFT, __builtin_ffs(ATOMIC_HASH_SIZE)-1);
+	DEFINE(ASM_ATOMIC_HASH_ENTRY_SHIFT, __builtin_ffs(sizeof(__atomic_hash[0]))-1);
+#endif
+
 	return 0;
 }
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index b6ed34d..7952ae4 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -336,9 +336,9 @@ __flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr)
 	}
 }
 
-void flush_dcache_page(struct page *page)
+static void flush_user_dcache_page_internal(struct address_space *mapping,
+					    struct page *page)
 {
-	struct address_space *mapping = page_mapping(page);
 	struct vm_area_struct *mpnt;
 	struct prio_tree_iter iter;
 	unsigned long offset;
@@ -346,14 +346,6 @@ void flush_dcache_page(struct page *page)
 	pgoff_t pgoff;
 	unsigned long pfn = page_to_pfn(page);
 
-
-	if (mapping && !mapping_mapped(mapping)) {
-		set_bit(PG_dcache_dirty, &page->flags);
-		return;
-	}
-
-	flush_kernel_dcache_page(page);
-
 	if (!mapping)
 		return;
 
@@ -387,6 +379,19 @@ void flush_dcache_page(struct page *page)
 	}
 	flush_dcache_mmap_unlock(mapping);
 }
+
+void flush_dcache_page(struct page *page)
+{
+	struct address_space *mapping = page_mapping(page);
+
+	if (mapping && !mapping_mapped(mapping)) {
+		set_bit(PG_dcache_dirty, &page->flags);
+		return;
+	}
+
+	flush_kernel_dcache_page(page);
+	flush_user_dcache_page_internal(mapping, page);
+}
 EXPORT_SYMBOL(flush_dcache_page);
 
 /* Defined in arch/parisc/kernel/pacache.S */
@@ -395,17 +400,6 @@ EXPORT_SYMBOL(flush_kernel_dcache_page_asm);
 EXPORT_SYMBOL(flush_data_cache_local);
 EXPORT_SYMBOL(flush_kernel_icache_range_asm);
 
-void clear_user_page_asm(void *page, unsigned long vaddr)
-{
-	unsigned long flags;
-	/* This function is implemented in assembly in pacache.S */
-	extern void __clear_user_page_asm(void *page, unsigned long vaddr);
-
-	purge_tlb_start(flags);
-	__clear_user_page_asm(page, vaddr);
-	purge_tlb_end(flags);
-}
-
 #define FLUSH_THRESHOLD 0x80000 /* 0.5MB */
 int parisc_cache_flush_threshold __read_mostly = FLUSH_THRESHOLD;
 
@@ -440,17 +434,26 @@ void __init parisc_setup_cache_timing(void)
 }
 
 extern void purge_kernel_dcache_page(unsigned long);
-extern void clear_user_page_asm(void *page, unsigned long vaddr);
 
 void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
 {
+#if 1
+	/* Clear user page using alias region.  */
+#if 0
 	unsigned long flags;
 
 	purge_kernel_dcache_page((unsigned long)page);
 	purge_tlb_start(flags);
 	pdtlb_kernel(page);
 	purge_tlb_end(flags);
+#endif
+
 	clear_user_page_asm(page, vaddr);
+#else
+	/* Clear user page using kernel mapping.  */
+	clear_page_asm(page);
+	flush_kernel_dcache_page_asm(page);
+#endif
 }
 EXPORT_SYMBOL(clear_user_page);
 
@@ -469,22 +472,15 @@ void copy_user_page(void *vto, void *vfrom, unsigned long vaddr,
 		    struct page *pg)
 {
 	/* no coherency needed (all in kmap/kunmap) */
-	copy_user_page_asm(vto, vfrom);
-	if (!parisc_requires_coherency())
-		flush_kernel_dcache_page_asm(vto);
+#if 0
+	copy_user_page_asm(vto, vfrom, vaddr);
+#else
+	copy_page_asm(vto, vfrom);
+	flush_kernel_dcache_page_asm(vto);
+#endif
 }
 EXPORT_SYMBOL(copy_user_page);
 
-#ifdef CONFIG_PA8X00
-
-void kunmap_parisc(void *addr)
-{
-	if (parisc_requires_coherency())
-		flush_kernel_dcache_page_addr(addr);
-}
-EXPORT_SYMBOL(kunmap_parisc);
-#endif
-
 void __flush_tlb_range(unsigned long sid, unsigned long start,
 		       unsigned long end)
 {
@@ -577,3 +573,25 @@ flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long
 		__flush_cache_page(vma, vmaddr);
 
 }
+
+void *kmap_parisc(struct page *page)
+{
+	/* this is a killer.  There's no easy way to test quickly if
+	 * this page is dirty in any userspace.  Additionally, for
+	 * kernel alterations of the page, we'd need it invalidated
+	 * here anyway, so currently flush (and invalidate)
+	 * universally */
+	flush_user_dcache_page_internal(page_mapping(page), page);
+	return page_address(page);
+}
+EXPORT_SYMBOL(kmap_parisc);
+
+void kunmap_parisc(void *addr)
+{
+	/* flush and invalidate the kernel mapping.  We need the
+	 * invalidate so we don't have stale data at this cache
+	 * location the next time the page is mapped */
+	flush_kernel_dcache_page_addr(addr);
+}
+EXPORT_SYMBOL(kunmap_parisc);
+
diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index 3a44f7f..42dbf32 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -45,7 +45,7 @@
 	.level 2.0
 #endif
 
-	.import         pa_dbit_lock,data
+	.import         pa_pte_lock,data
 
 	/* space_to_prot macro creates a prot id from a space id */
 
@@ -364,32 +364,6 @@
 	.align		32
 	.endm
 
-	/* The following are simple 32 vs 64 bit instruction
-	 * abstractions for the macros */
-	.macro		EXTR	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	extrd,u		\reg1,32+(\start),\length,\reg2
-#else
-	extrw,u		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEP	reg1,start,length,reg2
-#ifdef CONFIG_64BIT
-	depd		\reg1,32+(\start),\length,\reg2
-#else
-	depw		\reg1,\start,\length,\reg2
-#endif
-	.endm
-
-	.macro		DEPI	val,start,length,reg
-#ifdef CONFIG_64BIT
-	depdi		\val,32+(\start),\length,\reg
-#else
-	depwi		\val,\start,\length,\reg
-#endif
-	.endm
-
 	/* In LP64, the space contains part of the upper 32 bits of the
 	 * fault.  We have to extract this and place it in the va,
 	 * zeroing the corresponding bits in the space register */
@@ -442,19 +416,19 @@
 	 */
 	.macro		L2_ptep	pmd,pte,index,va,fault
 #if PT_NLEVELS == 3
-	EXTR		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
+	extru		\va,31-ASM_PMD_SHIFT,ASM_BITS_PER_PMD,\index
 #else
-	EXTR		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
+	extru		\va,31-ASM_PGDIR_SHIFT,ASM_BITS_PER_PGD,\index
 #endif
-	DEP             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	dep             %r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	copy		%r0,\pte
 	ldw,s		\index(\pmd),\pmd
 	bb,>=,n		\pmd,_PxD_PRESENT_BIT,\fault
-	DEP		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
+	dep		%r0,31,PxD_FLAG_SHIFT,\pmd /* clear flags */
 	copy		\pmd,%r9
 	SHLREG		%r9,PxD_VALUE_SHIFT,\pmd
-	EXTR		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
-	DEP		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
+	extru		\va,31-PAGE_SHIFT,ASM_BITS_PER_PTE,\index
+	dep		%r0,31,PAGE_SHIFT,\pmd  /* clear offset */
 	shladd		\index,BITS_PER_PTE_ENTRY,\pmd,\pmd
 	LDREG		%r0(\pmd),\pte		/* pmd is now pte */
 	bb,>=,n		\pte,_PAGE_PRESENT_BIT,\fault
@@ -488,13 +462,46 @@
 	L2_ptep		\pgd,\pte,\index,\va,\fault
 	.endm
 
+	/* SMP lock for consistent PTE updates.  Unlocks and jumps
+	   to FAULT if the page is not present.  Note the preceeding
+	   load of the PTE can't be deleted since we can't fault holding
+	   the lock.  */ 
+	.macro		pte_lock	ptep,pte,spc,tmp,tmp1,fault
+#ifdef CONFIG_SMP
+	cmpib,COND(=),n        0,\spc,2f
+	load32		PA(pa_pte_lock),\tmp1
+1:
+	LDCW		0(\tmp1),\tmp
+	cmpib,COND(=)         0,\tmp,1b
+	nop
+	LDREG		%r0(\ptep),\pte
+	bb,<,n		\pte,_PAGE_PRESENT_BIT,2f
+	ldi             1,\tmp
+	stw             \tmp,0(\tmp1)
+	b,n		\fault
+2:
+#endif
+	.endm
+
+	.macro		pte_unlock	spc,tmp,tmp1
+#ifdef CONFIG_SMP
+	cmpib,COND(=),n        0,\spc,1f
+	ldi             1,\tmp
+	stw             \tmp,0(\tmp1)
+1:
+#endif
+	.endm
+
 	/* Set the _PAGE_ACCESSED bit of the PTE.  Be clever and
 	 * don't needlessly dirty the cache line if it was already set */
-	.macro		update_ptep	ptep,pte,tmp,tmp1
-	ldi		_PAGE_ACCESSED,\tmp1
-	or		\tmp1,\pte,\tmp
-	and,COND(<>)	\tmp1,\pte,%r0
-	STREG		\tmp,0(\ptep)
+	.macro		update_ptep	ptep,pte,spc,tmp,tmp1,fault
+	bb,<,n		\pte,_PAGE_ACCESSED_BIT,3f
+	pte_lock	\ptep,\pte,\spc,\tmp,\tmp1,\fault
+	ldi		_PAGE_ACCESSED,\tmp
+	or		\tmp,\pte,\pte
+	STREG		\pte,0(\ptep)
+	pte_unlock	\spc,\tmp,\tmp1
+3:
 	.endm
 
 	/* Set the dirty bit (and accessed bit).  No need to be
@@ -605,7 +612,7 @@
 	depdi		0,31,32,\tmp
 #endif
 	copy		\va,\tmp1
-	DEPI		0,31,23,\tmp1
+	depi		0,31,23,\tmp1
 	cmpb,COND(<>),n	\tmp,\tmp1,\fault
 	ldi		(_PAGE_DIRTY|_PAGE_WRITE|_PAGE_READ),\prot
 	depd,z		\prot,8,7,\prot
@@ -622,6 +629,39 @@
 	or		%r26,%r0,\pte
 	.endm 
 
+	/* Save PTE for recheck if SMP.  */
+	.macro		save_pte	pte,tmp
+#ifdef CONFIG_SMP
+	copy		\pte,\tmp
+#endif
+	.endm
+
+	/* Reload the PTE and purge the data TLB entry if the new
+	   value is different from the old one.  */
+	.macro		dtlb_recheck	ptep,old_pte,spc,va,tmp
+#ifdef CONFIG_SMP
+	LDREG		%r0(\ptep),\tmp
+	cmpb,COND(=),n	\old_pte,\tmp,1f
+	mfsp		%sr1,\tmp
+	mtsp		\spc,%sr1
+	pdtlb,l		%r0(%sr1,\va)
+	mtsp		\tmp,%sr1
+1:
+#endif
+	.endm
+
+	.macro		itlb_recheck	ptep,old_pte,spc,va,tmp
+#ifdef CONFIG_SMP
+	LDREG		%r0(\ptep),\tmp
+	cmpb,COND(=),n	\old_pte,\tmp,1f
+	mfsp		%sr1,\tmp
+	mtsp		\spc,%sr1
+	pitlb,l		%r0(%sr1,\va)
+	mtsp		\tmp,%sr1
+1:
+#endif
+	.endm
+
 
 	/*
 	 * Align fault_vector_20 on 4K boundary so that both
@@ -758,6 +798,10 @@ ENTRY(__kernel_thread)
 
 	STREG	%r22, PT_GR22(%r1)	/* save r22 (arg5) */
 	copy	%r0, %r22		/* user_tid */
+	copy	%r0, %r21		/* child_tid */
+#else
+	stw	%r0, -52(%r30)	     	/* user_tid */
+	stw	%r0, -56(%r30)	     	/* child_tid */
 #endif
 	STREG	%r26, PT_GR26(%r1)  /* Store function & argument for child */
 	STREG	%r25, PT_GR25(%r1)
@@ -765,7 +809,7 @@ ENTRY(__kernel_thread)
 	ldo	CLONE_VM(%r26), %r26   /* Force CLONE_VM since only init_mm */
 	or	%r26, %r24, %r26      /* will have kernel mappings.	 */
 	ldi	1, %r25			/* stack_start, signals kernel thread */
-	stw	%r0, -52(%r30)	     	/* user_tid */
+	ldi	0, %r23			/* child_stack_size */
 #ifdef CONFIG_64BIT
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
@@ -972,7 +1016,10 @@ intr_check_sig:
 	BL	do_notify_resume,%r2
 	copy	%r16, %r26			/* struct pt_regs *regs */
 
-	b,n	intr_check_sig
+	mfctl   %cr30,%r16		/* Reload */
+	LDREG	TI_TASK(%r16), %r16	/* thread_info -> task_struct */
+	b	intr_check_sig
+	ldo	TASK_REGS(%r16),%r16
 
 intr_restore:
 	copy            %r16,%r29
@@ -997,13 +1044,6 @@ intr_restore:
 
 	rfi
 	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
-	nop
 
 #ifndef CONFIG_PREEMPT
 # define intr_do_preempt	intr_restore
@@ -1026,14 +1066,12 @@ intr_do_resched:
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	ldil	L%intr_check_sig, %r2
-#ifndef CONFIG_64BIT
-	b	schedule
-#else
-	load32	schedule, %r20
-	bv	%r0(%r20)
-#endif
-	ldo	R%intr_check_sig(%r2), %r2
+	BL	schedule,%r2
+	nop
+	mfctl   %cr30,%r16		/* Reload */
+	LDREG	TI_TASK(%r16), %r16	/* thread_info -> task_struct */
+	b	intr_check_sig
+	ldo	TASK_REGS(%r16),%r16
 
 	/* preempt the current task on returning to kernel
 	 * mode from an interrupt, iff need_resched is set,
@@ -1214,11 +1252,12 @@ dtlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,dtlb_check_alias_20w
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1,dtlb_check_alias_20w
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-	
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1238,11 +1277,10 @@ nadtlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,nadtlb_check_flush_20w
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1272,8 +1310,9 @@ dtlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,dtlb_check_alias_11
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1,dtlb_check_alias_11
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
@@ -1283,6 +1322,7 @@ dtlb_miss_11:
 	idtlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1321,11 +1361,9 @@ nadtlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,nadtlb_check_flush_11
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
-
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
 	mtsp		spc,%sr1
 
@@ -1333,6 +1371,7 @@ nadtlb_miss_11:
 	idtlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1368,13 +1407,15 @@ dtlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,dtlb_check_alias_20
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1,dtlb_check_alias_20
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0
 
 	idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1394,13 +1435,13 @@ nadtlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,nadtlb_check_flush_20
 
-	update_ptep	ptp,pte,t0,t1
-
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0
 	
         idtlbt          pte,prot
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1508,11 +1549,12 @@ itlb_miss_20w:
 
 	L3_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1,itlb_fault
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-	
 	iitlbt          pte,prot
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1526,8 +1568,9 @@ itlb_miss_11:
 
 	L2_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1,itlb_fault
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
 	mfsp		%sr1,t0  /* Save sr1 so we can use it in tlb inserts */
@@ -1537,6 +1580,7 @@ itlb_miss_11:
 	iitlbp		prot,(%sr1,va)
 
 	mtsp		t0, %sr1	/* Restore sr1 */
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1548,13 +1592,15 @@ itlb_miss_20:
 
 	L2_ptep		ptp,pte,t0,va,itlb_fault
 
-	update_ptep	ptp,pte,t0,t1
+	update_ptep	ptp,pte,spc,t0,t1,itlb_fault
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
 	f_extend	pte,t0	
 
 	iitlbt          pte,prot
+	itlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1570,29 +1616,14 @@ dbit_trap_20w:
 
 	L3_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_20w
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_20w:
-	LDCW		0(t0),t1
-	cmpib,COND(=)         0,t1,dbit_spin_20w
-	nop
-
-dbit_nolock_20w:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
-		
 	idtlbt          pte,prot
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_20w
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_20w:
-#endif
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1606,35 +1637,21 @@ dbit_trap_11:
 
 	L2_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_11
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_11:
-	LDCW		0(t0),t1
-	cmpib,=         0,t1,dbit_spin_11
-	nop
-
-dbit_nolock_11:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb_11	spc,pte,prot
 
-	mfsp            %sr1,t1  /* Save sr1 so we can use it in tlb inserts */
+	mfsp            %sr1,t0  /* Save sr1 so we can use it in tlb inserts */
 	mtsp		spc,%sr1
 
 	idtlba		pte,(%sr1,va)
 	idtlbp		prot,(%sr1,va)
 
-	mtsp            t1, %sr1     /* Restore sr1 */
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_11
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_11:
-#endif
+	mtsp            t0, %sr1     /* Restore sr1 */
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1646,32 +1663,17 @@ dbit_trap_20:
 
 	L2_ptep		ptp,pte,t0,va,dbit_fault
 
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nolock_20
-	load32		PA(pa_dbit_lock),t0
-
-dbit_spin_20:
-	LDCW		0(t0),t1
-	cmpib,=         0,t1,dbit_spin_20
-	nop
-
-dbit_nolock_20:
-#endif
-	update_dirty	ptp,pte,t1
+	pte_lock	ptp,pte,spc,t0,t1,dbit_fault
+	update_dirty	ptp,pte,t0
+	pte_unlock	spc,t0,t1
 
+	save_pte	pte,t1
 	make_insert_tlb	spc,pte,prot
 
-	f_extend	pte,t1
+	f_extend	pte,t0
 	
         idtlbt          pte,prot
-
-#ifdef CONFIG_SMP
-	cmpib,COND(=),n        0,spc,dbit_nounlock_20
-	ldi             1,t1
-	stw             t1,0(t0)
-
-dbit_nounlock_20:
-#endif
+	dtlb_recheck	ptp,t1,spc,va,t0
 
 	rfir
 	nop
@@ -1772,9 +1774,9 @@ ENTRY(sys_fork_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	/* These are call-clobbered registers and therefore
-	   also syscall-clobbered (we hope). */
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 
 	LDREG	PT_GR30(%r1),%r25
@@ -1804,7 +1806,7 @@ ENTRY(child_return)
 	nop
 
 	LDREG	TI_TASK-THREAD_SZ_ALGN-FRAME_SIZE-FRAME_SIZE(%r30), %r1
-	LDREG	TASK_PT_GR19(%r1),%r2
+	LDREG	TASK_PT_SYSCALL_RP(%r1),%r2
 	b	wrapper_exit
 	copy	%r0,%r28
 ENDPROC(child_return)
@@ -1823,8 +1825,9 @@ ENTRY(sys_clone_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	/* WARNING - Clobbers r19 and r21, userspace must save these! */
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 	BL	sys_clone,%r2
 	copy	%r1,%r24
@@ -1847,7 +1850,9 @@ ENTRY(sys_vfork_wrapper)
 	ldo	-16(%r30),%r29		/* Reference param save area */
 #endif
 
-	STREG	%r2,PT_GR19(%r1)	/* save for child */
+	STREG	%r2,PT_SYSCALL_RP(%r1)
+
+	/* WARNING - Clobbers r21, userspace must save! */
 	STREG	%r30,PT_GR21(%r1)
 
 	BL	sys_vfork,%r2
@@ -2076,9 +2081,10 @@ syscall_restore:
 	LDREG	TASK_PT_GR31(%r1),%r31	   /* restore syscall rp */
 
 	/* NOTE: We use rsm/ssm pair to make this operation atomic */
+	LDREG   TASK_PT_GR30(%r1),%r1              /* Get user sp */
 	rsm     PSW_SM_I, %r0
-	LDREG   TASK_PT_GR30(%r1),%r30             /* restore user sp */
-	mfsp	%sr3,%r1			   /* Get users space id */
+	copy    %r1,%r30                           /* Restore user sp */
+	mfsp    %sr3,%r1                           /* Get user space id */
 	mtsp    %r1,%sr7                           /* Restore sr7 */
 	ssm     PSW_SM_I, %r0
 
diff --git a/arch/parisc/kernel/pacache.S b/arch/parisc/kernel/pacache.S
index 09b77b2..b2f0d3d 100644
--- a/arch/parisc/kernel/pacache.S
+++ b/arch/parisc/kernel/pacache.S
@@ -277,7 +277,7 @@ ENDPROC(flush_data_cache_local)
 
 	.align	16
 
-ENTRY(copy_user_page_asm)
+ENTRY(copy_page_asm)
 	.proc
 	.callinfo NO_CALLS
 	.entry
@@ -288,54 +288,54 @@ ENTRY(copy_user_page_asm)
 	 * GCC probably can do this just as well.
 	 */
 
-	ldd		0(%r25), %r19
+	ldd		0(%r25), %r20
 	ldi		(PAGE_SIZE / 128), %r1
 
 	ldw		64(%r25), %r0		/* prefetch 1 cacheline ahead */
 	ldw		128(%r25), %r0		/* prefetch 2 */
 
-1:	ldd		8(%r25), %r20
+1:	ldd		8(%r25), %r21
 	ldw		192(%r25), %r0		/* prefetch 3 */
 	ldw		256(%r25), %r0		/* prefetch 4 */
 
-	ldd		16(%r25), %r21
-	ldd		24(%r25), %r22
-	std		%r19, 0(%r26)
-	std		%r20, 8(%r26)
-
-	ldd		32(%r25), %r19
-	ldd		40(%r25), %r20
-	std		%r21, 16(%r26)
-	std		%r22, 24(%r26)
-
-	ldd		48(%r25), %r21
-	ldd		56(%r25), %r22
-	std		%r19, 32(%r26)
-	std		%r20, 40(%r26)
-
-	ldd		64(%r25), %r19
-	ldd		72(%r25), %r20
-	std		%r21, 48(%r26)
-	std		%r22, 56(%r26)
-
-	ldd		80(%r25), %r21
-	ldd		88(%r25), %r22
-	std		%r19, 64(%r26)
-	std		%r20, 72(%r26)
-
-	ldd		 96(%r25), %r19
-	ldd		104(%r25), %r20
-	std		%r21, 80(%r26)
-	std		%r22, 88(%r26)
-
-	ldd		112(%r25), %r21
-	ldd		120(%r25), %r22
-	std		%r19, 96(%r26)
-	std		%r20, 104(%r26)
+	ldd		16(%r25), %r22
+	ldd		24(%r25), %r24
+	std		%r20, 0(%r26)
+	std		%r21, 8(%r26)
+
+	ldd		32(%r25), %r20
+	ldd		40(%r25), %r21
+	std		%r22, 16(%r26)
+	std		%r24, 24(%r26)
+
+	ldd		48(%r25), %r22
+	ldd		56(%r25), %r24
+	std		%r20, 32(%r26)
+	std		%r21, 40(%r26)
+
+	ldd		64(%r25), %r20
+	ldd		72(%r25), %r21
+	std		%r22, 48(%r26)
+	std		%r24, 56(%r26)
+
+	ldd		80(%r25), %r22
+	ldd		88(%r25), %r24
+	std		%r20, 64(%r26)
+	std		%r21, 72(%r26)
+
+	ldd		96(%r25), %r20
+	ldd		104(%r25), %r21
+	std		%r22, 80(%r26)
+	std		%r24, 88(%r26)
+
+	ldd		112(%r25), %r22
+	ldd		120(%r25), %r24
+	std		%r20, 96(%r26)
+	std		%r21, 104(%r26)
 
 	ldo		128(%r25), %r25
-	std		%r21, 112(%r26)
-	std		%r22, 120(%r26)
+	std		%r22, 112(%r26)
+	std		%r24, 120(%r26)
 	ldo		128(%r26), %r26
 
 	/* conditional branches nullify on forward taken branch, and on
@@ -343,7 +343,7 @@ ENTRY(copy_user_page_asm)
 	 * The ldd should only get executed if the branch is taken.
 	 */
 	addib,COND(>),n	-1, %r1, 1b		/* bundle 10 */
-	ldd		0(%r25), %r19		/* start next loads */
+	ldd		0(%r25), %r20		/* start next loads */
 
 #else
 
@@ -354,52 +354,116 @@ ENTRY(copy_user_page_asm)
 	 * the full 64 bit register values on interrupt, we can't
 	 * use ldd/std on a 32 bit kernel.
 	 */
-	ldw		0(%r25), %r19
+	ldw		0(%r25), %r20
 	ldi		(PAGE_SIZE / 64), %r1
 
 1:
-	ldw		4(%r25), %r20
-	ldw		8(%r25), %r21
-	ldw		12(%r25), %r22
-	stw		%r19, 0(%r26)
-	stw		%r20, 4(%r26)
-	stw		%r21, 8(%r26)
-	stw		%r22, 12(%r26)
-	ldw		16(%r25), %r19
-	ldw		20(%r25), %r20
-	ldw		24(%r25), %r21
-	ldw		28(%r25), %r22
-	stw		%r19, 16(%r26)
-	stw		%r20, 20(%r26)
-	stw		%r21, 24(%r26)
-	stw		%r22, 28(%r26)
-	ldw		32(%r25), %r19
-	ldw		36(%r25), %r20
-	ldw		40(%r25), %r21
-	ldw		44(%r25), %r22
-	stw		%r19, 32(%r26)
-	stw		%r20, 36(%r26)
-	stw		%r21, 40(%r26)
-	stw		%r22, 44(%r26)
-	ldw		48(%r25), %r19
-	ldw		52(%r25), %r20
-	ldw		56(%r25), %r21
-	ldw		60(%r25), %r22
-	stw		%r19, 48(%r26)
-	stw		%r20, 52(%r26)
+	ldw		4(%r25), %r21
+	ldw		8(%r25), %r22
+	ldw		12(%r25), %r24
+	stw		%r20, 0(%r26)
+	stw		%r21, 4(%r26)
+	stw		%r22, 8(%r26)
+	stw		%r24, 12(%r26)
+	ldw		16(%r25), %r20
+	ldw		20(%r25), %r21
+	ldw		24(%r25), %r22
+	ldw		28(%r25), %r24
+	stw		%r20, 16(%r26)
+	stw		%r21, 20(%r26)
+	stw		%r22, 24(%r26)
+	stw		%r24, 28(%r26)
+	ldw		32(%r25), %r20
+	ldw		36(%r25), %r21
+	ldw		40(%r25), %r22
+	ldw		44(%r25), %r24
+	stw		%r20, 32(%r26)
+	stw		%r21, 36(%r26)
+	stw		%r22, 40(%r26)
+	stw		%r24, 44(%r26)
+	ldw		48(%r25), %r20
+	ldw		52(%r25), %r21
+	ldw		56(%r25), %r22
+	ldw		60(%r25), %r24
+	stw		%r20, 48(%r26)
+	stw		%r21, 52(%r26)
 	ldo		64(%r25), %r25
-	stw		%r21, 56(%r26)
-	stw		%r22, 60(%r26)
+	stw		%r22, 56(%r26)
+	stw		%r24, 60(%r26)
 	ldo		64(%r26), %r26
 	addib,COND(>),n	-1, %r1, 1b
-	ldw		0(%r25), %r19
+	ldw		0(%r25), %r20
 #endif
 	bv		%r0(%r2)
 	nop
 	.exit
 
 	.procend
-ENDPROC(copy_user_page_asm)
+ENDPROC(copy_page_asm)
+
+ENTRY(clear_page_asm)
+	.proc
+	.callinfo NO_CALLS
+	.entry
+
+#ifdef CONFIG_64BIT
+	ldi		(PAGE_SIZE / 128), %r1
+
+1:
+	std		%r0, 0(%r26)
+	std		%r0, 8(%r26)
+	std		%r0, 16(%r26)
+	std		%r0, 24(%r26)
+	std		%r0, 32(%r26)
+	std		%r0, 40(%r26)
+	std		%r0, 48(%r26)
+	std		%r0, 56(%r26)
+	std		%r0, 64(%r26)
+	std		%r0, 72(%r26)
+	std		%r0, 80(%r26)
+	std		%r0, 88(%r26)
+	std		%r0, 96(%r26)
+	std		%r0, 104(%r26)
+	std		%r0, 112(%r26)
+	std		%r0, 120(%r26)
+
+	/* Conditional branches nullify on forward taken branch, and on
+	 * non-taken backward branch. Note that .+4 is a backwards branch.
+	 */
+	addib,COND(>),n	-1, %r1, 1b
+	ldo		128(%r26), %r26
+
+#else
+
+	ldi		(PAGE_SIZE / 64), %r1
+
+1:
+	stw		%r0, 0(%r26)
+	stw		%r0, 4(%r26)
+	stw		%r0, 8(%r26)
+	stw		%r0, 12(%r26)
+	stw		%r0, 16(%r26)
+	stw		%r0, 20(%r26)
+	stw		%r0, 24(%r26)
+	stw		%r0, 28(%r26)
+	stw		%r0, 32(%r26)
+	stw		%r0, 36(%r26)
+	stw		%r0, 40(%r26)
+	stw		%r0, 44(%r26)
+	stw		%r0, 48(%r26)
+	stw		%r0, 52(%r26)
+	stw		%r0, 56(%r26)
+	stw		%r0, 60(%r26)
+	addib,COND(>),n	-1, %r1, 1b
+	ldo		64(%r26), %r26
+#endif
+
+	bv		%r0(%r2)
+	nop
+	.exit
+
+	.procend
+ENDPROC(clear_page_asm)
 
 /*
  * NOTE: Code in clear_user_page has a hard coded dependency on the
@@ -422,7 +486,6 @@ ENDPROC(copy_user_page_asm)
  *          %r23 physical page (shifted for tlb insert) of "from" translation
  */
 
-#if 0
 
 	/*
 	 * We can't do this since copy_user_page is used to bring in
@@ -449,9 +512,9 @@ ENTRY(copy_user_page_asm)
 	ldil		L%(TMPALIAS_MAP_START), %r28
 	/* FIXME for different page sizes != 4k */
 #ifdef CONFIG_64BIT
-	extrd,u		%r26,56,32, %r26		/* convert phys addr to tlb insert format */
-	extrd,u		%r23,56,32, %r23		/* convert phys addr to tlb insert format */
-	depd		%r24,63,22, %r28		/* Form aliased virtual address 'to' */
+	extrd,u		%r26,56,32, %r26	/* convert phys addr to tlb insert format */
+	extrd,u		%r23,56,32, %r23	/* convert phys addr to tlb insert format */
+	depd		%r24,63,22, %r28	/* Form aliased virtual address 'to' */
 	depdi		0, 63,12, %r28		/* Clear any offset bits */
 	copy		%r28, %r29
 	depdi		1, 41,1, %r29		/* Form aliased virtual address 'from' */
@@ -464,12 +527,88 @@ ENTRY(copy_user_page_asm)
 	depwi		1, 9,1, %r29		/* Form aliased virtual address 'from' */
 #endif
 
+#ifdef CONFIG_SMP
+	ldil		L%pa_tlb_lock, %r1
+	ldo		R%pa_tlb_lock(%r1), %r24
+	rsm		PSW_SM_I, %r22
+1:
+	LDCW		0(%r24),%r25
+	cmpib,COND(=)	0,%r25,1b
+	nop
+#endif
+
 	/* Purge any old translations */
 
 	pdtlb		0(%r28)
 	pdtlb		0(%r29)
 
-	ldi		64, %r1
+#ifdef CONFIG_SMP
+	ldi		1,%r25
+	stw		%r25,0(%r24)
+	mtsm		%r22
+#endif
+
+#ifdef CONFIG_64BIT
+
+	ldd		0(%r29), %r20
+	ldi		(PAGE_SIZE / 128), %r1
+
+	ldw		64(%r29), %r0		/* prefetch 1 cacheline ahead */
+	ldw		128(%r29), %r0		/* prefetch 2 */
+
+2:	ldd		8(%r29), %r21
+	ldw		192(%r29), %r0		/* prefetch 3 */
+	ldw		256(%r29), %r0		/* prefetch 4 */
+
+	ldd		16(%r29), %r22
+	ldd		24(%r29), %r24
+	std		%r20, 0(%r28)
+	std		%r21, 8(%r28)
+
+	ldd		32(%r29), %r20
+	ldd		40(%r29), %r21
+	std		%r22, 16(%r28)
+	std		%r24, 24(%r28)
+
+	ldd		48(%r29), %r22
+	ldd		56(%r29), %r24
+	std		%r20, 32(%r28)
+	std		%r21, 40(%r28)
+
+	ldd		64(%r29), %r20
+	ldd		72(%r29), %r21
+	std		%r22, 48(%r28)
+	std		%r24, 56(%r28)
+
+	ldd		80(%r29), %r22
+	ldd		88(%r29), %r24
+	std		%r20, 64(%r28)
+	std		%r21, 72(%r28)
+
+	ldd		96(%r29), %r20
+	ldd		104(%r29), %r21
+	std		%r22, 80(%r28)
+	std		%r24, 88(%r28)
+
+	ldd		112(%r29), %r22
+	ldd		120(%r29), %r24
+	std		%r20, 96(%r28)
+	std		%r21, 104(%r28)
+
+	ldo		128(%r29), %r29
+	std		%r22, 112(%r28)
+	std		%r24, 120(%r28)
+
+	fdc		0(%r28)
+	ldo		64(%r28), %r28
+	fdc		0(%r28)
+	ldo		64(%r28), %r28
+	addib,COND(>),n	-1, %r1, 2b
+	ldd		0(%r29), %r20		/* start next loads */
+
+#else
+
+	ldi		(PAGE_SIZE / 64), %r1
 
 	/*
 	 * This loop is optimized for PCXL/PCXL2 ldw/ldw and stw/stw
@@ -480,53 +619,57 @@ ENTRY(copy_user_page_asm)
 	 * use ldd/std on a 32 bit kernel.
 	 */
 
-
-1:
-	ldw		0(%r29), %r19
-	ldw		4(%r29), %r20
-	ldw		8(%r29), %r21
-	ldw		12(%r29), %r22
-	stw		%r19, 0(%r28)
-	stw		%r20, 4(%r28)
-	stw		%r21, 8(%r28)
-	stw		%r22, 12(%r28)
-	ldw		16(%r29), %r19
-	ldw		20(%r29), %r20
-	ldw		24(%r29), %r21
-	ldw		28(%r29), %r22
-	stw		%r19, 16(%r28)
-	stw		%r20, 20(%r28)
-	stw		%r21, 24(%r28)
-	stw		%r22, 28(%r28)
-	ldw		32(%r29), %r19
-	ldw		36(%r29), %r20
-	ldw		40(%r29), %r21
-	ldw		44(%r29), %r22
-	stw		%r19, 32(%r28)
-	stw		%r20, 36(%r28)
-	stw		%r21, 40(%r28)
-	stw		%r22, 44(%r28)
-	ldw		48(%r29), %r19
-	ldw		52(%r29), %r20
-	ldw		56(%r29), %r21
-	ldw		60(%r29), %r22
-	stw		%r19, 48(%r28)
-	stw		%r20, 52(%r28)
-	stw		%r21, 56(%r28)
-	stw		%r22, 60(%r28)
-	ldo		64(%r28), %r28
-	addib,COND(>)		-1, %r1,1b
+2:
+	ldw		0(%r29), %r20
+	ldw		4(%r29), %r21
+	ldw		8(%r29), %r22
+	ldw		12(%r29), %r24
+	stw		%r20, 0(%r28)
+	stw		%r21, 4(%r28)
+	stw		%r22, 8(%r28)
+	stw		%r24, 12(%r28)
+	ldw		16(%r29), %r20
+	ldw		20(%r29), %r21
+	ldw		24(%r29), %r22
+	ldw		28(%r29), %r24
+	stw		%r20, 16(%r28)
+	stw		%r21, 20(%r28)
+	stw		%r22, 24(%r28)
+	stw		%r24, 28(%r28)
+	ldw		32(%r29), %r20
+	ldw		36(%r29), %r21
+	ldw		40(%r29), %r22
+	ldw		44(%r29), %r24
+	stw		%r20, 32(%r28)
+	stw		%r21, 36(%r28)
+	stw		%r22, 40(%r28)
+	stw		%r24, 44(%r28)
+	ldw		48(%r29), %r20
+	ldw		52(%r29), %r21
+	ldw		56(%r29), %r22
+	ldw		60(%r29), %r24
+	stw		%r20, 48(%r28)
+	stw		%r21, 52(%r28)
+	stw		%r22, 56(%r28)
+	stw		%r24, 60(%r28)
+	fdc		0(%r28)
+	ldo		32(%r28), %r28
+	fdc		0(%r28)
+	ldo		32(%r28), %r28
+	addib,COND(>)		-1, %r1,2b
 	ldo		64(%r29), %r29
 
+#endif
+
+	sync
 	bv		%r0(%r2)
 	nop
 	.exit
 
 	.procend
 ENDPROC(copy_user_page_asm)
-#endif
 
-ENTRY(__clear_user_page_asm)
+ENTRY(clear_user_page_asm)
 	.proc
 	.callinfo NO_CALLS
 	.entry
@@ -548,17 +691,33 @@ ENTRY(__clear_user_page_asm)
 	depwi		0, 31,12, %r28		/* Clear any offset bits */
 #endif
 
+#ifdef CONFIG_SMP
+	ldil		L%pa_tlb_lock, %r1
+	ldo		R%pa_tlb_lock(%r1), %r24
+	rsm		PSW_SM_I, %r22
+1:
+	LDCW		0(%r24),%r25
+	cmpib,COND(=)	0,%r25,1b
+	nop
+#endif
+
 	/* Purge any old translation */
 
 	pdtlb		0(%r28)
 
+#ifdef CONFIG_SMP
+	ldi		1,%r25
+	stw		%r25,0(%r24)
+	mtsm		%r22
+#endif
+
 #ifdef CONFIG_64BIT
 	ldi		(PAGE_SIZE / 128), %r1
 
 	/* PREFETCH (Write) has not (yet) been proven to help here */
 	/* #define	PREFETCHW_OP	ldd		256(%0), %r0 */
 
-1:	std		%r0, 0(%r28)
+2:	std		%r0, 0(%r28)
 	std		%r0, 8(%r28)
 	std		%r0, 16(%r28)
 	std		%r0, 24(%r28)
@@ -574,13 +733,13 @@ ENTRY(__clear_user_page_asm)
 	std		%r0, 104(%r28)
 	std		%r0, 112(%r28)
 	std		%r0, 120(%r28)
-	addib,COND(>)		-1, %r1, 1b
+	addib,COND(>)		-1, %r1, 2b
 	ldo		128(%r28), %r28
 
 #else	/* ! CONFIG_64BIT */
 	ldi		(PAGE_SIZE / 64), %r1
 
-1:
+2:
 	stw		%r0, 0(%r28)
 	stw		%r0, 4(%r28)
 	stw		%r0, 8(%r28)
@@ -597,7 +756,7 @@ ENTRY(__clear_user_page_asm)
 	stw		%r0, 52(%r28)
 	stw		%r0, 56(%r28)
 	stw		%r0, 60(%r28)
-	addib,COND(>)		-1, %r1, 1b
+	addib,COND(>)		-1, %r1, 2b
 	ldo		64(%r28), %r28
 #endif	/* CONFIG_64BIT */
 
@@ -606,7 +765,7 @@ ENTRY(__clear_user_page_asm)
 	.exit
 
 	.procend
-ENDPROC(__clear_user_page_asm)
+ENDPROC(clear_user_page_asm)
 
 ENTRY(flush_kernel_dcache_page_asm)
 	.proc
diff --git a/arch/parisc/kernel/parisc_ksyms.c b/arch/parisc/kernel/parisc_ksyms.c
index df65366..a5314df 100644
--- a/arch/parisc/kernel/parisc_ksyms.c
+++ b/arch/parisc/kernel/parisc_ksyms.c
@@ -159,4 +159,5 @@ EXPORT_SYMBOL(_mcount);
 #endif
 
 /* from pacache.S -- needed for copy_page */
-EXPORT_SYMBOL(copy_user_page_asm);
+EXPORT_SYMBOL(copy_page_asm);
+EXPORT_SYMBOL(clear_page_asm);
diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c
index cb71f3d..84b3239 100644
--- a/arch/parisc/kernel/setup.c
+++ b/arch/parisc/kernel/setup.c
@@ -128,6 +128,14 @@ void __init setup_arch(char **cmdline_p)
 	printk(KERN_INFO "The 32-bit Kernel has started...\n");
 #endif
 
+	/* Consistency check on the size and alignments of our spinlocks */
+#ifdef CONFIG_SMP
+	BUILD_BUG_ON(sizeof(arch_spinlock_t) != __PA_LDCW_ALIGNMENT);
+	BUG_ON((unsigned long)&__atomic_hash[0] & (__PA_LDCW_ALIGNMENT-1));
+	BUG_ON((unsigned long)&__atomic_hash[1] & (__PA_LDCW_ALIGNMENT-1));
+#endif
+	BUILD_BUG_ON((1<<L1_CACHE_SHIFT) != L1_CACHE_BYTES);
+
 	pdc_console_init();
 
 #ifdef CONFIG_64BIT
diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index f5f9602..68e75ce 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -47,18 +47,17 @@ ENTRY(linux_gateway_page)
 	KILL_INSN
 	.endr
 
-	/* ADDRESS 0xb0 to 0xb4, lws uses 1 insns for entry */
+	/* ADDRESS 0xb0 to 0xb8, lws uses two insns for entry */
 	/* Light-weight-syscall entry must always be located at 0xb0 */
 	/* WARNING: Keep this number updated with table size changes */
 #define __NR_lws_entries (2)
 
 lws_entry:
-	/* Unconditional branch to lws_start, located on the 
-	   same gateway page */
-	b,n	lws_start
+	gate	lws_start, %r0		/* increase privilege */
+	depi	3, 31, 2, %r31		/* Ensure we return into user mode. */
 
-	/* Fill from 0xb4 to 0xe0 */
-	.rept 11
+	/* Fill from 0xb8 to 0xe0 */
+	.rept 10
 	KILL_INSN
 	.endr
 
@@ -423,9 +422,6 @@ tracesys_sigexit:
 
 	*********************************************************/
 lws_start:
-	/* Gate and ensure we return to userspace */
-	gate	.+8, %r0
-	depi	3, 31, 2, %r31	/* Ensure we return to userspace */
 
 #ifdef CONFIG_64BIT
 	/* FIXME: If we are a 64-bit kernel just
@@ -442,7 +438,7 @@ lws_start:
 #endif	
 
         /* Is the lws entry number valid? */
-	comiclr,>>=	__NR_lws_entries, %r20, %r0
+	comiclr,>>	__NR_lws_entries, %r20, %r0
 	b,n	lws_exit_nosys
 
 	/* WARNING: Trashing sr2 and sr3 */
@@ -473,7 +469,7 @@ lws_exit:
 	/* now reset the lowest bit of sp if it was set */
 	xor	%r30,%r1,%r30
 #endif
-	be,n	0(%sr3, %r31)
+	be,n	0(%sr7, %r31)
 
 
 	
@@ -529,7 +525,6 @@ lws_compare_and_swap32:
 #endif
 
 lws_compare_and_swap:
-#ifdef CONFIG_SMP
 	/* Load start of lock table */
 	ldil	L%lws_lock_start, %r20
 	ldo	R%lws_lock_start(%r20), %r28
@@ -572,8 +567,6 @@ cas_wouldblock:
 	ldo	2(%r0), %r28				/* 2nd case */
 	b	lws_exit				/* Contended... */
 	ldo	-EAGAIN(%r0), %r21			/* Spin in userspace */
-#endif
-/* CONFIG_SMP */
 
 	/*
 		prev = *addr;
@@ -601,13 +594,11 @@ cas_action:
 1:	ldw	0(%sr3,%r26), %r28
 	sub,<>	%r28, %r25, %r0
 2:	stw	%r24, 0(%sr3,%r26)
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	/* Clear thread register indicator */
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	/* Return to userspace, set no error */
 	b	lws_exit
@@ -615,12 +606,10 @@ cas_action:
 
 3:		
 	/* Error occured on load or store */
-#ifdef CONFIG_SMP
 	/* Free lock */
 	stw	%r20, 0(%sr2,%r20)
-# if ENABLE_LWS_DEBUG
+#if ENABLE_LWS_DEBUG
 	stw	%r0, 4(%sr2,%r20)
-# endif
 #endif
 	b	lws_exit
 	ldo	-EFAULT(%r0),%r21	/* set errno */
@@ -672,7 +661,6 @@ ENTRY(sys_call_table64)
 END(sys_call_table64)
 #endif
 
-#ifdef CONFIG_SMP
 	/*
 		All light-weight-syscall atomic operations 
 		will use this set of locks 
@@ -694,8 +682,6 @@ ENTRY(lws_lock_start)
 	.endr
 END(lws_lock_start)
 	.previous
-#endif
-/* CONFIG_SMP for lws_lock_start */
 
 .end
 
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 8b58bf0..804b024 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -47,7 +47,7 @@
 			  /*  dumped to the console via printk)          */
 
 #if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-DEFINE_SPINLOCK(pa_dbit_lock);
+DEFINE_SPINLOCK(pa_pte_lock);
 #endif
 
 static void parisc_show_stack(struct task_struct *task, unsigned long *sp,
diff --git a/arch/parisc/lib/bitops.c b/arch/parisc/lib/bitops.c
index 353963d..bae6a86 100644
--- a/arch/parisc/lib/bitops.c
+++ b/arch/parisc/lib/bitops.c
@@ -15,6 +15,9 @@
 arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
 	[0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
 };
+arch_spinlock_t __atomic_user_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
+	[0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
+};
 #endif
 
 #ifdef CONFIG_64BIT
diff --git a/arch/parisc/math-emu/decode_exc.c b/arch/parisc/math-emu/decode_exc.c
index 3ca1c61..27a7492 100644
--- a/arch/parisc/math-emu/decode_exc.c
+++ b/arch/parisc/math-emu/decode_exc.c
@@ -342,6 +342,7 @@ decode_fpu(unsigned int Fpu_register[], unsigned int trap_counts[])
 		return SIGNALCODE(SIGFPE, FPE_FLTINV);
 	  case DIVISIONBYZEROEXCEPTION:
 		update_trap_counts(Fpu_register, aflags, bflags, trap_counts);
+		Clear_excp_register(exception_index);
 	  	return SIGNALCODE(SIGFPE, FPE_FLTDIV);
 	  case INEXACTEXCEPTION:
 		update_trap_counts(Fpu_register, aflags, bflags, trap_counts);
diff --git a/mm/memory.c b/mm/memory.c
index 09e4b1b..21c2916 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -616,7 +616,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	 * in the parent and the child
 	 */
 	if (is_cow_mapping(vm_flags)) {
-		ptep_set_wrprotect(src_mm, addr, src_pte);
+		ptep_set_wrprotect(vma, src_mm, addr, src_pte);
 		pte = pte_wrprotect(pte);
 	}
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Bug#561203: threads and fork on machine with VIPT-WB cache
  2010-04-08 22:44                     ` John David Anglin
  2010-04-09 14:14                       ` Carlos O'Donell
@ 2010-06-02 15:33                       ` Modestas Vainius
  2010-06-02 17:16                         ` John David Anglin
  1 sibling, 1 reply; 74+ messages in thread
From: Modestas Vainius @ 2010-06-02 15:33 UTC (permalink / raw)
  To: John David Anglin
  Cc: dave.anglin, deller, gniibe, linux-parisc, pkg-gauche-devel, 561203

[-- Attachment #1: Type: Text/Plain, Size: 531 bytes --]

Hello,

this bug [1] is back to the "very common" department with eglibc 2.11 (libc6-
dev_2.11.1-1) builds. Majority of KDE applications are failing to build on 
hppa again. Is there really nothing what could be done to fix it?

1. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561203
2. 
https://buildd.debian.org/fetch.cgi?pkg=kde4libs;ver=4%3A4.4.4-1;arch=hppa;stamp=1275467025
3. 
https://buildd.debian.org/fetch.cgi?pkg=basket;ver=1.80-1;arch=hppa;stamp=1275483241

-- 
Modestas Vainius <modestas@vainius.eu>

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: threads and fork on machine with VIPT-WB cache
  2010-06-02 15:33                       ` Bug#561203: threads and fork on machine with VIPT-WB cache Modestas Vainius
@ 2010-06-02 17:16                         ` John David Anglin
  2010-06-02 17:56                           ` Bug#561203: " dann frazier
  0 siblings, 1 reply; 74+ messages in thread
From: John David Anglin @ 2010-06-02 17:16 UTC (permalink / raw)
  To: Modestas Vainius
  Cc: dave.anglin, deller, gniibe, linux-parisc, pkg-gauche-devel, 561203

On Wed, 02 Jun 2010, Modestas Vainius wrote:

> Hello,
> 
> this bug [1] is back to the "very common" department with eglibc 2.11 (libc6-
> dev_2.11.1-1) builds. Majority of KDE applications are failing to build on 
> hppa again. Is there really nothing what could be done to fix it?

I will just say it is very tricky.  I think a fix is possible (arm and mips
had similar cache problems) but the victim replacement present in PA8800/PA8900
caches makes the problem especially difficult  for hardware using these
processors.

I have spent the last few months testing various alternatives and have
now done hundreds of kernel builds.  I did post some experimental patches
that fix the problem on UP kernels.  However, the problem is not resolved
for SMP kernels.

The minifail test is a good one to demonstrate the problem.  Indeed,
a very similar test was given in the thread below:
http://readlist.com/lists/vger.kernel.org/linux-kernel/54/270861.html

This thread also discusses the PA8800 problem:
http://readlist.com/lists/vger.kernel.org/linux-kernel/54/271417.html

I currently surmise that we have a problem with the cache victim
replacement, although the cause isn't clear.  I did find recently
that the cache prefetch in copy_user_page_asm extends to the line
beyond the end of the page, but fixing this doesn't resolve the problem.

I am still experimenting with using equivalent aliasing.  It does
help to flush in ptep_set_wrprotect.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Bug#561203: threads and fork on machine with VIPT-WB cache
  2010-06-02 17:16                         ` John David Anglin
@ 2010-06-02 17:56                           ` dann frazier
  2010-06-03  8:50                             ` Modestas Vainius
  0 siblings, 1 reply; 74+ messages in thread
From: dann frazier @ 2010-06-02 17:56 UTC (permalink / raw)
  To: John David Anglin, 561203
  Cc: Modestas Vainius, deller, gniibe, linux-parisc, pkg-gauche-devel

On Wed, Jun 02, 2010 at 01:16:01PM -0400, John David Anglin wrote:
> On Wed, 02 Jun 2010, Modestas Vainius wrote:
> 
> > Hello,
> > 
> > this bug [1] is back to the "very common" department with eglibc 2.11 (libc6-
> > dev_2.11.1-1) builds. Majority of KDE applications are failing to build on 
> > hppa again. Is there really nothing what could be done to fix it?
> 
> I will just say it is very tricky.  I think a fix is possible (arm and mips
> had similar cache problems) but the victim replacement present in PA8800/PA8900
> caches makes the problem especially difficult  for hardware using these
> processors.
> 
> I have spent the last few months testing various alternatives and have
> now done hundreds of kernel builds.  I did post some experimental patches
> that fix the problem on UP kernels.  However, the problem is not resolved
> for SMP kernels.

Note that Debian's buildds run a UP kernel, so as soon as those fixes
go upstream we can pull them in. Thanks for all your work here!

> The minifail test is a good one to demonstrate the problem.  Indeed,
> a very similar test was given in the thread below:
> http://readlist.com/lists/vger.kernel.org/linux-kernel/54/270861.html
> 
> This thread also discusses the PA8800 problem:
> http://readlist.com/lists/vger.kernel.org/linux-kernel/54/271417.html
> 
> I currently surmise that we have a problem with the cache victim
> replacement, although the cause isn't clear.  I did find recently
> that the cache prefetch in copy_user_page_asm extends to the line
> beyond the end of the page, but fixing this doesn't resolve the problem.
> 
> I am still experimenting with using equivalent aliasing.  It does
> help to flush in ptep_set_wrprotect.
> 
> Dave

-- 
dann frazier


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Bug#561203: threads and fork on machine with VIPT-WB cache
  2010-06-02 17:56                           ` Bug#561203: " dann frazier
@ 2010-06-03  8:50                             ` Modestas Vainius
  2010-06-04  1:03                               ` NIIBE Yutaka
  0 siblings, 1 reply; 74+ messages in thread
From: Modestas Vainius @ 2010-06-03  8:50 UTC (permalink / raw)
  To: dann frazier
  Cc: John David Anglin, 561203, deller, gniibe, linux-parisc,
	pkg-gauche-devel, carlos

[-- Attachment #1: Type: Text/Plain, Size: 1667 bytes --]

# Breaks unrelated applications
tags 561203 critical
thanks

Hello,

On trečiadienis 02 Birželis 2010 20:56:17 dann frazier wrote:
> On Wed, Jun 02, 2010 at 01:16:01PM -0400, John David Anglin wrote:
> > On Wed, 02 Jun 2010, Modestas Vainius wrote:
> > > Hello,
> > > 
> > > this bug [1] is back to the "very common" department with eglibc 2.11
> > > (libc6- dev_2.11.1-1) builds. Majority of KDE applications are failing
> > > to build on hppa again. Is there really nothing what could be done to
> > > fix it?
> > 
> > I will just say it is very tricky.  I think a fix is possible (arm and
> > mips had similar cache problems) but the victim replacement present in
> > PA8800/PA8900 caches makes the problem especially difficult  for
> > hardware using these processors.
> > 
> > I have spent the last few months testing various alternatives and have
> > now done hundreds of kernel builds.  I did post some experimental patches
> > that fix the problem on UP kernels.  However, the problem is not resolved
> > for SMP kernels.
> 
> Note that Debian's buildds run a UP kernel, so as soon as those fixes
> go upstream we can pull them in. Thanks for all your work here!
> 

Well, as long as this is unfixed or at least "common", I don't see how hppa 
can be considered to be a release arch. Is that UP patch available somewhere?

All KDE applications have been stuck in unstable before due to this and 
history is about to repeat itself unless something is done. While apparently a 
failing test in eglibc can be ignored, other applications have to suffer real 
world problems...

-- 
Modestas Vainius <modestas@vainius.eu>

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Bug#561203: threads and fork on machine with VIPT-WB cache
  2010-06-03  8:50                             ` Modestas Vainius
@ 2010-06-04  1:03                               ` NIIBE Yutaka
  2010-06-04  5:21                                 ` dann frazier
  0 siblings, 1 reply; 74+ messages in thread
From: NIIBE Yutaka @ 2010-06-04  1:03 UTC (permalink / raw)
  To: Modestas Vainius
  Cc: dann frazier, John David Anglin, 561203, deller, linux-parisc,
	pkg-gauche-devel, carlos

Modestas Vainius wrote:
>> Note that Debian's buildds run a UP kernel, so as soon as those fixes
>> go upstream we can pull them in. Thanks for all your work here!
>>
>
> Well, as long as this is unfixed or at least "common", I don't see how hppa
> can be considered to be a release arch. Is that UP patch available somewhere?

My case and my analysis talked about UP kernel, and John David Anglin
made a patch:
	http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561203#144

After that, the discussion went to SMP cases.

It would be better to evaluate the patch again, and make sure it works
for UP case and fix failures of buildd, then apply for Linux in Debian
(only) for HPPA.

I know that the patch is not that ideal because it touches
architecture independent part of Linux, but it is worth for Linux in
Debian (or Linux for the HPPA machine of buildd, at least).
-- 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Bug#561203: threads and fork on machine with VIPT-WB cache
  2010-06-04  1:03                               ` NIIBE Yutaka
@ 2010-06-04  5:21                                 ` dann frazier
  2010-06-04 10:44                                   ` Thibaut VARENE
  2010-06-06  1:01                                   ` Modestas Vainius
  0 siblings, 2 replies; 74+ messages in thread
From: dann frazier @ 2010-06-04  5:21 UTC (permalink / raw)
  To: NIIBE Yutaka, 561203
  Cc: Modestas Vainius, John David Anglin, deller, linux-parisc,
	pkg-gauche-devel, carlos

On Fri, Jun 04, 2010 at 10:03:07AM +0900, NIIBE Yutaka wrote:
> Modestas Vainius wrote:
>>> Note that Debian's buildds run a UP kernel, so as soon as those fixes
>>> go upstream we can pull them in. Thanks for all your work here!
>>>
>>
>> Well, as long as this is unfixed or at least "common", I don't see how hppa
>> can be considered to be a release arch. Is that UP patch available somewhere?
>
> My case and my analysis talked about UP kernel, and John David Anglin
> made a patch:
> 	http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561203#144
>
> After that, the discussion went to SMP cases.
>
> It would be better to evaluate the patch again, and make sure it works
> for UP case and fix failures of buildd, then apply for Linux in Debian
> (only) for HPPA.
>
> I know that the patch is not that ideal because it touches
> architecture independent part of Linux, but it is worth for Linux in
> Debian (or Linux for the HPPA machine of buildd, at least).

I'm happy to test the patch if necessary to help push this change
upstream. However, we do need the change to go upstream before we can
include it in the Debian kernel.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Bug#561203: threads and fork on machine with VIPT-WB cache
  2010-06-04  5:21                                 ` dann frazier
@ 2010-06-04 10:44                                   ` Thibaut VARENE
  2010-06-07 17:11                                     ` dann frazier
  2010-06-06  1:01                                   ` Modestas Vainius
  1 sibling, 1 reply; 74+ messages in thread
From: Thibaut VARENE @ 2010-06-04 10:44 UTC (permalink / raw)
  To: dann frazier
  Cc: NIIBE Yutaka, 561203, Modestas Vainius, John David Anglin,
	deller, linux-parisc, pkg-gauche-devel, carlos

On Fri, Jun 4, 2010 at 7:21 AM, dann frazier <dannf@debian.org> wrote:
> On Fri, Jun 04, 2010 at 10:03:07AM +0900, NIIBE Yutaka wrote:
>> Modestas Vainius wrote:
>>>> Note that Debian's buildds run a UP kernel, so as soon as those fi=
xes
>>>> go upstream we can pull them in. Thanks for all your work here!
>>>>
>>>
>>> Well, as long as this is unfixed or at least "common", I don't see =
how hppa
>>> can be considered to be a release arch. Is that UP patch available =
somewhere?
>>
>> My case and my analysis talked about UP kernel, and John David Angli=
n
>> made a patch:
>> =C2=A0 =C2=A0 =C2=A0 http://bugs.debian.org/cgi-bin/bugreport.cgi?bu=
g=3D561203#144
>>
>> After that, the discussion went to SMP cases.
>>
>> It would be better to evaluate the patch again, and make sure it wor=
ks
>> for UP case and fix failures of buildd, then apply for Linux in Debi=
an
>> (only) for HPPA.
>>
>> I know that the patch is not that ideal because it touches
>> architecture independent part of Linux, but it is worth for Linux in
>> Debian (or Linux for the HPPA machine of buildd, at least).
>
> I'm happy to test the patch if necessary to help push this change
> upstream. However, we do need the change to go upstream before we can
> include it in the Debian kernel.

Just for reference, I've summarized the test cases and related patches =
here:
http://wiki.parisc-linux.org/TestCases

HTH

--=20
Thibaut VARENE
http://www.parisc-linux.org/~varenet/
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Bug#561203: threads and fork on machine with VIPT-WB cache
  2010-06-04  5:21                                 ` dann frazier
  2010-06-04 10:44                                   ` Thibaut VARENE
@ 2010-06-06  1:01                                   ` Modestas Vainius
  1 sibling, 0 replies; 74+ messages in thread
From: Modestas Vainius @ 2010-06-06  1:01 UTC (permalink / raw)
  To: dann frazier
  Cc: NIIBE Yutaka, 561203, John David Anglin, deller, linux-parisc,
	pkg-gauche-devel, carlos

[-- Attachment #1: Type: Text/Plain, Size: 1328 bytes --]

Hello,

On penktadienis 04 Birželis 2010 08:21:06 dann frazier wrote:
> > My case and my analysis talked about UP kernel, and John David Anglin
> > 
> > made a patch:
> > 	http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561203#144
> > 
> > After that, the discussion went to SMP cases.
> > 
> > It would be better to evaluate the patch again, and make sure it works
> > for UP case and fix failures of buildd, then apply for Linux in Debian
> > (only) for HPPA.
> > 
> > I know that the patch is not that ideal because it touches
> > architecture independent part of Linux, but it is worth for Linux in
> > Debian (or Linux for the HPPA machine of buildd, at least).
> 
> I'm happy to test the patch if necessary to help push this change
> upstream. However, we do need the change to go upstream before we can
> include it in the Debian kernel.

I made a hackish patch for QProcess in Qt (usleep(1000) before fork()) which 
seems to reduce likelihood of the failure to very rare again. Once a new 
revision of qt4-x11 is uploaded to sid (soon I believe), KDE applications 
should be able to build again (hopefully).

Obviously it would be better to get this bug fixed for real but at least now 
the whole KDE stack won't be held by it while we wait.

-- 
Modestas Vainius <modestas@vainius.eu>

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Bug#561203: threads and fork on machine with VIPT-WB cache
  2010-06-04 10:44                                   ` Thibaut VARENE
@ 2010-06-07 17:11                                     ` dann frazier
  2010-06-07 18:27                                       ` Thibaut VARÈNE
  0 siblings, 1 reply; 74+ messages in thread
From: dann frazier @ 2010-06-07 17:11 UTC (permalink / raw)
  To: Thibaut VARENE, 561203
  Cc: NIIBE Yutaka, Modestas Vainius, John David Anglin, deller,
	linux-parisc, pkg-gauche-devel, carlos

On Fri, Jun 04, 2010 at 12:44:55PM +0200, Thibaut VARENE wrote:
> On Fri, Jun 4, 2010 at 7:21 AM, dann frazier <dannf@debian.org> wrote=
:
> > On Fri, Jun 04, 2010 at 10:03:07AM +0900, NIIBE Yutaka wrote:
> >> Modestas Vainius wrote:
> >>>> Note that Debian's buildds run a UP kernel, so as soon as those =
fixes
> >>>> go upstream we can pull them in. Thanks for all your work here!
> >>>>
> >>>
> >>> Well, as long as this is unfixed or at least "common", I don't se=
e how hppa
> >>> can be considered to be a release arch. Is that UP patch availabl=
e somewhere?
> >>
> >> My case and my analysis talked about UP kernel, and John David Ang=
lin
> >> made a patch:
> >> =A0 =A0 =A0 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D561=
203#144
> >>
> >> After that, the discussion went to SMP cases.
> >>
> >> It would be better to evaluate the patch again, and make sure it w=
orks
> >> for UP case and fix failures of buildd, then apply for Linux in De=
bian
> >> (only) for HPPA.
> >>
> >> I know that the patch is not that ideal because it touches
> >> architecture independent part of Linux, but it is worth for Linux =
in
> >> Debian (or Linux for the HPPA machine of buildd, at least).
> >
> > I'm happy to test the patch if necessary to help push this change
> > upstream. However, we do need the change to go upstream before we c=
an
> > include it in the Debian kernel.
>=20
> Just for reference, I've summarized the test cases and related patche=
s here:
> http://wiki.parisc-linux.org/TestCases

Cool - that is helpful. I've updated the kernel on peri/penalosa with
the various patches listed there that have gone upstream, but I'm not
seeing better results with any failing packages.

btw, I thought it would be useful to edit that page and tag each patch
with its status in Debian (in-official-kernel, installed-on-buildds,
etc), but the page appears to be immutable.

--=20
dann frazier

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Bug#561203: threads and fork on machine with VIPT-WB cache
  2010-06-07 17:11                                     ` dann frazier
@ 2010-06-07 18:27                                       ` Thibaut VARÈNE
  2010-06-07 23:33                                         ` dann frazier
  0 siblings, 1 reply; 74+ messages in thread
From: Thibaut VARÈNE @ 2010-06-07 18:27 UTC (permalink / raw)
  To: dann frazier; +Cc: linux-parisc

Le 7 juin 10 =E0 19:11, dann frazier a =E9crit :

> btw, I thought it would be useful to edit that page and tag each patc=
h
> with its status in Debian (in-official-kernel, installed-on-buildds,
> etc), but the page appears to be immutable.


The wiki can only be edited by registered (and approved) users (we had =
=20
a lot of defacing), see [1] for details.

Please register, I'd be happy to approve your account ;-)

Cheers,
T-Bone

[1] http://wiki.parisc-linux.org/WikiAccessPolicy

--=20
Thibaut VAR=C8NE
http://www.parisc-linux.org/~varenet/

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: Bug#561203: threads and fork on machine with VIPT-WB cache
  2010-06-07 18:27                                       ` Thibaut VARÈNE
@ 2010-06-07 23:33                                         ` dann frazier
  0 siblings, 0 replies; 74+ messages in thread
From: dann frazier @ 2010-06-07 23:33 UTC (permalink / raw)
  To: Thibaut VARÈNE; +Cc: linux-parisc

On Mon, Jun 07, 2010 at 08:27:44PM +0200, Thibaut VAR=C8NE wrote:
> Le 7 juin 10 =E0 19:11, dann frazier a =E9crit :
>
>> btw, I thought it would be useful to edit that page and tag each pat=
ch
>> with its status in Debian (in-official-kernel, installed-on-buildds,
>> etc), but the page appears to be immutable.
>
>
> The wiki can only be edited by registered (and approved) users (we ha=
d a=20
> lot of defacing), see [1] for details.
>
> Please register, I'd be happy to approve your account ;-)

ah, ok. My username is DannFrazier

> Cheers,
> T-Bone
>
> [1] http://wiki.parisc-linux.org/WikiAccessPolicy



--=20
dann frazier

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2010-06-07 23:33 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4BA43CE5.4020807@fsij.org>
     [not found] ` <87hbo4ek8l.fsf@thialfi.karme.de>
     [not found]   ` <4BB18B46.2070203@fsij.org>
     [not found]     ` <4BB53D26.60601@fsij.org>
2010-04-02  2:41       ` threads and fork on machine with VIPT-WB cache NIIBE Yutaka
2010-04-02  3:30         ` James Bottomley
2010-04-02  3:48           ` NIIBE Yutaka
2010-04-02  8:05             ` NIIBE Yutaka
2010-04-02 19:35               ` John David Anglin
2010-04-08 21:11                 ` Helge Deller
2010-04-08 21:54                   ` John David Anglin
2010-04-08 22:44                     ` John David Anglin
2010-04-09 14:14                       ` Carlos O'Donell
2010-04-09 15:13                         ` John David Anglin
2010-04-09 15:48                           ` James Bottomley
2010-04-09 16:22                             ` John David Anglin
2010-04-09 16:31                               ` James Bottomley
2010-04-10 20:46                           ` Helge Deller
2010-04-10 21:56                             ` John David Anglin
2010-04-10 22:53                             ` John David Anglin
2010-04-11 18:50                               ` Helge Deller
2010-04-11 22:25                                 ` John David Anglin
2010-04-12 21:02                                   ` Helge Deller
2010-04-12 21:41                                     ` John David Anglin
2010-04-13 11:55                                       ` Helge Deller
2010-04-13 14:03                                         ` John David Anglin
2010-04-15 22:35                                         ` John David Anglin
2010-04-19 16:26                                         ` John David Anglin
2010-04-20 17:59                                           ` Helge Deller
2010-04-20 18:52                                             ` John David Anglin
2010-05-09 12:43                                             ` John David Anglin
2010-05-09 14:14                                               ` Carlos O'Donell
2010-05-10  9:56                                               ` Helge Deller
2010-05-10 14:56                                                 ` John David Anglin
2010-05-10 19:20                                                   ` Helge Deller
2010-05-10 21:07                                                     ` John David Anglin
2010-05-11 16:37                                                       ` John David Anglin
2010-05-11 21:39                                                         ` John David Anglin
2010-05-11 20:44                                                       ` Helge Deller
2010-05-11 20:41                                                     ` Helge Deller
2010-05-11 21:26                                                       ` John David Anglin
2010-05-11 21:41                                                         ` Helge Deller
2010-05-15 21:02                                                           ` John David Anglin
2010-05-16 20:22                                                             ` Helge Deller
2010-05-16 21:38                                                               ` John David Anglin
2010-05-22 17:25                                                               ` John David Anglin
2010-05-23 13:11                                                             ` Carlos O'Donell
2010-05-23 14:43                                                               ` John David Anglin
2010-05-01 18:34                                           ` Thibaut VARENE
2010-05-01 20:17                                             ` John David Anglin
2010-05-02 10:53                                               ` Thibaut VARÈNE
2010-04-11 16:36                           ` [PATCH] Call pagefault_disable/pagefault_enable in kmap_atomic/kunmap_atomic John David Anglin
2010-04-11 17:03                         ` [PATCH] Remove unnecessary macros from entry.S John David Anglin
2010-04-11 17:08                         ` [PATCH] Delete unnecessary nop's in entry.S John David Anglin
2010-04-11 17:12                         ` [PATCH] Avoid interruption in critical region " John David Anglin
2010-04-11 18:24                           ` James Bottomley
2010-04-11 18:45                             ` John David Anglin
2010-04-11 18:53                               ` James Bottomley
2010-04-11 17:26                         ` [PATCH] LWS fixes for syscall.S John David Anglin
2010-06-02 15:33                       ` Bug#561203: threads and fork on machine with VIPT-WB cache Modestas Vainius
2010-06-02 17:16                         ` John David Anglin
2010-06-02 17:56                           ` Bug#561203: " dann frazier
2010-06-03  8:50                             ` Modestas Vainius
2010-06-04  1:03                               ` NIIBE Yutaka
2010-06-04  5:21                                 ` dann frazier
2010-06-04 10:44                                   ` Thibaut VARENE
2010-06-07 17:11                                     ` dann frazier
2010-06-07 18:27                                       ` Thibaut VARÈNE
2010-06-07 23:33                                         ` dann frazier
2010-06-06  1:01                                   ` Modestas Vainius
2010-04-02 12:22             ` James Bottomley
2010-04-05  0:39               ` NIIBE Yutaka
2010-04-05  2:51                 ` John David Anglin
2010-04-05  2:58                   ` John David Anglin
2010-04-05 16:18                   ` James Bottomley
2010-04-06  4:57                     ` NIIBE Yutaka
2010-04-06 13:37                       ` James Bottomley
2010-04-06 13:44                         ` James Bottomley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.