All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V6 00/35]  powerpc/mm: Update page table format for book3s 64
@ 2015-12-01  3:36 Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 01/35] powerpc/mm: move pte headers to book3s directory Aneesh Kumar K.V
                   ` (34 more replies)
  0 siblings, 35 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Hi All,

This patch series attempt to update book3s 64 linux page table format to
make it more flexible. Our current pte format is very restrictive and we
overload multiple pte bits. This is due to the non-availability of free bits
in pte_t. We use pte_t to track the validity of 4K subpages. This patch
series free up pte_t of 11 bits by tracking 4 4k subpage using one bit.
The pte format is updated such that we have a better method for identifying
a pte entry at pmd level. This will also enable us to implement hugetlb
migration(not yet done in this series). Before making the changes to the
pte format, I am splitting the pte header definition such that we now have
the below layout for headers

book3s
   32
     hash.h pgtable.h
   64
     hash.h  pgtable.h hash-4k.h hash-64k.h
booke
  32
     pgtable.h pte-40x.h pte-44x.h pte-8xx.h pte-fsl-booke.h
  64
    pgtable-4k.h  pgtable-64k.h  pgtable.h

I have done the header split such that booke headers and modified to the minimum so as to avoid
causing breakage in booke.

The patch series can also be found at
https://github.com/kvaneesh/linux.git book3s-pte-format-v2
https://github.com/kvaneesh/linux/commits/book3s-pte-format-v2

Changes from V5:
* We need a min pte frag size to be 4096 bytes, This is needed to track
  base page information with a THP config with 4K hash pte. Hence drop the
  changes to free up second half of pgtable_t. This implies we don't make
  any change to the size of pte fragment in this series.

Changes from V4:
* rebase to latest linus
* Add mmtest numbers

Changes from V3:
* Add missing #define pgprot_*
* Add Acked-by

Changes from V2:
* rebase to -next for powerpc tree

Changes from V1:
1) Build fix with STRICT_MM_TYPES enabled
2) pte_mkwrite fix for nohash
3) rebase to latest linus tree.


Performance numbers with and without patch series.

-------------- 4k hash pte guest config start--------------------
aim9
                                         4.3.0                       4.3.0
                          13672-gc669d35-dirty        13708-g8e488c2-dirty
Min      page_test   460586.67 (  0.00%)   435200.00 ( -5.51%)
Min      brk_test   4155429.71 (  0.00%)  4231312.46 (  1.83%)
Min      exec_test      960.69 (  0.00%)      904.33 ( -5.87%)
Min      fork_test     2142.38 (  0.00%)     2059.80 ( -3.85%)
Hmean    page_test   467906.57 (  0.00%)   445186.92 ( -4.86%)
Hmean    brk_test   4260526.41 (  0.00%)  4257237.07 ( -0.08%)
Hmean    exec_test     1013.71 (  0.00%)      966.27 ( -4.68%)
Hmean    fork_test     2292.90 (  0.00%)     2257.22 ( -1.56%)
Stddev   page_test     4416.83 (  0.00%)     4694.17 ( -6.28%)
Stddev   brk_test     72844.41 (  0.00%)    12520.81 ( 82.81%)
Stddev   exec_test       27.23 (  0.00%)       30.88 (-13.41%)
Stddev   fork_test       76.29 (  0.00%)      106.89 (-40.10%)
CoeffVar page_test        0.94 (  0.00%)        1.05 (-11.70%)
CoeffVar brk_test         1.71 (  0.00%)        0.29 ( 82.79%)
CoeffVar exec_test        2.68 (  0.00%)        3.19 (-18.94%)
CoeffVar fork_test        3.32 (  0.00%)        4.72 (-42.16%)
Max      page_test   473053.33 (  0.00%)   452426.67 ( -4.36%)
Max      brk_test   4366733.33 (  0.00%)  4272666.67 ( -2.15%)
Max      exec_test     1056.67 (  0.00%)     1014.67 ( -3.97%)
Max      fork_test     2396.80 (  0.00%)     2441.78 (  1.88%)

               4.3.0       4.3.0
        13672-gc669d35-dirty13708-g8e488c2-dirty
User            0.24        0.23
System          0.47        0.51
Elapsed       723.43      723.04

                                 4.3.0       4.3.0
                          13672-gc669d35-dirty13708-g8e488c2-dirty
Minor Faults                  37700464    36522147
Major Faults                        25          25
Swap Ins                             0           0
Swap Outs                            0           0
Allocation stalls                    0           0
DMA allocs                    12485233    11996182
DMA32 allocs                         0           0
Normal allocs                        0           0
Movable allocs                       0           0
Direct pages scanned                 0           0
Kswapd pages scanned                 0           0
Kswapd pages reclaimed               0           0
Direct pages reclaimed               0           0
Kswapd efficiency                 100%        100%
Kswapd velocity                  0.000       0.000
Direct efficiency                 100%        100%
Direct velocity                  0.000       0.000
Percentage direct scans             0%          0%
Zone normal velocity             0.000       0.000
Zone dma32 velocity              0.000       0.000
Zone dma velocity                0.000       0.000
Page writes by reclaim           0.000       0.000
Page writes file                     0           0
Page writes anon                     0           0
Page reclaim immediate               0           0
Sector Reads                      4224        4224
Sector Writes                    69944       69296
Page rescued immediate               0           0
Slabs scanned                        0           0
Direct inode steals                  0           0
Kswapd inode steals                  0           0
Kswapd skipped wait                  0           0
THP fault alloc                      0           0
THP collapse alloc                   0           0
THP splits                           0           0
THP fault fallback                   0           0
THP collapse fail                    0           0
Compaction stalls                    0           0
Compaction success                   0           0
Compaction failures                  0           0
Page migrate success                 0           0
Page migrate failure                 0           0
Compaction pages isolated            0           0
Compaction migrate scanned           0           0
Compaction free scanned              0           0
Compaction cost                      0           0
NUMA alloc hit                12482507    11993566
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local              12482507    11993566
NUMA base PTE updates              647         643
NUMA huge PMD updates                0           0
NUMA page range updates            647         643
NUMA hint faults                844060      873604
NUMA hint local faults          844060      873604
NUMA hint local percent            100         100
NUMA pages migrated                  0           0
AutoNUMA cost                    4220%       4368%

pft timings
                                         4.3.0                       4.3.0
                          13672-gc669d35-dirty        13708-g8e488c2-dirty
Min      system-1        0.0700 (  0.00%)       0.0700 (  0.00%)
Min      system-3        0.1400 (  0.00%)       0.1300 (  7.14%)
Min      system-5        0.2400 (  0.00%)       0.2300 (  4.17%)
Min      system-7        0.2500 (  0.00%)       0.2800 (-12.00%)
Min      system-8        0.2800 (  0.00%)       0.2900 ( -3.57%)
Min      elapsed-1       0.0900 (  0.00%)       0.0800 ( 11.11%)
Min      elapsed-3       0.0600 (  0.00%)       0.0600 (  0.00%)
Min      elapsed-5       0.0600 (  0.00%)       0.0600 (  0.00%)
Min      elapsed-7       0.0600 (  0.00%)       0.0600 (  0.00%)
Min      elapsed-8       0.0600 (  0.00%)       0.0600 (  0.00%)
Amean    system-1        0.1677 (  0.00%)       0.1574 (  6.18%)
Amean    system-3        0.1714 (  0.00%)       0.1575 (  8.10%)
Amean    system-5        0.3611 (  0.00%)       0.2919 ( 19.18%)
Amean    system-7        0.3364 (  0.00%)       0.3396 ( -0.97%)
Amean    system-8        0.4108 (  0.00%)       0.3704 (  9.83%)
Amean    elapsed-1       0.2040 (  0.00%)       0.1929 (  5.45%)
Amean    elapsed-3       0.0733 (  0.00%)       0.0686 (  6.31%)
Amean    elapsed-5       0.0968 (  0.00%)       0.0779 ( 19.51%)
Amean    elapsed-7       0.0656 (  0.00%)       0.0665 ( -1.33%)
Amean    elapsed-8       0.0690 (  0.00%)       0.0622 (  9.78%)
Stddev   system-1        0.3338 (  0.00%)       0.3156 (  5.45%)
Stddev   system-3        0.0203 (  0.00%)       0.0150 ( 26.29%)
Stddev   system-5        0.2636 (  0.00%)       0.0340 ( 87.11%)
Stddev   system-7        0.0343 (  0.00%)       0.0316 (  7.83%)
Stddev   system-8        0.2066 (  0.00%)       0.0386 ( 81.31%)
Stddev   elapsed-1       0.4417 (  0.00%)       0.4287 (  2.95%)
Stddev   elapsed-3       0.0085 (  0.00%)       0.0072 ( 15.09%)
Stddev   elapsed-5       0.0706 (  0.00%)       0.0089 ( 87.39%)
Stddev   elapsed-7       0.0063 (  0.00%)       0.0071 (-12.63%)
Stddev   elapsed-8       0.0335 (  0.00%)       0.0042 ( 87.54%)
CoeffVar system-1      198.9598 (  0.00%)     200.5134 ( -0.78%)
CoeffVar system-3       11.8421 (  0.00%)       9.4973 ( 19.80%)
CoeffVar system-5       72.9902 (  0.00%)      11.6450 ( 84.05%)
CoeffVar system-7       10.1931 (  0.00%)       9.3046 (  8.72%)
CoeffVar system-8       50.2929 (  0.00%)      10.4259 ( 79.27%)
CoeffVar elapsed-1     216.5224 (  0.00%)     222.2603 ( -2.65%)
CoeffVar elapsed-3      11.5790 (  0.00%)      10.4938 (  9.37%)
CoeffVar elapsed-5      72.9538 (  0.00%)      11.4303 ( 84.33%)
CoeffVar elapsed-7       9.5902 (  0.00%)      10.6597 (-11.15%)
CoeffVar elapsed-8      48.5886 (  0.00%)       6.7081 ( 86.19%)
Max      system-1        2.6900 (  0.00%)       2.7500 ( -2.23%)
Max      system-3        0.2600 (  0.00%)       0.2300 ( 11.54%)
Max      system-5        1.4800 (  0.00%)       0.3800 ( 74.32%)
Max      system-7        0.4300 (  0.00%)       0.4400 ( -2.33%)
Max      system-8        2.2000 (  0.00%)       0.4300 ( 80.45%)
Max      elapsed-1       3.4200 (  0.00%)       3.7200 ( -8.77%)
Max      elapsed-3       0.1100 (  0.00%)       0.0900 ( 18.18%)
Max      elapsed-5       0.4200 (  0.00%)       0.1000 ( 76.19%)
Max      elapsed-7       0.0800 (  0.00%)       0.0800 (  0.00%)
Max      elapsed-8       0.3600 (  0.00%)       0.0700 ( 80.56%)

pft faults
                                            4.3.0                       4.3.0
                             13672-gc669d35-dirty        13708-g8e488c2-dirty
Min      faults/cpu-1    4826.3370 (  0.00%)    4724.0920 ( -2.12%)
Min      faults/cpu-3   48054.3130 (  0.00%)   52057.8790 (  8.33%)
Min      faults/cpu-5    8657.3530 (  0.00%)   32656.8230 (277.21%)
Min      faults/cpu-7   28499.3350 (  0.00%)   28117.8470 ( -1.34%)
Min      faults/cpu-8    5785.6660 (  0.00%)   28468.9430 (392.06%)
Min      faults/sec-1    3818.0350 (  0.00%)    3503.7170 ( -8.23%)
Min      faults/sec-3  115163.0990 (  0.00%)  142861.2660 ( 24.05%)
Min      faults/sec-5   31307.4700 (  0.00%)  126644.8950 (304.52%)
Min      faults/sec-7  161275.0230 (  0.00%)  154987.8940 ( -3.90%)
Min      faults/sec-8   36175.2080 (  0.00%)  181500.8080 (401.73%)
Hmean    faults/cpu-1   73887.2577 (  0.00%)   78026.1722 (  5.60%)
Hmean    faults/cpu-3   71522.1756 (  0.00%)   77089.5360 (  7.78%)
Hmean    faults/cpu-5   34277.6489 (  0.00%)   42103.7728 ( 22.83%)
Hmean    faults/cpu-7   36418.2317 (  0.00%)   36054.2105 ( -1.00%)
Hmean    faults/cpu-8   29741.2065 (  0.00%)   33195.0525 ( 11.61%)
Hmean    faults/sec-1   63979.9276 (  0.00%)   67449.3892 (  5.42%)
Hmean    faults/sec-3  175991.8241 (  0.00%)  189556.4239 (  7.71%)
Hmean    faults/sec-5  134694.5971 (  0.00%)  167086.7980 ( 24.05%)
Hmean    faults/sec-7  196906.4585 (  0.00%)  194505.0606 ( -1.22%)
Hmean    faults/sec-8  190635.0992 (  0.00%)  210558.9780 ( 10.45%)
Stddev   faults/cpu-1   28421.3218 (  0.00%)   31119.8455 ( -9.49%)
Stddev   faults/cpu-3    7433.8195 (  0.00%)    7024.7950 (  5.50%)
Stddev   faults/cpu-5   10013.2881 (  0.00%)    4757.1716 ( 52.49%)
Stddev   faults/cpu-7    3856.6692 (  0.00%)    3128.6064 ( 18.88%)
Stddev   faults/cpu-8    4506.9641 (  0.00%)    3753.0081 ( 16.73%)
Stddev   faults/sec-1   27405.3207 (  0.00%)   29714.3764 ( -8.43%)
Stddev   faults/sec-3   16581.7614 (  0.00%)   16406.9053 (  1.05%)
Stddev   faults/sec-5   41735.1309 (  0.00%)   18896.6954 ( 54.72%)
Stddev   faults/sec-7   13126.2068 (  0.00%)   16315.6539 (-24.30%)
Stddev   faults/sec-8   26939.6466 (  0.00%)   11738.7089 ( 56.43%)
CoeffVar faults/cpu-1      25.6278 (  0.00%)      26.5666 ( -3.66%)
CoeffVar faults/cpu-3      10.2707 (  0.00%)       9.0358 ( 12.02%)
CoeffVar faults/cpu-5      24.5397 (  0.00%)      11.1592 ( 54.53%)
CoeffVar faults/cpu-7      10.4800 (  0.00%)       8.6127 ( 17.82%)
CoeffVar faults/cpu-8      14.3209 (  0.00%)      11.1754 ( 21.96%)
CoeffVar faults/sec-1      26.0287 (  0.00%)      26.7416 ( -2.74%)
CoeffVar faults/sec-3       9.3294 (  0.00%)       8.5905 (  7.92%)
CoeffVar faults/sec-5      25.7575 (  0.00%)      11.1747 ( 56.62%)
CoeffVar faults/sec-7       6.6347 (  0.00%)       8.3276 (-25.52%)
CoeffVar faults/sec-8      13.3650 (  0.00%)       5.5575 ( 58.42%)
Max      faults/cpu-1  169773.9820 (  0.00%)  173628.2170 (  2.27%)
Max      faults/cpu-3   87876.0920 (  0.00%)   95342.5620 (  8.50%)
Max      faults/cpu-5   52178.9930 (  0.00%)   53589.5550 (  2.70%)
Max      faults/cpu-7   49544.2420 (  0.00%)   43912.2830 (-11.37%)
Max      faults/cpu-8   42351.8250 (  0.00%)   42866.1120 (  1.21%)
Max      faults/sec-1  153006.2870 (  0.00%)  159219.3870 (  4.06%)
Max      faults/sec-3  213038.1300 (  0.00%)  222970.7520 (  4.66%)
Max      faults/sec-5  216193.7690 (  0.00%)  216955.0470 (  0.35%)
Max      faults/sec-7  219260.6940 (  0.00%)  229234.7670 (  4.55%)
Max      faults/sec-8  232367.7020 (  0.00%)  229220.0190 ( -1.35%)

               4.3.0       4.3.0
        13672-gc669d35-dirty13708-g8e488c2-dirty
User           13.47       10.80
System        138.78      134.16
Elapsed        48.68       46.84

                                 4.3.0       4.3.0
                          13672-gc669d35-dirty13708-g8e488c2-dirty
Minor Faults                   5478166     5478364
Major Faults                         0           0
Swap Ins                             0           0
Swap Outs                            0           0
Allocation stalls                    0           0
DMA allocs                     5269869     5270049
DMA32 allocs                         0           0
Normal allocs                        0           0
Movable allocs                       0           0
Direct pages scanned                 0           0
Kswapd pages scanned                 0           0
Kswapd pages reclaimed               0           0
Direct pages reclaimed               0           0
Kswapd efficiency                 100%        100%
Kswapd velocity                  0.000       0.000
Direct efficiency                 100%        100%
Direct velocity                  0.000       0.000
Percentage direct scans             0%          0%
Zone normal velocity             0.000       0.000
Zone dma32 velocity              0.000       0.000
Zone dma velocity                0.000       0.000
Page writes by reclaim           0.000       0.000
Page writes file                     0           0
Page writes anon                     0           0
Page reclaim immediate               0           0
Sector Reads                        72          72
Sector Writes                      560         500
Page rescued immediate               0           0
Slabs scanned                        0           0
Direct inode steals                  0           0
Kswapd inode steals                  0           0
Kswapd skipped wait                  0           0
THP fault alloc                      0           0
THP collapse alloc                   0           0
THP splits                           0           0
THP fault fallback                   0           0
THP collapse fail                    0           0
Compaction stalls                    0           0
Compaction success                   0           0
Compaction failures                  0           0
Page migrate success                 0           0
Page migrate failure                 0           0
Compaction pages isolated            0           0
Compaction migrate scanned           0           0
Compaction free scanned              0           0
Compaction cost                      0           0
NUMA alloc hit                 5269751     5269915
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local               5269751     5269915
NUMA base PTE updates            23119       11610
NUMA huge PMD updates                0           0
NUMA page range updates          23119       11610
NUMA hint faults                    16           7
NUMA hint local faults              16           7
NUMA hint local percent            100         100
NUMA pages migrated                  0           0
AutoNUMA cost                       0%          0%

ebizzy Overall Throughput
                                       4.3.0                       4.3.0
                        13672-gc669d35-dirty        13708-g8e488c2-dirty
Min      Rsec-1      6486.00 (  0.00%)     6153.00 ( -5.13%)
Min      Rsec-3      9821.00 (  0.00%)    10083.00 (  2.67%)
Min      Rsec-5     10688.00 (  0.00%)    10576.00 ( -1.05%)
Min      Rsec-7     11359.00 (  0.00%)    10870.00 ( -4.30%)
Min      Rsec-12    14925.00 (  0.00%)    13443.00 ( -9.93%)
Min      Rsec-18    12493.00 (  0.00%)    11553.00 ( -7.52%)
Min      Rsec-24    13696.00 (  0.00%)    12764.00 ( -6.80%)
Min      Rsec-30    13669.00 (  0.00%)    13299.00 ( -2.71%)
Min      Rsec-32    14852.00 (  0.00%)    13008.00 (-12.42%)
Hmean    Rsec-1      6574.40 (  0.00%)     6230.95 ( -5.22%)
Hmean    Rsec-3     10005.71 (  0.00%)    10403.25 (  3.97%)
Hmean    Rsec-5     11064.39 (  0.00%)    10999.12 ( -0.59%)
Hmean    Rsec-7     11793.27 (  0.00%)    11044.37 ( -6.35%)
Hmean    Rsec-12    15130.50 (  0.00%)    13929.85 ( -7.94%)
Hmean    Rsec-18    13396.79 (  0.00%)    12284.36 ( -8.30%)
Hmean    Rsec-24    13981.41 (  0.00%)    13493.87 ( -3.49%)
Hmean    Rsec-30    14127.08 (  0.00%)    13482.65 ( -4.56%)
Hmean    Rsec-32    15170.35 (  0.00%)    13493.89 (-11.05%)
Stddev   Rsec-1        62.81 (  0.00%)       52.93 ( 15.73%)
Stddev   Rsec-3       152.04 (  0.00%)      267.01 (-75.62%)
Stddev   Rsec-5       260.81 (  0.00%)      290.26 (-11.29%)
Stddev   Rsec-7       540.61 (  0.00%)      266.96 ( 50.62%)
Stddev   Rsec-12      203.42 (  0.00%)      314.95 (-54.83%)
Stddev   Rsec-18      607.51 (  0.00%)      456.19 ( 24.91%)
Stddev   Rsec-24      182.90 (  0.00%)      698.10 (-281.69%)
Stddev   Rsec-30      500.28 (  0.00%)      194.07 ( 61.21%)
Stddev   Rsec-32      247.90 (  0.00%)      405.56 (-63.60%)
CoeffVar Rsec-1         0.96 (  0.00%)        0.85 ( 11.08%)
CoeffVar Rsec-3         1.52 (  0.00%)        2.56 (-68.84%)
CoeffVar Rsec-5         2.36 (  0.00%)        2.64 (-11.94%)
CoeffVar Rsec-7         4.57 (  0.00%)        2.42 ( 47.19%)
CoeffVar Rsec-12        1.34 (  0.00%)        2.26 (-68.11%)
CoeffVar Rsec-18        4.53 (  0.00%)        3.71 ( 18.05%)
CoeffVar Rsec-24        1.31 (  0.00%)        5.16 (-294.54%)
CoeffVar Rsec-30        3.54 (  0.00%)        1.44 ( 59.31%)
CoeffVar Rsec-32        1.63 (  0.00%)        3.00 (-83.81%)
Max      Rsec-1      6679.00 (  0.00%)     6284.00 ( -5.91%)
Max      Rsec-3     10252.00 (  0.00%)    10865.00 (  5.98%)
Max      Rsec-5     11307.00 (  0.00%)    11337.00 (  0.27%)
Max      Rsec-7     12841.00 (  0.00%)    11582.00 ( -9.80%)
Max      Rsec-12    15509.00 (  0.00%)    14361.00 ( -7.40%)
Max      Rsec-18    14229.00 (  0.00%)    12792.00 (-10.10%)
Max      Rsec-24    14266.00 (  0.00%)    14826.00 (  3.93%)
Max      Rsec-30    15055.00 (  0.00%)    13840.00 ( -8.07%)
Max      Rsec-32    15508.00 (  0.00%)    14237.00 ( -8.20%)

ebizzy Per-thread
                                       4.3.0                       4.3.0
                        13672-gc669d35-dirty        13708-g8e488c2-dirty
Min      Rsec-1      6486.00 (  0.00%)     6153.00 ( -5.13%)
Min      Rsec-3      3100.00 (  0.00%)     3232.00 (  4.26%)
Min      Rsec-5      1940.00 (  0.00%)     2018.00 (  4.02%)
Min      Rsec-7      1532.00 (  0.00%)     1465.00 ( -4.37%)
Min      Rsec-12     1107.00 (  0.00%)     1012.00 ( -8.58%)
Min      Rsec-18      585.00 (  0.00%)      538.00 ( -8.03%)
Min      Rsec-24      459.00 (  0.00%)      431.00 ( -6.10%)
Min      Rsec-30      373.00 (  0.00%)      358.00 ( -4.02%)
Min      Rsec-32      372.00 (  0.00%)      332.00 (-10.75%)
Hmean    Rsec-1      6574.40 (  0.00%)     6230.95 ( -5.22%)
Hmean    Rsec-3      3328.16 (  0.00%)     3457.35 (  3.88%)
Hmean    Rsec-5      2209.26 (  0.00%)     2197.43 ( -0.54%)
Hmean    Rsec-7      1661.38 (  0.00%)     1576.23 ( -5.13%)
Hmean    Rsec-12     1248.27 (  0.00%)     1148.41 ( -8.00%)
Hmean    Rsec-18      692.79 (  0.00%)      639.55 ( -7.68%)
Hmean    Rsec-24      570.63 (  0.00%)      545.79 ( -4.35%)
Hmean    Rsec-30      461.30 (  0.00%)      439.45 ( -4.74%)
Hmean    Rsec-32      462.88 (  0.00%)      413.36 (-10.70%)
Stddev   Rsec-1        62.81 (  0.00%)       52.93 (-15.73%)
Stddev   Rsec-3       158.59 (  0.00%)      211.53 ( 33.38%)
Stddev   Rsec-5        96.83 (  0.00%)       88.33 ( -8.78%)
Stddev   Rsec-7       290.12 (  0.00%)       57.71 (-80.11%)
Stddev   Rsec-12      134.37 (  0.00%)      136.35 (  1.47%)
Stddev   Rsec-18      253.56 (  0.00%)      220.27 (-13.13%)
Stddev   Rsec-24       85.84 (  0.00%)      106.71 ( 24.32%)
Stddev   Rsec-30       76.73 (  0.00%)       72.81 ( -5.11%)
Stddev   Rsec-32       75.83 (  0.00%)       62.71 (-17.30%)
CoeffVar Rsec-1         0.96 (  0.00%)        0.85 ( 11.08%)
CoeffVar Rsec-3         4.75 (  0.00%)        6.10 (-28.23%)
CoeffVar Rsec-5         4.37 (  0.00%)        4.01 (  8.25%)
CoeffVar Rsec-7        17.19 (  0.00%)        3.66 ( 78.73%)
CoeffVar Rsec-12       10.66 (  0.00%)       11.75 (-10.19%)
CoeffVar Rsec-18       34.02 (  0.00%)       32.25 (  5.19%)
CoeffVar Rsec-24       14.74 (  0.00%)       18.95 (-28.52%)
CoeffVar Rsec-30       16.29 (  0.00%)       16.22 (  0.45%)
CoeffVar Rsec-32       16.01 (  0.00%)       14.88 (  7.07%)
Max      Rsec-1      6679.00 (  0.00%)     6284.00 ( -5.91%)
Max      Rsec-3      3602.00 (  0.00%)     3967.00 ( 10.13%)
Max      Rsec-5      2351.00 (  0.00%)     2433.00 (  3.49%)
Max      Rsec-7      3355.00 (  0.00%)     1737.00 (-48.23%)
Max      Rsec-12     1890.00 (  0.00%)     1787.00 ( -5.45%)
Max      Rsec-18     1448.00 (  0.00%)     1295.00 (-10.57%)
Max      Rsec-24      818.00 (  0.00%)      905.00 ( 10.64%)
Max      Rsec-30      771.00 (  0.00%)      724.00 ( -6.10%)
Max      Rsec-32      676.00 (  0.00%)      648.00 ( -4.14%)

ebizzy Thread spread
                                         4.3.0                       4.3.0
                          13672-gc669d35-dirty        13708-g8e488c2-dirty
Min      spread-1         0.00 (  0.00%)        0.00 (  0.00%)
Min      spread-3       272.00 (  0.00%)      342.00 (-25.74%)
Min      spread-5       119.00 (  0.00%)      132.00 (-10.92%)
Min      spread-7        90.00 (  0.00%)      101.00 (-12.22%)
Min      spread-12      278.00 (  0.00%)      112.00 ( 59.71%)
Min      spread-18      751.00 (  0.00%)      638.00 ( 15.05%)
Min      spread-24      192.00 (  0.00%)      260.00 (-35.42%)
Min      spread-30      307.00 (  0.00%)      260.00 ( 15.31%)
Min      spread-32      240.00 (  0.00%)      179.00 ( 25.42%)
Hmean    spread-1         0.00 (  0.00%)        0.00 (  0.00%)
Hmean    spread-3       315.55 (  0.00%)      411.86 (-30.52%)
Hmean    spread-5       179.34 (  0.00%)      172.47 (  3.83%)
Hmean    spread-7       121.46 (  0.00%)      134.77 (-10.96%)
Hmean    spread-12      331.88 (  0.00%)      263.00 ( 20.75%)
Hmean    spread-18      775.27 (  0.00%)      692.94 ( 10.62%)
Hmean    spread-24      267.67 (  0.00%)      351.87 (-31.46%)
Hmean    spread-30      339.24 (  0.00%)      304.50 ( 10.24%)
Hmean    spread-32      267.37 (  0.00%)      220.17 ( 17.65%)
Stddev   spread-1         0.00 (  0.00%)        0.00 (  0.00%)
Stddev   spread-3        76.63 (  0.00%)       63.98 ( 16.50%)
Stddev   spread-5        77.12 (  0.00%)       42.96 ( 44.30%)
Stddev   spread-7       689.22 (  0.00%)       24.63 ( 96.43%)
Stddev   spread-12      192.78 (  0.00%)      258.17 (-33.92%)
Stddev   spread-18       27.29 (  0.00%)       37.57 (-37.67%)
Stddev   spread-24       56.34 (  0.00%)       69.16 (-22.75%)
Stddev   spread-30       34.27 (  0.00%)       40.12 (-17.07%)
Stddev   spread-32       24.82 (  0.00%)       41.70 (-67.98%)
CoeffVar spread-1         0.00 (  0.00%)        0.00 (  0.00%)
CoeffVar spread-3        23.26 (  0.00%)       15.20 (-34.64%)
CoeffVar spread-5        38.03 (  0.00%)       23.71 (-37.66%)
CoeffVar spread-7       154.95 (  0.00%)       17.67 (-88.60%)
CoeffVar spread-12       49.76 (  0.00%)       60.89 ( 22.36%)
CoeffVar spread-18        3.52 (  0.00%)        5.41 ( 53.76%)
CoeffVar spread-24       20.12 (  0.00%)       18.90 ( -6.09%)
CoeffVar spread-30       10.00 (  0.00%)       12.96 ( 29.55%)
CoeffVar spread-32        9.21 (  0.00%)       18.31 ( 98.81%)
Max      spread-1         0.00 (  0.00%)        0.00 (  0.00%)
Max      spread-3       478.00 (  0.00%)      535.00 (-11.92%)
Max      spread-5       345.00 (  0.00%)      260.00 ( 24.64%)
Max      spread-7      1823.00 (  0.00%)      175.00 ( 90.40%)
Max      spread-12      772.00 (  0.00%)      751.00 (  2.72%)
Max      spread-18      827.00 (  0.00%)      746.00 (  9.79%)
Max      spread-24      351.00 (  0.00%)      448.00 (-27.64%)
Max      spread-30      387.00 (  0.00%)      366.00 (  5.43%)
Max      spread-32      304.00 (  0.00%)      278.00 (  8.55%)

               4.3.0       4.3.0
        13672-gc669d35-dirty13708-g8e488c2-dirty
User         1577.04     1520.68
System       6809.94     6864.08
Elapsed      1353.57     1353.80

                                 4.3.0       4.3.0
                          13672-gc669d35-dirty13708-g8e488c2-dirty
Minor Faults                 100301324    95014933
Major Faults                         0           0
Swap Ins                             0           0
Swap Outs                            0           0
Allocation stalls                    0           0
DMA allocs                   100222214    94935232
DMA32 allocs                         0           0
Normal allocs                        0           0
Movable allocs                       0           0
Direct pages scanned                 0           0
Kswapd pages scanned                 0           0
Kswapd pages reclaimed               0           0
Direct pages reclaimed               0           0
Kswapd efficiency                 100%        100%
Kswapd velocity                  0.000       0.000
Direct efficiency                 100%        100%
Direct velocity                  0.000       0.000
Percentage direct scans             0%          0%
Zone normal velocity             0.000       0.000
Zone dma32 velocity              0.000       0.000
Zone dma velocity                0.000       0.000
Page writes by reclaim           0.000       0.000
Page writes file                     0           0
Page writes anon                     0           0
Page reclaim immediate               0           0
Sector Reads                       440         560
Sector Writes                     4324        4156
Page rescued immediate               0           0
Slabs scanned                        0           0
Direct inode steals                  0           0
Kswapd inode steals                  0           0
Kswapd skipped wait                  0           0
THP fault alloc                      0           0
THP collapse alloc                   0           0
THP splits                           0           0
THP fault fallback                   0           0
THP collapse fail                    0           0
Compaction stalls                    0           0
Compaction success                   0           0
Compaction failures                  0           0
Page migrate success                 0           0
Page migrate failure                 0           0
Compaction pages isolated            0           0
Compaction migrate scanned           0           0
Compaction free scanned              0           0
Compaction cost                      0           0
NUMA alloc hit               100222248    94935323
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local             100222248    94935323
NUMA base PTE updates            43936       45000
NUMA huge PMD updates                0           0
NUMA page range updates          43936       45000
NUMA hint faults                 35295       36155
NUMA hint local faults           35295       36155
NUMA hint local percent            100         100
NUMA pages migrated                  0           0
AutoNUMA cost                     176%        181%
-------------- 4k hash pte guest config stop--------------------
-------------- 64k hash pte powernv config start--------------------

aim9
                                         4.3.0                       4.3.0
                                13672-gc669d35              13708-g8e488c2
Min      page_test  2256353.33 (  0.00%)  2225073.33 ( -1.39%)
Min      brk_test   4131000.00 (  0.00%)  4153666.67 (  0.55%)
Min      exec_test     1639.67 (  0.00%)     1635.00 ( -0.28%)
Min      fork_test     4636.91 (  0.00%)     4633.82 ( -0.07%)
Hmean    page_test  2281108.70 (  0.00%)  2282854.75 (  0.08%)
Hmean    brk_test   4178161.02 (  0.00%)  4190571.57 (  0.30%)
Hmean    exec_test     1666.34 (  0.00%)     1663.18 ( -0.19%)
Hmean    fork_test     4659.91 (  0.00%)     4665.26 (  0.11%)
Stddev   page_test    19681.94 (  0.00%)    23269.98 (-18.23%)
Stddev   brk_test     62136.33 (  0.00%)    41482.83 ( 33.24%)
Stddev   exec_test       14.09 (  0.00%)       11.74 ( 16.69%)
Stddev   fork_test        9.66 (  0.00%)       18.39 (-90.33%)
CoeffVar page_test        0.86 (  0.00%)        1.02 (-18.14%)
CoeffVar brk_test         1.49 (  0.00%)        0.99 ( 33.43%)
CoeffVar exec_test        0.85 (  0.00%)        0.71 ( 16.53%)
CoeffVar fork_test        0.21 (  0.00%)        0.39 (-90.11%)
Max      page_test  2323106.67 (  0.00%)  2307466.67 ( -0.67%)
Max      brk_test   4285133.33 (  0.00%)  4321400.00 (  0.85%)
Max      exec_test     1684.33 (  0.00%)     1678.33 ( -0.36%)
Max      fork_test     4676.88 (  0.00%)     4690.21 (  0.29%)

               4.3.0       4.3.0
        13672-gc669d3513708-g8e488c2
User            0.09        2.37
System          0.13        0.53
Elapsed       722.20      726.32

                                 4.3.0       4.3.0
                          13672-gc669d3513708-g8e488c2
Minor Faults                  84853683    84777713
Major Faults                        25         328
Swap Ins                             0           0
Swap Outs                            0           0
Allocation stalls                    0           0
DMA allocs                    38237729    38213150
DMA32 allocs                         0           0
Normal allocs                        0           0
Movable allocs                       0           0
Direct pages scanned                 0           0
Kswapd pages scanned                 0           0
Kswapd pages reclaimed               0           0
Direct pages reclaimed               0           0
Kswapd efficiency                 100%        100%
Kswapd velocity                  0.000       0.000
Direct efficiency                 100%        100%
Direct velocity                  0.000       0.000
Percentage direct scans             0%          0%
Zone normal velocity             0.000       0.000
Zone dma32 velocity              0.000       0.000
Zone dma velocity                0.000       0.000
Page writes by reclaim           0.000       0.000
Page writes file                     0           0
Page writes anon                     0           0
Page reclaim immediate               0           0
Sector Reads                      4744       22336
Sector Writes                    71712       71744
Page rescued immediate               0           0
Slabs scanned                        0           0
Direct inode steals                  0           0
Kswapd inode steals                  0           0
Kswapd skipped wait                  0           0
THP fault alloc                      0           2
THP collapse alloc                   0           0
THP splits                           0           0
THP fault fallback                   0           3
THP collapse fail                    0           0
Compaction stalls                    0           0
Compaction success                   0           0
Compaction failures                  0           0
Page migrate success                 0           0
Page migrate failure                 0           0
Compaction pages isolated            0           0
Compaction migrate scanned           0           0
Compaction free scanned              0           0
Compaction cost                      0           0
NUMA alloc hit                38231124    38206624
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local              38229247    38204174
NUMA base PTE updates              646         637
NUMA huge PMD updates                0           0
NUMA page range updates            646         637
NUMA hint faults               1803811     1756586
NUMA hint local faults         1803625     1756366
NUMA hint local percent             99          99
NUMA pages migrated                  0           0
AutoNUMA cost                    9019%       8782%

pft timings
                                           4.3.0                       4.3.0
                                  13672-gc669d35              13708-g8e488c2
Min      system-1          1.8600 (  0.00%)       1.8700 ( -0.54%)
Min      system-4          4.1900 (  0.00%)       4.1900 (  0.00%)
Min      system-7          6.5800 (  0.00%)       5.8500 ( 11.09%)
Min      system-12         9.3600 (  0.00%)      11.0400 (-17.95%)
Min      system-21        15.4700 (  0.00%)      16.2900 ( -5.30%)
Min      system-30        22.2600 (  0.00%)      21.4400 (  3.68%)
Min      system-48        30.7000 (  0.00%)      31.4000 ( -2.28%)
Min      system-79        55.4600 (  0.00%)      54.2300 (  2.22%)
Min      system-110       63.6400 (  0.00%)      61.8200 (  2.86%)
Min      system-141       66.8900 (  0.00%)      67.3400 ( -0.67%)
Min      system-160       69.5500 (  0.00%)      70.4900 ( -1.35%)
Min      elapsed-1         2.0300 (  0.00%)       2.0400 ( -0.49%)
Min      elapsed-4         1.3000 (  0.00%)       1.3100 ( -0.77%)
Min      elapsed-7         1.2300 (  0.00%)       0.9900 ( 19.51%)
Min      elapsed-12        0.8700 (  0.00%)       1.1500 (-32.18%)
Min      elapsed-21        0.9100 (  0.00%)       1.0100 (-10.99%)
Min      elapsed-30        0.8700 (  0.00%)       0.8700 (  0.00%)
Min      elapsed-48        0.8700 (  0.00%)       0.8800 ( -1.15%)
Min      elapsed-79        0.9000 (  0.00%)       0.8800 (  2.22%)
Min      elapsed-110       0.9500 (  0.00%)       0.9000 (  5.26%)
Min      elapsed-141       0.8800 (  0.00%)       0.9300 ( -5.68%)
Min      elapsed-160       0.8800 (  0.00%)       0.9200 ( -4.55%)
Amean    system-1          1.8881 (  0.00%)       1.9090 ( -1.11%)
Amean    system-4          6.5492 (  0.00%)       6.5650 ( -0.24%)
Amean    system-7         10.0890 (  0.00%)      10.1636 ( -0.74%)
Amean    system-12        14.7406 (  0.00%)      15.3927 ( -4.42%)
Amean    system-21        22.2554 (  0.00%)      25.3304 (-13.82%)
Amean    system-30        35.2691 (  0.00%)      29.8990 ( 15.23%)
Amean    system-48        45.8848 (  0.00%)      42.5765 (  7.21%)
Amean    system-79        65.6720 (  0.00%)      63.4805 (  3.34%)
Amean    system-110       77.0405 (  0.00%)      75.3361 (  2.21%)
Amean    system-141       83.8607 (  0.00%)      83.1904 (  0.80%)
Amean    system-160       84.8629 (  0.00%)      83.5491 (  1.55%)
Amean    elapsed-1         2.0836 (  0.00%)       2.1036 ( -0.96%)
Amean    elapsed-4         1.7204 (  0.00%)       1.7264 ( -0.35%)
Amean    elapsed-7         1.6176 (  0.00%)       1.6254 ( -0.48%)
Amean    elapsed-12        1.5135 (  0.00%)       1.5618 ( -3.19%)
Amean    elapsed-21        1.3556 (  0.00%)       1.5172 (-11.92%)
Amean    elapsed-30        1.5014 (  0.00%)       1.2959 ( 13.69%)
Amean    elapsed-48        1.2863 (  0.00%)       1.2243 (  4.82%)
Amean    elapsed-79        1.2315 (  0.00%)       1.1953 (  2.94%)
Amean    elapsed-110       1.1055 (  0.00%)       1.0705 (  3.17%)
Amean    elapsed-141       1.0494 (  0.00%)       1.0268 (  2.16%)
Amean    elapsed-160       1.0396 (  0.00%)       1.0375 (  0.20%)
Stddev   system-1          0.0758 (  0.00%)       0.0766 ( -1.03%)
Stddev   system-4          0.3702 (  0.00%)       0.2734 ( 26.16%)
Stddev   system-7          1.4758 (  0.00%)       1.4623 (  0.91%)
Stddev   system-12         2.1752 (  0.00%)       1.7466 ( 19.70%)
Stddev   system-21         5.2401 (  0.00%)       3.9628 ( 24.38%)
Stddev   system-30         6.4944 (  0.00%)       7.3898 (-13.79%)
Stddev   system-48        11.4237 (  0.00%)       9.5028 ( 16.81%)
Stddev   system-79         9.4485 (  0.00%)       6.2530 ( 33.82%)
Stddev   system-110        7.8213 (  0.00%)       6.6034 ( 15.57%)
Stddev   system-141        7.2452 (  0.00%)       6.5108 ( 10.14%)
Stddev   system-160        6.9865 (  0.00%)       6.5068 (  6.87%)
Stddev   elapsed-1         0.0778 (  0.00%)       0.0819 ( -5.36%)
Stddev   elapsed-4         0.0666 (  0.00%)       0.0470 ( 29.39%)
Stddev   elapsed-7         0.1404 (  0.00%)       0.1481 ( -5.50%)
Stddev   elapsed-12        0.1691 (  0.00%)       0.1135 ( 32.90%)
Stddev   elapsed-21        0.2723 (  0.00%)       0.1641 ( 39.75%)
Stddev   elapsed-30        0.2001 (  0.00%)       0.2761 (-37.99%)
Stddev   elapsed-48        0.2472 (  0.00%)       0.2112 ( 14.55%)
Stddev   elapsed-79        0.1377 (  0.00%)       0.1097 ( 20.36%)
Stddev   elapsed-110       0.0812 (  0.00%)       0.0808 (  0.46%)
Stddev   elapsed-141       0.0544 (  0.00%)       0.0557 ( -2.27%)
Stddev   elapsed-160       0.0637 (  0.00%)       0.0577 (  9.37%)
CoeffVar system-1          4.0135 (  0.00%)       4.0105 (  0.08%)
CoeffVar system-4          5.6527 (  0.00%)       4.1640 ( 26.34%)
CoeffVar system-7         14.6274 (  0.00%)      14.3878 (  1.64%)
CoeffVar system-12        14.7568 (  0.00%)      11.3470 ( 23.11%)
CoeffVar system-21        23.5452 (  0.00%)      15.6444 ( 33.56%)
CoeffVar system-30        18.4137 (  0.00%)      24.7158 (-34.22%)
CoeffVar system-48        24.8964 (  0.00%)      22.3194 ( 10.35%)
CoeffVar system-79        14.3874 (  0.00%)       9.8502 ( 31.54%)
CoeffVar system-110       10.1521 (  0.00%)       8.7653 ( 13.66%)
CoeffVar system-141        8.6396 (  0.00%)       7.8264 (  9.41%)
CoeffVar system-160        8.2327 (  0.00%)       7.7880 (  5.40%)
CoeffVar elapsed-1         3.7316 (  0.00%)       3.8941 ( -4.35%)
CoeffVar elapsed-4         3.8704 (  0.00%)       2.7235 ( 29.63%)
CoeffVar elapsed-7         8.6807 (  0.00%)       9.1146 ( -5.00%)
CoeffVar elapsed-12       11.1714 (  0.00%)       7.2646 ( 34.97%)
CoeffVar elapsed-21       20.0897 (  0.00%)      10.8143 ( 46.17%)
CoeffVar elapsed-30       13.3289 (  0.00%)      21.3093 (-59.87%)
CoeffVar elapsed-48       19.2176 (  0.00%)      17.2536 ( 10.22%)
CoeffVar elapsed-79       11.1827 (  0.00%)       9.1764 ( 17.94%)
CoeffVar elapsed-110       7.3416 (  0.00%)       7.5471 ( -2.80%)
CoeffVar elapsed-141       5.1854 (  0.00%)       5.4200 ( -4.53%)
CoeffVar elapsed-160       6.1260 (  0.00%)       5.5631 (  9.19%)
Max      system-1          2.3000 (  0.00%)       2.2800 (  0.87%)
Max      system-4          6.6900 (  0.00%)       6.7000 ( -0.15%)
Max      system-7         11.6800 (  0.00%)      11.7500 ( -0.60%)
Max      system-12        17.9100 (  0.00%)      18.4500 ( -3.02%)
Max      system-21        31.9700 (  0.00%)      31.7300 (  0.75%)
Max      system-30        43.3700 (  0.00%)      43.2700 (  0.23%)
Max      system-48        70.3200 (  0.00%)      70.1900 (  0.18%)
Max      system-79       106.7500 (  0.00%)      90.1600 ( 15.54%)
Max      system-110      102.1600 (  0.00%)     101.2100 (  0.93%)
Max      system-141      103.0400 (  0.00%)      96.7000 (  6.15%)
Max      system-160       98.4800 (  0.00%)      97.5000 (  1.00%)
Max      elapsed-1         2.4900 (  0.00%)       2.4800 (  0.40%)
Max      elapsed-4         1.7400 (  0.00%)       1.7500 ( -0.57%)
Max      elapsed-7         1.7500 (  0.00%)       1.7400 (  0.57%)
Max      elapsed-12        1.7400 (  0.00%)       1.7500 ( -0.57%)
Max      elapsed-21        1.7500 (  0.00%)       1.7000 (  2.86%)
Max      elapsed-30        1.7300 (  0.00%)       1.7100 (  1.16%)
Max      elapsed-48        1.7100 (  0.00%)       1.7000 (  0.58%)
Max      elapsed-79        1.7000 (  0.00%)       1.5300 ( 10.00%)
Max      elapsed-110       1.3500 (  0.00%)       1.3600 ( -0.74%)
Max      elapsed-141       1.1900 (  0.00%)       1.1800 (  0.84%)
Max      elapsed-160       1.2300 (  0.00%)       1.2300 (  0.00%)

pft faults
                                              4.3.0                       4.3.0
                                     13672-gc669d35              13708-g8e488c2
Min      faults/cpu-1    167962.7340 (  0.00%)  169258.4000 (  0.77%)
Min      faults/cpu-4     60713.3780 (  0.00%)   60626.9530 ( -0.14%)
Min      faults/cpu-7     35103.6480 (  0.00%)   34927.7700 ( -0.50%)
Min      faults/cpu-12    22960.1440 (  0.00%)   22309.4670 ( -2.83%)
Min      faults/cpu-21    12900.6920 (  0.00%)   13012.9840 (  0.87%)
Min      faults/cpu-30     9525.1380 (  0.00%)    9555.3660 (  0.32%)
Min      faults/cpu-48     5893.4010 (  0.00%)    5906.2790 (  0.22%)
Min      faults/cpu-79     3888.9070 (  0.00%)    4603.1070 ( 18.37%)
Min      faults/cpu-110    4062.4060 (  0.00%)    4100.8300 (  0.95%)
Min      faults/cpu-141    4028.4030 (  0.00%)    4292.1770 (  6.55%)
Min      faults/cpu-160    4212.2100 (  0.00%)    4256.9010 (  1.06%)
Min      faults/sec-1    167817.6050 (  0.00%)  168734.8460 (  0.55%)
Min      faults/sec-4    239459.5830 (  0.00%)  238660.6900 ( -0.33%)
Min      faults/sec-7    238787.6430 (  0.00%)  239458.0500 (  0.28%)
Min      faults/sec-12   240464.0970 (  0.00%)  238214.1180 ( -0.94%)
Min      faults/sec-21   238517.9660 (  0.00%)  245078.0380 (  2.75%)
Min      faults/sec-30   241525.0900 (  0.00%)  244625.2980 (  1.28%)
Min      faults/sec-48   244673.7080 (  0.00%)  245524.4950 (  0.35%)
Min      faults/sec-79   246298.3920 (  0.00%)  273087.1550 ( 10.88%)
Min      faults/sec-110  309127.0460 (  0.00%)  306026.2640 ( -1.00%)
Min      faults/sec-141  351861.9470 (  0.00%)  353729.4440 (  0.53%)
Min      faults/sec-160  338721.8730 (  0.00%)  339156.8390 (  0.13%)
Hmean    faults/cpu-1    201043.7845 (  0.00%)  198997.0168 ( -1.02%)
Hmean    faults/cpu-4     61948.9837 (  0.00%)   61760.6204 ( -0.30%)
Hmean    faults/cpu-7     40535.5843 (  0.00%)   40239.4941 ( -0.73%)
Hmean    faults/cpu-12    27836.2410 (  0.00%)   26689.3711 ( -4.12%)
Hmean    faults/cpu-21    18488.2448 (  0.00%)   16271.0796 (-11.99%)
Hmean    faults/cpu-30    11692.2366 (  0.00%)   13790.2478 ( 17.94%)
Hmean    faults/cpu-48     8998.8424 (  0.00%)    9697.2403 (  7.76%)
Hmean    faults/cpu-79     6298.5336 (  0.00%)    6516.3831 (  3.46%)
Hmean    faults/cpu-110    5375.1056 (  0.00%)    5497.3005 (  2.27%)
Hmean    faults/cpu-141    4942.4573 (  0.00%)    4983.7162 (  0.83%)
Hmean    faults/cpu-160    4884.7553 (  0.00%)    4962.4938 (  1.59%)
Hmean    faults/sec-1    200497.7126 (  0.00%)  198516.6748 ( -0.99%)
Hmean    faults/sec-4    242883.1475 (  0.00%)  241973.6927 ( -0.37%)
Hmean    faults/sec-7    258232.4571 (  0.00%)  256973.3980 ( -0.49%)
Hmean    faults/sec-12   275982.6468 (  0.00%)  267524.8543 ( -3.06%)
Hmean    faults/sec-21   308067.0715 (  0.00%)  275282.8535 (-10.64%)
Hmean    faults/sec-30   278272.5295 (  0.00%)  322318.2277 ( 15.83%)
Hmean    faults/sec-48   324712.2770 (  0.00%)  341047.4148 (  5.03%)
Hmean    faults/sec-79   339270.7201 (  0.00%)  349595.9658 (  3.04%)
Hmean    faults/sec-110  377555.4863 (  0.00%)  389982.2628 (  3.29%)
Hmean    faults/sec-141  398205.3191 (  0.00%)  407199.2710 (  2.26%)
Hmean    faults/sec-160  402260.8158 (  0.00%)  402662.2649 (  0.10%)
Stddev   faults/cpu-1      6579.0174 (  0.00%)    6939.5357 ( -5.48%)
Stddev   faults/cpu-4      5081.5381 (  0.00%)    3729.6536 ( 26.60%)
Stddev   faults/cpu-7      6863.0849 (  0.00%)    6915.7844 ( -0.77%)
Stddev   faults/cpu-12     4949.4091 (  0.00%)    3301.5541 ( 33.29%)
Stddev   faults/cpu-21     4556.4181 (  0.00%)    3091.1372 ( 32.16%)
Stddev   faults/cpu-30     2605.7291 (  0.00%)    3308.9108 (-26.99%)
Stddev   faults/cpu-48     2074.9654 (  0.00%)    1755.5029 ( 15.40%)
Stddev   faults/cpu-79      755.4743 (  0.00%)     570.7272 ( 24.45%)
Stddev   faults/cpu-110     518.6887 (  0.00%)     457.2601 ( 11.84%)
Stddev   faults/cpu-141     420.8814 (  0.00%)     399.1634 (  5.16%)
Stddev   faults/cpu-160     409.1068 (  0.00%)     389.8045 (  4.72%)
Stddev   faults/sec-1      6547.5672 (  0.00%)    6924.4985 ( -5.76%)
Stddev   faults/sec-4     12454.6505 (  0.00%)    8746.6168 ( 29.77%)
Stddev   faults/sec-7     25389.6889 (  0.00%)   29485.1401 (-16.13%)
Stddev   faults/sec-12    40255.2835 (  0.00%)   21610.2497 ( 46.32%)
Stddev   faults/sec-21    71662.3000 (  0.00%)   36438.3667 ( 49.15%)
Stddev   faults/sec-30    47102.8263 (  0.00%)   74717.8822 (-58.63%)
Stddev   faults/sec-48    66397.4011 (  0.00%)   58784.5253 ( 11.47%)
Stddev   faults/sec-79    37604.4314 (  0.00%)   33859.6768 (  9.96%)
Stddev   faults/sec-110   27245.1710 (  0.00%)   28753.8621 ( -5.54%)
Stddev   faults/sec-141   20904.0801 (  0.00%)   21653.4136 ( -3.58%)
Stddev   faults/sec-160   25264.6583 (  0.00%)   22119.1868 ( 12.45%)
CoeffVar faults/cpu-1         3.2684 (  0.00%)       3.4826 ( -6.55%)
CoeffVar faults/cpu-4         8.1660 (  0.00%)       6.0243 ( 26.23%)
CoeffVar faults/cpu-7        16.5350 (  0.00%)      16.7871 ( -1.52%)
CoeffVar faults/cpu-12       17.3398 (  0.00%)      12.2030 ( 29.62%)
CoeffVar faults/cpu-21       23.3188 (  0.00%)      18.4598 ( 20.84%)
CoeffVar faults/cpu-30       21.4245 (  0.00%)      22.6766 ( -5.84%)
CoeffVar faults/cpu-48       21.8405 (  0.00%)      17.4212 ( 20.23%)
CoeffVar faults/cpu-79       11.7968 (  0.00%)       8.6851 ( 26.38%)
CoeffVar faults/cpu-110       9.5579 (  0.00%)       8.2588 ( 13.59%)
CoeffVar faults/cpu-141       8.4542 (  0.00%)       7.9601 (  5.85%)
CoeffVar faults/cpu-160       8.3184 (  0.00%)       7.8078 (  6.14%)
CoeffVar faults/sec-1         3.2617 (  0.00%)       3.4834 ( -6.80%)
CoeffVar faults/sec-4         5.1176 (  0.00%)       3.6111 ( 29.44%)
CoeffVar faults/sec-7         9.7495 (  0.00%)      11.3571 (-16.49%)
CoeffVar faults/sec-12       14.3571 (  0.00%)       8.0310 ( 44.06%)
CoeffVar faults/sec-21       22.2373 (  0.00%)      13.0512 ( 41.31%)
CoeffVar faults/sec-30       16.5586 (  0.00%)      22.1067 (-33.51%)
CoeffVar faults/sec-48       19.6920 (  0.00%)      16.7508 ( 14.94%)
CoeffVar faults/sec-79       10.9508 (  0.00%)       9.6012 ( 12.32%)
CoeffVar faults/sec-110       7.1785 (  0.00%)       7.3326 ( -2.15%)
CoeffVar faults/sec-141       5.2355 (  0.00%)       5.3024 ( -1.28%)
CoeffVar faults/sec-160       6.2566 (  0.00%)       5.4766 ( 12.47%)
Max      faults/cpu-1    206718.3130 (  0.00%)  205567.6440 ( -0.56%)
Max      faults/cpu-4     95165.9670 (  0.00%)   94731.2700 ( -0.46%)
Max      faults/cpu-7     61266.2550 (  0.00%)   68833.4340 ( 12.35%)
Max      faults/cpu-12    43476.9210 (  0.00%)   37082.6580 (-14.71%)
Max      faults/cpu-21    26492.9770 (  0.00%)   25181.4640 ( -4.95%)
Max      faults/cpu-30    18445.4910 (  0.00%)   19158.5420 (  3.87%)
Max      faults/cpu-48    13377.7470 (  0.00%)   13082.6810 ( -2.21%)
Max      faults/cpu-79     7442.6860 (  0.00%)    7615.1420 (  2.32%)
Max      faults/cpu-110    6503.6080 (  0.00%)    6687.2100 (  2.82%)
Max      faults/cpu-141    6188.6650 (  0.00%)    6146.9070 ( -0.67%)
Max      faults/cpu-160    5950.8970 (  0.00%)    5871.3010 ( -1.34%)
Max      faults/sec-1    205979.4530 (  0.00%)  204798.6030 ( -0.57%)
Max      faults/sec-4    322267.9360 (  0.00%)  319804.0130 ( -0.76%)
Max      faults/sec-7    338997.6600 (  0.00%)  422940.9730 ( 24.76%)
Max      faults/sec-12   477630.1990 (  0.00%)  361992.2860 (-24.21%)
Max      faults/sec-21   460652.0560 (  0.00%)  414704.3010 ( -9.97%)
Max      faults/sec-30   481274.4730 (  0.00%)  479694.9450 ( -0.33%)
Max      faults/sec-48   478355.8440 (  0.00%)  475307.0260 ( -0.64%)
Max      faults/sec-79   461884.5660 (  0.00%)  473182.5810 (  2.45%)
Max      faults/sec-110  441502.3200 (  0.00%)  462415.9300 (  4.74%)
Max      faults/sec-141  473293.5230 (  0.00%)  447330.8000 ( -5.49%)
Max      faults/sec-160  475931.1320 (  0.00%)  453387.9520 ( -4.74%)

               4.3.0       4.3.0
        13672-gc669d3513708-g8e488c2
User         2202.96     2188.14
System      43841.77    43231.29
Elapsed      1363.37     1355.71

                                 4.3.0       4.3.0
                          13672-gc669d3513708-g8e488c2
Minor Faults                 370404167   370419290
Major Faults                         0          32
Swap Ins                             0           0
Swap Outs                            0           0
Allocation stalls                    0           0
DMA allocs                   368416400   368424912
DMA32 allocs                         0           0
Normal allocs                        0           0
Movable allocs                       0           0
Direct pages scanned                 0           0
Kswapd pages scanned                 0           0
Kswapd pages reclaimed               0           0
Direct pages reclaimed               0           0
Kswapd efficiency                 100%        100%
Kswapd velocity                  0.000       0.000
Direct efficiency                 100%        100%
Direct velocity                  0.000       0.000
Percentage direct scans             0%          0%
Zone normal velocity             0.000       0.000
Zone dma32 velocity              0.000       0.000
Zone dma velocity                0.000       0.000
Page writes by reclaim           0.000       0.000
Page writes file                     0           0
Page writes anon                     0           0
Page reclaim immediate               0           0
Sector Reads                       168        3696
Sector Writes                     7396        9136
Page rescued immediate               0           0
Slabs scanned                        0           0
Direct inode steals                  0           0
Kswapd inode steals                  0           0
Kswapd skipped wait                  0           0
THP fault alloc                      0           0
THP collapse alloc                   0           0
THP splits                           0           0
THP fault fallback                   0           0
THP collapse fail                    0           0
Compaction stalls                    0           0
Compaction success                   0           0
Compaction failures                  0           0
Page migrate success               452         442
Page migrate failure                 0           0
Compaction pages isolated            0           0
Compaction migrate scanned           0           0
Compaction free scanned              0           0
Compaction cost                      0           0
NUMA alloc hit               368352412   368362898
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local             249887344   239441708
NUMA base PTE updates         30421298    30252358
NUMA huge PMD updates                0           0
NUMA page range updates       30421298    30252358
NUMA hint faults                 39412       34328
NUMA hint local faults           28543       25263
NUMA hint local percent             72          73
NUMA pages migrated                452         442
AutoNUMA cost                     410%        383%

ebizzy Overall Throughput
                                        4.3.0                       4.3.0
                               13672-gc669d35              13708-g8e488c2
Min      Rsec-1      30631.00 (  0.00%)    27088.00 (-11.57%)
Min      Rsec-4      57821.00 (  0.00%)    47893.00 (-17.17%)
Min      Rsec-7      64229.00 (  0.00%)    62025.00 ( -3.43%)
Min      Rsec-12     75825.00 (  0.00%)    83296.00 (  9.85%)
Min      Rsec-21     81774.00 (  0.00%)    88602.00 (  8.35%)
Min      Rsec-30     92366.00 (  0.00%)    96488.00 (  4.46%)
Min      Rsec-48     98346.00 (  0.00%)   102081.00 (  3.80%)
Min      Rsec-79    107800.00 (  0.00%)   107806.00 (  0.01%)
Min      Rsec-110   115001.00 (  0.00%)   114767.00 ( -0.20%)
Min      Rsec-141   118423.00 (  0.00%)   120113.00 (  1.43%)
Min      Rsec-172   128361.00 (  0.00%)   126839.00 ( -1.19%)
Min      Rsec-203   128782.00 (  0.00%)   124506.00 ( -3.32%)
Min      Rsec-234   129973.00 (  0.00%)   125048.00 ( -3.79%)
Min      Rsec-265   129718.00 (  0.00%)   124645.00 ( -3.91%)
Min      Rsec-296   131203.00 (  0.00%)   124085.00 ( -5.43%)
Min      Rsec-327   136167.00 (  0.00%)    93782.00 (-31.13%)
Min      Rsec-358   132593.00 (  0.00%)   126162.00 ( -4.85%)
Min      Rsec-389   133791.00 (  0.00%)   125713.00 ( -6.04%)
Min      Rsec-420   132244.00 (  0.00%)   125662.00 ( -4.98%)
Min      Rsec-451   130741.00 (  0.00%)   131577.00 (  0.64%)
Min      Rsec-482   132991.00 (  0.00%)   133038.00 (  0.04%)
Min      Rsec-513   135247.00 (  0.00%)   132752.00 ( -1.84%)
Min      Rsec-544   133541.00 (  0.00%)   138052.00 (  3.38%)
Min      Rsec-575   134361.00 (  0.00%)   140407.00 (  4.50%)
Min      Rsec-606   136566.00 (  0.00%)   140320.00 (  2.75%)
Min      Rsec-637   136596.00 (  0.00%)    99349.00 (-27.27%)
Min      Rsec-640   101422.00 (  0.00%)    99561.00 ( -1.83%)
Hmean    Rsec-1      30824.32 (  0.00%)    28676.70 ( -6.97%)
Hmean    Rsec-4      64157.06 (  0.00%)    62294.92 ( -2.90%)
Hmean    Rsec-7      65651.01 (  0.00%)    69718.38 (  6.20%)
Hmean    Rsec-12     78873.91 (  0.00%)    91860.80 ( 16.47%)
Hmean    Rsec-21     86468.13 (  0.00%)    96639.75 ( 11.76%)
Hmean    Rsec-30     95530.97 (  0.00%)   103998.21 (  8.86%)
Hmean    Rsec-48    103374.48 (  0.00%)   108060.90 (  4.53%)
Hmean    Rsec-79    110468.40 (  0.00%)   111625.07 (  1.05%)
Hmean    Rsec-110   117068.12 (  0.00%)   117277.05 (  0.18%)
Hmean    Rsec-141   122303.78 (  0.00%)   123704.79 (  1.15%)
Hmean    Rsec-172   129919.04 (  0.00%)   127485.39 ( -1.87%)
Hmean    Rsec-203   129848.20 (  0.00%)   126836.53 ( -2.32%)
Hmean    Rsec-234   131271.19 (  0.00%)   126981.96 ( -3.27%)
Hmean    Rsec-265   130786.79 (  0.00%)   126195.37 ( -3.51%)
Hmean    Rsec-296   136679.24 (  0.00%)   125519.14 ( -8.17%)
Hmean    Rsec-327   136987.94 (  0.00%)   118013.15 (-13.85%)
Hmean    Rsec-358   134069.52 (  0.00%)   126610.36 ( -5.56%)
Hmean    Rsec-389   134492.50 (  0.00%)   126879.43 ( -5.66%)
Hmean    Rsec-420   134001.91 (  0.00%)   127656.60 ( -4.74%)
Hmean    Rsec-451   133772.89 (  0.00%)   133757.64 ( -0.01%)
Hmean    Rsec-482   135542.90 (  0.00%)   135620.98 (  0.06%)
Hmean    Rsec-513   137292.43 (  0.00%)   135231.99 ( -1.50%)
Hmean    Rsec-544   135299.06 (  0.00%)   141228.12 (  4.38%)
Hmean    Rsec-575   137473.02 (  0.00%)   140835.74 (  2.45%)
Hmean    Rsec-606   137456.90 (  0.00%)   143092.13 (  4.10%)
Hmean    Rsec-637   137146.14 (  0.00%)   129751.36 ( -5.39%)
Hmean    Rsec-640   131512.67 (  0.00%)   129329.67 ( -1.66%)
Stddev   Rsec-1        144.49 (  0.00%)     1292.15 (-794.30%)
Stddev   Rsec-4       4005.88 (  0.00%)     8292.35 (-107.00%)
Stddev   Rsec-7        959.09 (  0.00%)     5545.25 (-478.18%)
Stddev   Rsec-12      2728.12 (  0.00%)     7819.40 (-186.62%)
Stddev   Rsec-21      5449.55 (  0.00%)     8175.21 (-50.02%)
Stddev   Rsec-30      2577.87 (  0.00%)     4946.96 (-91.90%)
Stddev   Rsec-48      4972.04 (  0.00%)     4285.92 ( 13.80%)
Stddev   Rsec-79      1860.85 (  0.00%)     3294.02 (-77.02%)
Stddev   Rsec-110     2042.24 (  0.00%)     1385.94 ( 32.14%)
Stddev   Rsec-141     2275.70 (  0.00%)     2258.88 (  0.74%)
Stddev   Rsec-172     1003.21 (  0.00%)      679.57 ( 32.26%)
Stddev   Rsec-203      623.13 (  0.00%)     1220.15 (-95.81%)
Stddev   Rsec-234      723.22 (  0.00%)     1138.08 (-57.36%)
Stddev   Rsec-265      742.86 (  0.00%)     1321.18 (-77.85%)
Stddev   Rsec-296     2939.87 (  0.00%)      993.39 ( 66.21%)
Stddev   Rsec-327      604.06 (  0.00%)    12956.36 (-2044.87%)
Stddev   Rsec-358     1390.12 (  0.00%)      285.62 ( 79.45%)
Stddev   Rsec-389      847.67 (  0.00%)      855.57 ( -0.93%)
Stddev   Rsec-420     1151.99 (  0.00%)     1619.63 (-40.59%)
Stddev   Rsec-451     1570.06 (  0.00%)     1573.54 ( -0.22%)
Stddev   Rsec-482     1509.33 (  0.00%)     1984.46 (-31.48%)
Stddev   Rsec-513     1480.83 (  0.00%)     1873.63 (-26.53%)
Stddev   Rsec-544     1470.26 (  0.00%)     3914.73 (-166.26%)
Stddev   Rsec-575     1898.92 (  0.00%)      613.86 ( 67.67%)
Stddev   Rsec-606      694.98 (  0.00%)     2642.72 (-280.26%)
Stddev   Rsec-637      670.80 (  0.00%)    16535.67 (-2365.07%)
Stddev   Rsec-640    16352.20 (  0.00%)    16160.95 (  1.17%)
CoeffVar Rsec-1          0.47 (  0.00%)        4.50 (-859.37%)
CoeffVar Rsec-4          6.22 (  0.00%)       13.04 (-109.75%)
CoeffVar Rsec-7          1.46 (  0.00%)        7.90 (-441.04%)
CoeffVar Rsec-12         3.45 (  0.00%)        8.45 (-144.72%)
CoeffVar Rsec-21         6.28 (  0.00%)        8.40 (-33.83%)
CoeffVar Rsec-30         2.70 (  0.00%)        4.75 (-76.00%)
CoeffVar Rsec-48         4.80 (  0.00%)        3.96 ( 17.49%)
CoeffVar Rsec-79         1.68 (  0.00%)        2.95 (-75.08%)
CoeffVar Rsec-110        1.74 (  0.00%)        1.18 ( 32.25%)
CoeffVar Rsec-141        1.86 (  0.00%)        1.83 (  1.86%)
CoeffVar Rsec-172        0.77 (  0.00%)        0.53 ( 30.97%)
CoeffVar Rsec-203        0.48 (  0.00%)        0.96 (-100.45%)
CoeffVar Rsec-234        0.55 (  0.00%)        0.90 (-62.67%)
CoeffVar Rsec-265        0.57 (  0.00%)        1.05 (-84.31%)
CoeffVar Rsec-296        2.15 (  0.00%)        0.79 ( 63.19%)
CoeffVar Rsec-327        0.44 (  0.00%)       10.83 (-2354.96%)
CoeffVar Rsec-358        1.04 (  0.00%)        0.23 ( 78.24%)
CoeffVar Rsec-389        0.63 (  0.00%)        0.67 ( -6.99%)
CoeffVar Rsec-420        0.86 (  0.00%)        1.27 (-47.57%)
CoeffVar Rsec-451        1.17 (  0.00%)        1.18 ( -0.23%)
CoeffVar Rsec-482        1.11 (  0.00%)        1.46 (-31.39%)
CoeffVar Rsec-513        1.08 (  0.00%)        1.39 (-28.44%)
CoeffVar Rsec-544        1.09 (  0.00%)        2.77 (-154.92%)
CoeffVar Rsec-575        1.38 (  0.00%)        0.44 ( 68.44%)
CoeffVar Rsec-606        0.51 (  0.00%)        1.85 (-265.17%)
CoeffVar Rsec-637        0.49 (  0.00%)       12.50 (-2455.70%)
CoeffVar Rsec-640       12.21 (  0.00%)       12.27 ( -0.48%)
Max      Rsec-1      31000.00 (  0.00%)    30276.00 ( -2.34%)
Max      Rsec-4      69454.00 (  0.00%)    71739.00 (  3.29%)
Max      Rsec-7      67182.00 (  0.00%)    75573.00 ( 12.49%)
Max      Rsec-12     82739.00 (  0.00%)   106341.00 ( 28.53%)
Max      Rsec-21     95156.00 (  0.00%)   110486.00 ( 16.11%)
Max      Rsec-30     98859.00 (  0.00%)   109381.00 ( 10.64%)
Max      Rsec-48    113061.00 (  0.00%)   114346.00 (  1.14%)
Max      Rsec-79    113578.00 (  0.00%)   117539.00 (  3.49%)
Max      Rsec-110   119618.00 (  0.00%)   118806.00 ( -0.68%)
Max      Rsec-141   125009.00 (  0.00%)   126417.00 (  1.13%)
Max      Rsec-172   131250.00 (  0.00%)   128709.00 ( -1.94%)
Max      Rsec-203   130718.00 (  0.00%)   128025.00 ( -2.06%)
Max      Rsec-234   131856.00 (  0.00%)   128528.00 ( -2.52%)
Max      Rsec-265   132036.00 (  0.00%)   128059.00 ( -3.01%)
Max      Rsec-296   139572.00 (  0.00%)   126819.00 ( -9.14%)
Max      Rsec-327   138043.00 (  0.00%)   126581.00 ( -8.30%)
Max      Rsec-358   136616.00 (  0.00%)   126986.00 ( -7.05%)
Max      Rsec-389   136122.00 (  0.00%)   128050.00 ( -5.93%)
Max      Rsec-420   135873.00 (  0.00%)   129725.00 ( -4.52%)
Max      Rsec-451   134918.00 (  0.00%)   135687.00 (  0.57%)
Max      Rsec-482   137574.00 (  0.00%)   138546.00 (  0.71%)
Max      Rsec-513   139144.00 (  0.00%)   138358.00 ( -0.56%)
Max      Rsec-544   137415.00 (  0.00%)   148411.00 (  8.00%)
Max      Rsec-575   140015.00 (  0.00%)   142036.00 (  1.44%)
Max      Rsec-606   138658.00 (  0.00%)   148103.00 (  6.81%)
Max      Rsec-637   138429.00 (  0.00%)   143293.00 (  3.51%)
Max      Rsec-640   144351.00 (  0.00%)   141507.00 ( -1.97%)

ebizzy Per-thread
                                        4.3.0                       4.3.0
                               13672-gc669d35              13708-g8e488c2
Min      Rsec-1      30631.00 (  0.00%)    27088.00 (-11.57%)
Min      Rsec-4      12735.00 (  0.00%)    11127.00 (-12.63%)
Min      Rsec-7       7954.00 (  0.00%)     7116.00 (-10.54%)
Min      Rsec-12      4721.00 (  0.00%)     5315.00 ( 12.58%)
Min      Rsec-21      2410.00 (  0.00%)     2304.00 ( -4.40%)
Min      Rsec-30      1931.00 (  0.00%)     2093.00 (  8.39%)
Min      Rsec-48      1451.00 (  0.00%)     1403.00 ( -3.31%)
Min      Rsec-79       821.00 (  0.00%)      715.00 (-12.91%)
Min      Rsec-110      647.00 (  0.00%)      668.00 (  3.25%)
Min      Rsec-141      477.00 (  0.00%)      495.00 (  3.77%)
Min      Rsec-172      463.00 (  0.00%)      452.00 ( -2.38%)
Min      Rsec-203      367.00 (  0.00%)      314.00 (-14.44%)
Min      Rsec-234      304.00 (  0.00%)      290.00 ( -4.61%)
Min      Rsec-265      260.00 (  0.00%)      244.00 ( -6.15%)
Min      Rsec-296      211.00 (  0.00%)      205.00 ( -2.84%)
Min      Rsec-327      185.00 (  0.00%)      152.00 (-17.84%)
Min      Rsec-358      170.00 (  0.00%)      171.00 (  0.59%)
Min      Rsec-389      134.00 (  0.00%)      139.00 (  3.73%)
Min      Rsec-420      153.00 (  0.00%)      159.00 (  3.92%)
Min      Rsec-451      133.00 (  0.00%)      135.00 (  1.50%)
Min      Rsec-482      127.00 (  0.00%)      127.00 (  0.00%)
Min      Rsec-513      120.00 (  0.00%)      130.00 (  8.33%)
Min      Rsec-544      107.00 (  0.00%)      101.00 ( -5.61%)
Min      Rsec-575       94.00 (  0.00%)      114.00 ( 21.28%)
Min      Rsec-606       94.00 (  0.00%)      127.00 ( 35.11%)
Min      Rsec-637      101.00 (  0.00%)      101.00 (  0.00%)
Min      Rsec-640       81.00 (  0.00%)       90.00 ( 11.11%)
Hmean    Rsec-1      30824.32 (  0.00%)    28676.70 ( -6.97%)
Hmean    Rsec-4      15947.39 (  0.00%)    15456.86 ( -3.08%)
Hmean    Rsec-7       9303.67 (  0.00%)     9820.37 (  5.55%)
Hmean    Rsec-12      6388.57 (  0.00%)     7416.20 ( 16.09%)
Hmean    Rsec-21      3321.63 (  0.00%)     3634.83 (  9.43%)
Hmean    Rsec-30      2815.97 (  0.00%)     3019.42 (  7.22%)
Hmean    Rsec-48      1981.55 (  0.00%)     1982.90 (  0.07%)
Hmean    Rsec-79      1276.93 (  0.00%)     1287.34 (  0.82%)
Hmean    Rsec-110      956.47 (  0.00%)      985.88 (  3.08%)
Hmean    Rsec-141      783.13 (  0.00%)      783.22 (  0.01%)
Hmean    Rsec-172      689.92 (  0.00%)      667.13 ( -3.30%)
Hmean    Rsec-203      587.77 (  0.00%)      556.88 ( -5.26%)
Hmean    Rsec-234      506.71 (  0.00%)      504.36 ( -0.46%)
Hmean    Rsec-265      446.66 (  0.00%)      432.97 ( -3.06%)
Hmean    Rsec-296      412.77 (  0.00%)      378.81 ( -8.23%)
Hmean    Rsec-327      359.13 (  0.00%)      315.30 (-12.20%)
Hmean    Rsec-358      335.46 (  0.00%)      312.77 ( -6.76%)
Hmean    Rsec-389      291.33 (  0.00%)      281.59 ( -3.34%)
Hmean    Rsec-420      289.74 (  0.00%)      278.40 ( -3.91%)
Hmean    Rsec-451      255.18 (  0.00%)      263.41 (  3.22%)
Hmean    Rsec-482      241.82 (  0.00%)      253.38 (  4.78%)
Hmean    Rsec-513      237.84 (  0.00%)      233.55 ( -1.81%)
Hmean    Rsec-544      219.99 (  0.00%)      216.96 ( -1.38%)
Hmean    Rsec-575      205.77 (  0.00%)      216.15 (  5.04%)
Hmean    Rsec-606      199.08 (  0.00%)      215.75 (  8.37%)
Hmean    Rsec-637      183.93 (  0.00%)      190.02 (  3.31%)
Hmean    Rsec-640      183.38 (  0.00%)      184.17 (  0.43%)
Stddev   Rsec-1        144.49 (  0.00%)     1292.15 (794.30%)
Stddev   Rsec-4       1525.43 (  0.00%)     2515.03 ( 64.87%)
Stddev   Rsec-7        853.17 (  0.00%)     1373.82 ( 61.02%)
Stddev   Rsec-12      1178.82 (  0.00%)     1543.19 ( 30.91%)
Stddev   Rsec-21      2459.94 (  0.00%)     2892.79 ( 17.60%)
Stddev   Rsec-30      1585.28 (  0.00%)     1903.96 ( 20.10%)
Stddev   Rsec-48       972.90 (  0.00%)     1219.25 ( 25.32%)
Stddev   Rsec-79       629.60 (  0.00%)      637.97 (  1.33%)
Stddev   Rsec-110      529.71 (  0.00%)      417.64 (-21.16%)
Stddev   Rsec-141      421.55 (  0.00%)      452.80 (  7.41%)
Stddev   Rsec-172      341.05 (  0.00%)      375.21 ( 10.02%)
Stddev   Rsec-203      249.78 (  0.00%)      281.78 ( 12.81%)
Stddev   Rsec-234      244.43 (  0.00%)      193.14 (-20.98%)
Stddev   Rsec-265      186.43 (  0.00%)      196.87 (  5.60%)
Stddev   Rsec-296      192.08 (  0.00%)      184.98 ( -3.70%)
Stddev   Rsec-327      186.84 (  0.00%)      161.37 (-13.63%)
Stddev   Rsec-358      150.51 (  0.00%)      147.22 ( -2.19%)
Stddev   Rsec-389      163.74 (  0.00%)      139.43 (-14.85%)
Stddev   Rsec-420      123.57 (  0.00%)      112.75 ( -8.76%)
Stddev   Rsec-451      130.29 (  0.00%)      127.71 ( -1.98%)
Stddev   Rsec-482      130.08 (  0.00%)      109.39 (-15.90%)
Stddev   Rsec-513      109.04 (  0.00%)      105.78 ( -2.99%)
Stddev   Rsec-544      100.31 (  0.00%)      126.31 ( 25.92%)
Stddev   Rsec-575      102.79 (  0.00%)       98.62 ( -4.06%)
Stddev   Rsec-606       93.57 (  0.00%)       82.75 (-11.57%)
Stddev   Rsec-637      102.32 (  0.00%)       67.40 (-34.13%)
Stddev   Rsec-640       78.48 (  0.00%)       89.02 ( 13.43%)
CoeffVar Rsec-1          0.47 (  0.00%)        4.50 (-859.37%)
CoeffVar Rsec-4          9.47 (  0.00%)       15.82 (-67.06%)
CoeffVar Rsec-7          9.10 (  0.00%)       13.71 (-50.68%)
CoeffVar Rsec-12        17.91 (  0.00%)       20.02 (-11.77%)
CoeffVar Rsec-21        59.52 (  0.00%)       62.44 ( -4.91%)
CoeffVar Rsec-30        49.75 (  0.00%)       54.80 (-10.15%)
CoeffVar Rsec-48        45.09 (  0.00%)       54.08 (-19.96%)
CoeffVar Rsec-79        45.03 (  0.00%)       45.13 ( -0.22%)
CoeffVar Rsec-110       49.78 (  0.00%)       39.18 ( 21.28%)
CoeffVar Rsec-141       48.61 (  0.00%)       51.62 ( -6.20%)
CoeffVar Rsec-172       45.18 (  0.00%)       50.65 (-12.12%)
CoeffVar Rsec-203       39.08 (  0.00%)       45.13 (-15.48%)
CoeffVar Rsec-234       43.61 (  0.00%)       35.62 ( 18.32%)
CoeffVar Rsec-265       37.81 (  0.00%)       41.38 ( -9.44%)
CoeffVar Rsec-296       41.62 (  0.00%)       43.67 ( -4.92%)
CoeffVar Rsec-327       44.65 (  0.00%)       44.15 (  1.13%)
CoeffVar Rsec-358       40.24 (  0.00%)       41.68 ( -3.59%)
CoeffVar Rsec-389       47.42 (  0.00%)       42.81 (  9.73%)
CoeffVar Rsec-420       38.79 (  0.00%)       37.15 (  4.23%)
CoeffVar Rsec-451       43.99 (  0.00%)       43.13 (  1.96%)
CoeffVar Rsec-482       46.33 (  0.00%)       38.94 ( 15.96%)
CoeffVar Rsec-513       40.81 (  0.00%)       40.19 (  1.52%)
CoeffVar Rsec-544       40.41 (  0.00%)       48.71 (-20.54%)
CoeffVar Rsec-575       43.07 (  0.00%)       40.34 (  6.33%)
CoeffVar Rsec-606       41.34 (  0.00%)       35.10 ( 15.08%)
CoeffVar Rsec-637       47.63 (  0.00%)       32.53 ( 31.70%)
CoeffVar Rsec-640       37.58 (  0.00%)       43.34 (-15.33%)
Max      Rsec-1      31000.00 (  0.00%)    30276.00 ( -2.34%)
Max      Rsec-4      18406.00 (  0.00%)    19844.00 (  7.81%)
Max      Rsec-7      11200.00 (  0.00%)    12913.00 ( 15.29%)
Max      Rsec-12     10612.00 (  0.00%)    11465.00 (  8.04%)
Max      Rsec-21     15254.00 (  0.00%)    15490.00 (  1.55%)
Max      Rsec-30     10529.00 (  0.00%)    11822.00 ( 12.28%)
Max      Rsec-48      8134.00 (  0.00%)     8236.00 (  1.25%)
Max      Rsec-79      5105.00 (  0.00%)     5945.00 ( 16.45%)
Max      Rsec-110     4728.00 (  0.00%)     4371.00 ( -7.55%)
Max      Rsec-141     3628.00 (  0.00%)     3997.00 ( 10.17%)
Max      Rsec-172     3261.00 (  0.00%)     3138.00 ( -3.77%)
Max      Rsec-203     2453.00 (  0.00%)     2995.00 ( 22.10%)
Max      Rsec-234     2363.00 (  0.00%)     2194.00 ( -7.15%)
Max      Rsec-265     2038.00 (  0.00%)     2235.00 (  9.67%)
Max      Rsec-296     2174.00 (  0.00%)     1942.00 (-10.67%)
Max      Rsec-327     1791.00 (  0.00%)     1826.00 (  1.95%)
Max      Rsec-358     1389.00 (  0.00%)     1538.00 ( 10.73%)
Max      Rsec-389     1630.00 (  0.00%)     1273.00 (-21.90%)
Max      Rsec-420     1249.00 (  0.00%)     1047.00 (-16.17%)
Max      Rsec-451     1179.00 (  0.00%)     1316.00 ( 11.62%)
Max      Rsec-482     1187.00 (  0.00%)     1211.00 (  2.02%)
Max      Rsec-513     1257.00 (  0.00%)     1018.00 (-19.01%)
Max      Rsec-544     1139.00 (  0.00%)     1216.00 (  6.76%)
Max      Rsec-575     1089.00 (  0.00%)      912.00 (-16.25%)
Max      Rsec-606      903.00 (  0.00%)     1014.00 ( 12.29%)
Max      Rsec-637     1053.00 (  0.00%)      660.00 (-37.32%)
Max      Rsec-640      749.00 (  0.00%)     1081.00 ( 44.33%)

ebizzy Thread spread
                                          4.3.0                       4.3.0
                                 13672-gc669d35              13708-g8e488c2
Min      spread-1          0.00 (  0.00%)        0.00 (  0.00%)
Min      spread-4       1122.00 (  0.00%)     1464.00 (-30.48%)
Min      spread-7       1537.00 (  0.00%)     1949.00 (-26.81%)
Min      spread-12      3466.00 (  0.00%)     3667.00 ( -5.80%)
Min      spread-21      6507.00 (  0.00%)     6406.00 (  1.55%)
Min      spread-30      5648.00 (  0.00%)     7543.00 (-33.55%)
Min      spread-48      3531.00 (  0.00%)     4244.00 (-20.19%)
Min      spread-79      2933.00 (  0.00%)     3184.00 ( -8.56%)
Min      spread-110     2341.00 (  0.00%)     1892.00 ( 19.18%)
Min      spread-141     2084.00 (  0.00%)     2641.00 (-26.73%)
Min      spread-172     1447.00 (  0.00%)     2001.00 (-38.29%)
Min      spread-203     1268.00 (  0.00%)     1686.00 (-32.97%)
Min      spread-234     1703.00 (  0.00%)     1157.00 ( 32.06%)
Min      spread-265     1385.00 (  0.00%)     1370.00 (  1.08%)
Min      spread-296     1139.00 (  0.00%)     1263.00 (-10.89%)
Min      spread-327     1259.00 (  0.00%)      328.00 ( 73.95%)
Min      spread-358      781.00 (  0.00%)     1002.00 (-28.30%)
Min      spread-389      746.00 (  0.00%)      767.00 ( -2.82%)
Min      spread-420      703.00 (  0.00%)      673.00 (  4.27%)
Min      spread-451      812.00 (  0.00%)      603.00 ( 25.74%)
Min      spread-482      530.00 (  0.00%)      685.00 (-29.25%)
Min      spread-513      564.00 (  0.00%)      456.00 ( 19.15%)
Min      spread-544      530.00 (  0.00%)      664.00 (-25.28%)
Min      spread-575      497.00 (  0.00%)      443.00 ( 10.87%)
Min      spread-606      372.00 (  0.00%)      460.00 (-23.66%)
Min      spread-637      377.00 (  0.00%)      185.00 ( 50.93%)
Min      spread-640      295.00 (  0.00%)       82.00 ( 72.20%)
Hmean    spread-1          0.00 (  0.00%)        0.00 (  0.00%)
Hmean    spread-4       2198.84 (  0.00%)     2371.33 ( -7.84%)
Hmean    spread-7       2165.67 (  0.00%)     2771.13 (-27.96%)
Hmean    spread-12      4198.82 (  0.00%)     4507.36 ( -7.35%)
Hmean    spread-21      7559.98 (  0.00%)     8509.79 (-12.56%)
Hmean    spread-30      6617.29 (  0.00%)     8461.26 (-27.87%)
Hmean    spread-48      4726.30 (  0.00%)     5794.29 (-22.60%)
Hmean    spread-79      3596.75 (  0.00%)     3913.30 ( -8.80%)
Hmean    spread-110     3210.81 (  0.00%)     2530.22 ( 21.20%)
Hmean    spread-141     2642.11 (  0.00%)     2904.53 ( -9.93%)
Hmean    spread-172     2147.18 (  0.00%)     2374.37 (-10.58%)
Hmean    spread-203     1778.54 (  0.00%)     2008.80 (-12.95%)
Hmean    spread-234     1843.86 (  0.00%)     1385.32 ( 24.87%)
Hmean    spread-265     1516.18 (  0.00%)     1641.22 ( -8.25%)
Hmean    spread-296     1396.98 (  0.00%)     1441.88 ( -3.21%)
Hmean    spread-327     1418.21 (  0.00%)      789.92 ( 44.30%)
Hmean    spread-358      940.59 (  0.00%)     1187.35 (-26.23%)
Hmean    spread-389     1050.33 (  0.00%)      922.54 ( 12.17%)
Hmean    spread-420      837.76 (  0.00%)      772.29 (  7.82%)
Hmean    spread-451      874.68 (  0.00%)      821.89 (  6.04%)
Hmean    spread-482      755.71 (  0.00%)      782.04 ( -3.49%)
Hmean    spread-513      716.35 (  0.00%)      682.52 (  4.72%)
Hmean    spread-544      659.82 (  0.00%)      807.36 (-22.36%)
Hmean    spread-575      667.82 (  0.00%)      584.47 ( 12.48%)
Hmean    spread-606      547.04 (  0.00%)      556.28 ( -1.69%)
Hmean    spread-637      571.10 (  0.00%)      340.03 ( 40.46%)
Hmean    spread-640      413.62 (  0.00%)      247.08 ( 40.26%)
Stddev   spread-1          0.00 (  0.00%)        0.00 (  0.00%)
Stddev   spread-4       1055.03 (  0.00%)     1918.21 (-81.81%)
Stddev   spread-7        675.30 (  0.00%)      998.13 (-47.80%)
Stddev   spread-12       645.46 (  0.00%)      726.78 (-12.60%)
Stddev   spread-21      2351.57 (  0.00%)     2788.71 (-18.59%)
Stddev   spread-30      1044.36 (  0.00%)      717.65 ( 31.28%)
Stddev   spread-48      1259.45 (  0.00%)      912.70 ( 27.53%)
Stddev   spread-79       445.51 (  0.00%)      833.11 (-87.00%)
Stddev   spread-110      595.82 (  0.00%)      624.63 ( -4.84%)
Stddev   spread-141      370.68 (  0.00%)      285.28 ( 23.04%)
Stddev   spread-172      472.11 (  0.00%)      250.80 ( 46.88%)
Stddev   spread-203      288.31 (  0.00%)      393.40 (-36.45%)
Stddev   spread-234      130.86 (  0.00%)      329.63 (-151.90%)
Stddev   spread-265      129.24 (  0.00%)      196.98 (-52.41%)
Stddev   spread-296      292.03 (  0.00%)      157.32 ( 46.13%)
Stddev   spread-327      136.07 (  0.00%)      443.66 (-226.05%)
Stddev   spread-358      168.12 (  0.00%)      116.22 ( 30.87%)
Stddev   spread-389      273.41 (  0.00%)      120.46 ( 55.94%)
Stddev   spread-420      124.12 (  0.00%)       89.51 ( 27.88%)
Stddev   spread-451       80.65 (  0.00%)      220.33 (-173.18%)
Stddev   spread-482      211.68 (  0.00%)      146.76 ( 30.67%)
Stddev   spread-513      202.23 (  0.00%)      156.38 ( 22.67%)
Stddev   spread-544      183.05 (  0.00%)      149.90 ( 18.11%)
Stddev   spread-575      170.16 (  0.00%)      134.70 ( 20.84%)
Stddev   spread-606      145.56 (  0.00%)      161.55 (-10.99%)
Stddev   spread-637      240.03 (  0.00%)      109.43 ( 54.41%)
Stddev   spread-640      119.87 (  0.00%)      295.05 (-146.15%)
CoeffVar spread-1          0.00 (  0.00%)        0.00 (  0.00%)
CoeffVar spread-4         39.42 (  0.00%)       60.93 ( 54.57%)
CoeffVar spread-7         28.47 (  0.00%)       32.47 ( 14.07%)
CoeffVar spread-12        15.04 (  0.00%)       15.75 (  4.74%)
CoeffVar spread-21        29.16 (  0.00%)       29.98 (  2.80%)
CoeffVar spread-30        15.43 (  0.00%)        8.42 (-45.42%)
CoeffVar spread-48        25.02 (  0.00%)       15.30 (-38.85%)
CoeffVar spread-79        12.19 (  0.00%)       20.43 ( 67.55%)
CoeffVar spread-110       17.88 (  0.00%)       23.42 ( 30.99%)
CoeffVar spread-141       13.74 (  0.00%)        9.73 (-29.15%)
CoeffVar spread-172       20.83 (  0.00%)       10.44 (-49.89%)
CoeffVar spread-203       15.70 (  0.00%)       18.94 ( 20.66%)
CoeffVar spread-234        7.06 (  0.00%)       22.68 (221.11%)
CoeffVar spread-265        8.47 (  0.00%)       11.83 ( 39.72%)
CoeffVar spread-296       20.16 (  0.00%)       10.79 (-46.46%)
CoeffVar spread-327        9.51 (  0.00%)       41.00 (331.14%)
CoeffVar spread-358       17.36 (  0.00%)        9.69 (-44.18%)
CoeffVar spread-389       24.42 (  0.00%)       12.84 (-47.41%)
CoeffVar spread-420       14.52 (  0.00%)       11.43 (-21.27%)
CoeffVar spread-451        9.15 (  0.00%)       25.12 (174.49%)
CoeffVar spread-482       26.01 (  0.00%)       18.24 (-29.86%)
CoeffVar spread-513       26.62 (  0.00%)       21.60 (-18.85%)
CoeffVar spread-544       26.14 (  0.00%)       18.03 (-31.02%)
CoeffVar spread-575       24.10 (  0.00%)       21.88 ( -9.21%)
CoeffVar spread-606       24.89 (  0.00%)       27.30 (  9.67%)
CoeffVar spread-637       36.24 (  0.00%)       28.41 (-21.60%)
CoeffVar spread-640       26.97 (  0.00%)       62.12 (130.29%)
Max      spread-1          0.00 (  0.00%)        0.00 (  0.00%)
Max      spread-4       4390.00 (  0.00%)     6802.00 (-54.94%)
Max      spread-7       3165.00 (  0.00%)     4721.00 (-49.16%)
Max      spread-12      5377.00 (  0.00%)     5913.00 ( -9.97%)
Max      spread-21     12629.00 (  0.00%)    12796.00 ( -1.32%)
Max      spread-30      8385.00 (  0.00%)     9572.00 (-14.16%)
Max      spread-48      6557.00 (  0.00%)     6670.00 ( -1.72%)
Max      spread-79      4169.00 (  0.00%)     5230.00 (-25.45%)
Max      spread-110     3997.00 (  0.00%)     3688.00 (  7.73%)
Max      spread-141     3151.00 (  0.00%)     3416.00 ( -8.41%)
Max      spread-172     2798.00 (  0.00%)     2686.00 (  4.00%)
Max      spread-203     2061.00 (  0.00%)     2681.00 (-30.08%)
Max      spread-234     2049.00 (  0.00%)     1904.00 (  7.08%)
Max      spread-265     1762.00 (  0.00%)     1952.00 (-10.78%)
Max      spread-296     1963.00 (  0.00%)     1737.00 ( 11.51%)
Max      spread-327     1592.00 (  0.00%)     1670.00 ( -4.90%)
Max      spread-358     1197.00 (  0.00%)     1366.00 (-14.12%)
Max      spread-389     1486.00 (  0.00%)     1116.00 ( 24.90%)
Max      spread-420     1067.00 (  0.00%)      879.00 ( 17.62%)
Max      spread-451     1033.00 (  0.00%)     1181.00 (-14.33%)
Max      spread-482     1060.00 (  0.00%)     1084.00 ( -2.26%)
Max      spread-513     1137.00 (  0.00%)      885.00 ( 22.16%)
Max      spread-544     1032.00 (  0.00%)     1103.00 ( -6.88%)
Max      spread-575      995.00 (  0.00%)      789.00 ( 20.70%)
Max      spread-606      809.00 (  0.00%)      887.00 ( -9.64%)
Max      spread-637      951.00 (  0.00%)      518.00 ( 45.53%)
Max      spread-640      625.00 (  0.00%)      991.00 (-58.56%)

               4.3.0       4.3.0
        13672-gc669d3513708-g8e488c2
User        69638.44    67856.26
System     403739.55   405630.82
Elapsed      4055.69     4058.90

                                 4.3.0       4.3.0
                          13672-gc669d3513708-g8e488c2
Minor Faults                2846674710  2831764228
Major Faults                         0           2
Swap Ins                             0           0
Swap Outs                            0           0
Allocation stalls                    0           0
DMA allocs                  2846385041  2831501075
DMA32 allocs                         0           0
Normal allocs                        0           0
Movable allocs                       0           0
Direct pages scanned                 0           0
Kswapd pages scanned                 0           0
Kswapd pages reclaimed               0           0
Direct pages reclaimed               0           0
Kswapd efficiency                 100%        100%
Kswapd velocity                  0.000       0.000
Direct efficiency                 100%        100%
Direct velocity                  0.000       0.000
Percentage direct scans             0%          0%
Zone normal velocity             0.000       0.000
Zone dma32 velocity              0.000       0.000
Zone dma velocity                0.000       0.000
Page writes by reclaim           0.000       0.000
Page writes file                     0           0
Page writes anon                     0           0
Page reclaim immediate               0           0
Sector Reads                       444         956
Sector Writes                    14220       14232
Page rescued immediate               0           0
Slabs scanned                        0           0
Direct inode steals                  0           0
Kswapd inode steals                  0           0
Kswapd skipped wait                  0           0
THP fault alloc                     39          70
THP collapse alloc                   3           3
THP splits                          40          69
THP fault fallback                 209         246
THP collapse fail                    0           0
Compaction stalls                    0           0
Compaction success                   0           0
Compaction failures                  0           0
Page migrate success             23141       24465
Page migrate failure                 0           0
Compaction pages isolated            0           0
Compaction migrate scanned           0           0
Compaction free scanned              0           0
Compaction cost                     24          25
NUMA alloc hit              2846238192  2831325148
NUMA alloc miss                      0           0
NUMA interleave hit                  0           0
NUMA alloc local            1507528120  1466361674
NUMA base PTE updates           572055      572474
NUMA huge PMD updates                0           0
NUMA page range updates         572055      572474
NUMA hint faults                459654      460212
NUMA hint local faults          229186      224799
NUMA hint local percent             49          48
NUMA pages migrated              23141       24465
AutoNUMA cost                    2302%       2305%

-------------- 64k hash pte powernv config start--------------------




Aneesh Kumar K.V (35):
  powerpc/mm: move pte headers to book3s directory
  powerpc/mm: move pte headers to book3s directory (part 2)
  powerpc/mm: make a separate copy for book3s
  powerpc/mm: make a separate copy for book3s (part 2)
  powerpc/mm: Move hash specific pte width and other defines to book3s
  powerpc/mm: Delete booke bits from book3s
  powerpc/mm: Don't have generic headers introduce functions touching
    pte bits
  powerpc/mm: Drop pte-common.h from BOOK3S 64
  powerpc/mm: Don't use pte_val as lvalue
  powerpc/mm: Don't use pmd_val,pud_val and pgd_val as lvalue
  powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h
  powerpc/mm: Move PTE bits from generic functions to hash64 functions.
  powerpc/booke: Move nohash headers (part 1)
  powerpc/booke: Move nohash headers (part 2)
  powerpc/booke: Move nohash headers (part 3)
  powerpc/booke: Move nohash headers (part 4)
  powerpc/booke: Move nohash headers (part 5)
  powerpc/mm: Convert 4k hash insert to C
  powerpc/mm: Remove the dependency on pte bit position in asm code
  powerpc/mm: Don't track subpage valid bit in pte_t
  powerpc/mm: Remove pte_val usage for the second half of pgtable_t
  powerpc/mm: Increase the width of #define
  powerpc/mm: Convert __hash_page_64K to C
  powerpc/mm: Convert 4k insert from asm to C
  powerpc/mm: Add helper for converting pte bit to hpte bits
  powerpc/mm: Move WIMG update to helper.
  powerpc/mm: Move hugetlb related headers
  powerpc/mm: Move THP headers around
  powerpc/mm: Add a _PAGE_PTE bit
  powerpc/mm: Don't hardcode page table size
  powerpc/mm: Don't hardcode the hash pte slot shift
  powerpc/nohash: Update 64K nohash config to have 32 pte fragement
  powerpc/nohash: we don't use real_pte_t for nohash
  powerpc/mm: Use H_READ with H_READ_4
  powerpc/mm: Don't open code pgtable_t size

 .../include/asm/{pte-hash32.h => book3s/32/hash.h} |    6 +-
 .../asm/{pgtable-ppc32.h => book3s/32/pgtable.h}   |  288 ++++--
 .../{pgtable-ppc64-4k.h => book3s/64/hash-4k.h}    |   58 +-
 arch/powerpc/include/asm/book3s/64/hash-64k.h      |  312 ++++++
 arch/powerpc/include/asm/book3s/64/hash.h          |  526 ++++++++++
 arch/powerpc/include/asm/book3s/64/pgtable.h       |  266 ++++++
 arch/powerpc/include/asm/book3s/pgtable.h          |   29 +
 arch/powerpc/include/asm/mmu-hash64.h              |    2 +-
 .../asm/{pgtable-ppc32.h => nohash/32/pgtable.h}   |   25 +-
 arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h |    6 +-
 arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h |    6 +-
 arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h |    6 +-
 .../include/asm/{ => nohash/32}/pte-fsl-booke.h    |    6 +-
 .../{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} |   12 +-
 .../64/pgtable-64k.h}                              |   27 +-
 .../asm/{pgtable-ppc64.h => nohash/64/pgtable.h}   |  336 +------
 arch/powerpc/include/asm/{ => nohash}/pgtable.h    |  175 ++--
 arch/powerpc/include/asm/{ => nohash}/pte-book3e.h |    6 +-
 arch/powerpc/include/asm/page.h                    |   86 +-
 arch/powerpc/include/asm/pgalloc-32.h              |   34 +-
 arch/powerpc/include/asm/pgalloc-64.h              |   27 +-
 arch/powerpc/include/asm/pgtable.h                 |  200 +---
 arch/powerpc/include/asm/plpar_wrappers.h          |   17 +
 arch/powerpc/include/asm/pte-common.h              |    5 +
 arch/powerpc/include/asm/pte-hash64-4k.h           |   17 -
 arch/powerpc/include/asm/pte-hash64-64k.h          |  102 --
 arch/powerpc/include/asm/pte-hash64.h              |   54 --
 arch/powerpc/kernel/exceptions-64s.S               |   18 +-
 arch/powerpc/mm/40x_mmu.c                          |   10 +-
 arch/powerpc/mm/Makefile                           |    9 +-
 arch/powerpc/mm/hash64_4k.c                        |  123 +++
 arch/powerpc/mm/hash64_64k.c                       |  322 +++++++
 arch/powerpc/mm/hash_low_64.S                      | 1003 --------------------
 arch/powerpc/mm/hash_native_64.c                   |   10 +
 arch/powerpc/mm/hash_utils_64.c                    |  105 +-
 arch/powerpc/mm/hugepage-hash64.c                  |   22 +-
 arch/powerpc/mm/hugetlbpage-hash64.c               |   33 +-
 arch/powerpc/mm/hugetlbpage.c                      |   76 +-
 arch/powerpc/mm/init_64.c                          |    4 -
 arch/powerpc/mm/pgtable.c                          |    4 +
 arch/powerpc/mm/pgtable_64.c                       |   28 +-
 arch/powerpc/platforms/pseries/lpar.c              |   64 +-
 42 files changed, 2276 insertions(+), 2189 deletions(-)
 rename arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} (93%)
 copy arch/powerpc/include/asm/{pgtable-ppc32.h => book3s/32/pgtable.h} (62%)
 copy arch/powerpc/include/asm/{pgtable-ppc64-4k.h => book3s/64/hash-4k.h} (71%)
 create mode 100644 arch/powerpc/include/asm/book3s/64/hash-64k.h
 create mode 100644 arch/powerpc/include/asm/book3s/64/hash.h
 create mode 100644 arch/powerpc/include/asm/book3s/64/pgtable.h
 create mode 100644 arch/powerpc/include/asm/book3s/pgtable.h
 rename arch/powerpc/include/asm/{pgtable-ppc32.h => nohash/32/pgtable.h} (96%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h (95%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h (96%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h (95%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-fsl-booke.h (88%)
 rename arch/powerpc/include/asm/{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} (92%)
 rename arch/powerpc/include/asm/{pgtable-ppc64-64k.h => nohash/64/pgtable-64k.h} (64%)
 rename arch/powerpc/include/asm/{pgtable-ppc64.h => nohash/64/pgtable.h} (51%)
 copy arch/powerpc/include/asm/{ => nohash}/pgtable.h (62%)
 rename arch/powerpc/include/asm/{ => nohash}/pte-book3e.h (95%)
 delete mode 100644 arch/powerpc/include/asm/pte-hash64-4k.h
 delete mode 100644 arch/powerpc/include/asm/pte-hash64-64k.h
 delete mode 100644 arch/powerpc/include/asm/pte-hash64.h
 create mode 100644 arch/powerpc/mm/hash64_4k.c
 create mode 100644 arch/powerpc/mm/hash64_64k.c
 delete mode 100644 arch/powerpc/mm/hash_low_64.S

-- 
2.5.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH V6 01/35] powerpc/mm: move pte headers to book3s directory
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-14 10:46   ` [V6,01/35] " Michael Ellerman
  2016-02-18  2:52   ` [PATCH V6 01/35] " Balbir Singh
  2015-12-01  3:36 ` [PATCH V6 02/35] powerpc/mm: move pte headers to book3s directory (part 2) Aneesh Kumar K.V
                   ` (33 subsequent siblings)
  34 siblings, 2 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} | 0
 arch/powerpc/include/asm/{pte-hash64.h => book3s/64/hash.h} | 0
 arch/powerpc/include/asm/pgtable-ppc32.h                    | 2 +-
 arch/powerpc/include/asm/pgtable-ppc64.h                    | 2 +-
 4 files changed, 2 insertions(+), 2 deletions(-)
 rename arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} (100%)
 rename arch/powerpc/include/asm/{pte-hash64.h => book3s/64/hash.h} (100%)

diff --git a/arch/powerpc/include/asm/pte-hash32.h b/arch/powerpc/include/asm/book3s/32/hash.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-hash32.h
rename to arch/powerpc/include/asm/book3s/32/hash.h
diff --git a/arch/powerpc/include/asm/pte-hash64.h b/arch/powerpc/include/asm/book3s/64/hash.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-hash64.h
rename to arch/powerpc/include/asm/book3s/64/hash.h
diff --git a/arch/powerpc/include/asm/pgtable-ppc32.h b/arch/powerpc/include/asm/pgtable-ppc32.h
index 9c326565d498..1a58a05be99c 100644
--- a/arch/powerpc/include/asm/pgtable-ppc32.h
+++ b/arch/powerpc/include/asm/pgtable-ppc32.h
@@ -116,7 +116,7 @@ extern int icache_44x_need_flush;
 #elif defined(CONFIG_8xx)
 #include <asm/pte-8xx.h>
 #else /* CONFIG_6xx */
-#include <asm/pte-hash32.h>
+#include <asm/book3s/32/hash.h>
 #endif
 
 /* And here we include common definitions */
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 3245f2d96d4f..b36a932abdfb 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -98,7 +98,7 @@
  * Include the PTE bits definitions
  */
 #ifdef CONFIG_PPC_BOOK3S
-#include <asm/pte-hash64.h>
+#include <asm/book3s/64/hash.h>
 #else
 #include <asm/pte-book3e.h>
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 02/35] powerpc/mm: move pte headers to book3s directory (part 2)
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 01/35] powerpc/mm: move pte headers to book3s directory Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 03/35] powerpc/mm: make a separate copy for book3s Aneesh Kumar K.V
                   ` (32 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Splitting this so that rename can track changes to file. Before merging
we will fold this

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/hash.h                      |  6 +++---
 .../include/asm/{pte-hash64-4k.h => book3s/64/hash-4k.h}       |  0
 .../include/asm/{pte-hash64-64k.h => book3s/64/hash-64k.h}     |  0
 arch/powerpc/include/asm/book3s/64/hash.h                      | 10 +++++-----
 4 files changed, 8 insertions(+), 8 deletions(-)
 rename arch/powerpc/include/asm/{pte-hash64-4k.h => book3s/64/hash-4k.h} (100%)
 rename arch/powerpc/include/asm/{pte-hash64-64k.h => book3s/64/hash-64k.h} (100%)

diff --git a/arch/powerpc/include/asm/book3s/32/hash.h b/arch/powerpc/include/asm/book3s/32/hash.h
index 62cfb0c663bb..264b754d65b0 100644
--- a/arch/powerpc/include/asm/book3s/32/hash.h
+++ b/arch/powerpc/include/asm/book3s/32/hash.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_HASH32_H
-#define _ASM_POWERPC_PTE_HASH32_H
+#ifndef _ASM_POWERPC_BOOK3S_32_HASH_H
+#define _ASM_POWERPC_BOOK3S_32_HASH_H
 #ifdef __KERNEL__
 
 /*
@@ -43,4 +43,4 @@
 #define PTE_ATOMIC_UPDATES	1
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_HASH32_H */
+#endif /* _ASM_POWERPC_BOOK3S_32_HASH_H */
diff --git a/arch/powerpc/include/asm/pte-hash64-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-hash64-4k.h
rename to arch/powerpc/include/asm/book3s/64/hash-4k.h
diff --git a/arch/powerpc/include/asm/pte-hash64-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-hash64-64k.h
rename to arch/powerpc/include/asm/book3s/64/hash-64k.h
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index ef612c160da7..8e60d4fa434d 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_HASH64_H
-#define _ASM_POWERPC_PTE_HASH64_H
+#ifndef _ASM_POWERPC_BOOK3S_64_HASH_H
+#define _ASM_POWERPC_BOOK3S_64_HASH_H
 #ifdef __KERNEL__
 
 /*
@@ -45,10 +45,10 @@
 #define PTE_ATOMIC_UPDATES	1
 
 #ifdef CONFIG_PPC_64K_PAGES
-#include <asm/pte-hash64-64k.h>
+#include <asm/book3s/64/hash-64k.h>
 #else
-#include <asm/pte-hash64-4k.h>
+#include <asm/book3s/64/hash-4k.h>
 #endif
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_HASH64_H */
+#endif /* _ASM_POWERPC_BOOK3S_64_HASH_H */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 03/35] powerpc/mm: make a separate copy for book3s
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 01/35] powerpc/mm: move pte headers to book3s directory Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 02/35] powerpc/mm: move pte headers to book3s directory (part 2) Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 04/35] powerpc/mm: make a separate copy for book3s (part 2) Aneesh Kumar K.V
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

In this patch we do:
cp pgtable-ppc32.h book3s/32/pgtable.h
cp pgtable-ppc64.h book3s/64/pgtable.h

This enable us to do further changes to hash specific config.
We will change the page table format for 64bit hash in later patches.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 340 +++++++++++++++
 arch/powerpc/include/asm/book3s/64/pgtable.h | 626 +++++++++++++++++++++++++++
 arch/powerpc/include/asm/book3s/pgtable.h    |  10 +
 arch/powerpc/include/asm/mmu-hash64.h        |   2 +-
 arch/powerpc/include/asm/pgtable.h           |   4 +
 5 files changed, 981 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/include/asm/book3s/32/pgtable.h
 create mode 100644 arch/powerpc/include/asm/book3s/64/pgtable.h
 create mode 100644 arch/powerpc/include/asm/book3s/pgtable.h

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
new file mode 100644
index 000000000000..1a58a05be99c
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -0,0 +1,340 @@
+#ifndef _ASM_POWERPC_PGTABLE_PPC32_H
+#define _ASM_POWERPC_PGTABLE_PPC32_H
+
+#include <asm-generic/pgtable-nopmd.h>
+
+#ifndef __ASSEMBLY__
+#include <linux/sched.h>
+#include <linux/threads.h>
+#include <asm/io.h>			/* For sub-arch specific PPC_PIN_SIZE */
+
+extern unsigned long ioremap_bot;
+
+#ifdef CONFIG_44x
+extern int icache_44x_need_flush;
+#endif
+
+#endif /* __ASSEMBLY__ */
+
+/*
+ * The normal case is that PTEs are 32-bits and we have a 1-page
+ * 1024-entry pgdir pointing to 1-page 1024-entry PTE pages.  -- paulus
+ *
+ * For any >32-bit physical address platform, we can use the following
+ * two level page table layout where the pgdir is 8KB and the MS 13 bits
+ * are an index to the second level table.  The combined pgdir/pmd first
+ * level has 2048 entries and the second level has 512 64-bit PTE entries.
+ * -Matt
+ */
+/* PGDIR_SHIFT determines what a top-level page table entry can map */
+#define PGDIR_SHIFT	(PAGE_SHIFT + PTE_SHIFT)
+#define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK	(~(PGDIR_SIZE-1))
+
+/*
+ * entries per page directory level: our page-table tree is two-level, so
+ * we don't really have any PMD directory.
+ */
+#ifndef __ASSEMBLY__
+#define PTE_TABLE_SIZE	(sizeof(pte_t) << PTE_SHIFT)
+#define PGD_TABLE_SIZE	(sizeof(pgd_t) << (32 - PGDIR_SHIFT))
+#endif	/* __ASSEMBLY__ */
+
+#define PTRS_PER_PTE	(1 << PTE_SHIFT)
+#define PTRS_PER_PMD	1
+#define PTRS_PER_PGD	(1 << (32 - PGDIR_SHIFT))
+
+#define USER_PTRS_PER_PGD	(TASK_SIZE / PGDIR_SIZE)
+#define FIRST_USER_ADDRESS	0UL
+
+#define pte_ERROR(e) \
+	pr_err("%s:%d: bad pte %llx.\n", __FILE__, __LINE__, \
+		(unsigned long long)pte_val(e))
+#define pgd_ERROR(e) \
+	pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
+
+/*
+ * This is the bottom of the PKMAP area with HIGHMEM or an arbitrary
+ * value (for now) on others, from where we can start layout kernel
+ * virtual space that goes below PKMAP and FIXMAP
+ */
+#ifdef CONFIG_HIGHMEM
+#define KVIRT_TOP	PKMAP_BASE
+#else
+#define KVIRT_TOP	(0xfe000000UL)	/* for now, could be FIXMAP_BASE ? */
+#endif
+
+/*
+ * ioremap_bot starts at that address. Early ioremaps move down from there,
+ * until mem_init() at which point this becomes the top of the vmalloc
+ * and ioremap space
+ */
+#ifdef CONFIG_NOT_COHERENT_CACHE
+#define IOREMAP_TOP	((KVIRT_TOP - CONFIG_CONSISTENT_SIZE) & PAGE_MASK)
+#else
+#define IOREMAP_TOP	KVIRT_TOP
+#endif
+
+/*
+ * Just any arbitrary offset to the start of the vmalloc VM area: the
+ * current 16MB value just means that there will be a 64MB "hole" after the
+ * physical memory until the kernel virtual memory starts.  That means that
+ * any out-of-bounds memory accesses will hopefully be caught.
+ * The vmalloc() routines leaves a hole of 4kB between each vmalloced
+ * area for the same reason. ;)
+ *
+ * We no longer map larger than phys RAM with the BATs so we don't have
+ * to worry about the VMALLOC_OFFSET causing problems.  We do have to worry
+ * about clashes between our early calls to ioremap() that start growing down
+ * from ioremap_base being run into the VM area allocations (growing upwards
+ * from VMALLOC_START).  For this reason we have ioremap_bot to check when
+ * we actually run into our mappings setup in the early boot with the VM
+ * system.  This really does become a problem for machines with good amounts
+ * of RAM.  -- Cort
+ */
+#define VMALLOC_OFFSET (0x1000000) /* 16M */
+#ifdef PPC_PIN_SIZE
+#define VMALLOC_START (((_ALIGN((long)high_memory, PPC_PIN_SIZE) + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
+#else
+#define VMALLOC_START ((((long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
+#endif
+#define VMALLOC_END	ioremap_bot
+
+/*
+ * Bits in a linux-style PTE.  These match the bits in the
+ * (hardware-defined) PowerPC PTE as closely as possible.
+ */
+
+#if defined(CONFIG_40x)
+#include <asm/pte-40x.h>
+#elif defined(CONFIG_44x)
+#include <asm/pte-44x.h>
+#elif defined(CONFIG_FSL_BOOKE) && defined(CONFIG_PTE_64BIT)
+#include <asm/pte-book3e.h>
+#elif defined(CONFIG_FSL_BOOKE)
+#include <asm/pte-fsl-booke.h>
+#elif defined(CONFIG_8xx)
+#include <asm/pte-8xx.h>
+#else /* CONFIG_6xx */
+#include <asm/book3s/32/hash.h>
+#endif
+
+/* And here we include common definitions */
+#include <asm/pte-common.h>
+
+#ifndef __ASSEMBLY__
+
+#define pte_clear(mm, addr, ptep) \
+	do { pte_update(ptep, ~_PAGE_HASHPTE, 0); } while (0)
+
+#define pmd_none(pmd)		(!pmd_val(pmd))
+#define	pmd_bad(pmd)		(pmd_val(pmd) & _PMD_BAD)
+#define	pmd_present(pmd)	(pmd_val(pmd) & _PMD_PRESENT_MASK)
+#define	pmd_clear(pmdp)		do { pmd_val(*(pmdp)) = 0; } while (0)
+
+/*
+ * When flushing the tlb entry for a page, we also need to flush the hash
+ * table entry.  flush_hash_pages is assembler (for speed) in hashtable.S.
+ */
+extern int flush_hash_pages(unsigned context, unsigned long va,
+			    unsigned long pmdval, int count);
+
+/* Add an HPTE to the hash table */
+extern void add_hash_page(unsigned context, unsigned long va,
+			  unsigned long pmdval);
+
+/* Flush an entry from the TLB/hash table */
+extern void flush_hash_entry(struct mm_struct *mm, pte_t *ptep,
+			     unsigned long address);
+
+/*
+ * PTE updates. This function is called whenever an existing
+ * valid PTE is updated. This does -not- include set_pte_at()
+ * which nowadays only sets a new PTE.
+ *
+ * Depending on the type of MMU, we may need to use atomic updates
+ * and the PTE may be either 32 or 64 bit wide. In the later case,
+ * when using atomic updates, only the low part of the PTE is
+ * accessed atomically.
+ *
+ * In addition, on 44x, we also maintain a global flag indicating
+ * that an executable user mapping was modified, which is needed
+ * to properly flush the virtually tagged instruction cache of
+ * those implementations.
+ */
+#ifndef CONFIG_PTE_64BIT
+static inline unsigned long pte_update(pte_t *p,
+				       unsigned long clr,
+				       unsigned long set)
+{
+#ifdef PTE_ATOMIC_UPDATES
+	unsigned long old, tmp;
+
+	__asm__ __volatile__("\
+1:	lwarx	%0,0,%3\n\
+	andc	%1,%0,%4\n\
+	or	%1,%1,%5\n"
+	PPC405_ERR77(0,%3)
+"	stwcx.	%1,0,%3\n\
+	bne-	1b"
+	: "=&r" (old), "=&r" (tmp), "=m" (*p)
+	: "r" (p), "r" (clr), "r" (set), "m" (*p)
+	: "cc" );
+#else /* PTE_ATOMIC_UPDATES */
+	unsigned long old = pte_val(*p);
+	*p = __pte((old & ~clr) | set);
+#endif /* !PTE_ATOMIC_UPDATES */
+
+#ifdef CONFIG_44x
+	if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
+		icache_44x_need_flush = 1;
+#endif
+	return old;
+}
+#else /* CONFIG_PTE_64BIT */
+static inline unsigned long long pte_update(pte_t *p,
+					    unsigned long clr,
+					    unsigned long set)
+{
+#ifdef PTE_ATOMIC_UPDATES
+	unsigned long long old;
+	unsigned long tmp;
+
+	__asm__ __volatile__("\
+1:	lwarx	%L0,0,%4\n\
+	lwzx	%0,0,%3\n\
+	andc	%1,%L0,%5\n\
+	or	%1,%1,%6\n"
+	PPC405_ERR77(0,%3)
+"	stwcx.	%1,0,%4\n\
+	bne-	1b"
+	: "=&r" (old), "=&r" (tmp), "=m" (*p)
+	: "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" (set), "m" (*p)
+	: "cc" );
+#else /* PTE_ATOMIC_UPDATES */
+	unsigned long long old = pte_val(*p);
+	*p = __pte((old & ~(unsigned long long)clr) | set);
+#endif /* !PTE_ATOMIC_UPDATES */
+
+#ifdef CONFIG_44x
+	if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
+		icache_44x_need_flush = 1;
+#endif
+	return old;
+}
+#endif /* CONFIG_PTE_64BIT */
+
+/*
+ * 2.6 calls this without flushing the TLB entry; this is wrong
+ * for our hash-based implementation, we fix that up here.
+ */
+#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
+static inline int __ptep_test_and_clear_young(unsigned int context, unsigned long addr, pte_t *ptep)
+{
+	unsigned long old;
+	old = pte_update(ptep, _PAGE_ACCESSED, 0);
+#if _PAGE_HASHPTE != 0
+	if (old & _PAGE_HASHPTE) {
+		unsigned long ptephys = __pa(ptep) & PAGE_MASK;
+		flush_hash_pages(context, addr, ptephys, 1);
+	}
+#endif
+	return (old & _PAGE_ACCESSED) != 0;
+}
+#define ptep_test_and_clear_young(__vma, __addr, __ptep) \
+	__ptep_test_and_clear_young((__vma)->vm_mm->context.id, __addr, __ptep)
+
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
+				       pte_t *ptep)
+{
+	return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0));
+}
+
+#define __HAVE_ARCH_PTEP_SET_WRPROTECT
+static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+				      pte_t *ptep)
+{
+	pte_update(ptep, (_PAGE_RW | _PAGE_HWWRITE), _PAGE_RO);
+}
+static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
+					   unsigned long addr, pte_t *ptep)
+{
+	ptep_set_wrprotect(mm, addr, ptep);
+}
+
+
+static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
+{
+	unsigned long set = pte_val(entry) &
+		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
+	unsigned long clr = ~pte_val(entry) & _PAGE_RO;
+
+	pte_update(ptep, clr, set);
+}
+
+#define __HAVE_ARCH_PTE_SAME
+#define pte_same(A,B)	(((pte_val(A) ^ pte_val(B)) & ~_PAGE_HASHPTE) == 0)
+
+/*
+ * Note that on Book E processors, the pmd contains the kernel virtual
+ * (lowmem) address of the pte page.  The physical address is less useful
+ * because everything runs with translation enabled (even the TLB miss
+ * handler).  On everything else the pmd contains the physical address
+ * of the pte page.  -- paulus
+ */
+#ifndef CONFIG_BOOKE
+#define pmd_page_vaddr(pmd)	\
+	((unsigned long) __va(pmd_val(pmd) & PAGE_MASK))
+#define pmd_page(pmd)		\
+	pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT)
+#else
+#define pmd_page_vaddr(pmd)	\
+	((unsigned long) (pmd_val(pmd) & PAGE_MASK))
+#define pmd_page(pmd)		\
+	pfn_to_page((__pa(pmd_val(pmd)) >> PAGE_SHIFT))
+#endif
+
+/* to find an entry in a kernel page-table-directory */
+#define pgd_offset_k(address) pgd_offset(&init_mm, address)
+
+/* to find an entry in a page-table-directory */
+#define pgd_index(address)	 ((address) >> PGDIR_SHIFT)
+#define pgd_offset(mm, address)	 ((mm)->pgd + pgd_index(address))
+
+/* Find an entry in the third-level page table.. */
+#define pte_index(address)		\
+	(((address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
+#define pte_offset_kernel(dir, addr)	\
+	((pte_t *) pmd_page_vaddr(*(dir)) + pte_index(addr))
+#define pte_offset_map(dir, addr)		\
+	((pte_t *) kmap_atomic(pmd_page(*(dir))) + pte_index(addr))
+#define pte_unmap(pte)		kunmap_atomic(pte)
+
+/*
+ * Encode and decode a swap entry.
+ * Note that the bits we use in a PTE for representing a swap entry
+ * must not include the _PAGE_PRESENT bit or the _PAGE_HASHPTE bit (if used).
+ *   -- paulus
+ */
+#define __swp_type(entry)		((entry).val & 0x1f)
+#define __swp_offset(entry)		((entry).val >> 5)
+#define __swp_entry(type, offset)	((swp_entry_t) { (type) | ((offset) << 5) })
+#define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val(pte) >> 3 })
+#define __swp_entry_to_pte(x)		((pte_t) { (x).val << 3 })
+
+#ifndef CONFIG_PPC_4K_PAGES
+void pgtable_cache_init(void);
+#else
+/*
+ * No page table caches to initialise
+ */
+#define pgtable_cache_init()	do { } while (0)
+#endif
+
+extern int get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep,
+		      pmd_t **pmdp);
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* _ASM_POWERPC_PGTABLE_PPC32_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
new file mode 100644
index 000000000000..4c61db6adcde
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -0,0 +1,626 @@
+#ifndef _ASM_POWERPC_PGTABLE_PPC64_H_
+#define _ASM_POWERPC_PGTABLE_PPC64_H_
+/*
+ * This file contains the functions and defines necessary to modify and use
+ * the ppc64 hashed page table.
+ */
+
+#ifdef CONFIG_PPC_64K_PAGES
+#include <asm/pgtable-ppc64-64k.h>
+#else
+#include <asm/pgtable-ppc64-4k.h>
+#endif
+#include <asm/barrier.h>
+
+#define FIRST_USER_ADDRESS	0UL
+
+/*
+ * Size of EA range mapped by our pagetables.
+ */
+#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
+			    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
+#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#define PMD_CACHE_INDEX	(PMD_INDEX_SIZE + 1)
+#else
+#define PMD_CACHE_INDEX	PMD_INDEX_SIZE
+#endif
+/*
+ * Define the address range of the kernel non-linear virtual area
+ */
+
+#ifdef CONFIG_PPC_BOOK3E
+#define KERN_VIRT_START ASM_CONST(0x8000000000000000)
+#else
+#define KERN_VIRT_START ASM_CONST(0xD000000000000000)
+#endif
+#define KERN_VIRT_SIZE	ASM_CONST(0x0000100000000000)
+
+/*
+ * The vmalloc space starts at the beginning of that region, and
+ * occupies half of it on hash CPUs and a quarter of it on Book3E
+ * (we keep a quarter for the virtual memmap)
+ */
+#define VMALLOC_START	KERN_VIRT_START
+#ifdef CONFIG_PPC_BOOK3E
+#define VMALLOC_SIZE	(KERN_VIRT_SIZE >> 2)
+#else
+#define VMALLOC_SIZE	(KERN_VIRT_SIZE >> 1)
+#endif
+#define VMALLOC_END	(VMALLOC_START + VMALLOC_SIZE)
+
+/*
+ * The second half of the kernel virtual space is used for IO mappings,
+ * it's itself carved into the PIO region (ISA and PHB IO space) and
+ * the ioremap space
+ *
+ *  ISA_IO_BASE = KERN_IO_START, 64K reserved area
+ *  PHB_IO_BASE = ISA_IO_BASE + 64K to ISA_IO_BASE + 2G, PHB IO spaces
+ * IOREMAP_BASE = ISA_IO_BASE + 2G to VMALLOC_START + PGTABLE_RANGE
+ */
+#define KERN_IO_START	(KERN_VIRT_START + (KERN_VIRT_SIZE >> 1))
+#define FULL_IO_SIZE	0x80000000ul
+#define  ISA_IO_BASE	(KERN_IO_START)
+#define  ISA_IO_END	(KERN_IO_START + 0x10000ul)
+#define  PHB_IO_BASE	(ISA_IO_END)
+#define  PHB_IO_END	(KERN_IO_START + FULL_IO_SIZE)
+#define IOREMAP_BASE	(PHB_IO_END)
+#define IOREMAP_END	(KERN_VIRT_START + KERN_VIRT_SIZE)
+
+
+/*
+ * Region IDs
+ */
+#define REGION_SHIFT		60UL
+#define REGION_MASK		(0xfUL << REGION_SHIFT)
+#define REGION_ID(ea)		(((unsigned long)(ea)) >> REGION_SHIFT)
+
+#define VMALLOC_REGION_ID	(REGION_ID(VMALLOC_START))
+#define KERNEL_REGION_ID	(REGION_ID(PAGE_OFFSET))
+#define VMEMMAP_REGION_ID	(0xfUL)	/* Server only */
+#define USER_REGION_ID		(0UL)
+
+/*
+ * Defines the address of the vmemap area, in its own region on
+ * hash table CPUs and after the vmalloc space on Book3E
+ */
+#ifdef CONFIG_PPC_BOOK3E
+#define VMEMMAP_BASE		VMALLOC_END
+#define VMEMMAP_END		KERN_IO_START
+#else
+#define VMEMMAP_BASE		(VMEMMAP_REGION_ID << REGION_SHIFT)
+#endif
+#define vmemmap			((struct page *)VMEMMAP_BASE)
+
+
+/*
+ * Include the PTE bits definitions
+ */
+#ifdef CONFIG_PPC_BOOK3S
+#include <asm/book3s/64/hash.h>
+#else
+#include <asm/pte-book3e.h>
+#endif
+#include <asm/pte-common.h>
+
+#ifdef CONFIG_PPC_MM_SLICES
+#define HAVE_ARCH_UNMAPPED_AREA
+#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+#endif /* CONFIG_PPC_MM_SLICES */
+
+#ifndef __ASSEMBLY__
+
+/*
+ * This is the default implementation of various PTE accessors, it's
+ * used in all cases except Book3S with 64K pages where we have a
+ * concept of sub-pages
+ */
+#ifndef __real_pte
+
+#ifdef CONFIG_STRICT_MM_TYPECHECKS
+#define __real_pte(e,p)		((real_pte_t){(e)})
+#define __rpte_to_pte(r)	((r).pte)
+#else
+#define __real_pte(e,p)		(e)
+#define __rpte_to_pte(r)	(__pte(r))
+#endif
+#define __rpte_to_hidx(r,index)	(pte_val(__rpte_to_pte(r)) >> 12)
+
+#define pte_iterate_hashed_subpages(rpte, psize, va, index, shift)       \
+	do {							         \
+		index = 0;					         \
+		shift = mmu_psize_defs[psize].shift;		         \
+
+#define pte_iterate_hashed_end() } while(0)
+
+/*
+ * We expect this to be called only for user addresses or kernel virtual
+ * addresses other than the linear mapping.
+ */
+#define pte_pagesize_index(mm, addr, pte)	MMU_PAGE_4K
+
+#endif /* __real_pte */
+
+
+/* pte_clear moved to later in this file */
+
+#define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
+#define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
+
+#define pmd_set(pmdp, pmdval) 	(pmd_val(*(pmdp)) = (pmdval))
+#define pmd_none(pmd)		(!pmd_val(pmd))
+#define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
+				 || (pmd_val(pmd) & PMD_BAD_BITS))
+#define	pmd_present(pmd)	(!pmd_none(pmd))
+#define	pmd_clear(pmdp)		(pmd_val(*(pmdp)) = 0)
+#define pmd_page_vaddr(pmd)	(pmd_val(pmd) & ~PMD_MASKED_BITS)
+extern struct page *pmd_page(pmd_t pmd);
+
+#define pud_set(pudp, pudval)	(pud_val(*(pudp)) = (pudval))
+#define pud_none(pud)		(!pud_val(pud))
+#define	pud_bad(pud)		(!is_kernel_addr(pud_val(pud)) \
+				 || (pud_val(pud) & PUD_BAD_BITS))
+#define pud_present(pud)	(pud_val(pud) != 0)
+#define pud_clear(pudp)		(pud_val(*(pudp)) = 0)
+#define pud_page_vaddr(pud)	(pud_val(pud) & ~PUD_MASKED_BITS)
+
+extern struct page *pud_page(pud_t pud);
+
+static inline pte_t pud_pte(pud_t pud)
+{
+	return __pte(pud_val(pud));
+}
+
+static inline pud_t pte_pud(pte_t pte)
+{
+	return __pud(pte_val(pte));
+}
+#define pud_write(pud)		pte_write(pud_pte(pud))
+#define pgd_set(pgdp, pudp)	({pgd_val(*(pgdp)) = (unsigned long)(pudp);})
+#define pgd_write(pgd)		pte_write(pgd_pte(pgd))
+
+/*
+ * Find an entry in a page-table-directory.  We combine the address region
+ * (the high order N bits) and the pgd portion of the address.
+ */
+#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
+
+#define pgd_offset(mm, address)	 ((mm)->pgd + pgd_index(address))
+
+#define pmd_offset(pudp,addr) \
+  (((pmd_t *) pud_page_vaddr(*(pudp))) + (((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1)))
+
+#define pte_offset_kernel(dir,addr) \
+  (((pte_t *) pmd_page_vaddr(*(dir))) + (((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)))
+
+#define pte_offset_map(dir,addr)	pte_offset_kernel((dir), (addr))
+#define pte_unmap(pte)			do { } while(0)
+
+/* to find an entry in a kernel page-table-directory */
+/* This now only contains the vmalloc pages */
+#define pgd_offset_k(address) pgd_offset(&init_mm, address)
+extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
+			    pte_t *ptep, unsigned long pte, int huge);
+
+/* Atomic PTE updates */
+static inline unsigned long pte_update(struct mm_struct *mm,
+				       unsigned long addr,
+				       pte_t *ptep, unsigned long clr,
+				       unsigned long set,
+				       int huge)
+{
+#ifdef PTE_ATOMIC_UPDATES
+	unsigned long old, tmp;
+
+	__asm__ __volatile__(
+	"1:	ldarx	%0,0,%3		# pte_update\n\
+	andi.	%1,%0,%6\n\
+	bne-	1b \n\
+	andc	%1,%0,%4 \n\
+	or	%1,%1,%7\n\
+	stdcx.	%1,0,%3 \n\
+	bne-	1b"
+	: "=&r" (old), "=&r" (tmp), "=m" (*ptep)
+	: "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
+	: "cc" );
+#else
+	unsigned long old = pte_val(*ptep);
+	*ptep = __pte((old & ~clr) | set);
+#endif
+	/* huge pages use the old page table lock */
+	if (!huge)
+		assert_pte_locked(mm, addr);
+
+#ifdef CONFIG_PPC_STD_MMU_64
+	if (old & _PAGE_HASHPTE)
+		hpte_need_flush(mm, addr, ptep, old, huge);
+#endif
+
+	return old;
+}
+
+static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
+					      unsigned long addr, pte_t *ptep)
+{
+	unsigned long old;
+
+	if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+		return 0;
+	old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
+	return (old & _PAGE_ACCESSED) != 0;
+}
+#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
+#define ptep_test_and_clear_young(__vma, __addr, __ptep)		   \
+({									   \
+	int __r;							   \
+	__r = __ptep_test_and_clear_young((__vma)->vm_mm, __addr, __ptep); \
+	__r;								   \
+})
+
+#define __HAVE_ARCH_PTEP_SET_WRPROTECT
+static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+				      pte_t *ptep)
+{
+
+	if ((pte_val(*ptep) & _PAGE_RW) == 0)
+		return;
+
+	pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
+}
+
+static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
+					   unsigned long addr, pte_t *ptep)
+{
+	if ((pte_val(*ptep) & _PAGE_RW) == 0)
+		return;
+
+	pte_update(mm, addr, ptep, _PAGE_RW, 0, 1);
+}
+
+/*
+ * We currently remove entries from the hashtable regardless of whether
+ * the entry was young or dirty. The generic routines only flush if the
+ * entry was young or dirty which is not good enough.
+ *
+ * We should be more intelligent about this but for the moment we override
+ * these functions and force a tlb flush unconditionally
+ */
+#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
+#define ptep_clear_flush_young(__vma, __address, __ptep)		\
+({									\
+	int __young = __ptep_test_and_clear_young((__vma)->vm_mm, __address, \
+						  __ptep);		\
+	__young;							\
+})
+
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
+				       unsigned long addr, pte_t *ptep)
+{
+	unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0);
+	return __pte(old);
+}
+
+static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
+			     pte_t * ptep)
+{
+	pte_update(mm, addr, ptep, ~0UL, 0, 0);
+}
+
+
+/* Set the dirty and/or accessed bits atomically in a linux PTE, this
+ * function doesn't need to flush the hash entry
+ */
+static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
+{
+	unsigned long bits = pte_val(entry) &
+		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
+
+#ifdef PTE_ATOMIC_UPDATES
+	unsigned long old, tmp;
+
+	__asm__ __volatile__(
+	"1:	ldarx	%0,0,%4\n\
+		andi.	%1,%0,%6\n\
+		bne-	1b \n\
+		or	%0,%3,%0\n\
+		stdcx.	%0,0,%4\n\
+		bne-	1b"
+	:"=&r" (old), "=&r" (tmp), "=m" (*ptep)
+	:"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
+	:"cc");
+#else
+	unsigned long old = pte_val(*ptep);
+	*ptep = __pte(old | bits);
+#endif
+}
+
+#define __HAVE_ARCH_PTE_SAME
+#define pte_same(A,B)	(((pte_val(A) ^ pte_val(B)) & ~_PAGE_HPTEFLAGS) == 0)
+
+#define pte_ERROR(e) \
+	pr_err("%s:%d: bad pte %08lx.\n", __FILE__, __LINE__, pte_val(e))
+#define pmd_ERROR(e) \
+	pr_err("%s:%d: bad pmd %08lx.\n", __FILE__, __LINE__, pmd_val(e))
+#define pgd_ERROR(e) \
+	pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
+
+/* Encode and de-code a swap entry */
+#define MAX_SWAPFILES_CHECK() do { \
+	BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS); \
+	/*							\
+	 * Don't have overlapping bits with _PAGE_HPTEFLAGS	\
+	 * We filter HPTEFLAGS on set_pte.			\
+	 */							\
+	BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \
+	} while (0)
+/*
+ * on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
+ */
+#define SWP_TYPE_BITS 5
+#define __swp_type(x)		(((x).val >> _PAGE_BIT_SWAP_TYPE) \
+				& ((1UL << SWP_TYPE_BITS) - 1))
+#define __swp_offset(x)		((x).val >> PTE_RPN_SHIFT)
+#define __swp_entry(type, offset)	((swp_entry_t) { \
+					((type) << _PAGE_BIT_SWAP_TYPE) \
+					| ((offset) << PTE_RPN_SHIFT) })
+
+#define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val((pte)) })
+#define __swp_entry_to_pte(x)		__pte((x).val)
+
+void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
+void pgtable_cache_init(void);
+#endif /* __ASSEMBLY__ */
+
+/*
+ * THP pages can't be special. So use the _PAGE_SPECIAL
+ */
+#define _PAGE_SPLITTING _PAGE_SPECIAL
+
+/*
+ * We need to differentiate between explicit huge page and THP huge
+ * page, since THP huge page also need to track real subpage details
+ */
+#define _PAGE_THP_HUGE  _PAGE_4K_PFN
+
+/*
+ * set of bits not changed in pmd_modify.
+ */
+#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
+			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
+			 _PAGE_THP_HUGE)
+
+#ifndef __ASSEMBLY__
+/*
+ * The linux hugepage PMD now include the pmd entries followed by the address
+ * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
+ * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
+ * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
+ * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
+ *
+ * The last three bits are intentionally left to zero. This memory location
+ * are also used as normal page PTE pointers. So if we have any pointers
+ * left around while we collapse a hugepage, we need to make sure
+ * _PAGE_PRESENT bit of that is zero when we look at them
+ */
+static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
+{
+	return (hpte_slot_array[index] >> 3) & 0x1;
+}
+
+static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
+					   int index)
+{
+	return hpte_slot_array[index] >> 4;
+}
+
+static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
+					unsigned int index, unsigned int hidx)
+{
+	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
+}
+
+struct page *realmode_pfn_to_page(unsigned long pfn);
+
+static inline char *get_hpte_slot_array(pmd_t *pmdp)
+{
+	/*
+	 * The hpte hindex is stored in the pgtable whose address is in the
+	 * second half of the PMD
+	 *
+	 * Order this load with the test for pmd_trans_huge in the caller
+	 */
+	smp_rmb();
+	return *(char **)(pmdp + PTRS_PER_PMD);
+
+
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
+				   pmd_t *pmdp, unsigned long old_pmd);
+extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot);
+extern pmd_t mk_pmd(struct page *page, pgprot_t pgprot);
+extern pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot);
+extern void set_pmd_at(struct mm_struct *mm, unsigned long addr,
+		       pmd_t *pmdp, pmd_t pmd);
+extern void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
+				 pmd_t *pmd);
+/*
+ *
+ * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
+ * page. The hugetlbfs page table walking and mangling paths are totally
+ * separated form the core VM paths and they're differentiated by
+ *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
+ *
+ * pmd_trans_huge() is defined as false at build time if
+ * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
+ * time in such case.
+ *
+ * For ppc64 we need to differntiate from explicit hugepages from THP, because
+ * for THP we also track the subpage details at the pmd level. We don't do
+ * that for explicit huge pages.
+ *
+ */
+static inline int pmd_trans_huge(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
+}
+
+static inline int pmd_trans_splitting(pmd_t pmd)
+{
+	if (pmd_trans_huge(pmd))
+		return pmd_val(pmd) & _PAGE_SPLITTING;
+	return 0;
+}
+
+extern int has_transparent_hugepage(void);
+#else
+static inline void hpte_do_hugepage_flush(struct mm_struct *mm,
+					  unsigned long addr, pmd_t *pmdp,
+					  unsigned long old_pmd)
+{
+
+	WARN(1, "%s called with THP disabled\n", __func__);
+}
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+static inline int pmd_large(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pmd_val(pmd) & 0x3) != 0x0);
+}
+
+static inline pte_t pmd_pte(pmd_t pmd)
+{
+	return __pte(pmd_val(pmd));
+}
+
+static inline pmd_t pte_pmd(pte_t pte)
+{
+	return __pmd(pte_val(pte));
+}
+
+static inline pte_t *pmdp_ptep(pmd_t *pmd)
+{
+	return (pte_t *)pmd;
+}
+
+#define pmd_pfn(pmd)		pte_pfn(pmd_pte(pmd))
+#define pmd_dirty(pmd)		pte_dirty(pmd_pte(pmd))
+#define pmd_young(pmd)		pte_young(pmd_pte(pmd))
+#define pmd_mkold(pmd)		pte_pmd(pte_mkold(pmd_pte(pmd)))
+#define pmd_wrprotect(pmd)	pte_pmd(pte_wrprotect(pmd_pte(pmd)))
+#define pmd_mkdirty(pmd)	pte_pmd(pte_mkdirty(pmd_pte(pmd)))
+#define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
+#define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
+
+#define __HAVE_ARCH_PMD_WRITE
+#define pmd_write(pmd)		pte_write(pmd_pte(pmd))
+
+static inline pmd_t pmd_mkhuge(pmd_t pmd)
+{
+	/* Do nothing, mk_pmd() does this part.  */
+	return pmd;
+}
+
+static inline pmd_t pmd_mknotpresent(pmd_t pmd)
+{
+	pmd_val(pmd) &= ~_PAGE_PRESENT;
+	return pmd;
+}
+
+static inline pmd_t pmd_mksplitting(pmd_t pmd)
+{
+	pmd_val(pmd) |= _PAGE_SPLITTING;
+	return pmd;
+}
+
+#define __HAVE_ARCH_PMD_SAME
+static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
+{
+	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
+}
+
+#define __HAVE_ARCH_PMDP_SET_ACCESS_FLAGS
+extern int pmdp_set_access_flags(struct vm_area_struct *vma,
+				 unsigned long address, pmd_t *pmdp,
+				 pmd_t entry, int dirty);
+
+extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
+					 unsigned long addr,
+					 pmd_t *pmdp,
+					 unsigned long clr,
+					 unsigned long set);
+
+static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
+					      unsigned long addr, pmd_t *pmdp)
+{
+	unsigned long old;
+
+	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+		return 0;
+	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
+	return ((old & _PAGE_ACCESSED) != 0);
+}
+
+#define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
+extern int pmdp_test_and_clear_young(struct vm_area_struct *vma,
+				     unsigned long address, pmd_t *pmdp);
+#define __HAVE_ARCH_PMDP_CLEAR_YOUNG_FLUSH
+extern int pmdp_clear_flush_young(struct vm_area_struct *vma,
+				  unsigned long address, pmd_t *pmdp);
+
+#define __HAVE_ARCH_PMDP_HUGE_GET_AND_CLEAR
+extern pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
+				     unsigned long addr, pmd_t *pmdp);
+
+#define __HAVE_ARCH_PMDP_SET_WRPROTECT
+static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+				      pmd_t *pmdp)
+{
+
+	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
+		return;
+
+	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
+}
+
+#define __HAVE_ARCH_PMDP_SPLITTING_FLUSH
+extern void pmdp_splitting_flush(struct vm_area_struct *vma,
+				 unsigned long address, pmd_t *pmdp);
+
+extern pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
+				 unsigned long address, pmd_t *pmdp);
+#define pmdp_collapse_flush pmdp_collapse_flush
+
+#define __HAVE_ARCH_PGTABLE_DEPOSIT
+extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
+				       pgtable_t pgtable);
+#define __HAVE_ARCH_PGTABLE_WITHDRAW
+extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
+
+#define __HAVE_ARCH_PMDP_INVALIDATE
+extern void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
+			    pmd_t *pmdp);
+
+#define pmd_move_must_withdraw pmd_move_must_withdraw
+struct spinlock;
+static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
+					 struct spinlock *old_pmd_ptl)
+{
+	/*
+	 * Archs like ppc64 use pgtable to store per pmd
+	 * specific information. So when we switch the pmd,
+	 * we should also withdraw and deposit the pgtable
+	 */
+	return true;
+}
+#endif /* __ASSEMBLY__ */
+#endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
new file mode 100644
index 000000000000..a8d8e5152bd4
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -0,0 +1,10 @@
+#ifndef _ASM_POWERPC_BOOK3S_PGTABLE_H
+#define _ASM_POWERPC_BOOK3S_PGTABLE_H
+
+#ifdef CONFIG_PPC64
+#include <asm/book3s/64/pgtable.h>
+#else
+#include <asm/book3s/32/pgtable.h>
+#endif
+
+#endif
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index ba3342bbdbda..7352d3f212df 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -21,7 +21,7 @@
  * need for various slices related matters. Note that this isn't the
  * complete pgtable.h but only a portion of it.
  */
-#include <asm/pgtable-ppc64.h>
+#include <asm/book3s/64/pgtable.h>
 #include <asm/bug.h>
 #include <asm/processor.h>
 
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index b64b4212b71f..c304d0767919 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -13,11 +13,15 @@ struct mm_struct;
 
 #endif /* !__ASSEMBLY__ */
 
+#ifdef CONFIG_PPC_BOOK3S
+#include <asm/book3s/pgtable.h>
+#else
 #if defined(CONFIG_PPC64)
 #  include <asm/pgtable-ppc64.h>
 #else
 #  include <asm/pgtable-ppc32.h>
 #endif
+#endif /* !CONFIG_PPC_BOOK3S */
 
 /*
  * We save the slot number & secondary bit in the second half of the
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 04/35] powerpc/mm: make a separate copy for book3s (part 2)
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (2 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 03/35] powerpc/mm: make a separate copy for book3s Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2016-02-18  2:54   ` Balbir Singh
  2015-12-01  3:36 ` [PATCH V6 05/35] powerpc/mm: Move hash specific pte width and other defines to book3s Aneesh Kumar K.V
                   ` (30 subsequent siblings)
  34 siblings, 1 reply; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Keep it seperate to make rebasing easier

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 6 +++---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 6 +++---
 arch/powerpc/include/asm/pgtable-ppc32.h     | 2 --
 arch/powerpc/include/asm/pgtable-ppc64.h     | 4 ----
 4 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 1a58a05be99c..418d2fa3ac7d 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC32_H
-#define _ASM_POWERPC_PGTABLE_PPC32_H
+#ifndef _ASM_POWERPC_BOOK3S_32_PGTABLE_H
+#define _ASM_POWERPC_BOOK3S_32_PGTABLE_H
 
 #include <asm-generic/pgtable-nopmd.h>
 
@@ -337,4 +337,4 @@ extern int get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep,
 
 #endif /* !__ASSEMBLY__ */
 
-#endif /* _ASM_POWERPC_PGTABLE_PPC32_H */
+#endif /*  _ASM_POWERPC_BOOK3S_32_PGTABLE_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 4c61db6adcde..cdd5284d9eaa 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC64_H_
-#define _ASM_POWERPC_PGTABLE_PPC64_H_
+#ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
+#define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
 /*
  * This file contains the functions and defines necessary to modify and use
  * the ppc64 hashed page table.
@@ -623,4 +623,4 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
 	return true;
 }
 #endif /* __ASSEMBLY__ */
-#endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
+#endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
diff --git a/arch/powerpc/include/asm/pgtable-ppc32.h b/arch/powerpc/include/asm/pgtable-ppc32.h
index 1a58a05be99c..aac6547b0823 100644
--- a/arch/powerpc/include/asm/pgtable-ppc32.h
+++ b/arch/powerpc/include/asm/pgtable-ppc32.h
@@ -115,8 +115,6 @@ extern int icache_44x_need_flush;
 #include <asm/pte-fsl-booke.h>
 #elif defined(CONFIG_8xx)
 #include <asm/pte-8xx.h>
-#else /* CONFIG_6xx */
-#include <asm/book3s/32/hash.h>
 #endif
 
 /* And here we include common definitions */
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index b36a932abdfb..1ef0fea32e1e 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -97,11 +97,7 @@
 /*
  * Include the PTE bits definitions
  */
-#ifdef CONFIG_PPC_BOOK3S
-#include <asm/book3s/64/hash.h>
-#else
 #include <asm/pte-book3e.h>
-#endif
 #include <asm/pte-common.h>
 
 #ifdef CONFIG_PPC_MM_SLICES
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 05/35] powerpc/mm: Move hash specific pte width and other defines to book3s
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (3 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 04/35] powerpc/mm: make a separate copy for book3s (part 2) Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2016-02-18  4:45   ` Balbir Singh
  2015-12-01  3:36 ` [PATCH V6 06/35] powerpc/mm: Delete booke bits from book3s Aneesh Kumar K.V
                   ` (29 subsequent siblings)
  34 siblings, 1 reply; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

This further make a copy of pte defines to book3s/64/hash*.h. This
remove the dependency on pgtable-ppc64-4k.h and pgtable-ppc64-64k.h

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 86 ++++++++++++++++++++++++++-
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 46 +++++++++++++-
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  6 +-
 3 files changed, 129 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index c134e809aac3..f2c51cd61f69 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -1,4 +1,51 @@
-/* To be include by pgtable-hash64.h only */
+#ifndef _ASM_POWERPC_BOOK3S_64_HASH_4K_H
+#define _ASM_POWERPC_BOOK3S_64_HASH_4K_H
+/*
+ * Entries per page directory level.  The PTE level must use a 64b record
+ * for each page table entry.  The PMD and PGD level use a 32b record for
+ * each entry by assuming that each entry is page aligned.
+ */
+#define PTE_INDEX_SIZE  9
+#define PMD_INDEX_SIZE  7
+#define PUD_INDEX_SIZE  9
+#define PGD_INDEX_SIZE  9
+
+#ifndef __ASSEMBLY__
+#define PTE_TABLE_SIZE	(sizeof(pte_t) << PTE_INDEX_SIZE)
+#define PMD_TABLE_SIZE	(sizeof(pmd_t) << PMD_INDEX_SIZE)
+#define PUD_TABLE_SIZE	(sizeof(pud_t) << PUD_INDEX_SIZE)
+#define PGD_TABLE_SIZE	(sizeof(pgd_t) << PGD_INDEX_SIZE)
+#endif	/* __ASSEMBLY__ */
+
+#define PTRS_PER_PTE	(1 << PTE_INDEX_SIZE)
+#define PTRS_PER_PMD	(1 << PMD_INDEX_SIZE)
+#define PTRS_PER_PUD	(1 << PUD_INDEX_SIZE)
+#define PTRS_PER_PGD	(1 << PGD_INDEX_SIZE)
+
+/* PMD_SHIFT determines what a second-level page table entry can map */
+#define PMD_SHIFT	(PAGE_SHIFT + PTE_INDEX_SIZE)
+#define PMD_SIZE	(1UL << PMD_SHIFT)
+#define PMD_MASK	(~(PMD_SIZE-1))
+
+/* With 4k base page size, hugepage PTEs go at the PMD level */
+#define MIN_HUGEPTE_SHIFT	PMD_SHIFT
+
+/* PUD_SHIFT determines what a third-level page table entry can map */
+#define PUD_SHIFT	(PMD_SHIFT + PMD_INDEX_SIZE)
+#define PUD_SIZE	(1UL << PUD_SHIFT)
+#define PUD_MASK	(~(PUD_SIZE-1))
+
+/* PGDIR_SHIFT determines what a fourth-level page table entry can map */
+#define PGDIR_SHIFT	(PUD_SHIFT + PUD_INDEX_SIZE)
+#define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK	(~(PGDIR_SIZE-1))
+
+/* Bits to mask out from a PMD to get to the PTE page */
+#define PMD_MASKED_BITS		0
+/* Bits to mask out from a PUD to get to the PMD page */
+#define PUD_MASKED_BITS		0
+/* Bits to mask out from a PGD to get to the PUD page */
+#define PGD_MASKED_BITS		0
 
 /* PTE bits */
 #define _PAGE_HASHPTE	0x0400 /* software: pte has an associated HPTE */
@@ -15,3 +62,40 @@
 /* shift to put page number into pte */
 #define PTE_RPN_SHIFT	(17)
 
+#ifndef __ASSEMBLY__
+/*
+ * 4-level page tables related bits
+ */
+
+#define pgd_none(pgd)		(!pgd_val(pgd))
+#define pgd_bad(pgd)		(pgd_val(pgd) == 0)
+#define pgd_present(pgd)	(pgd_val(pgd) != 0)
+#define pgd_clear(pgdp)		(pgd_val(*(pgdp)) = 0)
+#define pgd_page_vaddr(pgd)	(pgd_val(pgd) & ~PGD_MASKED_BITS)
+
+static inline pte_t pgd_pte(pgd_t pgd)
+{
+	return __pte(pgd_val(pgd));
+}
+
+static inline pgd_t pte_pgd(pte_t pte)
+{
+	return __pgd(pte_val(pte));
+}
+extern struct page *pgd_page(pgd_t pgd);
+
+#define pud_offset(pgdp, addr)	\
+  (((pud_t *) pgd_page_vaddr(*(pgdp))) + \
+    (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
+
+#define pud_ERROR(e) \
+	pr_err("%s:%d: bad pud %08lx.\n", __FILE__, __LINE__, pud_val(e))
+
+/*
+ * On all 4K setups, remap_4k_pfn() equates to remap_pfn_range() */
+#define remap_4k_pfn(vma, addr, pfn, prot)	\
+	remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, (prot))
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* _ASM_POWERPC_BOOK3S_64_HASH_4K_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 4f4ec2ab45c9..ee073822145d 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -1,4 +1,35 @@
-/* To be include by pgtable-hash64.h only */
+#ifndef _ASM_POWERPC_BOOK3S_64_HASH_64K_H
+#define _ASM_POWERPC_BOOK3S_64_HASH_64K_H
+
+#include <asm-generic/pgtable-nopud.h>
+
+#define PTE_INDEX_SIZE  8
+#define PMD_INDEX_SIZE  10
+#define PUD_INDEX_SIZE	0
+#define PGD_INDEX_SIZE  12
+
+#define PTRS_PER_PTE	(1 << PTE_INDEX_SIZE)
+#define PTRS_PER_PMD	(1 << PMD_INDEX_SIZE)
+#define PTRS_PER_PGD	(1 << PGD_INDEX_SIZE)
+
+/* With 4k base page size, hugepage PTEs go at the PMD level */
+#define MIN_HUGEPTE_SHIFT	PAGE_SHIFT
+
+/* PMD_SHIFT determines what a second-level page table entry can map */
+#define PMD_SHIFT	(PAGE_SHIFT + PTE_INDEX_SIZE)
+#define PMD_SIZE	(1UL << PMD_SHIFT)
+#define PMD_MASK	(~(PMD_SIZE-1))
+
+/* PGDIR_SHIFT determines what a third-level page table entry can map */
+#define PGDIR_SHIFT	(PMD_SHIFT + PMD_INDEX_SIZE)
+#define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK	(~(PGDIR_SIZE-1))
+
+/* Bits to mask out from a PMD to get to the PTE page */
+/* PMDs point to PTE table fragments which are 4K aligned.  */
+#define PMD_MASKED_BITS		0xfff
+/* Bits to mask out from a PGD/PUD to get to the PMD page */
+#define PUD_MASKED_BITS		0x1ff
 
 /* Additional PTE bits (don't change without checking asm in hash_low.S) */
 #define _PAGE_SPECIAL	0x00000400 /* software: special page */
@@ -74,8 +105,8 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 #define __rpte_to_pte(r)	((r).pte)
 #define __rpte_sub_valid(rpte, index) \
 	(pte_val(rpte.pte) & (_PAGE_HPTE_SUB0 >> (index)))
-
-/* Trick: we set __end to va + 64k, which happens works for
+/*
+ * Trick: we set __end to va + 64k, which happens works for
  * a 16M page as well as we want only one iteration
  */
 #define pte_iterate_hashed_subpages(rpte, psize, vpn, index, shift)	\
@@ -99,4 +130,13 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 		remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE,	\
 			__pgprot(pgprot_val((prot)) | _PAGE_4K_PFN)))
 
+#define PTE_TABLE_SIZE	(sizeof(real_pte_t) << PTE_INDEX_SIZE)
+#define PMD_TABLE_SIZE	(sizeof(pmd_t) << PMD_INDEX_SIZE)
+#define PGD_TABLE_SIZE	(sizeof(pgd_t) << PGD_INDEX_SIZE)
+
+#define pgd_pte(pgd)	(pud_pte(((pud_t){ pgd })))
+#define pte_pgd(pte)	((pgd_t)pte_pud(pte))
+
 #endif	/* __ASSEMBLY__ */
+
+#endif /* _ASM_POWERPC_BOOK3S_64_HASH_64K_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index cdd5284d9eaa..2741ac6fbd3d 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -5,11 +5,7 @@
  * the ppc64 hashed page table.
  */
 
-#ifdef CONFIG_PPC_64K_PAGES
-#include <asm/pgtable-ppc64-64k.h>
-#else
-#include <asm/pgtable-ppc64-4k.h>
-#endif
+#include <asm/book3s/64/hash.h>
 #include <asm/barrier.h>
 
 #define FIRST_USER_ADDRESS	0UL
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 06/35] powerpc/mm: Delete booke bits from book3s
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (4 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 05/35] powerpc/mm: Move hash specific pte width and other defines to book3s Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 07/35] powerpc/mm: Don't have generic headers introduce functions touching pte bits Aneesh Kumar K.V
                   ` (28 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

We also move __ASSEMBLY__ towards the end of header. This avoid
having #ifndef __ASSEMBLY___ all over the header

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 93 +++++++---------------------
 arch/powerpc/include/asm/book3s/64/pgtable.h | 86 +++++++------------------
 arch/powerpc/include/asm/book3s/pgtable.h    |  1 +
 3 files changed, 49 insertions(+), 131 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 418d2fa3ac7d..5438c0b6aeec 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -3,18 +3,10 @@
 
 #include <asm-generic/pgtable-nopmd.h>
 
-#ifndef __ASSEMBLY__
-#include <linux/sched.h>
-#include <linux/threads.h>
-#include <asm/io.h>			/* For sub-arch specific PPC_PIN_SIZE */
-
-extern unsigned long ioremap_bot;
-
-#ifdef CONFIG_44x
-extern int icache_44x_need_flush;
-#endif
+#include <asm/book3s/32/hash.h>
 
-#endif /* __ASSEMBLY__ */
+/* And here we include common definitions */
+#include <asm/pte-common.h>
 
 /*
  * The normal case is that PTEs are 32-bits and we have a 1-page
@@ -31,28 +23,11 @@ extern int icache_44x_need_flush;
 #define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
 #define PGDIR_MASK	(~(PGDIR_SIZE-1))
 
-/*
- * entries per page directory level: our page-table tree is two-level, so
- * we don't really have any PMD directory.
- */
-#ifndef __ASSEMBLY__
-#define PTE_TABLE_SIZE	(sizeof(pte_t) << PTE_SHIFT)
-#define PGD_TABLE_SIZE	(sizeof(pgd_t) << (32 - PGDIR_SHIFT))
-#endif	/* __ASSEMBLY__ */
-
 #define PTRS_PER_PTE	(1 << PTE_SHIFT)
 #define PTRS_PER_PMD	1
 #define PTRS_PER_PGD	(1 << (32 - PGDIR_SHIFT))
 
 #define USER_PTRS_PER_PGD	(TASK_SIZE / PGDIR_SIZE)
-#define FIRST_USER_ADDRESS	0UL
-
-#define pte_ERROR(e) \
-	pr_err("%s:%d: bad pte %llx.\n", __FILE__, __LINE__, \
-		(unsigned long long)pte_val(e))
-#define pgd_ERROR(e) \
-	pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
-
 /*
  * This is the bottom of the PKMAP area with HIGHMEM or an arbitrary
  * value (for now) on others, from where we can start layout kernel
@@ -100,30 +75,30 @@ extern int icache_44x_need_flush;
 #endif
 #define VMALLOC_END	ioremap_bot
 
+#ifndef __ASSEMBLY__
+#include <linux/sched.h>
+#include <linux/threads.h>
+#include <asm/io.h>			/* For sub-arch specific PPC_PIN_SIZE */
+
+extern unsigned long ioremap_bot;
+
+/*
+ * entries per page directory level: our page-table tree is two-level, so
+ * we don't really have any PMD directory.
+ */
+#define PTE_TABLE_SIZE	(sizeof(pte_t) << PTE_SHIFT)
+#define PGD_TABLE_SIZE	(sizeof(pgd_t) << (32 - PGDIR_SHIFT))
+
+#define pte_ERROR(e) \
+	pr_err("%s:%d: bad pte %llx.\n", __FILE__, __LINE__, \
+		(unsigned long long)pte_val(e))
+#define pgd_ERROR(e) \
+	pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
 /*
  * Bits in a linux-style PTE.  These match the bits in the
  * (hardware-defined) PowerPC PTE as closely as possible.
  */
 
-#if defined(CONFIG_40x)
-#include <asm/pte-40x.h>
-#elif defined(CONFIG_44x)
-#include <asm/pte-44x.h>
-#elif defined(CONFIG_FSL_BOOKE) && defined(CONFIG_PTE_64BIT)
-#include <asm/pte-book3e.h>
-#elif defined(CONFIG_FSL_BOOKE)
-#include <asm/pte-fsl-booke.h>
-#elif defined(CONFIG_8xx)
-#include <asm/pte-8xx.h>
-#else /* CONFIG_6xx */
-#include <asm/book3s/32/hash.h>
-#endif
-
-/* And here we include common definitions */
-#include <asm/pte-common.h>
-
-#ifndef __ASSEMBLY__
-
 #define pte_clear(mm, addr, ptep) \
 	do { pte_update(ptep, ~_PAGE_HASHPTE, 0); } while (0)
 
@@ -167,7 +142,6 @@ static inline unsigned long pte_update(pte_t *p,
 				       unsigned long clr,
 				       unsigned long set)
 {
-#ifdef PTE_ATOMIC_UPDATES
 	unsigned long old, tmp;
 
 	__asm__ __volatile__("\
@@ -180,15 +154,7 @@ static inline unsigned long pte_update(pte_t *p,
 	: "=&r" (old), "=&r" (tmp), "=m" (*p)
 	: "r" (p), "r" (clr), "r" (set), "m" (*p)
 	: "cc" );
-#else /* PTE_ATOMIC_UPDATES */
-	unsigned long old = pte_val(*p);
-	*p = __pte((old & ~clr) | set);
-#endif /* !PTE_ATOMIC_UPDATES */
-
-#ifdef CONFIG_44x
-	if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
-		icache_44x_need_flush = 1;
-#endif
+
 	return old;
 }
 #else /* CONFIG_PTE_64BIT */
@@ -196,7 +162,6 @@ static inline unsigned long long pte_update(pte_t *p,
 					    unsigned long clr,
 					    unsigned long set)
 {
-#ifdef PTE_ATOMIC_UPDATES
 	unsigned long long old;
 	unsigned long tmp;
 
@@ -211,15 +176,7 @@ static inline unsigned long long pte_update(pte_t *p,
 	: "=&r" (old), "=&r" (tmp), "=m" (*p)
 	: "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" (set), "m" (*p)
 	: "cc" );
-#else /* PTE_ATOMIC_UPDATES */
-	unsigned long long old = pte_val(*p);
-	*p = __pte((old & ~(unsigned long long)clr) | set);
-#endif /* !PTE_ATOMIC_UPDATES */
-
-#ifdef CONFIG_44x
-	if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
-		icache_44x_need_flush = 1;
-#endif
+
 	return old;
 }
 #endif /* CONFIG_PTE_64BIT */
@@ -233,12 +190,10 @@ static inline int __ptep_test_and_clear_young(unsigned int context, unsigned lon
 {
 	unsigned long old;
 	old = pte_update(ptep, _PAGE_ACCESSED, 0);
-#if _PAGE_HASHPTE != 0
 	if (old & _PAGE_HASHPTE) {
 		unsigned long ptephys = __pa(ptep) & PAGE_MASK;
 		flush_hash_pages(context, addr, ptephys, 1);
 	}
-#endif
 	return (old & _PAGE_ACCESSED) != 0;
 }
 #define ptep_test_and_clear_young(__vma, __addr, __ptep) \
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 2741ac6fbd3d..ddc08bf22709 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -8,7 +8,6 @@
 #include <asm/book3s/64/hash.h>
 #include <asm/barrier.h>
 
-#define FIRST_USER_ADDRESS	0UL
 
 /*
  * Size of EA range mapped by our pagetables.
@@ -25,27 +24,16 @@
 /*
  * Define the address range of the kernel non-linear virtual area
  */
-
-#ifdef CONFIG_PPC_BOOK3E
-#define KERN_VIRT_START ASM_CONST(0x8000000000000000)
-#else
 #define KERN_VIRT_START ASM_CONST(0xD000000000000000)
-#endif
 #define KERN_VIRT_SIZE	ASM_CONST(0x0000100000000000)
-
 /*
  * The vmalloc space starts at the beginning of that region, and
  * occupies half of it on hash CPUs and a quarter of it on Book3E
  * (we keep a quarter for the virtual memmap)
  */
 #define VMALLOC_START	KERN_VIRT_START
-#ifdef CONFIG_PPC_BOOK3E
-#define VMALLOC_SIZE	(KERN_VIRT_SIZE >> 2)
-#else
 #define VMALLOC_SIZE	(KERN_VIRT_SIZE >> 1)
-#endif
 #define VMALLOC_END	(VMALLOC_START + VMALLOC_SIZE)
-
 /*
  * The second half of the kernel virtual space is used for IO mappings,
  * it's itself carved into the PIO region (ISA and PHB IO space) and
@@ -64,7 +52,6 @@
 #define IOREMAP_BASE	(PHB_IO_END)
 #define IOREMAP_END	(KERN_VIRT_START + KERN_VIRT_SIZE)
 
-
 /*
  * Region IDs
  */
@@ -79,32 +66,39 @@
 
 /*
  * Defines the address of the vmemap area, in its own region on
- * hash table CPUs and after the vmalloc space on Book3E
+ * hash table CPUs.
  */
-#ifdef CONFIG_PPC_BOOK3E
-#define VMEMMAP_BASE		VMALLOC_END
-#define VMEMMAP_END		KERN_IO_START
-#else
 #define VMEMMAP_BASE		(VMEMMAP_REGION_ID << REGION_SHIFT)
-#endif
 #define vmemmap			((struct page *)VMEMMAP_BASE)
 
 
-/*
- * Include the PTE bits definitions
- */
-#ifdef CONFIG_PPC_BOOK3S
-#include <asm/book3s/64/hash.h>
-#else
-#include <asm/pte-book3e.h>
-#endif
-#include <asm/pte-common.h>
-
 #ifdef CONFIG_PPC_MM_SLICES
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
 #endif /* CONFIG_PPC_MM_SLICES */
 
+/*
+ * THP pages can't be special. So use the _PAGE_SPECIAL
+ */
+#define _PAGE_SPLITTING _PAGE_SPECIAL
+
+/*
+ * We need to differentiate between explicit huge page and THP huge
+ * page, since THP huge page also need to track real subpage details
+ */
+#define _PAGE_THP_HUGE  _PAGE_4K_PFN
+
+/*
+ * set of bits not changed in pmd_modify.
+ */
+#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
+			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
+			 _PAGE_THP_HUGE)
+/*
+ * Default defines for things which we don't use.
+ * We should get this removed.
+ */
+#include <asm/pte-common.h>
 #ifndef __ASSEMBLY__
 
 /*
@@ -144,7 +138,7 @@
 #define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
 #define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
 
-#define pmd_set(pmdp, pmdval) 	(pmd_val(*(pmdp)) = (pmdval))
+#define pmd_set(pmdp, pmdval)	(pmd_val(*(pmdp)) = (pmdval))
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
 				 || (pmd_val(pmd) & PMD_BAD_BITS))
@@ -206,7 +200,6 @@ static inline unsigned long pte_update(struct mm_struct *mm,
 				       unsigned long set,
 				       int huge)
 {
-#ifdef PTE_ATOMIC_UPDATES
 	unsigned long old, tmp;
 
 	__asm__ __volatile__(
@@ -220,18 +213,12 @@ static inline unsigned long pte_update(struct mm_struct *mm,
 	: "=&r" (old), "=&r" (tmp), "=m" (*ptep)
 	: "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
 	: "cc" );
-#else
-	unsigned long old = pte_val(*ptep);
-	*ptep = __pte((old & ~clr) | set);
-#endif
 	/* huge pages use the old page table lock */
 	if (!huge)
 		assert_pte_locked(mm, addr);
 
-#ifdef CONFIG_PPC_STD_MMU_64
 	if (old & _PAGE_HASHPTE)
 		hpte_need_flush(mm, addr, ptep, old, huge);
-#endif
 
 	return old;
 }
@@ -313,7 +300,6 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 	unsigned long bits = pte_val(entry) &
 		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
 
-#ifdef PTE_ATOMIC_UPDATES
 	unsigned long old, tmp;
 
 	__asm__ __volatile__(
@@ -326,10 +312,6 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 	:"=&r" (old), "=&r" (tmp), "=m" (*ptep)
 	:"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
 	:"cc");
-#else
-	unsigned long old = pte_val(*ptep);
-	*ptep = __pte(old | bits);
-#endif
 }
 
 #define __HAVE_ARCH_PTE_SAME
@@ -367,28 +349,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
-#endif /* __ASSEMBLY__ */
 
 /*
- * THP pages can't be special. So use the _PAGE_SPECIAL
- */
-#define _PAGE_SPLITTING _PAGE_SPECIAL
-
-/*
- * We need to differentiate between explicit huge page and THP huge
- * page, since THP huge page also need to track real subpage details
- */
-#define _PAGE_THP_HUGE  _PAGE_4K_PFN
-
-/*
- * set of bits not changed in pmd_modify.
- */
-#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
-			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
-
-#ifndef __ASSEMBLY__
-/*
  * The linux hugepage PMD now include the pmd entries followed by the address
  * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
  * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index a8d8e5152bd4..3818cc7bc9b7 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -7,4 +7,5 @@
 #include <asm/book3s/32/pgtable.h>
 #endif
 
+#define FIRST_USER_ADDRESS	0UL
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 07/35] powerpc/mm: Don't have generic headers introduce functions touching pte bits
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (5 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 06/35] powerpc/mm: Delete booke bits from book3s Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2016-02-18  4:58   ` Balbir Singh
  2015-12-01  3:36 ` [PATCH V6 08/35] powerpc/mm: Drop pte-common.h from BOOK3S 64 Aneesh Kumar K.V
                   ` (27 subsequent siblings)
  34 siblings, 1 reply; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

We are going to drop pte_common.h in the later patch. The idea is to
enable hash code not require to define all PTE bits. Having PTE bits
defined in pte_common.h made the code unnecessarily complex.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/pgtable.h          | 176 +++++++++++++++++++
 .../include/asm/{pgtable.h => pgtable-book3e.h}    |  96 +----------
 arch/powerpc/include/asm/pgtable.h                 | 192 +--------------------
 3 files changed, 185 insertions(+), 279 deletions(-)
 copy arch/powerpc/include/asm/{pgtable.h => pgtable-book3e.h} (70%)

diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index 3818cc7bc9b7..fa270cfcf30a 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -8,4 +8,180 @@
 #endif
 
 #define FIRST_USER_ADDRESS	0UL
+#ifndef __ASSEMBLY__
+
+/* Generic accessors to PTE bits */
+static inline int pte_write(pte_t pte)
+{
+	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO;
+}
+static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
+static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
+static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
+static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
+static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
+
+#ifdef CONFIG_NUMA_BALANCING
+/*
+ * These work without NUMA balancing but the kernel does not care. See the
+ * comment in include/asm-generic/pgtable.h . On powerpc, this will only
+ * work for user pages and always return true for kernel pages.
+ */
+static inline int pte_protnone(pte_t pte)
+{
+	return (pte_val(pte) &
+		(_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
+}
+
+static inline int pmd_protnone(pmd_t pmd)
+{
+	return pte_protnone(pmd_pte(pmd));
+}
+#endif /* CONFIG_NUMA_BALANCING */
+
+static inline int pte_present(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_PRESENT;
+}
+
+/* Conversion functions: convert a page and protection to a page entry,
+ * and a page entry and page directory to the page they refer to.
+ *
+ * Even if PTEs can be unsigned long long, a PFN is always an unsigned
+ * long for now.
+ */
+static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
+	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
+		     pgprot_val(pgprot)); }
+static inline unsigned long pte_pfn(pte_t pte)	{
+	return pte_val(pte) >> PTE_RPN_SHIFT; }
+
+/* Generic modifiers for PTE bits */
+static inline pte_t pte_wrprotect(pte_t pte) {
+	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
+	pte_val(pte) |= _PAGE_RO; return pte; }
+static inline pte_t pte_mkclean(pte_t pte) {
+	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
+static inline pte_t pte_mkold(pte_t pte) {
+	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
+static inline pte_t pte_mkwrite(pte_t pte) {
+	pte_val(pte) &= ~_PAGE_RO;
+	pte_val(pte) |= _PAGE_RW; return pte; }
+static inline pte_t pte_mkdirty(pte_t pte) {
+	pte_val(pte) |= _PAGE_DIRTY; return pte; }
+static inline pte_t pte_mkyoung(pte_t pte) {
+	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
+static inline pte_t pte_mkspecial(pte_t pte) {
+	pte_val(pte) |= _PAGE_SPECIAL; return pte; }
+static inline pte_t pte_mkhuge(pte_t pte) {
+	return pte; }
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+	pte_val(pte) = (pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot);
+	return pte;
+}
+
+
+/* Insert a PTE, top-level function is out of line. It uses an inline
+ * low level function in the respective pgtable-* files
+ */
+extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+		       pte_t pte);
+
+/* This low level function performs the actual PTE insertion
+ * Setting the PTE depends on the MMU type and other factors. It's
+ * an horrible mess that I'm not going to try to clean up now but
+ * I'm keeping it in one place rather than spread around
+ */
+static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
+				pte_t *ptep, pte_t pte, int percpu)
+{
+#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
+	/* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
+	 * helper pte_update() which does an atomic update. We need to do that
+	 * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
+	 * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
+	 * the hash bits instead (ie, same as the non-SMP case)
+	 */
+	if (percpu)
+		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+			      | (pte_val(pte) & ~_PAGE_HASHPTE));
+	else
+		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
+
+#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
+	/* Second case is 32-bit with 64-bit PTE.  In this case, we
+	 * can just store as long as we do the two halves in the right order
+	 * with a barrier in between. This is possible because we take care,
+	 * in the hash code, to pre-invalidate if the PTE was already hashed,
+	 * which synchronizes us with any concurrent invalidation.
+	 * In the percpu case, we also fallback to the simple update preserving
+	 * the hash bits
+	 */
+	if (percpu) {
+		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+			      | (pte_val(pte) & ~_PAGE_HASHPTE));
+		return;
+	}
+	if (pte_val(*ptep) & _PAGE_HASHPTE)
+		flush_hash_entry(mm, ptep, addr);
+	__asm__ __volatile__("\
+		stw%U0%X0 %2,%0\n\
+		eieio\n\
+		stw%U0%X0 %L2,%1"
+	: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
+	: "r" (pte) : "memory");
+
+#elif defined(CONFIG_PPC_STD_MMU_32)
+	/* Third case is 32-bit hash table in UP mode, we need to preserve
+	 * the _PAGE_HASHPTE bit since we may not have invalidated the previous
+	 * translation in the hash yet (done in a subsequent flush_tlb_xxx())
+	 * and see we need to keep track that this PTE needs invalidating
+	 */
+	*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+		      | (pte_val(pte) & ~_PAGE_HASHPTE));
+
+#else
+	/* Anything else just stores the PTE normally. That covers all 64-bit
+	 * cases, and 32-bit non-hash with 32-bit PTEs.
+	 */
+	*ptep = pte;
+#endif
+}
+
+
+#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
+extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
+				 pte_t *ptep, pte_t entry, int dirty);
+
+/*
+ * Macro to mark a page protection value as "uncacheable".
+ */
+
+#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
+			 _PAGE_WRITETHRU)
+
+#define pgprot_noncached(prot)	  (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
+				            _PAGE_NO_CACHE | _PAGE_GUARDED))
+
+#define pgprot_noncached_wc(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
+				            _PAGE_NO_CACHE))
+
+#define pgprot_cached(prot)       (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
+				            _PAGE_COHERENT))
+
+#define pgprot_cached_wthru(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
+				            _PAGE_COHERENT | _PAGE_WRITETHRU))
+
+#define pgprot_cached_noncoherent(prot) \
+		(__pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL))
+
+#define pgprot_writecombine pgprot_noncached_wc
+
+struct file;
+extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
+				     unsigned long size, pgprot_t vma_prot);
+#define __HAVE_PHYS_MEM_ACCESS_PROT
+
+#endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable-book3e.h
similarity index 70%
copy from arch/powerpc/include/asm/pgtable.h
copy to arch/powerpc/include/asm/pgtable-book3e.h
index c304d0767919..a3221cff2e31 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable-book3e.h
@@ -1,41 +1,19 @@
-#ifndef _ASM_POWERPC_PGTABLE_H
-#define _ASM_POWERPC_PGTABLE_H
-#ifdef __KERNEL__
+#ifndef _ASM_POWERPC_PGTABLE_BOOK3E_H
+#define _ASM_POWERPC_PGTABLE_BOOK3E_H
 
-#ifndef __ASSEMBLY__
-#include <linux/mmdebug.h>
-#include <linux/mmzone.h>
-#include <asm/processor.h>		/* For TASK_SIZE */
-#include <asm/mmu.h>
-#include <asm/page.h>
-
-struct mm_struct;
-
-#endif /* !__ASSEMBLY__ */
-
-#ifdef CONFIG_PPC_BOOK3S
-#include <asm/book3s/pgtable.h>
-#else
 #if defined(CONFIG_PPC64)
-#  include <asm/pgtable-ppc64.h>
+#include <asm/pgtable-ppc64.h>
 #else
-#  include <asm/pgtable-ppc32.h>
+#include <asm/pgtable-ppc32.h>
 #endif
-#endif /* !CONFIG_PPC_BOOK3S */
-
-/*
- * We save the slot number & secondary bit in the second half of the
- * PTE page. We use the 8 bytes per each pte entry.
- */
-#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
 
 #ifndef __ASSEMBLY__
 
-#include <asm/tlbflush.h>
-
 /* Generic accessors to PTE bits */
 static inline int pte_write(pte_t pte)
-{	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO; }
+{
+	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO;
+}
 static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
 static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
 static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
@@ -77,10 +55,6 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
 static inline unsigned long pte_pfn(pte_t pte)	{
 	return pte_val(pte) >> PTE_RPN_SHIFT; }
 
-/* Keep these as a macros to avoid include dependency mess */
-#define pte_page(x)		pfn_to_page(pte_pfn(x))
-#define mk_pte(page, pgprot)	pfn_pte(page_to_pfn(page), (pgprot))
-
 /* Generic modifiers for PTE bits */
 static inline pte_t pte_wrprotect(pte_t pte) {
 	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
@@ -221,59 +195,5 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 				     unsigned long size, pgprot_t vma_prot);
 #define __HAVE_PHYS_MEM_ACCESS_PROT
 
-/*
- * ZERO_PAGE is a global shared page that is always zero: used
- * for zero-mapped memory areas etc..
- */
-extern unsigned long empty_zero_page[];
-#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
-
-extern pgd_t swapper_pg_dir[];
-
-void limit_zone_pfn(enum zone_type zone, unsigned long max_pfn);
-int dma_pfn_limit_to_zone(u64 pfn_limit);
-extern void paging_init(void);
-
-/*
- * kern_addr_valid is intended to indicate whether an address is a valid
- * kernel address.  Most 32-bit archs define it as always true (like this)
- * but most 64-bit archs actually perform a test.  What should we do here?
- */
-#define kern_addr_valid(addr)	(1)
-
-#include <asm-generic/pgtable.h>
-
-
-/*
- * This gets called at the end of handling a page fault, when
- * the kernel has put a new PTE into the page table for the process.
- * We use it to ensure coherency between the i-cache and d-cache
- * for the page which has just been mapped in.
- * On machines which use an MMU hash table, we use this to put a
- * corresponding HPTE into the hash table ahead of time, instead of
- * waiting for the inevitable extra hash-table miss exception.
- */
-extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t *);
-
-extern int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
-		       unsigned long end, int write,
-		       struct page **pages, int *nr);
-#ifndef CONFIG_TRANSPARENT_HUGEPAGE
-#define pmd_large(pmd)		0
-#define has_transparent_hugepage() 0
-#endif
-pte_t *__find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
-				   bool *is_thp, unsigned *shift);
-static inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
-					       bool *is_thp, unsigned *shift)
-{
-	if (!arch_irqs_disabled()) {
-		pr_info("%s called with irq enabled\n", __func__);
-		dump_stack();
-	}
-	return __find_linux_pte_or_hugepte(pgdir, ea, is_thp, shift);
-}
 #endif /* __ASSEMBLY__ */
-
-#endif /* __KERNEL__ */
-#endif /* _ASM_POWERPC_PGTABLE_H */
+#endif
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index c304d0767919..a27b8cef51d7 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -1,6 +1,5 @@
 #ifndef _ASM_POWERPC_PGTABLE_H
 #define _ASM_POWERPC_PGTABLE_H
-#ifdef __KERNEL__
 
 #ifndef __ASSEMBLY__
 #include <linux/mmdebug.h>
@@ -16,11 +15,7 @@ struct mm_struct;
 #ifdef CONFIG_PPC_BOOK3S
 #include <asm/book3s/pgtable.h>
 #else
-#if defined(CONFIG_PPC64)
-#  include <asm/pgtable-ppc64.h>
-#else
-#  include <asm/pgtable-ppc32.h>
-#endif
+#include <asm/pgtable-book3e.h>
 #endif /* !CONFIG_PPC_BOOK3S */
 
 /*
@@ -33,194 +28,10 @@ struct mm_struct;
 
 #include <asm/tlbflush.h>
 
-/* Generic accessors to PTE bits */
-static inline int pte_write(pte_t pte)
-{	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO; }
-static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
-static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
-static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
-static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
-
-#ifdef CONFIG_NUMA_BALANCING
-/*
- * These work without NUMA balancing but the kernel does not care. See the
- * comment in include/asm-generic/pgtable.h . On powerpc, this will only
- * work for user pages and always return true for kernel pages.
- */
-static inline int pte_protnone(pte_t pte)
-{
-	return (pte_val(pte) &
-		(_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
-}
-
-static inline int pmd_protnone(pmd_t pmd)
-{
-	return pte_protnone(pmd_pte(pmd));
-}
-#endif /* CONFIG_NUMA_BALANCING */
-
-static inline int pte_present(pte_t pte)
-{
-	return pte_val(pte) & _PAGE_PRESENT;
-}
-
-/* Conversion functions: convert a page and protection to a page entry,
- * and a page entry and page directory to the page they refer to.
- *
- * Even if PTEs can be unsigned long long, a PFN is always an unsigned
- * long for now.
- */
-static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
-	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
-		     pgprot_val(pgprot)); }
-static inline unsigned long pte_pfn(pte_t pte)	{
-	return pte_val(pte) >> PTE_RPN_SHIFT; }
-
 /* Keep these as a macros to avoid include dependency mess */
 #define pte_page(x)		pfn_to_page(pte_pfn(x))
 #define mk_pte(page, pgprot)	pfn_pte(page_to_pfn(page), (pgprot))
 
-/* Generic modifiers for PTE bits */
-static inline pte_t pte_wrprotect(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
-	pte_val(pte) |= _PAGE_RO; return pte; }
-static inline pte_t pte_mkclean(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
-static inline pte_t pte_mkold(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_RO;
-	pte_val(pte) |= _PAGE_RW; return pte; }
-static inline pte_t pte_mkdirty(pte_t pte) {
-	pte_val(pte) |= _PAGE_DIRTY; return pte; }
-static inline pte_t pte_mkyoung(pte_t pte) {
-	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkspecial(pte_t pte) {
-	pte_val(pte) |= _PAGE_SPECIAL; return pte; }
-static inline pte_t pte_mkhuge(pte_t pte) {
-	return pte; }
-static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
-{
-	pte_val(pte) = (pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot);
-	return pte;
-}
-
-
-/* Insert a PTE, top-level function is out of line. It uses an inline
- * low level function in the respective pgtable-* files
- */
-extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
-		       pte_t pte);
-
-/* This low level function performs the actual PTE insertion
- * Setting the PTE depends on the MMU type and other factors. It's
- * an horrible mess that I'm not going to try to clean up now but
- * I'm keeping it in one place rather than spread around
- */
-static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
-				pte_t *ptep, pte_t pte, int percpu)
-{
-#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
-	/* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
-	 * helper pte_update() which does an atomic update. We need to do that
-	 * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
-	 * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
-	 * the hash bits instead (ie, same as the non-SMP case)
-	 */
-	if (percpu)
-		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-			      | (pte_val(pte) & ~_PAGE_HASHPTE));
-	else
-		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
-
-#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
-	/* Second case is 32-bit with 64-bit PTE.  In this case, we
-	 * can just store as long as we do the two halves in the right order
-	 * with a barrier in between. This is possible because we take care,
-	 * in the hash code, to pre-invalidate if the PTE was already hashed,
-	 * which synchronizes us with any concurrent invalidation.
-	 * In the percpu case, we also fallback to the simple update preserving
-	 * the hash bits
-	 */
-	if (percpu) {
-		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-			      | (pte_val(pte) & ~_PAGE_HASHPTE));
-		return;
-	}
-#if _PAGE_HASHPTE != 0
-	if (pte_val(*ptep) & _PAGE_HASHPTE)
-		flush_hash_entry(mm, ptep, addr);
-#endif
-	__asm__ __volatile__("\
-		stw%U0%X0 %2,%0\n\
-		eieio\n\
-		stw%U0%X0 %L2,%1"
-	: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
-	: "r" (pte) : "memory");
-
-#elif defined(CONFIG_PPC_STD_MMU_32)
-	/* Third case is 32-bit hash table in UP mode, we need to preserve
-	 * the _PAGE_HASHPTE bit since we may not have invalidated the previous
-	 * translation in the hash yet (done in a subsequent flush_tlb_xxx())
-	 * and see we need to keep track that this PTE needs invalidating
-	 */
-	*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-		      | (pte_val(pte) & ~_PAGE_HASHPTE));
-
-#else
-	/* Anything else just stores the PTE normally. That covers all 64-bit
-	 * cases, and 32-bit non-hash with 32-bit PTEs.
-	 */
-	*ptep = pte;
-
-#ifdef CONFIG_PPC_BOOK3E_64
-	/*
-	 * With hardware tablewalk, a sync is needed to ensure that
-	 * subsequent accesses see the PTE we just wrote.  Unlike userspace
-	 * mappings, we can't tolerate spurious faults, so make sure
-	 * the new PTE will be seen the first time.
-	 */
-	if (is_kernel_addr(addr))
-		mb();
-#endif
-#endif
-}
-
-
-#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
-extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
-				 pte_t *ptep, pte_t entry, int dirty);
-
-/*
- * Macro to mark a page protection value as "uncacheable".
- */
-
-#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU)
-
-#define pgprot_noncached(prot)	  (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_NO_CACHE | _PAGE_GUARDED))
-
-#define pgprot_noncached_wc(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_NO_CACHE))
-
-#define pgprot_cached(prot)       (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_COHERENT))
-
-#define pgprot_cached_wthru(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_COHERENT | _PAGE_WRITETHRU))
-
-#define pgprot_cached_noncoherent(prot) \
-		(__pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL))
-
-#define pgprot_writecombine pgprot_noncached_wc
-
-struct file;
-extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
-				     unsigned long size, pgprot_t vma_prot);
-#define __HAVE_PHYS_MEM_ACCESS_PROT
-
 /*
  * ZERO_PAGE is a global shared page that is always zero: used
  * for zero-mapped memory areas etc..
@@ -275,5 +86,4 @@ static inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
 }
 #endif /* __ASSEMBLY__ */
 
-#endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_PGTABLE_H */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 08/35] powerpc/mm: Drop pte-common.h from BOOK3S 64
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (6 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 07/35] powerpc/mm: Don't have generic headers introduce functions touching pte bits Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 09/35] powerpc/mm: Don't use pte_val as lvalue Aneesh Kumar K.V
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

We copy only needed PTE bits define from pte-common.h to respective
hash related header. This should greatly simply later patches in which
we are going to change the pte format for hash config

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h |   1 +
 arch/powerpc/include/asm/book3s/64/hash.h    |   2 +
 arch/powerpc/include/asm/book3s/64/pgtable.h | 104 ++++++++++++++++++++++++++-
 arch/powerpc/include/asm/book3s/pgtable.h    |  16 ++---
 4 files changed, 111 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index f2c51cd61f69..15518b620f5a 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -62,6 +62,7 @@
 /* shift to put page number into pte */
 #define PTE_RPN_SHIFT	(17)
 
+#define _PAGE_4K_PFN		0
 #ifndef __ASSEMBLY__
 /*
  * 4-level page tables related bits
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 8e60d4fa434d..7deb5063ff8c 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -20,6 +20,7 @@
 #define _PAGE_EXEC		0x0004 /* No execute on POWER4 and newer (we invert) */
 #define _PAGE_GUARDED		0x0008
 /* We can derive Memory coherence from _PAGE_NO_CACHE */
+#define _PAGE_COHERENT		0x0
 #define _PAGE_NO_CACHE		0x0020 /* I: cache inhibit */
 #define _PAGE_WRITETHRU		0x0040 /* W: cache write-through */
 #define _PAGE_DIRTY		0x0080 /* C: page changed */
@@ -30,6 +31,7 @@
 /* No separate kernel read-only */
 #define _PAGE_KERNEL_RW		(_PAGE_RW | _PAGE_DIRTY) /* user access blocked by key */
 #define _PAGE_KERNEL_RO		 _PAGE_KERNEL_RW
+#define _PAGE_KERNEL_RWX	(_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC)
 
 /* Strong Access Ordering */
 #define _PAGE_SAO		(_PAGE_WRITETHRU | _PAGE_NO_CACHE | _PAGE_COHERENT)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index ddc08bf22709..f942b27e9c5f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -94,11 +94,109 @@
 #define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
 			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
 			 _PAGE_THP_HUGE)
+#define _PTE_NONE_MASK	_PAGE_HPTEFLAGS
 /*
- * Default defines for things which we don't use.
- * We should get this removed.
+ * The mask convered by the RPN must be a ULL on 32-bit platforms with
+ * 64-bit PTEs
  */
-#include <asm/pte-common.h>
+#define PTE_RPN_MASK	(~((1UL << PTE_RPN_SHIFT) - 1))
+/*
+ * _PAGE_CHG_MASK masks of bits that are to be preserved across
+ * pgprot changes
+ */
+#define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
+			 _PAGE_ACCESSED | _PAGE_SPECIAL)
+/*
+ * Mask of bits returned by pte_pgprot()
+ */
+#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
+			 _PAGE_WRITETHRU | _PAGE_4K_PFN | \
+			 _PAGE_USER | _PAGE_ACCESSED |  \
+			 _PAGE_RW |  _PAGE_DIRTY | _PAGE_EXEC)
+/*
+ * We define 2 sets of base prot bits, one for basic pages (ie,
+ * cacheable kernel and user pages) and one for non cacheable
+ * pages. We always set _PAGE_COHERENT when SMP is enabled or
+ * the processor might need it for DMA coherency.
+ */
+#define _PAGE_BASE_NC	(_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE)
+#define _PAGE_BASE	(_PAGE_BASE_NC | _PAGE_COHERENT)
+
+/* Permission masks used to generate the __P and __S table,
+ *
+ * Note:__pgprot is defined in arch/powerpc/include/asm/page.h
+ *
+ * Write permissions imply read permissions for now (we could make write-only
+ * pages on BookE but we don't bother for now). Execute permission control is
+ * possible on platforms that define _PAGE_EXEC
+ *
+ * Note due to the way vm flags are laid out, the bits are XWR
+ */
+#define PAGE_NONE	__pgprot(_PAGE_BASE)
+#define PAGE_SHARED	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
+#define PAGE_SHARED_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | \
+				 _PAGE_EXEC)
+#define PAGE_COPY	__pgprot(_PAGE_BASE | _PAGE_USER )
+#define PAGE_COPY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+#define PAGE_READONLY	__pgprot(_PAGE_BASE | _PAGE_USER )
+#define PAGE_READONLY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+
+#define __P000	PAGE_NONE
+#define __P001	PAGE_READONLY
+#define __P010	PAGE_COPY
+#define __P011	PAGE_COPY
+#define __P100	PAGE_READONLY_X
+#define __P101	PAGE_READONLY_X
+#define __P110	PAGE_COPY_X
+#define __P111	PAGE_COPY_X
+
+#define __S000	PAGE_NONE
+#define __S001	PAGE_READONLY
+#define __S010	PAGE_SHARED
+#define __S011	PAGE_SHARED
+#define __S100	PAGE_READONLY_X
+#define __S101	PAGE_READONLY_X
+#define __S110	PAGE_SHARED_X
+#define __S111	PAGE_SHARED_X
+
+/* Permission masks used for kernel mappings */
+#define PAGE_KERNEL	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
+#define PAGE_KERNEL_NC	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
+				 _PAGE_NO_CACHE)
+#define PAGE_KERNEL_NCG	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
+				 _PAGE_NO_CACHE | _PAGE_GUARDED)
+#define PAGE_KERNEL_X	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RWX)
+#define PAGE_KERNEL_RO	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
+#define PAGE_KERNEL_ROX	__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
+
+/* Protection used for kernel text. We want the debuggers to be able to
+ * set breakpoints anywhere, so don't write protect the kernel text
+ * on platforms where such control is possible.
+ */
+#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) ||\
+	defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE)
+#define PAGE_KERNEL_TEXT	PAGE_KERNEL_X
+#else
+#define PAGE_KERNEL_TEXT	PAGE_KERNEL_ROX
+#endif
+
+/* Make modules code happy. We don't set RO yet */
+#define PAGE_KERNEL_EXEC	PAGE_KERNEL_X
+
+/*
+ * Don't just check for any non zero bits in __PAGE_USER, since for book3e
+ * and PTE_64BIT, PAGE_KERNEL_X contains _PAGE_BAP_SR which is also in
+ * _PAGE_USER.  Need to explicitly match _PAGE_BAP_UR bit in that case too.
+ */
+#define pte_user(val)		((val & _PAGE_USER) == _PAGE_USER)
+
+/* Advertise special mapping type for AGP */
+#define PAGE_AGP		(PAGE_KERNEL_NC)
+#define HAVE_PAGE_AGP
+
+/* Advertise support for _PAGE_SPECIAL */
+#define __HAVE_ARCH_PTE_SPECIAL
+
 #ifndef __ASSEMBLY__
 
 /*
diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index fa270cfcf30a..87333618af3b 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -11,10 +11,7 @@
 #ifndef __ASSEMBLY__
 
 /* Generic accessors to PTE bits */
-static inline int pte_write(pte_t pte)
-{
-	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO;
-}
+static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
 static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
 static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
 static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
@@ -57,15 +54,16 @@ static inline unsigned long pte_pfn(pte_t pte)	{
 	return pte_val(pte) >> PTE_RPN_SHIFT; }
 
 /* Generic modifiers for PTE bits */
-static inline pte_t pte_wrprotect(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
-	pte_val(pte) |= _PAGE_RO; return pte; }
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+	pte_val(pte) &= ~_PAGE_RW;
+	return pte;
+}
 static inline pte_t pte_mkclean(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
+	pte_val(pte) &= ~_PAGE_DIRTY; return pte; }
 static inline pte_t pte_mkold(pte_t pte) {
 	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
 static inline pte_t pte_mkwrite(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_RO;
 	pte_val(pte) |= _PAGE_RW; return pte; }
 static inline pte_t pte_mkdirty(pte_t pte) {
 	pte_val(pte) |= _PAGE_DIRTY; return pte; }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 09/35] powerpc/mm: Don't use pte_val as lvalue
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (7 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 08/35] powerpc/mm: Drop pte-common.h from BOOK3S 64 Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 10/35] powerpc/mm: Don't use pmd_val, pud_val and pgd_val " Aneesh Kumar K.V
                   ` (25 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

We also convert few #define to static inline in this patch for better
type checking

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/pgtable.h | 118 +++++++++++++++++++++---------
 arch/powerpc/include/asm/page.h           |  10 ++-
 arch/powerpc/include/asm/pgtable-book3e.h |  68 ++++++++++++-----
 3 files changed, 139 insertions(+), 57 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index 87333618af3b..ebd6677ea017 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -12,9 +12,9 @@
 
 /* Generic accessors to PTE bits */
 static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
-static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
-static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
+static inline int pte_dirty(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_DIRTY); }
+static inline int pte_young(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_ACCESSED); }
+static inline int pte_special(pte_t pte)	{ return !!(pte_val(pte) & _PAGE_SPECIAL); }
 static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
 static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
 
@@ -47,36 +47,61 @@ static inline int pte_present(pte_t pte)
  * Even if PTEs can be unsigned long long, a PFN is always an unsigned
  * long for now.
  */
-static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
+static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
+{
 	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
-		     pgprot_val(pgprot)); }
-static inline unsigned long pte_pfn(pte_t pte)	{
-	return pte_val(pte) >> PTE_RPN_SHIFT; }
+		     pgprot_val(pgprot));
+}
+
+static inline unsigned long pte_pfn(pte_t pte)
+{
+	return pte_val(pte) >> PTE_RPN_SHIFT;
+}
 
 /* Generic modifiers for PTE bits */
 static inline pte_t pte_wrprotect(pte_t pte)
 {
-	pte_val(pte) &= ~_PAGE_RW;
+	return __pte(pte_val(pte) & ~_PAGE_RW);
+}
+
+static inline pte_t pte_mkclean(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkold(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_RW);
+}
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_SPECIAL);
+}
+
+static inline pte_t pte_mkhuge(pte_t pte)
+{
 	return pte;
 }
-static inline pte_t pte_mkclean(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_DIRTY; return pte; }
-static inline pte_t pte_mkold(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte) {
-	pte_val(pte) |= _PAGE_RW; return pte; }
-static inline pte_t pte_mkdirty(pte_t pte) {
-	pte_val(pte) |= _PAGE_DIRTY; return pte; }
-static inline pte_t pte_mkyoung(pte_t pte) {
-	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkspecial(pte_t pte) {
-	pte_val(pte) |= _PAGE_SPECIAL; return pte; }
-static inline pte_t pte_mkhuge(pte_t pte) {
-	return pte; }
+
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
-	pte_val(pte) = (pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot);
-	return pte;
+	return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
 }
 
 
@@ -159,22 +184,45 @@ extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addre
 #define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
 			 _PAGE_WRITETHRU)
 
-#define pgprot_noncached(prot)	  (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_NO_CACHE | _PAGE_GUARDED))
+#define pgprot_noncached pgprot_noncached
+static inline pgprot_t pgprot_noncached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE | _PAGE_GUARDED);
+}
 
-#define pgprot_noncached_wc(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_NO_CACHE))
+#define pgprot_noncached_wc pgprot_noncached_wc
+static inline pgprot_t pgprot_noncached_wc(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE);
+}
 
-#define pgprot_cached(prot)       (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_COHERENT))
+#define pgprot_cached pgprot_cached
+static inline pgprot_t pgprot_cached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT);
+}
 
-#define pgprot_cached_wthru(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_COHERENT | _PAGE_WRITETHRU))
+#define pgprot_cached_wthru pgprot_cached_wthru
+static inline pgprot_t pgprot_cached_wthru(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT | _PAGE_WRITETHRU);
+}
 
-#define pgprot_cached_noncoherent(prot) \
-		(__pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL))
+#define pgprot_cached_noncoherent pgprot_cached_noncoherent
+static inline pgprot_t pgprot_cached_noncoherent(pgprot_t prot)
+{
+	return __pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL);
+}
 
-#define pgprot_writecombine pgprot_noncached_wc
+#define pgprot_writecombine pgprot_writecombine
+static inline pgprot_t pgprot_writecombine(pgprot_t prot)
+{
+	return pgprot_noncached_wc(prot);
+}
 
 struct file;
 extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 96534b4e5a64..0e9d1d32053f 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -285,8 +285,11 @@ extern long long virt_phys_offset;
 
 /* PTE level */
 typedef struct { pte_basic_t pte; } pte_t;
-#define pte_val(x)	((x).pte)
 #define __pte(x)	((pte_t) { (x) })
+static inline pte_basic_t pte_val(pte_t x)
+{
+	return x.pte;
+}
 
 /* 64k pages additionally define a bigger "real PTE" type that gathers
  * the "second half" part of the PTE for pseudo 64k pages
@@ -328,8 +331,11 @@ typedef struct { unsigned long pgprot; } pgprot_t;
  */
 
 typedef pte_basic_t pte_t;
-#define pte_val(x)	(x)
 #define __pte(x)	(x)
+static inline pte_basic_t pte_val(pte_t pte)
+{
+	return pte;
+}
 
 #if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC_STD_MMU_64)
 typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
diff --git a/arch/powerpc/include/asm/pgtable-book3e.h b/arch/powerpc/include/asm/pgtable-book3e.h
index a3221cff2e31..91325997ba25 100644
--- a/arch/powerpc/include/asm/pgtable-book3e.h
+++ b/arch/powerpc/include/asm/pgtable-book3e.h
@@ -56,30 +56,58 @@ static inline unsigned long pte_pfn(pte_t pte)	{
 	return pte_val(pte) >> PTE_RPN_SHIFT; }
 
 /* Generic modifiers for PTE bits */
-static inline pte_t pte_wrprotect(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
-	pte_val(pte) |= _PAGE_RO; return pte; }
-static inline pte_t pte_mkclean(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
-static inline pte_t pte_mkold(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_RO;
-	pte_val(pte) |= _PAGE_RW; return pte; }
-static inline pte_t pte_mkdirty(pte_t pte) {
-	pte_val(pte) |= _PAGE_DIRTY; return pte; }
-static inline pte_t pte_mkyoung(pte_t pte) {
-	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkspecial(pte_t pte) {
-	pte_val(pte) |= _PAGE_SPECIAL; return pte; }
-static inline pte_t pte_mkhuge(pte_t pte) {
-	return pte; }
-static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+	pte_basic_t ptev;
+
+	ptev = pte_val(pte) & ~(_PAGE_RW | _PAGE_HWWRITE);
+	ptev |= _PAGE_RO;
+	return __pte(ptev);
+}
+
+static inline pte_t pte_mkclean(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~(_PAGE_DIRTY | _PAGE_HWWRITE));
+}
+
+static inline pte_t pte_mkold(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+	pte_basic_t ptev;
+
+	ptev = pte_val(pte) & ~_PAGE_RO;
+	ptev |= _PAGE_RW;
+	return __pte(ptev);
+}
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_SPECIAL);
+}
+
+static inline pte_t pte_mkhuge(pte_t pte)
 {
-	pte_val(pte) = (pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot);
 	return pte;
 }
 
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+	return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
+}
 
 /* Insert a PTE, top-level function is out of line. It uses an inline
  * low level function in the respective pgtable-* files
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 10/35] powerpc/mm: Don't use pmd_val, pud_val and pgd_val as lvalue
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (8 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 09/35] powerpc/mm: Don't use pte_val as lvalue Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 11/35] powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h Aneesh Kumar K.V
                   ` (24 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

We convert them static inline function here as we did with pte_val in
the previous patch

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h |  6 ++++-
 arch/powerpc/include/asm/book3s/64/hash-4k.h |  6 ++++-
 arch/powerpc/include/asm/book3s/64/pgtable.h | 36 +++++++++++++++++++++-------
 arch/powerpc/include/asm/page.h              | 34 +++++++++++++++++++-------
 arch/powerpc/include/asm/pgalloc-32.h        | 34 +++++++++++++++++++-------
 arch/powerpc/include/asm/pgalloc-64.h        | 17 +++++++++----
 arch/powerpc/include/asm/pgtable-ppc32.h     |  7 +++++-
 arch/powerpc/include/asm/pgtable-ppc64-4k.h  |  6 ++++-
 arch/powerpc/include/asm/pgtable-ppc64.h     | 36 +++++++++++++++++++++-------
 arch/powerpc/mm/40x_mmu.c                    | 10 ++++----
 arch/powerpc/mm/pgtable_64.c                 | 19 +++++++--------
 11 files changed, 154 insertions(+), 57 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 5438c0b6aeec..226f29d39332 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -105,7 +105,11 @@ extern unsigned long ioremap_bot;
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(pmd_val(pmd) & _PMD_BAD)
 #define	pmd_present(pmd)	(pmd_val(pmd) & _PMD_PRESENT_MASK)
-#define	pmd_clear(pmdp)		do { pmd_val(*(pmdp)) = 0; } while (0)
+static inline void pmd_clear(pmd_t *pmdp)
+{
+	*pmdp = __pmd(0);
+}
+
 
 /*
  * When flushing the tlb entry for a page, we also need to flush the hash
diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 15518b620f5a..537eacecf6e9 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -71,9 +71,13 @@
 #define pgd_none(pgd)		(!pgd_val(pgd))
 #define pgd_bad(pgd)		(pgd_val(pgd) == 0)
 #define pgd_present(pgd)	(pgd_val(pgd) != 0)
-#define pgd_clear(pgdp)		(pgd_val(*(pgdp)) = 0)
 #define pgd_page_vaddr(pgd)	(pgd_val(pgd) & ~PGD_MASKED_BITS)
 
+static inline void pgd_clear(pgd_t *pgdp)
+{
+	*pgdp = __pgd(0);
+}
+
 static inline pte_t pgd_pte(pgd_t pgd)
 {
 	return __pte(pgd_val(pgd));
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index f942b27e9c5f..09c6474f89c7 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -236,21 +236,38 @@
 #define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
 #define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
 
-#define pmd_set(pmdp, pmdval)	(pmd_val(*(pmdp)) = (pmdval))
+static inline void pmd_set(pmd_t *pmdp, unsigned long val)
+{
+	*pmdp = __pmd(val);
+}
+
+static inline void pmd_clear(pmd_t *pmdp)
+{
+	*pmdp = __pmd(0);
+}
+
+
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
 				 || (pmd_val(pmd) & PMD_BAD_BITS))
 #define	pmd_present(pmd)	(!pmd_none(pmd))
-#define	pmd_clear(pmdp)		(pmd_val(*(pmdp)) = 0)
 #define pmd_page_vaddr(pmd)	(pmd_val(pmd) & ~PMD_MASKED_BITS)
 extern struct page *pmd_page(pmd_t pmd);
 
-#define pud_set(pudp, pudval)	(pud_val(*(pudp)) = (pudval))
+static inline void pud_set(pud_t *pudp, unsigned long val)
+{
+	*pudp = __pud(val);
+}
+
+static inline void pud_clear(pud_t *pudp)
+{
+	*pudp = __pud(0);
+}
+
 #define pud_none(pud)		(!pud_val(pud))
 #define	pud_bad(pud)		(!is_kernel_addr(pud_val(pud)) \
 				 || (pud_val(pud) & PUD_BAD_BITS))
 #define pud_present(pud)	(pud_val(pud) != 0)
-#define pud_clear(pudp)		(pud_val(*(pudp)) = 0)
 #define pud_page_vaddr(pud)	(pud_val(pud) & ~PUD_MASKED_BITS)
 
 extern struct page *pud_page(pud_t pud);
@@ -265,8 +282,11 @@ static inline pud_t pte_pud(pte_t pte)
 	return __pud(pte_val(pte));
 }
 #define pud_write(pud)		pte_write(pud_pte(pud))
-#define pgd_set(pgdp, pudp)	({pgd_val(*(pgdp)) = (unsigned long)(pudp);})
 #define pgd_write(pgd)		pte_write(pgd_pte(pgd))
+static inline void pgd_set(pgd_t *pgdp, unsigned long val)
+{
+	*pgdp = __pgd(val);
+}
 
 /*
  * Find an entry in a page-table-directory.  We combine the address region
@@ -588,14 +608,12 @@ static inline pmd_t pmd_mkhuge(pmd_t pmd)
 
 static inline pmd_t pmd_mknotpresent(pmd_t pmd)
 {
-	pmd_val(pmd) &= ~_PAGE_PRESENT;
-	return pmd;
+	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
 }
 
 static inline pmd_t pmd_mksplitting(pmd_t pmd)
 {
-	pmd_val(pmd) |= _PAGE_SPLITTING;
-	return pmd;
+	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
 }
 
 #define __HAVE_ARCH_PMD_SAME
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 0e9d1d32053f..9d2f38e1b21d 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -303,21 +303,30 @@ typedef struct { pte_t pte; } real_pte_t;
 /* PMD level */
 #ifdef CONFIG_PPC64
 typedef struct { unsigned long pmd; } pmd_t;
-#define pmd_val(x)	((x).pmd)
 #define __pmd(x)	((pmd_t) { (x) })
+static inline unsigned long pmd_val(pmd_t x)
+{
+	return x.pmd;
+}
 
 /* PUD level exusts only on 4k pages */
 #ifndef CONFIG_PPC_64K_PAGES
 typedef struct { unsigned long pud; } pud_t;
-#define pud_val(x)	((x).pud)
 #define __pud(x)	((pud_t) { (x) })
+static inline unsigned long pud_val(pud_t x)
+{
+	return x.pud;
+}
 #endif /* !CONFIG_PPC_64K_PAGES */
 #endif /* CONFIG_PPC64 */
 
 /* PGD level */
 typedef struct { unsigned long pgd; } pgd_t;
-#define pgd_val(x)	((x).pgd)
 #define __pgd(x)	((pgd_t) { (x) })
+static inline unsigned long pgd_val(pgd_t x)
+{
+	return x.pgd;
+}
 
 /* Page protection bits */
 typedef struct { unsigned long pgprot; } pgprot_t;
@@ -346,22 +355,31 @@ typedef pte_t real_pte_t;
 
 #ifdef CONFIG_PPC64
 typedef unsigned long pmd_t;
-#define pmd_val(x)	(x)
 #define __pmd(x)	(x)
+static inline unsigned long pmd_val(pmd_t pmd)
+{
+	return pmd;
+}
 
 #ifndef CONFIG_PPC_64K_PAGES
 typedef unsigned long pud_t;
-#define pud_val(x)	(x)
 #define __pud(x)	(x)
+static inline unsigned long pud_val(pud_t pud)
+{
+	return pud;
+}
 #endif /* !CONFIG_PPC_64K_PAGES */
 #endif /* CONFIG_PPC64 */
 
 typedef unsigned long pgd_t;
-#define pgd_val(x)	(x)
-#define pgprot_val(x)	(x)
+#define __pgd(x)	(x)
+static inline unsigned long pgd_val(pgd_t pgd)
+{
+	return pgd;
+}
 
 typedef unsigned long pgprot_t;
-#define __pgd(x)	(x)
+#define pgprot_val(x)	(x)
 #define __pgprot(x)	(x)
 
 #endif
diff --git a/arch/powerpc/include/asm/pgalloc-32.h b/arch/powerpc/include/asm/pgalloc-32.h
index 842846c1b711..76d6b9e0c8a9 100644
--- a/arch/powerpc/include/asm/pgalloc-32.h
+++ b/arch/powerpc/include/asm/pgalloc-32.h
@@ -21,16 +21,34 @@ extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
 /* #define pgd_populate(mm, pmd, pte)      BUG() */
 
 #ifndef CONFIG_BOOKE
-#define pmd_populate_kernel(mm, pmd, pte)	\
-		(pmd_val(*(pmd)) = __pa(pte) | _PMD_PRESENT)
-#define pmd_populate(mm, pmd, pte)	\
-		(pmd_val(*(pmd)) = (page_to_pfn(pte) << PAGE_SHIFT) | _PMD_PRESENT)
+
+static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp,
+				       pte_t *pte)
+{
+	*pmdp = __pmd(__pa(pte) | _PMD_PRESENT);
+}
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp,
+				pgtable_t pte_page)
+{
+	*pmdp = __pmd((page_to_pfn(pte_page) << PAGE_SHIFT) | _PMD_PRESENT);
+}
+
 #define pmd_pgtable(pmd) pmd_page(pmd)
 #else
-#define pmd_populate_kernel(mm, pmd, pte)	\
-		(pmd_val(*(pmd)) = (unsigned long)pte | _PMD_PRESENT)
-#define pmd_populate(mm, pmd, pte)	\
-		(pmd_val(*(pmd)) = (unsigned long)lowmem_page_address(pte) | _PMD_PRESENT)
+
+static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp,
+				       pte_t *pte)
+{
+	*pmdp = __pmd((unsigned long)pte | _PMD_PRESENT);
+}
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp,
+				pgtable_t pte_page)
+{
+	*pmdp = __pmd((unsigned long)lowmem_page_address(pte_page) | _PMD_PRESENT);
+}
+
 #define pmd_pgtable(pmd) pmd_page(pmd)
 #endif
 
diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index 4b0be20fcbfd..d8cde71f6734 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -53,7 +53,7 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 
 #ifndef CONFIG_PPC_64K_PAGES
 
-#define pgd_populate(MM, PGD, PUD)	pgd_set(PGD, PUD)
+#define pgd_populate(MM, PGD, PUD)	pgd_set(PGD, (unsigned long)PUD)
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
@@ -71,9 +71,18 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
 	pud_set(pud, (unsigned long)pmd);
 }
 
-#define pmd_populate(mm, pmd, pte_page) \
-	pmd_populate_kernel(mm, pmd, page_address(pte_page))
-#define pmd_populate_kernel(mm, pmd, pte) pmd_set(pmd, (unsigned long)(pte))
+static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
+				       pte_t *pte)
+{
+	pmd_set(pmd, (unsigned long)pte);
+}
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
+				pgtable_t pte_page)
+{
+	pmd_set(pmd, (unsigned long)page_address(pte_page));
+}
+
 #define pmd_pgtable(pmd) pmd_page(pmd)
 
 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
diff --git a/arch/powerpc/include/asm/pgtable-ppc32.h b/arch/powerpc/include/asm/pgtable-ppc32.h
index aac6547b0823..fbb23c54b998 100644
--- a/arch/powerpc/include/asm/pgtable-ppc32.h
+++ b/arch/powerpc/include/asm/pgtable-ppc32.h
@@ -128,7 +128,12 @@ extern int icache_44x_need_flush;
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(pmd_val(pmd) & _PMD_BAD)
 #define	pmd_present(pmd)	(pmd_val(pmd) & _PMD_PRESENT_MASK)
-#define	pmd_clear(pmdp)		do { pmd_val(*(pmdp)) = 0; } while (0)
+static inline void pmd_clear(pmd_t *pmdp)
+{
+	*pmdp = __pmd(0);
+}
+
+
 
 /*
  * When flushing the tlb entry for a page, we also need to flush the hash
diff --git a/arch/powerpc/include/asm/pgtable-ppc64-4k.h b/arch/powerpc/include/asm/pgtable-ppc64-4k.h
index 132ee1d482c2..7bace25d6b62 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64-4k.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64-4k.h
@@ -55,11 +55,15 @@
 #define pgd_none(pgd)		(!pgd_val(pgd))
 #define pgd_bad(pgd)		(pgd_val(pgd) == 0)
 #define pgd_present(pgd)	(pgd_val(pgd) != 0)
-#define pgd_clear(pgdp)		(pgd_val(*(pgdp)) = 0)
 #define pgd_page_vaddr(pgd)	(pgd_val(pgd) & ~PGD_MASKED_BITS)
 
 #ifndef __ASSEMBLY__
 
+static inline void pgd_clear(pgd_t *pgdp)
+{
+	*pgdp = __pgd(0);
+}
+
 static inline pte_t pgd_pte(pgd_t pgd)
 {
 	return __pte(pgd_val(pgd));
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 1ef0fea32e1e..6be203d43fd1 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -144,21 +144,37 @@
 #define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
 #define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
 
-#define pmd_set(pmdp, pmdval) 	(pmd_val(*(pmdp)) = (pmdval))
+static inline void pmd_set(pmd_t *pmdp, unsigned long val)
+{
+	*pmdp = __pmd(val);
+}
+
+static inline void pmd_clear(pmd_t *pmdp)
+{
+	*pmdp = __pmd(0);
+}
+
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
 				 || (pmd_val(pmd) & PMD_BAD_BITS))
 #define	pmd_present(pmd)	(!pmd_none(pmd))
-#define	pmd_clear(pmdp)		(pmd_val(*(pmdp)) = 0)
 #define pmd_page_vaddr(pmd)	(pmd_val(pmd) & ~PMD_MASKED_BITS)
 extern struct page *pmd_page(pmd_t pmd);
 
-#define pud_set(pudp, pudval)	(pud_val(*(pudp)) = (pudval))
+static inline void pud_set(pud_t *pudp, unsigned long val)
+{
+	*pudp = __pud(val);
+}
+
+static inline void pud_clear(pud_t *pudp)
+{
+	*pudp = __pud(0);
+}
+
 #define pud_none(pud)		(!pud_val(pud))
 #define	pud_bad(pud)		(!is_kernel_addr(pud_val(pud)) \
 				 || (pud_val(pud) & PUD_BAD_BITS))
 #define pud_present(pud)	(pud_val(pud) != 0)
-#define pud_clear(pudp)		(pud_val(*(pudp)) = 0)
 #define pud_page_vaddr(pud)	(pud_val(pud) & ~PUD_MASKED_BITS)
 
 extern struct page *pud_page(pud_t pud);
@@ -173,9 +189,13 @@ static inline pud_t pte_pud(pte_t pte)
 	return __pud(pte_val(pte));
 }
 #define pud_write(pud)		pte_write(pud_pte(pud))
-#define pgd_set(pgdp, pudp)	({pgd_val(*(pgdp)) = (unsigned long)(pudp);})
 #define pgd_write(pgd)		pte_write(pgd_pte(pgd))
 
+static inline void pgd_set(pgd_t *pgdp, unsigned long val)
+{
+	*pgdp = __pgd(val);
+}
+
 /*
  * Find an entry in a page-table-directory.  We combine the address region
  * (the high order N bits) and the pgd portion of the address.
@@ -528,14 +548,12 @@ static inline pmd_t pmd_mkhuge(pmd_t pmd)
 
 static inline pmd_t pmd_mknotpresent(pmd_t pmd)
 {
-	pmd_val(pmd) &= ~_PAGE_PRESENT;
-	return pmd;
+	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
 }
 
 static inline pmd_t pmd_mksplitting(pmd_t pmd)
 {
-	pmd_val(pmd) |= _PAGE_SPLITTING;
-	return pmd;
+	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
 }
 
 #define __HAVE_ARCH_PMD_SAME
diff --git a/arch/powerpc/mm/40x_mmu.c b/arch/powerpc/mm/40x_mmu.c
index 5810967511d4..31a5d42df8c9 100644
--- a/arch/powerpc/mm/40x_mmu.c
+++ b/arch/powerpc/mm/40x_mmu.c
@@ -110,10 +110,10 @@ unsigned long __init mmu_mapin_ram(unsigned long top)
 		unsigned long val = p | _PMD_SIZE_16M | _PAGE_EXEC | _PAGE_HWWRITE;
 
 		pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
-		pmd_val(*pmdp++) = val;
-		pmd_val(*pmdp++) = val;
-		pmd_val(*pmdp++) = val;
-		pmd_val(*pmdp++) = val;
+		*pmdp++ = __pmd(val);
+		*pmdp++ = __pmd(val);
+		*pmdp++ = __pmd(val);
+		*pmdp++ = __pmd(val);
 
 		v += LARGE_PAGE_SIZE_16M;
 		p += LARGE_PAGE_SIZE_16M;
@@ -125,7 +125,7 @@ unsigned long __init mmu_mapin_ram(unsigned long top)
 		unsigned long val = p | _PMD_SIZE_4M | _PAGE_EXEC | _PAGE_HWWRITE;
 
 		pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
-		pmd_val(*pmdp) = val;
+		*pmdp = __pmd(val);
 
 		v += LARGE_PAGE_SIZE_4M;
 		p += LARGE_PAGE_SIZE_4M;
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index e92cb2146b18..d692ae31cfc7 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -759,22 +759,20 @@ void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
 
 static pmd_t pmd_set_protbits(pmd_t pmd, pgprot_t pgprot)
 {
-	pmd_val(pmd) |= pgprot_val(pgprot);
-	return pmd;
+	return __pmd(pmd_val(pmd) | pgprot_val(pgprot));
 }
 
 pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot)
 {
-	pmd_t pmd;
+	unsigned long pmdv;
 	/*
 	 * For a valid pte, we would have _PAGE_PRESENT always
 	 * set. We use this to check THP page at pmd level.
 	 * leaf pte for huge page, bottom two bits != 00
 	 */
-	pmd_val(pmd) = pfn << PTE_RPN_SHIFT;
-	pmd_val(pmd) |= _PAGE_THP_HUGE;
-	pmd = pmd_set_protbits(pmd, pgprot);
-	return pmd;
+	pmdv = pfn << PTE_RPN_SHIFT;
+	pmdv |= _PAGE_THP_HUGE;
+	return pmd_set_protbits(__pmd(pmdv), pgprot);
 }
 
 pmd_t mk_pmd(struct page *page, pgprot_t pgprot)
@@ -784,10 +782,11 @@ pmd_t mk_pmd(struct page *page, pgprot_t pgprot)
 
 pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
 {
+	unsigned long pmdv;
 
-	pmd_val(pmd) &= _HPAGE_CHG_MASK;
-	pmd = pmd_set_protbits(pmd, newprot);
-	return pmd;
+	pmdv = pmd_val(pmd);
+	pmdv &= _HPAGE_CHG_MASK;
+	return pmd_set_protbits(__pmd(pmdv), newprot);
 }
 
 /*
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 11/35] powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (9 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 10/35] powerpc/mm: Don't use pmd_val, pud_val and pgd_val " Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 12/35] powerpc/mm: Move PTE bits from generic functions to hash64 functions Aneesh Kumar K.V
                   ` (23 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

This enables us to keep hash64 related bits together, and makes it easy
to follow.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h    | 448 ++++++++++++++++++++++++++-
 arch/powerpc/include/asm/book3s/64/pgtable.h | 445 +-------------------------
 arch/powerpc/include/asm/pgtable.h           |   6 -
 3 files changed, 448 insertions(+), 451 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 7deb5063ff8c..447b212649c8 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -2,6 +2,61 @@
 #define _ASM_POWERPC_BOOK3S_64_HASH_H
 #ifdef __KERNEL__
 
+#ifdef CONFIG_PPC_64K_PAGES
+#include <asm/book3s/64/hash-64k.h>
+#else
+#include <asm/book3s/64/hash-4k.h>
+#endif
+
+/*
+ * Size of EA range mapped by our pagetables.
+ */
+#define PGTABLE_EADDR_SIZE	(PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
+				 PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
+#define PGTABLE_RANGE		(ASM_CONST(1) << PGTABLE_EADDR_SIZE)
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#define PMD_CACHE_INDEX	(PMD_INDEX_SIZE + 1)
+#else
+#define PMD_CACHE_INDEX	PMD_INDEX_SIZE
+#endif
+/*
+ * Define the address range of the kernel non-linear virtual area
+ */
+#define KERN_VIRT_START ASM_CONST(0xD000000000000000)
+#define KERN_VIRT_SIZE	ASM_CONST(0x0000100000000000)
+
+/*
+ * The vmalloc space starts at the beginning of that region, and
+ * occupies half of it on hash CPUs and a quarter of it on Book3E
+ * (we keep a quarter for the virtual memmap)
+ */
+#define VMALLOC_START	KERN_VIRT_START
+#define VMALLOC_SIZE	(KERN_VIRT_SIZE >> 1)
+#define VMALLOC_END	(VMALLOC_START + VMALLOC_SIZE)
+
+/*
+ * Region IDs
+ */
+#define REGION_SHIFT		60UL
+#define REGION_MASK		(0xfUL << REGION_SHIFT)
+#define REGION_ID(ea)		(((unsigned long)(ea)) >> REGION_SHIFT)
+
+#define VMALLOC_REGION_ID	(REGION_ID(VMALLOC_START))
+#define KERNEL_REGION_ID	(REGION_ID(PAGE_OFFSET))
+#define VMEMMAP_REGION_ID	(0xfUL)	/* Server only */
+#define USER_REGION_ID		(0UL)
+
+/*
+ * Defines the address of the vmemap area, in its own region on
+ * hash table CPUs.
+ */
+#define VMEMMAP_BASE		(VMEMMAP_REGION_ID << REGION_SHIFT)
+
+#ifdef CONFIG_PPC_MM_SLICES
+#define HAVE_ARCH_UNMAPPED_AREA
+#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+#endif /* CONFIG_PPC_MM_SLICES */
 /*
  * Common bits between 4K and 64K pages in a linux-style PTE.
  * These match the bits in the (hardware-defined) PowerPC PTE as closely
@@ -46,11 +101,398 @@
 /* Hash table based platforms need atomic updates of the linux PTE */
 #define PTE_ATOMIC_UPDATES	1
 
-#ifdef CONFIG_PPC_64K_PAGES
-#include <asm/book3s/64/hash-64k.h>
+/*
+ * THP pages can't be special. So use the _PAGE_SPECIAL
+ */
+#define _PAGE_SPLITTING _PAGE_SPECIAL
+
+/*
+ * We need to differentiate between explicit huge page and THP huge
+ * page, since THP huge page also need to track real subpage details
+ */
+#define _PAGE_THP_HUGE  _PAGE_4K_PFN
+
+/*
+ * set of bits not changed in pmd_modify.
+ */
+#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
+			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
+			 _PAGE_THP_HUGE)
+#define _PTE_NONE_MASK	_PAGE_HPTEFLAGS
+/*
+ * The mask convered by the RPN must be a ULL on 32-bit platforms with
+ * 64-bit PTEs
+ */
+#define PTE_RPN_MASK	(~((1UL << PTE_RPN_SHIFT) - 1))
+/*
+ * _PAGE_CHG_MASK masks of bits that are to be preserved across
+ * pgprot changes
+ */
+#define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
+			 _PAGE_ACCESSED | _PAGE_SPECIAL)
+/*
+ * Mask of bits returned by pte_pgprot()
+ */
+#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
+			 _PAGE_WRITETHRU | _PAGE_4K_PFN | \
+			 _PAGE_USER | _PAGE_ACCESSED |  \
+			 _PAGE_RW |  _PAGE_DIRTY | _PAGE_EXEC)
+/*
+ * We define 2 sets of base prot bits, one for basic pages (ie,
+ * cacheable kernel and user pages) and one for non cacheable
+ * pages. We always set _PAGE_COHERENT when SMP is enabled or
+ * the processor might need it for DMA coherency.
+ */
+#define _PAGE_BASE_NC	(_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE)
+#define _PAGE_BASE	(_PAGE_BASE_NC | _PAGE_COHERENT)
+
+/* Permission masks used to generate the __P and __S table,
+ *
+ * Note:__pgprot is defined in arch/powerpc/include/asm/page.h
+ *
+ * Write permissions imply read permissions for now (we could make write-only
+ * pages on BookE but we don't bother for now). Execute permission control is
+ * possible on platforms that define _PAGE_EXEC
+ *
+ * Note due to the way vm flags are laid out, the bits are XWR
+ */
+#define PAGE_NONE	__pgprot(_PAGE_BASE)
+#define PAGE_SHARED	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
+#define PAGE_SHARED_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | \
+				 _PAGE_EXEC)
+#define PAGE_COPY	__pgprot(_PAGE_BASE | _PAGE_USER )
+#define PAGE_COPY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+#define PAGE_READONLY	__pgprot(_PAGE_BASE | _PAGE_USER )
+#define PAGE_READONLY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+
+#define __P000	PAGE_NONE
+#define __P001	PAGE_READONLY
+#define __P010	PAGE_COPY
+#define __P011	PAGE_COPY
+#define __P100	PAGE_READONLY_X
+#define __P101	PAGE_READONLY_X
+#define __P110	PAGE_COPY_X
+#define __P111	PAGE_COPY_X
+
+#define __S000	PAGE_NONE
+#define __S001	PAGE_READONLY
+#define __S010	PAGE_SHARED
+#define __S011	PAGE_SHARED
+#define __S100	PAGE_READONLY_X
+#define __S101	PAGE_READONLY_X
+#define __S110	PAGE_SHARED_X
+#define __S111	PAGE_SHARED_X
+
+/* Permission masks used for kernel mappings */
+#define PAGE_KERNEL	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
+#define PAGE_KERNEL_NC	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
+				 _PAGE_NO_CACHE)
+#define PAGE_KERNEL_NCG	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
+				 _PAGE_NO_CACHE | _PAGE_GUARDED)
+#define PAGE_KERNEL_X	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RWX)
+#define PAGE_KERNEL_RO	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
+#define PAGE_KERNEL_ROX	__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
+
+/* Protection used for kernel text. We want the debuggers to be able to
+ * set breakpoints anywhere, so don't write protect the kernel text
+ * on platforms where such control is possible.
+ */
+#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) ||\
+	defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE)
+#define PAGE_KERNEL_TEXT	PAGE_KERNEL_X
 #else
-#include <asm/book3s/64/hash-4k.h>
+#define PAGE_KERNEL_TEXT	PAGE_KERNEL_ROX
 #endif
 
+/* Make modules code happy. We don't set RO yet */
+#define PAGE_KERNEL_EXEC	PAGE_KERNEL_X
+#define PAGE_AGP		(PAGE_KERNEL_NC)
+
+#define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
+#define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
+/*
+ * We save the slot number & secondary bit in the second half of the
+ * PTE page. We use the 8 bytes per each pte entry.
+ */
+#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
+
+#ifndef __ASSEMBLY__
+#define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
+				 || (pmd_val(pmd) & PMD_BAD_BITS))
+#define pmd_page_vaddr(pmd)	(pmd_val(pmd) & ~PMD_MASKED_BITS)
+
+#define	pud_bad(pud)		(!is_kernel_addr(pud_val(pud)) \
+				 || (pud_val(pud) & PUD_BAD_BITS))
+#define pud_page_vaddr(pud)	(pud_val(pud) & ~PUD_MASKED_BITS)
+
+#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
+#define pmd_index(address) (((address) >> (PMD_SHIFT)) & (PTRS_PER_PMD - 1))
+#define pte_index(address) (((address) >> (PAGE_SHIFT)) & (PTRS_PER_PTE - 1))
+
+extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
+			    pte_t *ptep, unsigned long pte, int huge);
+extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
+					 unsigned long addr,
+					 pmd_t *pmdp,
+					 unsigned long clr,
+					 unsigned long set);
+/* Atomic PTE updates */
+static inline unsigned long pte_update(struct mm_struct *mm,
+				       unsigned long addr,
+				       pte_t *ptep, unsigned long clr,
+				       unsigned long set,
+				       int huge)
+{
+	unsigned long old, tmp;
+
+	__asm__ __volatile__(
+	"1:	ldarx	%0,0,%3		# pte_update\n\
+	andi.	%1,%0,%6\n\
+	bne-	1b \n\
+	andc	%1,%0,%4 \n\
+	or	%1,%1,%7\n\
+	stdcx.	%1,0,%3 \n\
+	bne-	1b"
+	: "=&r" (old), "=&r" (tmp), "=m" (*ptep)
+	: "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
+	: "cc" );
+	/* huge pages use the old page table lock */
+	if (!huge)
+		assert_pte_locked(mm, addr);
+
+	if (old & _PAGE_HASHPTE)
+		hpte_need_flush(mm, addr, ptep, old, huge);
+
+	return old;
+}
+
+static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
+					      unsigned long addr, pte_t *ptep)
+{
+	unsigned long old;
+
+	if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+		return 0;
+	old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
+	return (old & _PAGE_ACCESSED) != 0;
+}
+#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
+#define ptep_test_and_clear_young(__vma, __addr, __ptep)		   \
+({									   \
+	int __r;							   \
+	__r = __ptep_test_and_clear_young((__vma)->vm_mm, __addr, __ptep); \
+	__r;								   \
+})
+
+#define __HAVE_ARCH_PTEP_SET_WRPROTECT
+static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+				      pte_t *ptep)
+{
+
+	if ((pte_val(*ptep) & _PAGE_RW) == 0)
+		return;
+
+	pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
+}
+
+static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
+					   unsigned long addr, pte_t *ptep)
+{
+	if ((pte_val(*ptep) & _PAGE_RW) == 0)
+		return;
+
+	pte_update(mm, addr, ptep, _PAGE_RW, 0, 1);
+}
+
+/*
+ * We currently remove entries from the hashtable regardless of whether
+ * the entry was young or dirty. The generic routines only flush if the
+ * entry was young or dirty which is not good enough.
+ *
+ * We should be more intelligent about this but for the moment we override
+ * these functions and force a tlb flush unconditionally
+ */
+#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
+#define ptep_clear_flush_young(__vma, __address, __ptep)		\
+({									\
+	int __young = __ptep_test_and_clear_young((__vma)->vm_mm, __address, \
+						  __ptep);		\
+	__young;							\
+})
+
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
+				       unsigned long addr, pte_t *ptep)
+{
+	unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0);
+	return __pte(old);
+}
+
+static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
+			     pte_t * ptep)
+{
+	pte_update(mm, addr, ptep, ~0UL, 0, 0);
+}
+
+
+/* Set the dirty and/or accessed bits atomically in a linux PTE, this
+ * function doesn't need to flush the hash entry
+ */
+static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
+{
+	unsigned long bits = pte_val(entry) &
+		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
+
+	unsigned long old, tmp;
+
+	__asm__ __volatile__(
+	"1:	ldarx	%0,0,%4\n\
+		andi.	%1,%0,%6\n\
+		bne-	1b \n\
+		or	%0,%3,%0\n\
+		stdcx.	%0,0,%4\n\
+		bne-	1b"
+	:"=&r" (old), "=&r" (tmp), "=m" (*ptep)
+	:"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
+	:"cc");
+}
+
+#define __HAVE_ARCH_PTE_SAME
+#define pte_same(A,B)	(((pte_val(A) ^ pte_val(B)) & ~_PAGE_HPTEFLAGS) == 0)
+
+static inline char *get_hpte_slot_array(pmd_t *pmdp)
+{
+	/*
+	 * The hpte hindex is stored in the pgtable whose address is in the
+	 * second half of the PMD
+	 *
+	 * Order this load with the test for pmd_trans_huge in the caller
+	 */
+	smp_rmb();
+	return *(char **)(pmdp + PTRS_PER_PMD);
+
+
+}
+/*
+ * The linux hugepage PMD now include the pmd entries followed by the address
+ * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
+ * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
+ * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
+ * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
+ *
+ * The last three bits are intentionally left to zero. This memory location
+ * are also used as normal page PTE pointers. So if we have any pointers
+ * left around while we collapse a hugepage, we need to make sure
+ * _PAGE_PRESENT bit of that is zero when we look at them
+ */
+static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
+{
+	return (hpte_slot_array[index] >> 3) & 0x1;
+}
+
+static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
+					   int index)
+{
+	return hpte_slot_array[index] >> 4;
+}
+
+static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
+					unsigned int index, unsigned int hidx)
+{
+	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+/*
+ *
+ * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
+ * page. The hugetlbfs page table walking and mangling paths are totally
+ * separated form the core VM paths and they're differentiated by
+ *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
+ *
+ * pmd_trans_huge() is defined as false at build time if
+ * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
+ * time in such case.
+ *
+ * For ppc64 we need to differntiate from explicit hugepages from THP, because
+ * for THP we also track the subpage details at the pmd level. We don't do
+ * that for explicit huge pages.
+ *
+ */
+static inline int pmd_trans_huge(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
+}
+
+static inline int pmd_trans_splitting(pmd_t pmd)
+{
+	if (pmd_trans_huge(pmd))
+		return pmd_val(pmd) & _PAGE_SPLITTING;
+	return 0;
+}
+
+#endif
+static inline int pmd_large(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pmd_val(pmd) & 0x3) != 0x0);
+}
+
+static inline pmd_t pmd_mknotpresent(pmd_t pmd)
+{
+	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
+}
+
+static inline pmd_t pmd_mksplitting(pmd_t pmd)
+{
+	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
+}
+
+#define __HAVE_ARCH_PMD_SAME
+static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
+{
+	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
+}
+
+static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
+					      unsigned long addr, pmd_t *pmdp)
+{
+	unsigned long old;
+
+	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+		return 0;
+	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
+	return ((old & _PAGE_ACCESSED) != 0);
+}
+
+#define __HAVE_ARCH_PMDP_SET_WRPROTECT
+static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+				      pmd_t *pmdp)
+{
+
+	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
+		return;
+
+	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
+				   pmd_t *pmdp, unsigned long old_pmd);
+#else
+static inline void hpte_do_hugepage_flush(struct mm_struct *mm,
+					  unsigned long addr, pmd_t *pmdp,
+					  unsigned long old_pmd)
+{
+	WARN(1, "%s called with THP disabled\n", __func__);
+}
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+#endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 09c6474f89c7..aac630b4a15e 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -8,32 +8,6 @@
 #include <asm/book3s/64/hash.h>
 #include <asm/barrier.h>
 
-
-/*
- * Size of EA range mapped by our pagetables.
- */
-#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
-			    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
-#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
-
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-#define PMD_CACHE_INDEX	(PMD_INDEX_SIZE + 1)
-#else
-#define PMD_CACHE_INDEX	PMD_INDEX_SIZE
-#endif
-/*
- * Define the address range of the kernel non-linear virtual area
- */
-#define KERN_VIRT_START ASM_CONST(0xD000000000000000)
-#define KERN_VIRT_SIZE	ASM_CONST(0x0000100000000000)
-/*
- * The vmalloc space starts at the beginning of that region, and
- * occupies half of it on hash CPUs and a quarter of it on Book3E
- * (we keep a quarter for the virtual memmap)
- */
-#define VMALLOC_START	KERN_VIRT_START
-#define VMALLOC_SIZE	(KERN_VIRT_SIZE >> 1)
-#define VMALLOC_END	(VMALLOC_START + VMALLOC_SIZE)
 /*
  * The second half of the kernel virtual space is used for IO mappings,
  * it's itself carved into the PIO region (ISA and PHB IO space) and
@@ -52,146 +26,9 @@
 #define IOREMAP_BASE	(PHB_IO_END)
 #define IOREMAP_END	(KERN_VIRT_START + KERN_VIRT_SIZE)
 
-/*
- * Region IDs
- */
-#define REGION_SHIFT		60UL
-#define REGION_MASK		(0xfUL << REGION_SHIFT)
-#define REGION_ID(ea)		(((unsigned long)(ea)) >> REGION_SHIFT)
-
-#define VMALLOC_REGION_ID	(REGION_ID(VMALLOC_START))
-#define KERNEL_REGION_ID	(REGION_ID(PAGE_OFFSET))
-#define VMEMMAP_REGION_ID	(0xfUL)	/* Server only */
-#define USER_REGION_ID		(0UL)
-
-/*
- * Defines the address of the vmemap area, in its own region on
- * hash table CPUs.
- */
-#define VMEMMAP_BASE		(VMEMMAP_REGION_ID << REGION_SHIFT)
 #define vmemmap			((struct page *)VMEMMAP_BASE)
 
-
-#ifdef CONFIG_PPC_MM_SLICES
-#define HAVE_ARCH_UNMAPPED_AREA
-#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
-#endif /* CONFIG_PPC_MM_SLICES */
-
-/*
- * THP pages can't be special. So use the _PAGE_SPECIAL
- */
-#define _PAGE_SPLITTING _PAGE_SPECIAL
-
-/*
- * We need to differentiate between explicit huge page and THP huge
- * page, since THP huge page also need to track real subpage details
- */
-#define _PAGE_THP_HUGE  _PAGE_4K_PFN
-
-/*
- * set of bits not changed in pmd_modify.
- */
-#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
-			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
-#define _PTE_NONE_MASK	_PAGE_HPTEFLAGS
-/*
- * The mask convered by the RPN must be a ULL on 32-bit platforms with
- * 64-bit PTEs
- */
-#define PTE_RPN_MASK	(~((1UL << PTE_RPN_SHIFT) - 1))
-/*
- * _PAGE_CHG_MASK masks of bits that are to be preserved across
- * pgprot changes
- */
-#define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
-			 _PAGE_ACCESSED | _PAGE_SPECIAL)
-/*
- * Mask of bits returned by pte_pgprot()
- */
-#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU | _PAGE_4K_PFN | \
-			 _PAGE_USER | _PAGE_ACCESSED |  \
-			 _PAGE_RW |  _PAGE_DIRTY | _PAGE_EXEC)
-/*
- * We define 2 sets of base prot bits, one for basic pages (ie,
- * cacheable kernel and user pages) and one for non cacheable
- * pages. We always set _PAGE_COHERENT when SMP is enabled or
- * the processor might need it for DMA coherency.
- */
-#define _PAGE_BASE_NC	(_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE)
-#define _PAGE_BASE	(_PAGE_BASE_NC | _PAGE_COHERENT)
-
-/* Permission masks used to generate the __P and __S table,
- *
- * Note:__pgprot is defined in arch/powerpc/include/asm/page.h
- *
- * Write permissions imply read permissions for now (we could make write-only
- * pages on BookE but we don't bother for now). Execute permission control is
- * possible on platforms that define _PAGE_EXEC
- *
- * Note due to the way vm flags are laid out, the bits are XWR
- */
-#define PAGE_NONE	__pgprot(_PAGE_BASE)
-#define PAGE_SHARED	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
-#define PAGE_SHARED_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | \
-				 _PAGE_EXEC)
-#define PAGE_COPY	__pgprot(_PAGE_BASE | _PAGE_USER )
-#define PAGE_COPY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
-#define PAGE_READONLY	__pgprot(_PAGE_BASE | _PAGE_USER )
-#define PAGE_READONLY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
-
-#define __P000	PAGE_NONE
-#define __P001	PAGE_READONLY
-#define __P010	PAGE_COPY
-#define __P011	PAGE_COPY
-#define __P100	PAGE_READONLY_X
-#define __P101	PAGE_READONLY_X
-#define __P110	PAGE_COPY_X
-#define __P111	PAGE_COPY_X
-
-#define __S000	PAGE_NONE
-#define __S001	PAGE_READONLY
-#define __S010	PAGE_SHARED
-#define __S011	PAGE_SHARED
-#define __S100	PAGE_READONLY_X
-#define __S101	PAGE_READONLY_X
-#define __S110	PAGE_SHARED_X
-#define __S111	PAGE_SHARED_X
-
-/* Permission masks used for kernel mappings */
-#define PAGE_KERNEL	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
-#define PAGE_KERNEL_NC	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
-				 _PAGE_NO_CACHE)
-#define PAGE_KERNEL_NCG	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
-				 _PAGE_NO_CACHE | _PAGE_GUARDED)
-#define PAGE_KERNEL_X	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RWX)
-#define PAGE_KERNEL_RO	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
-#define PAGE_KERNEL_ROX	__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
-
-/* Protection used for kernel text. We want the debuggers to be able to
- * set breakpoints anywhere, so don't write protect the kernel text
- * on platforms where such control is possible.
- */
-#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) ||\
-	defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE)
-#define PAGE_KERNEL_TEXT	PAGE_KERNEL_X
-#else
-#define PAGE_KERNEL_TEXT	PAGE_KERNEL_ROX
-#endif
-
-/* Make modules code happy. We don't set RO yet */
-#define PAGE_KERNEL_EXEC	PAGE_KERNEL_X
-
-/*
- * Don't just check for any non zero bits in __PAGE_USER, since for book3e
- * and PTE_64BIT, PAGE_KERNEL_X contains _PAGE_BAP_SR which is also in
- * _PAGE_USER.  Need to explicitly match _PAGE_BAP_UR bit in that case too.
- */
-#define pte_user(val)		((val & _PAGE_USER) == _PAGE_USER)
-
 /* Advertise special mapping type for AGP */
-#define PAGE_AGP		(PAGE_KERNEL_NC)
 #define HAVE_PAGE_AGP
 
 /* Advertise support for _PAGE_SPECIAL */
@@ -230,12 +67,6 @@
 
 #endif /* __real_pte */
 
-
-/* pte_clear moved to later in this file */
-
-#define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
-#define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
-
 static inline void pmd_set(pmd_t *pmdp, unsigned long val)
 {
 	*pmdp = __pmd(val);
@@ -246,13 +77,8 @@ static inline void pmd_clear(pmd_t *pmdp)
 	*pmdp = __pmd(0);
 }
 
-
 #define pmd_none(pmd)		(!pmd_val(pmd))
-#define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
-				 || (pmd_val(pmd) & PMD_BAD_BITS))
 #define	pmd_present(pmd)	(!pmd_none(pmd))
-#define pmd_page_vaddr(pmd)	(pmd_val(pmd) & ~PMD_MASKED_BITS)
-extern struct page *pmd_page(pmd_t pmd);
 
 static inline void pud_set(pud_t *pudp, unsigned long val)
 {
@@ -265,13 +91,10 @@ static inline void pud_clear(pud_t *pudp)
 }
 
 #define pud_none(pud)		(!pud_val(pud))
-#define	pud_bad(pud)		(!is_kernel_addr(pud_val(pud)) \
-				 || (pud_val(pud) & PUD_BAD_BITS))
 #define pud_present(pud)	(pud_val(pud) != 0)
-#define pud_page_vaddr(pud)	(pud_val(pud) & ~PUD_MASKED_BITS)
 
 extern struct page *pud_page(pud_t pud);
-
+extern struct page *pmd_page(pmd_t pmd);
 static inline pte_t pud_pte(pud_t pud)
 {
 	return __pte(pud_val(pud));
@@ -292,15 +115,14 @@ static inline void pgd_set(pgd_t *pgdp, unsigned long val)
  * Find an entry in a page-table-directory.  We combine the address region
  * (the high order N bits) and the pgd portion of the address.
  */
-#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
 
 #define pgd_offset(mm, address)	 ((mm)->pgd + pgd_index(address))
 
 #define pmd_offset(pudp,addr) \
-  (((pmd_t *) pud_page_vaddr(*(pudp))) + (((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1)))
+	(((pmd_t *) pud_page_vaddr(*(pudp))) + pmd_index(addr))
 
 #define pte_offset_kernel(dir,addr) \
-  (((pte_t *) pmd_page_vaddr(*(dir))) + (((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)))
+	(((pte_t *) pmd_page_vaddr(*(dir))) + pte_index(addr))
 
 #define pte_offset_map(dir,addr)	pte_offset_kernel((dir), (addr))
 #define pte_unmap(pte)			do { } while(0)
@@ -308,132 +130,6 @@ static inline void pgd_set(pgd_t *pgdp, unsigned long val)
 /* to find an entry in a kernel page-table-directory */
 /* This now only contains the vmalloc pages */
 #define pgd_offset_k(address) pgd_offset(&init_mm, address)
-extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
-			    pte_t *ptep, unsigned long pte, int huge);
-
-/* Atomic PTE updates */
-static inline unsigned long pte_update(struct mm_struct *mm,
-				       unsigned long addr,
-				       pte_t *ptep, unsigned long clr,
-				       unsigned long set,
-				       int huge)
-{
-	unsigned long old, tmp;
-
-	__asm__ __volatile__(
-	"1:	ldarx	%0,0,%3		# pte_update\n\
-	andi.	%1,%0,%6\n\
-	bne-	1b \n\
-	andc	%1,%0,%4 \n\
-	or	%1,%1,%7\n\
-	stdcx.	%1,0,%3 \n\
-	bne-	1b"
-	: "=&r" (old), "=&r" (tmp), "=m" (*ptep)
-	: "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
-	: "cc" );
-	/* huge pages use the old page table lock */
-	if (!huge)
-		assert_pte_locked(mm, addr);
-
-	if (old & _PAGE_HASHPTE)
-		hpte_need_flush(mm, addr, ptep, old, huge);
-
-	return old;
-}
-
-static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
-					      unsigned long addr, pte_t *ptep)
-{
-	unsigned long old;
-
-	if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
-		return 0;
-	old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
-	return (old & _PAGE_ACCESSED) != 0;
-}
-#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
-#define ptep_test_and_clear_young(__vma, __addr, __ptep)		   \
-({									   \
-	int __r;							   \
-	__r = __ptep_test_and_clear_young((__vma)->vm_mm, __addr, __ptep); \
-	__r;								   \
-})
-
-#define __HAVE_ARCH_PTEP_SET_WRPROTECT
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
-				      pte_t *ptep)
-{
-
-	if ((pte_val(*ptep) & _PAGE_RW) == 0)
-		return;
-
-	pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
-}
-
-static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
-					   unsigned long addr, pte_t *ptep)
-{
-	if ((pte_val(*ptep) & _PAGE_RW) == 0)
-		return;
-
-	pte_update(mm, addr, ptep, _PAGE_RW, 0, 1);
-}
-
-/*
- * We currently remove entries from the hashtable regardless of whether
- * the entry was young or dirty. The generic routines only flush if the
- * entry was young or dirty which is not good enough.
- *
- * We should be more intelligent about this but for the moment we override
- * these functions and force a tlb flush unconditionally
- */
-#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
-#define ptep_clear_flush_young(__vma, __address, __ptep)		\
-({									\
-	int __young = __ptep_test_and_clear_young((__vma)->vm_mm, __address, \
-						  __ptep);		\
-	__young;							\
-})
-
-#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
-static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
-				       unsigned long addr, pte_t *ptep)
-{
-	unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0);
-	return __pte(old);
-}
-
-static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
-			     pte_t * ptep)
-{
-	pte_update(mm, addr, ptep, ~0UL, 0, 0);
-}
-
-
-/* Set the dirty and/or accessed bits atomically in a linux PTE, this
- * function doesn't need to flush the hash entry
- */
-static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
-{
-	unsigned long bits = pte_val(entry) &
-		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
-
-	unsigned long old, tmp;
-
-	__asm__ __volatile__(
-	"1:	ldarx	%0,0,%4\n\
-		andi.	%1,%0,%6\n\
-		bne-	1b \n\
-		or	%0,%3,%0\n\
-		stdcx.	%0,0,%4\n\
-		bne-	1b"
-	:"=&r" (old), "=&r" (tmp), "=m" (*ptep)
-	:"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
-	:"cc");
-}
-
-#define __HAVE_ARCH_PTE_SAME
-#define pte_same(A,B)	(((pte_val(A) ^ pte_val(B)) & ~_PAGE_HPTEFLAGS) == 0)
 
 #define pte_ERROR(e) \
 	pr_err("%s:%d: bad pte %08lx.\n", __FILE__, __LINE__, pte_val(e))
@@ -468,54 +164,9 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
 
-/*
- * The linux hugepage PMD now include the pmd entries followed by the address
- * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
- * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
- * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
- * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
- *
- * The last three bits are intentionally left to zero. This memory location
- * are also used as normal page PTE pointers. So if we have any pointers
- * left around while we collapse a hugepage, we need to make sure
- * _PAGE_PRESENT bit of that is zero when we look at them
- */
-static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
-{
-	return (hpte_slot_array[index] >> 3) & 0x1;
-}
-
-static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
-					   int index)
-{
-	return hpte_slot_array[index] >> 4;
-}
-
-static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
-					unsigned int index, unsigned int hidx)
-{
-	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
-}
-
 struct page *realmode_pfn_to_page(unsigned long pfn);
 
-static inline char *get_hpte_slot_array(pmd_t *pmdp)
-{
-	/*
-	 * The hpte hindex is stored in the pgtable whose address is in the
-	 * second half of the PMD
-	 *
-	 * Order this load with the test for pmd_trans_huge in the caller
-	 */
-	smp_rmb();
-	return *(char **)(pmdp + PTRS_PER_PMD);
-
-
-}
-
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
-				   pmd_t *pmdp, unsigned long old_pmd);
 extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot);
 extern pmd_t mk_pmd(struct page *page, pgprot_t pgprot);
 extern pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot);
@@ -523,55 +174,9 @@ extern void set_pmd_at(struct mm_struct *mm, unsigned long addr,
 		       pmd_t *pmdp, pmd_t pmd);
 extern void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
 				 pmd_t *pmd);
-/*
- *
- * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
- * page. The hugetlbfs page table walking and mangling paths are totally
- * separated form the core VM paths and they're differentiated by
- *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
- *
- * pmd_trans_huge() is defined as false at build time if
- * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
- * time in such case.
- *
- * For ppc64 we need to differntiate from explicit hugepages from THP, because
- * for THP we also track the subpage details at the pmd level. We don't do
- * that for explicit huge pages.
- *
- */
-static inline int pmd_trans_huge(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
-}
-
-static inline int pmd_trans_splitting(pmd_t pmd)
-{
-	if (pmd_trans_huge(pmd))
-		return pmd_val(pmd) & _PAGE_SPLITTING;
-	return 0;
-}
-
 extern int has_transparent_hugepage(void);
-#else
-static inline void hpte_do_hugepage_flush(struct mm_struct *mm,
-					  unsigned long addr, pmd_t *pmdp,
-					  unsigned long old_pmd)
-{
-
-	WARN(1, "%s called with THP disabled\n", __func__);
-}
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-static inline int pmd_large(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
-}
 
 static inline pte_t pmd_pte(pmd_t pmd)
 {
@@ -606,44 +211,11 @@ static inline pmd_t pmd_mkhuge(pmd_t pmd)
 	return pmd;
 }
 
-static inline pmd_t pmd_mknotpresent(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
-}
-
-static inline pmd_t pmd_mksplitting(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
-}
-
-#define __HAVE_ARCH_PMD_SAME
-static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
-{
-	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
-}
-
 #define __HAVE_ARCH_PMDP_SET_ACCESS_FLAGS
 extern int pmdp_set_access_flags(struct vm_area_struct *vma,
 				 unsigned long address, pmd_t *pmdp,
 				 pmd_t entry, int dirty);
 
-extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
-					 unsigned long addr,
-					 pmd_t *pmdp,
-					 unsigned long clr,
-					 unsigned long set);
-
-static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
-					      unsigned long addr, pmd_t *pmdp)
-{
-	unsigned long old;
-
-	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
-		return 0;
-	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
-	return ((old & _PAGE_ACCESSED) != 0);
-}
-
 #define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
 extern int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 				     unsigned long address, pmd_t *pmdp);
@@ -655,17 +227,6 @@ extern int pmdp_clear_flush_young(struct vm_area_struct *vma,
 extern pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
 				     unsigned long addr, pmd_t *pmdp);
 
-#define __HAVE_ARCH_PMDP_SET_WRPROTECT
-static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
-				      pmd_t *pmdp)
-{
-
-	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
-		return;
-
-	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
-}
-
 #define __HAVE_ARCH_PMDP_SPLITTING_FLUSH
 extern void pmdp_splitting_flush(struct vm_area_struct *vma,
 				 unsigned long address, pmd_t *pmdp);
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index a27b8cef51d7..8f7338678fdc 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -18,12 +18,6 @@ struct mm_struct;
 #include <asm/pgtable-book3e.h>
 #endif /* !CONFIG_PPC_BOOK3S */
 
-/*
- * We save the slot number & secondary bit in the second half of the
- * PTE page. We use the 8 bytes per each pte entry.
- */
-#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
-
 #ifndef __ASSEMBLY__
 
 #include <asm/tlbflush.h>
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 12/35] powerpc/mm: Move PTE bits from generic functions to hash64 functions.
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (10 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 11/35] powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 13/35] powerpc/booke: Move nohash headers (part 1) Aneesh Kumar K.V
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

functions which operate on pte bits are moved to hash*.h and other
generic functions are moved to pgtable.h

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 183 ++++++++++++++++++++++++
 arch/powerpc/include/asm/book3s/64/hash.h    | 151 ++++++++++++++++++++
 arch/powerpc/include/asm/book3s/64/pgtable.h |   6 +
 arch/powerpc/include/asm/book3s/pgtable.h    | 204 ---------------------------
 4 files changed, 340 insertions(+), 204 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 226f29d39332..38b33dcfcc9d 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -294,6 +294,189 @@ void pgtable_cache_init(void);
 extern int get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep,
 		      pmd_t **pmdp);
 
+/* Generic accessors to PTE bits */
+static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
+static inline int pte_dirty(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_DIRTY); }
+static inline int pte_young(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_ACCESSED); }
+static inline int pte_special(pte_t pte)	{ return !!(pte_val(pte) & _PAGE_SPECIAL); }
+static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
+static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
+
+static inline int pte_present(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_PRESENT;
+}
+
+/* Conversion functions: convert a page and protection to a page entry,
+ * and a page entry and page directory to the page they refer to.
+ *
+ * Even if PTEs can be unsigned long long, a PFN is always an unsigned
+ * long for now.
+ */
+static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
+{
+	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
+		     pgprot_val(pgprot));
+}
+
+static inline unsigned long pte_pfn(pte_t pte)
+{
+	return pte_val(pte) >> PTE_RPN_SHIFT;
+}
+
+/* Generic modifiers for PTE bits */
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_RW);
+}
+
+static inline pte_t pte_mkclean(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkold(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_RW);
+}
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_SPECIAL);
+}
+
+static inline pte_t pte_mkhuge(pte_t pte)
+{
+	return pte;
+}
+
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+	return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
+}
+
+
+
+/* This low level function performs the actual PTE insertion
+ * Setting the PTE depends on the MMU type and other factors. It's
+ * an horrible mess that I'm not going to try to clean up now but
+ * I'm keeping it in one place rather than spread around
+ */
+static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
+				pte_t *ptep, pte_t pte, int percpu)
+{
+#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
+	/* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
+	 * helper pte_update() which does an atomic update. We need to do that
+	 * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
+	 * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
+	 * the hash bits instead (ie, same as the non-SMP case)
+	 */
+	if (percpu)
+		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+			      | (pte_val(pte) & ~_PAGE_HASHPTE));
+	else
+		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
+
+#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
+	/* Second case is 32-bit with 64-bit PTE.  In this case, we
+	 * can just store as long as we do the two halves in the right order
+	 * with a barrier in between. This is possible because we take care,
+	 * in the hash code, to pre-invalidate if the PTE was already hashed,
+	 * which synchronizes us with any concurrent invalidation.
+	 * In the percpu case, we also fallback to the simple update preserving
+	 * the hash bits
+	 */
+	if (percpu) {
+		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+			      | (pte_val(pte) & ~_PAGE_HASHPTE));
+		return;
+	}
+	if (pte_val(*ptep) & _PAGE_HASHPTE)
+		flush_hash_entry(mm, ptep, addr);
+	__asm__ __volatile__("\
+		stw%U0%X0 %2,%0\n\
+		eieio\n\
+		stw%U0%X0 %L2,%1"
+	: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
+	: "r" (pte) : "memory");
+
+#elif defined(CONFIG_PPC_STD_MMU_32)
+	/* Third case is 32-bit hash table in UP mode, we need to preserve
+	 * the _PAGE_HASHPTE bit since we may not have invalidated the previous
+	 * translation in the hash yet (done in a subsequent flush_tlb_xxx())
+	 * and see we need to keep track that this PTE needs invalidating
+	 */
+	*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+		      | (pte_val(pte) & ~_PAGE_HASHPTE));
+
+#else
+#error "Not supported "
+#endif
+}
+
+/*
+ * Macro to mark a page protection value as "uncacheable".
+ */
+
+#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
+			 _PAGE_WRITETHRU)
+
+#define pgprot_noncached pgprot_noncached
+static inline pgprot_t pgprot_noncached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE | _PAGE_GUARDED);
+}
+
+#define pgprot_noncached_wc pgprot_noncached_wc
+static inline pgprot_t pgprot_noncached_wc(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE);
+}
+
+#define pgprot_cached pgprot_cached
+static inline pgprot_t pgprot_cached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT);
+}
+
+#define pgprot_cached_wthru pgprot_cached_wthru
+static inline pgprot_t pgprot_cached_wthru(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT | _PAGE_WRITETHRU);
+}
+
+#define pgprot_cached_noncoherent pgprot_cached_noncoherent
+static inline pgprot_t pgprot_cached_noncoherent(pgprot_t prot)
+{
+	return __pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL);
+}
+
+#define pgprot_writecombine pgprot_writecombine
+static inline pgprot_t pgprot_writecombine(pgprot_t prot)
+{
+	return pgprot_noncached_wc(prot);
+}
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /*  _ASM_POWERPC_BOOK3S_32_PGTABLE_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 447b212649c8..48237e66e823 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -481,6 +481,157 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
 }
 
+/* Generic accessors to PTE bits */
+static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
+static inline int pte_dirty(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_DIRTY); }
+static inline int pte_young(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_ACCESSED); }
+static inline int pte_special(pte_t pte)	{ return !!(pte_val(pte) & _PAGE_SPECIAL); }
+static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
+static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
+
+#ifdef CONFIG_NUMA_BALANCING
+/*
+ * These work without NUMA balancing but the kernel does not care. See the
+ * comment in include/asm-generic/pgtable.h . On powerpc, this will only
+ * work for user pages and always return true for kernel pages.
+ */
+static inline int pte_protnone(pte_t pte)
+{
+	return (pte_val(pte) &
+		(_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
+}
+#endif /* CONFIG_NUMA_BALANCING */
+
+static inline int pte_present(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_PRESENT;
+}
+
+/* Conversion functions: convert a page and protection to a page entry,
+ * and a page entry and page directory to the page they refer to.
+ *
+ * Even if PTEs can be unsigned long long, a PFN is always an unsigned
+ * long for now.
+ */
+static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
+{
+	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
+		     pgprot_val(pgprot));
+}
+
+static inline unsigned long pte_pfn(pte_t pte)
+{
+	return pte_val(pte) >> PTE_RPN_SHIFT;
+}
+
+/* Generic modifiers for PTE bits */
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_RW);
+}
+
+static inline pte_t pte_mkclean(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkold(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_RW);
+}
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_SPECIAL);
+}
+
+static inline pte_t pte_mkhuge(pte_t pte)
+{
+	return pte;
+}
+
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+	return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
+}
+
+/* This low level function performs the actual PTE insertion
+ * Setting the PTE depends on the MMU type and other factors. It's
+ * an horrible mess that I'm not going to try to clean up now but
+ * I'm keeping it in one place rather than spread around
+ */
+static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
+				pte_t *ptep, pte_t pte, int percpu)
+{
+	/*
+	 * Anything else just stores the PTE normally. That covers all 64-bit
+	 * cases, and 32-bit non-hash with 32-bit PTEs.
+	 */
+	*ptep = pte;
+}
+
+/*
+ * Macro to mark a page protection value as "uncacheable".
+ */
+
+#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
+			 _PAGE_WRITETHRU)
+
+#define pgprot_noncached pgprot_noncached
+static inline pgprot_t pgprot_noncached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE | _PAGE_GUARDED);
+}
+
+#define pgprot_noncached_wc pgprot_noncached_wc
+static inline pgprot_t pgprot_noncached_wc(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE);
+}
+
+#define pgprot_cached pgprot_cached
+static inline pgprot_t pgprot_cached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT);
+}
+
+#define pgprot_cached_wthru pgprot_cached_wthru
+static inline pgprot_t pgprot_cached_wthru(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT | _PAGE_WRITETHRU);
+}
+
+#define pgprot_cached_noncoherent pgprot_cached_noncoherent
+static inline pgprot_t pgprot_cached_noncoherent(pgprot_t prot)
+{
+	return __pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL);
+}
+
+#define pgprot_writecombine pgprot_writecombine
+static inline pgprot_t pgprot_writecombine(pgprot_t prot)
+{
+	return pgprot_noncached_wc(prot);
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
 				   pmd_t *pmdp, unsigned long old_pmd);
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index aac630b4a15e..f2ace2cac7bb 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -201,6 +201,12 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
 #define pmd_mkdirty(pmd)	pte_pmd(pte_mkdirty(pmd_pte(pmd)))
 #define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
 #define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
+#ifdef CONFIG_NUMA_BALANCING
+static inline int pmd_protnone(pmd_t pmd)
+{
+	return pte_protnone(pmd_pte(pmd));
+}
+#endif /* CONFIG_NUMA_BALANCING */
 
 #define __HAVE_ARCH_PMD_WRITE
 #define pmd_write(pmd)		pte_write(pmd_pte(pmd))
diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index ebd6677ea017..8b0f4a29259a 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -9,221 +9,17 @@
 
 #define FIRST_USER_ADDRESS	0UL
 #ifndef __ASSEMBLY__
-
-/* Generic accessors to PTE bits */
-static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
-static inline int pte_dirty(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_DIRTY); }
-static inline int pte_young(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_ACCESSED); }
-static inline int pte_special(pte_t pte)	{ return !!(pte_val(pte) & _PAGE_SPECIAL); }
-static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
-static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
-
-#ifdef CONFIG_NUMA_BALANCING
-/*
- * These work without NUMA balancing but the kernel does not care. See the
- * comment in include/asm-generic/pgtable.h . On powerpc, this will only
- * work for user pages and always return true for kernel pages.
- */
-static inline int pte_protnone(pte_t pte)
-{
-	return (pte_val(pte) &
-		(_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
-}
-
-static inline int pmd_protnone(pmd_t pmd)
-{
-	return pte_protnone(pmd_pte(pmd));
-}
-#endif /* CONFIG_NUMA_BALANCING */
-
-static inline int pte_present(pte_t pte)
-{
-	return pte_val(pte) & _PAGE_PRESENT;
-}
-
-/* Conversion functions: convert a page and protection to a page entry,
- * and a page entry and page directory to the page they refer to.
- *
- * Even if PTEs can be unsigned long long, a PFN is always an unsigned
- * long for now.
- */
-static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
-{
-	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
-		     pgprot_val(pgprot));
-}
-
-static inline unsigned long pte_pfn(pte_t pte)
-{
-	return pte_val(pte) >> PTE_RPN_SHIFT;
-}
-
-/* Generic modifiers for PTE bits */
-static inline pte_t pte_wrprotect(pte_t pte)
-{
-	return __pte(pte_val(pte) & ~_PAGE_RW);
-}
-
-static inline pte_t pte_mkclean(pte_t pte)
-{
-	return __pte(pte_val(pte) & ~_PAGE_DIRTY);
-}
-
-static inline pte_t pte_mkold(pte_t pte)
-{
-	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
-}
-
-static inline pte_t pte_mkwrite(pte_t pte)
-{
-	return __pte(pte_val(pte) | _PAGE_RW);
-}
-
-static inline pte_t pte_mkdirty(pte_t pte)
-{
-	return __pte(pte_val(pte) | _PAGE_DIRTY);
-}
-
-static inline pte_t pte_mkyoung(pte_t pte)
-{
-	return __pte(pte_val(pte) | _PAGE_ACCESSED);
-}
-
-static inline pte_t pte_mkspecial(pte_t pte)
-{
-	return __pte(pte_val(pte) | _PAGE_SPECIAL);
-}
-
-static inline pte_t pte_mkhuge(pte_t pte)
-{
-	return pte;
-}
-
-static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
-{
-	return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
-}
-
-
 /* Insert a PTE, top-level function is out of line. It uses an inline
  * low level function in the respective pgtable-* files
  */
 extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
 		       pte_t pte);
 
-/* This low level function performs the actual PTE insertion
- * Setting the PTE depends on the MMU type and other factors. It's
- * an horrible mess that I'm not going to try to clean up now but
- * I'm keeping it in one place rather than spread around
- */
-static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
-				pte_t *ptep, pte_t pte, int percpu)
-{
-#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
-	/* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
-	 * helper pte_update() which does an atomic update. We need to do that
-	 * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
-	 * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
-	 * the hash bits instead (ie, same as the non-SMP case)
-	 */
-	if (percpu)
-		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-			      | (pte_val(pte) & ~_PAGE_HASHPTE));
-	else
-		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
-
-#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
-	/* Second case is 32-bit with 64-bit PTE.  In this case, we
-	 * can just store as long as we do the two halves in the right order
-	 * with a barrier in between. This is possible because we take care,
-	 * in the hash code, to pre-invalidate if the PTE was already hashed,
-	 * which synchronizes us with any concurrent invalidation.
-	 * In the percpu case, we also fallback to the simple update preserving
-	 * the hash bits
-	 */
-	if (percpu) {
-		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-			      | (pte_val(pte) & ~_PAGE_HASHPTE));
-		return;
-	}
-	if (pte_val(*ptep) & _PAGE_HASHPTE)
-		flush_hash_entry(mm, ptep, addr);
-	__asm__ __volatile__("\
-		stw%U0%X0 %2,%0\n\
-		eieio\n\
-		stw%U0%X0 %L2,%1"
-	: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
-	: "r" (pte) : "memory");
-
-#elif defined(CONFIG_PPC_STD_MMU_32)
-	/* Third case is 32-bit hash table in UP mode, we need to preserve
-	 * the _PAGE_HASHPTE bit since we may not have invalidated the previous
-	 * translation in the hash yet (done in a subsequent flush_tlb_xxx())
-	 * and see we need to keep track that this PTE needs invalidating
-	 */
-	*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-		      | (pte_val(pte) & ~_PAGE_HASHPTE));
-
-#else
-	/* Anything else just stores the PTE normally. That covers all 64-bit
-	 * cases, and 32-bit non-hash with 32-bit PTEs.
-	 */
-	*ptep = pte;
-#endif
-}
-
 
 #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
 extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
 				 pte_t *ptep, pte_t entry, int dirty);
 
-/*
- * Macro to mark a page protection value as "uncacheable".
- */
-
-#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU)
-
-#define pgprot_noncached pgprot_noncached
-static inline pgprot_t pgprot_noncached(pgprot_t prot)
-{
-	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
-			_PAGE_NO_CACHE | _PAGE_GUARDED);
-}
-
-#define pgprot_noncached_wc pgprot_noncached_wc
-static inline pgprot_t pgprot_noncached_wc(pgprot_t prot)
-{
-	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
-			_PAGE_NO_CACHE);
-}
-
-#define pgprot_cached pgprot_cached
-static inline pgprot_t pgprot_cached(pgprot_t prot)
-{
-	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
-			_PAGE_COHERENT);
-}
-
-#define pgprot_cached_wthru pgprot_cached_wthru
-static inline pgprot_t pgprot_cached_wthru(pgprot_t prot)
-{
-	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
-			_PAGE_COHERENT | _PAGE_WRITETHRU);
-}
-
-#define pgprot_cached_noncoherent pgprot_cached_noncoherent
-static inline pgprot_t pgprot_cached_noncoherent(pgprot_t prot)
-{
-	return __pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL);
-}
-
-#define pgprot_writecombine pgprot_writecombine
-static inline pgprot_t pgprot_writecombine(pgprot_t prot)
-{
-	return pgprot_noncached_wc(prot);
-}
-
 struct file;
 extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 				     unsigned long size, pgprot_t vma_prot);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 13/35] powerpc/booke: Move nohash headers (part 1)
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (11 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 12/35] powerpc/mm: Move PTE bits from generic functions to hash64 functions Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 14/35] powerpc/booke: Move nohash headers (part 2) Aneesh Kumar K.V
                   ` (21 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Move the booke related headers below booke/32 or booke/64

We are splitting this change into multiple patch to make the rebasing
easier. The following patches can be folded into this if needed.
They are kept separate for easier review.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/{pgtable-book3e.h => nohash/pgtable.h} | 0
 arch/powerpc/include/asm/pgtable.h                              | 2 +-
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename arch/powerpc/include/asm/{pgtable-book3e.h => nohash/pgtable.h} (100%)

diff --git a/arch/powerpc/include/asm/pgtable-book3e.h b/arch/powerpc/include/asm/nohash/pgtable.h
similarity index 100%
rename from arch/powerpc/include/asm/pgtable-book3e.h
rename to arch/powerpc/include/asm/nohash/pgtable.h
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index 8f7338678fdc..ac9fb114e25d 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -15,7 +15,7 @@ struct mm_struct;
 #ifdef CONFIG_PPC_BOOK3S
 #include <asm/book3s/pgtable.h>
 #else
-#include <asm/pgtable-book3e.h>
+#include <asm/nohash/pgtable.h>
 #endif /* !CONFIG_PPC_BOOK3S */
 
 #ifndef __ASSEMBLY__
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 14/35] powerpc/booke: Move nohash headers (part 2)
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (12 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 13/35] powerpc/booke: Move nohash headers (part 1) Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 15/35] powerpc/booke: Move nohash headers (part 3) Aneesh Kumar K.V
                   ` (20 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/{pgtable-ppc32.h => nohash/32/pgtable.h} | 0
 arch/powerpc/include/asm/{pgtable-ppc64.h => nohash/64/pgtable.h} | 2 +-
 arch/powerpc/include/asm/nohash/pgtable.h                         | 8 ++++----
 3 files changed, 5 insertions(+), 5 deletions(-)
 rename arch/powerpc/include/asm/{pgtable-ppc32.h => nohash/32/pgtable.h} (100%)
 rename arch/powerpc/include/asm/{pgtable-ppc64.h => nohash/64/pgtable.h} (99%)

diff --git a/arch/powerpc/include/asm/pgtable-ppc32.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
similarity index 100%
rename from arch/powerpc/include/asm/pgtable-ppc32.h
rename to arch/powerpc/include/asm/nohash/32/pgtable.h
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
similarity index 99%
rename from arch/powerpc/include/asm/pgtable-ppc64.h
rename to arch/powerpc/include/asm/nohash/64/pgtable.h
index 6be203d43fd1..9b4f9fcd64de 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -18,7 +18,7 @@
  * Size of EA range mapped by our pagetables.
  */
 #define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
-                	    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
+			    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
 #define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index 91325997ba25..c0c41a2409d2 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -1,10 +1,10 @@
-#ifndef _ASM_POWERPC_PGTABLE_BOOK3E_H
-#define _ASM_POWERPC_PGTABLE_BOOK3E_H
+#ifndef _ASM_POWERPC_NOHASH_PGTABLE_H
+#define _ASM_POWERPC_NOHASH_PGTABLE_H
 
 #if defined(CONFIG_PPC64)
-#include <asm/pgtable-ppc64.h>
+#include <asm/nohash/64/pgtable.h>
 #else
-#include <asm/pgtable-ppc32.h>
+#include <asm/nohash/32/pgtable.h>
 #endif
 
 #ifndef __ASSEMBLY__
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 15/35] powerpc/booke: Move nohash headers (part 3)
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (13 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 14/35] powerpc/booke: Move nohash headers (part 2) Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 16/35] powerpc/booke: Move nohash headers (part 4) Aneesh Kumar K.V
                   ` (19 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 .../include/asm/{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} |  0
 .../asm/{pgtable-ppc64-64k.h => nohash/64/pgtable-64k.h}       |  0
 arch/powerpc/include/asm/nohash/64/pgtable.h                   | 10 +++++-----
 3 files changed, 5 insertions(+), 5 deletions(-)
 rename arch/powerpc/include/asm/{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} (100%)
 rename arch/powerpc/include/asm/{pgtable-ppc64-64k.h => nohash/64/pgtable-64k.h} (100%)

diff --git a/arch/powerpc/include/asm/pgtable-ppc64-4k.h b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
similarity index 100%
rename from arch/powerpc/include/asm/pgtable-ppc64-4k.h
rename to arch/powerpc/include/asm/nohash/64/pgtable-4k.h
diff --git a/arch/powerpc/include/asm/pgtable-ppc64-64k.h b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
similarity index 100%
rename from arch/powerpc/include/asm/pgtable-ppc64-64k.h
rename to arch/powerpc/include/asm/nohash/64/pgtable-64k.h
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 9b4f9fcd64de..aebb83f84b69 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -1,14 +1,14 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC64_H_
-#define _ASM_POWERPC_PGTABLE_PPC64_H_
+#ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_H
+#define _ASM_POWERPC_NOHASH_64_PGTABLE_H
 /*
  * This file contains the functions and defines necessary to modify and use
  * the ppc64 hashed page table.
  */
 
 #ifdef CONFIG_PPC_64K_PAGES
-#include <asm/pgtable-ppc64-64k.h>
+#include <asm/nohash/64/pgtable-64k.h>
 #else
-#include <asm/pgtable-ppc64-4k.h>
+#include <asm/nohash/64/pgtable-4k.h>
 #endif
 #include <asm/barrier.h>
 
@@ -637,4 +637,4 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
 	return true;
 }
 #endif /* __ASSEMBLY__ */
-#endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
+#endif /* _ASM_POWERPC_NOHASH_64_PGTABLE_H */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 16/35] powerpc/booke: Move nohash headers (part 4)
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (14 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 15/35] powerpc/booke: Move nohash headers (part 3) Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 17/35] powerpc/booke: Move nohash headers (part 5) Aneesh Kumar K.V
                   ` (18 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/nohash/32/pgtable.h             | 16 ++++++++--------
 arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h       |  0
 arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h       |  0
 arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h       |  0
 arch/powerpc/include/asm/{ => nohash/32}/pte-fsl-booke.h |  0
 arch/powerpc/include/asm/nohash/64/pgtable.h             |  2 +-
 arch/powerpc/include/asm/{ => nohash}/pte-book3e.h       |  0
 7 files changed, 9 insertions(+), 9 deletions(-)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h (100%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h (100%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h (100%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-fsl-booke.h (100%)
 rename arch/powerpc/include/asm/{ => nohash}/pte-book3e.h (100%)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index fbb23c54b998..c82cbf52d19e 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC32_H
-#define _ASM_POWERPC_PGTABLE_PPC32_H
+#ifndef _ASM_POWERPC_NOHASH_32_PGTABLE_H
+#define _ASM_POWERPC_NOHASH_32_PGTABLE_H
 
 #include <asm-generic/pgtable-nopmd.h>
 
@@ -106,15 +106,15 @@ extern int icache_44x_need_flush;
  */
 
 #if defined(CONFIG_40x)
-#include <asm/pte-40x.h>
+#include <asm/nohash/32/pte-40x.h>
 #elif defined(CONFIG_44x)
-#include <asm/pte-44x.h>
+#include <asm/nohash/32/pte-44x.h>
 #elif defined(CONFIG_FSL_BOOKE) && defined(CONFIG_PTE_64BIT)
-#include <asm/pte-book3e.h>
+#include <asm/nohash/pte-book3e.h>
 #elif defined(CONFIG_FSL_BOOKE)
-#include <asm/pte-fsl-booke.h>
+#include <asm/nohash/32/pte-fsl-booke.h>
 #elif defined(CONFIG_8xx)
-#include <asm/pte-8xx.h>
+#include <asm/nohash/32/pte-8xx.h>
 #endif
 
 /* And here we include common definitions */
@@ -340,4 +340,4 @@ extern int get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep,
 
 #endif /* !__ASSEMBLY__ */
 
-#endif /* _ASM_POWERPC_PGTABLE_PPC32_H */
+#endif /* __ASM_POWERPC_NOHASH_32_PGTABLE_H */
diff --git a/arch/powerpc/include/asm/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-40x.h
rename to arch/powerpc/include/asm/nohash/32/pte-40x.h
diff --git a/arch/powerpc/include/asm/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-44x.h
rename to arch/powerpc/include/asm/nohash/32/pte-44x.h
diff --git a/arch/powerpc/include/asm/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-8xx.h
rename to arch/powerpc/include/asm/nohash/32/pte-8xx.h
diff --git a/arch/powerpc/include/asm/pte-fsl-booke.h b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-fsl-booke.h
rename to arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index aebb83f84b69..c24e03f22655 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -97,7 +97,7 @@
 /*
  * Include the PTE bits definitions
  */
-#include <asm/pte-book3e.h>
+#include <asm/nohash/pte-book3e.h>
 #include <asm/pte-common.h>
 
 #ifdef CONFIG_PPC_MM_SLICES
diff --git a/arch/powerpc/include/asm/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-book3e.h
rename to arch/powerpc/include/asm/nohash/pte-book3e.h
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 17/35] powerpc/booke: Move nohash headers (part 5)
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (15 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 16/35] powerpc/booke: Move nohash headers (part 4) Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 18/35] powerpc/mm: Convert 4k hash insert to C Aneesh Kumar K.V
                   ` (17 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/nohash/32/pte-40x.h       | 6 +++---
 arch/powerpc/include/asm/nohash/32/pte-44x.h       | 6 +++---
 arch/powerpc/include/asm/nohash/32/pte-8xx.h       | 6 +++---
 arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h | 6 +++---
 arch/powerpc/include/asm/nohash/64/pgtable-4k.h    | 6 +++---
 arch/powerpc/include/asm/nohash/64/pgtable-64k.h   | 6 +++---
 arch/powerpc/include/asm/nohash/pte-book3e.h       | 6 +++---
 7 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h
index 486b1ef81338..9624ebdacc47 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-40x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_40x_H
-#define _ASM_POWERPC_PTE_40x_H
+#ifndef _ASM_POWERPC_NOHASH_32_PTE_40x_H
+#define _ASM_POWERPC_NOHASH_32_PTE_40x_H
 #ifdef __KERNEL__
 
 /*
@@ -61,4 +61,4 @@
 #define PTE_ATOMIC_UPDATES	1
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_40x_H */
+#endif /*  _ASM_POWERPC_NOHASH_32_PTE_40x_H */
diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h
index 36f75fab23f5..fdab41c654ef 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-44x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_44x_H
-#define _ASM_POWERPC_PTE_44x_H
+#ifndef _ASM_POWERPC_NOHASH_32_PTE_44x_H
+#define _ASM_POWERPC_NOHASH_32_PTE_44x_H
 #ifdef __KERNEL__
 
 /*
@@ -94,4 +94,4 @@
 
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_44x_H */
+#endif /*  _ASM_POWERPC_NOHASH_32_PTE_44x_H */
diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index a0e2ba960976..3742b1919661 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_8xx_H
-#define _ASM_POWERPC_PTE_8xx_H
+#ifndef _ASM_POWERPC_NOHASH_32_PTE_8xx_H
+#define _ASM_POWERPC_NOHASH_32_PTE_8xx_H
 #ifdef __KERNEL__
 
 /*
@@ -62,4 +62,4 @@
 				 _PAGE_HWWRITE | _PAGE_EXEC)
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_8xx_H */
+#endif /*  _ASM_POWERPC_NOHASH_32_PTE_8xx_H */
diff --git a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
index 9f5c3d04a1a3..5422d00c6145 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_FSL_BOOKE_H
-#define _ASM_POWERPC_PTE_FSL_BOOKE_H
+#ifndef _ASM_POWERPC_NOHASH_32_PTE_FSL_BOOKE_H
+#define _ASM_POWERPC_NOHASH_32_PTE_FSL_BOOKE_H
 #ifdef __KERNEL__
 
 /* PTE bit definitions for Freescale BookE SW loaded TLB MMU based
@@ -37,4 +37,4 @@
 #define PTE_WIMGE_SHIFT (6)
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_FSL_BOOKE_H */
+#endif /*  _ASM_POWERPC_NOHASH_32_PTE_FSL_BOOKE_H */
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
index 7bace25d6b62..fc7d51753f81 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC64_4K_H
-#define _ASM_POWERPC_PGTABLE_PPC64_4K_H
+#ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
+#define _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
 /*
  * Entries per page directory level.  The PTE level must use a 64b record
  * for each page table entry.  The PMD and PGD level use a 32b record for
@@ -89,4 +89,4 @@ extern struct page *pgd_page(pgd_t pgd);
 #define remap_4k_pfn(vma, addr, pfn, prot)	\
 	remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, (prot))
 
-#endif /* _ASM_POWERPC_PGTABLE_PPC64_4K_H */
+#endif /* _ _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H */
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
index 1de35bbd02a6..a44660d76096 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC64_64K_H
-#define _ASM_POWERPC_PGTABLE_PPC64_64K_H
+#ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_64K_H
+#define _ASM_POWERPC_NOHASH_64_PGTABLE_64K_H
 
 #include <asm-generic/pgtable-nopud.h>
 
@@ -41,4 +41,4 @@
 #define pgd_pte(pgd)	(pud_pte(((pud_t){ pgd })))
 #define pte_pgd(pte)	((pgd_t)pte_pud(pte))
 
-#endif /* _ASM_POWERPC_PGTABLE_PPC64_64K_H */
+#endif /* _ASM_POWERPC_NOHASH_64_PGTABLE_64K_H */
diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h
index 8d8473278d91..e16807b78edf 100644
--- a/arch/powerpc/include/asm/nohash/pte-book3e.h
+++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_BOOK3E_H
-#define _ASM_POWERPC_PTE_BOOK3E_H
+#ifndef _ASM_POWERPC_NOHASH_PTE_BOOK3E_H
+#define _ASM_POWERPC_NOHASH_PTE_BOOK3E_H
 #ifdef __KERNEL__
 
 /* PTE bit definitions for processors compliant to the Book3E
@@ -84,4 +84,4 @@
 #endif
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_FSL_BOOKE_H */
+#endif /*  _ASM_POWERPC_NOHASH_PTE_BOOK3E_H */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 18/35] powerpc/mm: Convert 4k hash insert to C
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (16 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 17/35] powerpc/booke: Move nohash headers (part 5) Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 19/35] powerpc/mm: Remove the dependency on pte bit position in asm code Aneesh Kumar K.V
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/Makefile        |   3 +
 arch/powerpc/mm/hash64_64k.c    | 202 +++++++++++++++++++++
 arch/powerpc/mm/hash_low_64.S   | 380 ----------------------------------------
 arch/powerpc/mm/hash_utils_64.c |   4 +-
 4 files changed, 208 insertions(+), 381 deletions(-)
 create mode 100644 arch/powerpc/mm/hash64_64k.c

diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index 3eb73a38220d..f80ad1a76cc8 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -18,6 +18,9 @@ obj-$(CONFIG_PPC_STD_MMU_32)	+= ppc_mmu_32.o
 obj-$(CONFIG_PPC_STD_MMU)	+= hash_low_$(CONFIG_WORD_SIZE).o \
 				   tlb_hash$(CONFIG_WORD_SIZE).o \
 				   mmu_context_hash$(CONFIG_WORD_SIZE).o
+ifeq ($(CONFIG_PPC_STD_MMU_64),y)
+obj-$(CONFIG_PPC_64K_PAGES)	+= hash64_64k.o
+endif
 obj-$(CONFIG_PPC_ICSWX)		+= icswx.o
 obj-$(CONFIG_PPC_ICSWX_PID)	+= icswx_pid.o
 obj-$(CONFIG_40x)		+= 40x_mmu.o
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
new file mode 100644
index 000000000000..9ffeae2cbb57
--- /dev/null
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -0,0 +1,202 @@
+/*
+ * Copyright IBM Corporation, 2015
+ * Author Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#include <linux/mm.h>
+#include <asm/machdep.h>
+#include <asm/mmu.h>
+
+int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
+		   pte_t *ptep, unsigned long trap, unsigned long flags,
+		   int ssize, int subpg_prot)
+{
+	real_pte_t rpte;
+	unsigned long *hidxp;
+	unsigned long hpte_group;
+	unsigned int subpg_index;
+	unsigned long rflags, pa, hidx;
+	unsigned long old_pte, new_pte, subpg_pte;
+	unsigned long vpn, hash, slot;
+	unsigned long shift = mmu_psize_defs[MMU_PAGE_4K].shift;
+
+	/*
+	 * atomically mark the linux large page PTE busy and dirty
+	 */
+	do {
+		pte_t pte = READ_ONCE(*ptep);
+
+		old_pte = pte_val(pte);
+		/* If PTE busy, retry the access */
+		if (unlikely(old_pte & _PAGE_BUSY))
+			return 0;
+		/* If PTE permissions don't match, take page fault */
+		if (unlikely(access & ~old_pte))
+			return 1;
+		/*
+		 * Try to lock the PTE, add ACCESSED and DIRTY if it was
+		 * a write access. Since this is 4K insert of 64K page size
+		 * also add _PAGE_COMBO
+		 */
+		new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED | _PAGE_COMBO;
+		if (access & _PAGE_RW)
+			new_pte |= _PAGE_DIRTY;
+	} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
+					  old_pte, new_pte));
+	/*
+	 * Handle the subpage protection bits
+	 */
+	subpg_pte = new_pte & ~subpg_prot;
+	/*
+	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
+	 * need to add in 0x1 if it's a read-only user page
+	 */
+	rflags = subpg_pte & _PAGE_USER;
+	if ((subpg_pte & _PAGE_USER) && !((subpg_pte & _PAGE_RW) &&
+					(subpg_pte & _PAGE_DIRTY)))
+		rflags |= 0x1;
+	/*
+	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
+	 */
+	rflags |= ((subpg_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
+	/*
+	 * Always add C and Memory coherence bit
+	 */
+	rflags |= HPTE_R_C | HPTE_R_M;
+	/*
+	 * Add in WIMG bits
+	 */
+	rflags |= (subpg_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
+				_PAGE_COHERENT | _PAGE_GUARDED));
+
+	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
+	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {
+
+		/*
+		 * No CPU has hugepages but lacks no execute, so we
+		 * don't need to worry about that case
+		 */
+		rflags = hash_page_do_lazy_icache(rflags, __pte(old_pte), trap);
+	}
+
+	subpg_index = (ea & (PAGE_SIZE - 1)) >> shift;
+	vpn  = hpt_vpn(ea, vsid, ssize);
+	rpte = __real_pte(__pte(old_pte), ptep);
+	/*
+	 *None of the sub 4k page is hashed
+	 */
+	if (!(old_pte & _PAGE_HASHPTE))
+		goto htab_insert_hpte;
+	/*
+	 * Check if the pte was already inserted into the hash table
+	 * as a 64k HW page, and invalidate the 64k HPTE if so.
+	 */
+	if (!(old_pte & _PAGE_COMBO)) {
+		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
+		old_pte &= ~_PAGE_HPTE_SUB;
+		goto htab_insert_hpte;
+	}
+	/*
+	 * Check for sub page valid and update
+	 */
+	if (__rpte_sub_valid(rpte, subpg_index)) {
+		int ret;
+
+		hash = hpt_hash(vpn, shift, ssize);
+		hidx = __rpte_to_hidx(rpte, subpg_index);
+		if (hidx & _PTEIDX_SECONDARY)
+			hash = ~hash;
+		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+		slot += hidx & _PTEIDX_GROUP_IX;
+
+		ret = ppc_md.hpte_updatepp(slot, rflags, vpn,
+					   MMU_PAGE_4K, MMU_PAGE_4K,
+					   ssize, flags);
+		/*
+		 *if we failed because typically the HPTE wasn't really here
+		 * we try an insertion.
+		 */
+		if (ret == -1)
+			goto htab_insert_hpte;
+
+		*ptep = __pte(new_pte & ~_PAGE_BUSY);
+		return 0;
+	}
+
+htab_insert_hpte:
+	/*
+	 * handle _PAGE_4K_PFN case
+	 */
+	if (old_pte & _PAGE_4K_PFN) {
+		/*
+		 * All the sub 4k page have the same
+		 * physical address.
+		 */
+		pa = pte_pfn(__pte(old_pte)) << HW_PAGE_SHIFT;
+	} else {
+		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
+		pa += (subpg_index << shift);
+	}
+	hash = hpt_hash(vpn, shift, ssize);
+repeat:
+	hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+
+	/* Insert into the hash table, primary slot */
+	slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
+				  MMU_PAGE_4K, MMU_PAGE_4K, ssize);
+	/*
+	 * Primary is full, try the secondary
+	 */
+	if (unlikely(slot == -1)) {
+		hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+		slot = ppc_md.hpte_insert(hpte_group, vpn, pa,
+					  rflags, HPTE_V_SECONDARY,
+					  MMU_PAGE_4K, MMU_PAGE_4K, ssize);
+		if (slot == -1) {
+			if (mftb() & 0x1)
+				hpte_group = ((hash & htab_hash_mask) *
+					      HPTES_PER_GROUP) & ~0x7UL;
+			ppc_md.hpte_remove(hpte_group);
+			/*
+			 * FIXME!! Should be try the group from which we removed ?
+			 */
+			goto repeat;
+		}
+	}
+	/*
+	 * Hypervisor failure. Restore old pmd and return -1
+	 * similar to __hash_page_*
+	 */
+	if (unlikely(slot == -2)) {
+		*ptep = __pte(old_pte);
+		hash_failure_debug(ea, access, vsid, trap, ssize,
+				   MMU_PAGE_4K, MMU_PAGE_4K, old_pte);
+		return -1;
+	}
+	/*
+	 * Insert slot number & secondary bit in PTE second half,
+	 * clear _PAGE_BUSY and set appropriate HPTE slot bit
+	 * Since we have _PAGE_BUSY set on ptep, we can be sure
+	 * nobody is undating hidx.
+	 */
+	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+	/* __real_pte use pte_val() any idea why ? FIXME!! */
+	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
+	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
+	new_pte |= (_PAGE_HPTE_SUB0 >> subpg_index);
+	/*
+	 * check __real_pte for details on matching smp_rmb()
+	 */
+	smp_wmb();
+	*ptep = __pte(new_pte & ~_PAGE_BUSY);
+	return 0;
+}
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index 3b49e3295901..6b4d4c1d0628 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -328,381 +328,8 @@ htab_pte_insert_failure:
 	li	r3,-1
 	b	htab_bail
 
-
 #else /* CONFIG_PPC_64K_PAGES */
 
-
-/*****************************************************************************
- *                                                                           *
- *           64K SW & 4K or 64K HW in a 4K segment pages implementation      *
- *                                                                           *
- *****************************************************************************/
-
-/* _hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
- *		 pte_t *ptep, unsigned long trap, unsigned local flags,
- *		 int ssize, int subpg_prot)
- */
-
-/*
- * For now, we do NOT implement Admixed pages
- */
-_GLOBAL(__hash_page_4K)
-	mflr	r0
-	std	r0,16(r1)
-	stdu	r1,-STACKFRAMESIZE(r1)
-	/* Save all params that we need after a function call */
-	std	r6,STK_PARAM(R6)(r1)
-	std	r8,STK_PARAM(R8)(r1)
-	std	r9,STK_PARAM(R9)(r1)
-
-	/* Save non-volatile registers.
-	 * r31 will hold "old PTE"
-	 * r30 is "new PTE"
-	 * r29 is vpn
-	 * r28 is a hash value
-	 * r27 is hashtab mask (maybe dynamic patched instead ?)
-	 * r26 is the hidx mask
-	 * r25 is the index in combo page
-	 */
-	std	r25,STK_REG(R25)(r1)
-	std	r26,STK_REG(R26)(r1)
-	std	r27,STK_REG(R27)(r1)
-	std	r28,STK_REG(R28)(r1)
-	std	r29,STK_REG(R29)(r1)
-	std	r30,STK_REG(R30)(r1)
-	std	r31,STK_REG(R31)(r1)
-
-	/* Step 1:
-	 *
-	 * Check permissions, atomically mark the linux PTE busy
-	 * and hashed.
-	 */
-1:
-	ldarx	r31,0,r6
-	/* Check access rights (access & ~(pte_val(*ptep))) */
-	andc.	r0,r4,r31
-	bne-	htab_wrong_access
-	/* Check if PTE is busy */
-	andi.	r0,r31,_PAGE_BUSY
-	/* If so, just bail out and refault if needed. Someone else
-	 * is changing this PTE anyway and might hash it.
-	 */
-	bne-	htab_bail_ok
-	/* Prepare new PTE value (turn access RW into DIRTY, then
-	 * add BUSY and ACCESSED)
-	 */
-	rlwinm	r30,r4,32-9+7,31-7,31-7	/* _PAGE_RW -> _PAGE_DIRTY */
-	or	r30,r30,r31
-	ori	r30,r30,_PAGE_BUSY | _PAGE_ACCESSED
-	oris	r30,r30,_PAGE_COMBO@h
-	/* Write the linux PTE atomically (setting busy) */
-	stdcx.	r30,0,r6
-	bne-	1b
-	isync
-
-	/* Step 2:
-	 *
-	 * Insert/Update the HPTE in the hash table. At this point,
-	 * r4 (access) is re-useable, we use it for the new HPTE flags
-	 */
-
-	/* Load the hidx index */
-	rldicl	r25,r3,64-12,60
-
-BEGIN_FTR_SECTION
-	cmpdi	r9,0			/* check segment size */
-	bne	3f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_1T_SEGMENT)
-	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT - VPN_SHIFT
-	/*
-	 * clrldi r3,r3,64 - SID_SHIFT -->  ea & 0xfffffff
-	 * srdi	 r28,r3,VPN_SHIFT
-	 */
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT - VPN_SHIFT)
-	or	r29,r28,r29
-	/*
-	 * Calculate hash value for primary slot and store it in r28
-	 * r3 = va, r5 = vsid
-	 * r0 = (va >> 12) & ((1ul << (28 - 12)) -1)
-	 */
-	rldicl	r0,r3,64-12,48
-	xor	r28,r5,r0		/* hash */
-	b	4f
-
-3:	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT_1T - VPN_SHIFT
-	/*
-	 * clrldi r3,r3,64 - SID_SHIFT_1T -->  ea & 0xffffffffff
-	 * srdi	r28,r3,VPN_SHIFT
-	 */
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT_1T - VPN_SHIFT)
-	or	r29,r28,r29
-
-	/*
-	 * Calculate hash value for primary slot and
-	 * store it in r28  for 1T segment
-	 * r3 = va, r5 = vsid
-	 */
-	sldi	r28,r5,25		/* vsid << 25 */
-	/* r0 = (va >> 12) & ((1ul << (40 - 12)) -1) */
-	rldicl	r0,r3,64-12,36
-	xor	r28,r28,r5		/* vsid ^ ( vsid << 25) */
-	xor	r28,r28,r0		/* hash */
-
-	/* Convert linux PTE bits into HW equivalents */
-4:
-#ifdef CONFIG_PPC_SUBPAGE_PROT
-	andc	r10,r30,r10
-	andi.	r3,r10,0x1fe		/* Get basic set of flags */
-	rlwinm	r0,r10,32-9+1,30,30	/* _PAGE_RW -> _PAGE_USER (r0) */
-#else
-	andi.	r3,r30,0x1fe		/* Get basic set of flags */
-	rlwinm	r0,r30,32-9+1,30,30	/* _PAGE_RW -> _PAGE_USER (r0) */
-#endif
-	xori	r3,r3,HPTE_R_N		/* _PAGE_EXEC -> NOEXEC */
-	rlwinm	r4,r30,32-7+1,30,30	/* _PAGE_DIRTY -> _PAGE_USER (r4) */
-	and	r0,r0,r4		/* _PAGE_RW & _PAGE_DIRTY ->r0 bit 30*/
-	andc	r0,r3,r0		/* r0 = pte & ~r0 */
-	rlwimi	r3,r0,32-1,31,31	/* Insert result into PP lsb */
-	/*
-	 * Always add "C" bit for perf. Memory coherence is always enabled
-	 */
-	ori	r3,r3,HPTE_R_C | HPTE_R_M
-
-	/* We eventually do the icache sync here (maybe inline that
-	 * code rather than call a C function...)
-	 */
-BEGIN_FTR_SECTION
-	mr	r4,r30
-	mr	r5,r7
-	bl	hash_page_do_lazy_icache
-END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE)
-
-	/* At this point, r3 contains new PP bits, save them in
-	 * place of "access" in the param area (sic)
-	 */
-	std	r3,STK_PARAM(R4)(r1)
-
-	/* Get htab_hash_mask */
-	ld	r4,htab_hash_mask@got(2)
-	ld	r27,0(r4)	/* htab_hash_mask -> r27 */
-
-	/* Check if we may already be in the hashtable, in this case, we
-	 * go to out-of-line code to try to modify the HPTE. We look for
-	 * the bit at (1 >> (index + 32))
-	 */
-	rldicl.	r0,r31,64-12,48
-	li	r26,0			/* Default hidx */
-	beq	htab_insert_pte
-
-	/*
-	 * Check if the pte was already inserted into the hash table
-	 * as a 64k HW page, and invalidate the 64k HPTE if so.
-	 */
-	andis.	r0,r31,_PAGE_COMBO@h
-	beq	htab_inval_old_hpte
-
-	ld	r6,STK_PARAM(R6)(r1)
-	ori	r26,r6,PTE_PAGE_HIDX_OFFSET /* Load the hidx mask. */
-	ld	r26,0(r26)
-	addi	r5,r25,36		/* Check actual HPTE_SUB bit, this */
-	rldcr.	r0,r31,r5,0		/* must match pgtable.h definition */
-	bne	htab_modify_pte
-
-htab_insert_pte:
-	/* real page number in r5, PTE RPN value + index */
-	andis.	r0,r31,_PAGE_4K_PFN@h
-	srdi	r5,r31,PTE_RPN_SHIFT
-	bne-	htab_special_pfn
-	sldi	r5,r5,PAGE_FACTOR
-	add	r5,r5,r25
-htab_special_pfn:
-	sldi	r5,r5,HW_PAGE_SHIFT
-
-	/* Calculate primary group hash */
-	and	r0,r28,r27
-	rldicr	r3,r0,3,63-3		/* r0 = (hash & mask) << 3 */
-
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,0			/* !bolted, !secondary */
-	li	r8,MMU_PAGE_4K		/* page size */
-	li	r9,MMU_PAGE_4K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl htab_call_hpte_insert1
-htab_call_hpte_insert1:
-	bl	.			/* patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge	htab_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	htab_pte_insert_failure
-
-	/* Now try secondary slot */
-
-	/* real page number in r5, PTE RPN value + index */
-	andis.	r0,r31,_PAGE_4K_PFN@h
-	srdi	r5,r31,PTE_RPN_SHIFT
-	bne-	3f
-	sldi	r5,r5,PAGE_FACTOR
-	add	r5,r5,r25
-3:	sldi	r5,r5,HW_PAGE_SHIFT
-
-	/* Calculate secondary group hash */
-	andc	r0,r27,r28
-	rldicr	r3,r0,3,63-3		/* r0 = (~hash & mask) << 3 */
-
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,HPTE_V_SECONDARY	/* !bolted, secondary */
-	li	r8,MMU_PAGE_4K		/* page size */
-	li	r9,MMU_PAGE_4K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl htab_call_hpte_insert2
-htab_call_hpte_insert2:
-	bl	.			/* patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge+	htab_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	htab_pte_insert_failure
-
-	/* Both are full, we need to evict something */
-	mftb	r0
-	/* Pick a random group based on TB */
-	andi.	r0,r0,1
-	mr	r5,r28
-	bne	2f
-	not	r5,r5
-2:	and	r0,r5,r27
-	rldicr	r3,r0,3,63-3		/* r0 = (hash & mask) << 3 */
-	/* Call ppc_md.hpte_remove */
-.globl htab_call_hpte_remove
-htab_call_hpte_remove:
-	bl	.			/* patched by htab_finish_init() */
-
-	/* Try all again */
-	b	htab_insert_pte
-
-	/*
-	 * Call out to C code to invalidate an 64k HW HPTE that is
-	 * useless now that the segment has been switched to 4k pages.
-	 */
-htab_inval_old_hpte:
-	mr	r3,r29			/* vpn */
-	mr	r4,r31			/* PTE.pte */
-	li	r5,0			/* PTE.hidx */
-	li	r6,MMU_PAGE_64K		/* psize */
-	ld	r7,STK_PARAM(R9)(r1)	/* ssize */
-	ld	r8,STK_PARAM(R8)(r1)	/* flags */
-	bl	flush_hash_page
-	/* Clear out _PAGE_HPTE_SUB bits in the new linux PTE */
-	lis	r0,_PAGE_HPTE_SUB@h
-	ori	r0,r0,_PAGE_HPTE_SUB@l
-	andc	r30,r30,r0
-	b	htab_insert_pte
-	
-htab_bail_ok:
-	li	r3,0
-	b	htab_bail
-
-htab_pte_insert_ok:
-	/* Insert slot number & secondary bit in PTE second half,
-	 * clear _PAGE_BUSY and set approriate HPTE slot bit
-	 */
-	ld	r6,STK_PARAM(R6)(r1)
-	li	r0,_PAGE_BUSY
-	andc	r30,r30,r0
-	/* HPTE SUB bit */
-	li	r0,1
-	subfic	r5,r25,27		/* Must match bit position in */
-	sld	r0,r0,r5		/* pgtable.h */
-	or	r30,r30,r0
-	/* hindx */
-	sldi	r5,r25,2
-	sld	r3,r3,r5
-	li	r4,0xf
-	sld	r4,r4,r5
-	andc	r26,r26,r4
-	or	r26,r26,r3
-	ori	r5,r6,PTE_PAGE_HIDX_OFFSET
-	std	r26,0(r5)
-	lwsync
-	std	r30,0(r6)
-	li	r3, 0
-htab_bail:
-	ld	r25,STK_REG(R25)(r1)
-	ld	r26,STK_REG(R26)(r1)
-	ld	r27,STK_REG(R27)(r1)
-	ld	r28,STK_REG(R28)(r1)
-	ld	r29,STK_REG(R29)(r1)
-	ld      r30,STK_REG(R30)(r1)
-	ld      r31,STK_REG(R31)(r1)
-	addi    r1,r1,STACKFRAMESIZE
-	ld      r0,16(r1)
-	mtlr    r0
-	blr
-
-htab_modify_pte:
-	/* Keep PP bits in r4 and slot idx from the PTE around in r3 */
-	mr	r4,r3
-	sldi	r5,r25,2
-	srd	r3,r26,r5
-
-	/* Secondary group ? if yes, get a inverted hash value */
-	mr	r5,r28
-	andi.	r0,r3,0x8 /* page secondary ? */
-	beq	1f
-	not	r5,r5
-1:	andi.	r3,r3,0x7 /* extract idx alone */
-
-	/* Calculate proper slot value for ppc_md.hpte_updatepp */
-	and	r0,r5,r27
-	rldicr	r0,r0,3,63-3	/* r0 = (hash & mask) << 3 */
-	add	r3,r0,r3	/* add slot idx */
-
-	/* Call ppc_md.hpte_updatepp */
-	mr	r5,r29			/* vpn */
-	li	r6,MMU_PAGE_4K		/* base page size */
-	li	r7,MMU_PAGE_4K		/* actual page size */
-	ld	r8,STK_PARAM(R9)(r1)	/* segment size */
-	ld	r9,STK_PARAM(R8)(r1)	/* get "flags" param */
-.globl htab_call_hpte_updatepp
-htab_call_hpte_updatepp:
-	bl	.			/* patched by htab_finish_init() */
-
-	/* if we failed because typically the HPTE wasn't really here
-	 * we try an insertion.
-	 */
-	cmpdi	0,r3,-1
-	beq-	htab_insert_pte
-
-	/* Clear the BUSY bit and Write out the PTE */
-	li	r0,_PAGE_BUSY
-	andc	r30,r30,r0
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r30,0(r6)
-	li	r3,0
-	b	htab_bail
-
-htab_wrong_access:
-	/* Bail out clearing reservation */
-	stdcx.	r31,0,r6
-	li	r3,1
-	b	htab_bail
-
-htab_pte_insert_failure:
-	/* Bail out restoring old PTE */
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r31,0(r6)
-	li	r3,-1
-	b	htab_bail
-
-#endif /* CONFIG_PPC_64K_PAGES */
-
-#ifdef CONFIG_PPC_64K_PAGES
-
 /*****************************************************************************
  *                                                                           *
  *           64K SW & 64K HW in a 64K segment pages implementation           *
@@ -994,10 +621,3 @@ ht64_pte_insert_failure:
 
 
 #endif /* CONFIG_PPC_64K_PAGES */
-
-
-/*****************************************************************************
- *                                                                           *
- *           Huge pages implementation is in hugetlbpage.c                   *
- *                                                                           *
- *****************************************************************************/
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 7f9616f7c479..9fcad40c16e9 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -653,7 +653,7 @@ static void __init htab_finish_init(void)
 	patch_branch(ht64_call_hpte_updatepp,
 		ppc_function_entry(ppc_md.hpte_updatepp),
 		BRANCH_SET_LINK);
-#endif /* CONFIG_PPC_64K_PAGES */
+#else /* !CONFIG_PPC_64K_PAGES */
 
 	patch_branch(htab_call_hpte_insert1,
 		ppc_function_entry(ppc_md.hpte_insert),
@@ -667,6 +667,8 @@ static void __init htab_finish_init(void)
 	patch_branch(htab_call_hpte_updatepp,
 		ppc_function_entry(ppc_md.hpte_updatepp),
 		BRANCH_SET_LINK);
+#endif
+
 }
 
 static void __init htab_initialize(void)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 19/35] powerpc/mm: Remove the dependency on pte bit position in asm code
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (17 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 18/35] powerpc/mm: Convert 4k hash insert to C Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t Aneesh Kumar K.V
                   ` (15 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

We should not expect pte bit position in asm code. Simply
by moving part of that to C

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/exceptions-64s.S | 18 ++++--------------
 arch/powerpc/mm/hash_utils_64.c      | 29 +++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 0a0399c2af11..7f82c8e170b6 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1556,29 +1556,19 @@ do_hash_page:
 	lwz	r0,TI_PREEMPT(r11)	/* If we're in an "NMI" */
 	andis.	r0,r0,NMI_MASK@h	/* (i.e. an irq when soft-disabled) */
 	bne	77f			/* then don't call hash_page now */
-	/*
-	 * We need to set the _PAGE_USER bit if MSR_PR is set or if we are
-	 * accessing a userspace segment (even from the kernel). We assume
-	 * kernel addresses always have the high bit set.
-	 */
-	rlwinm	r4,r4,32-25+9,31-9,31-9	/* DSISR_STORE -> _PAGE_RW */
-	rotldi	r0,r3,15		/* Move high bit into MSR_PR posn */
-	orc	r0,r12,r0		/* MSR_PR | ~high_bit */
-	rlwimi	r4,r0,32-13,30,30	/* becomes _PAGE_USER access bit */
-	ori	r4,r4,1			/* add _PAGE_PRESENT */
-	rlwimi	r4,r5,22+2,31-2,31-2	/* Set _PAGE_EXEC if trap is 0x400 */
 
 	/*
 	 * r3 contains the faulting address
-	 * r4 contains the required access permissions
+	 * r4 msr
 	 * r5 contains the trap number
 	 * r6 contains dsisr
 	 *
 	 * at return r3 = 0 for success, 1 for page fault, negative for error
 	 */
+        mr 	r4,r12
 	ld      r6,_DSISR(r1)
-	bl	hash_page		/* build HPTE if possible */
-	cmpdi	r3,0			/* see if hash_page succeeded */
+	bl	__hash_page		/* build HPTE if possible */
+        cmpdi	r3,0			/* see if __hash_page succeeded */
 
 	/* Success */
 	beq	fast_exc_return_irq	/* Return from exception on success */
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 9fcad40c16e9..12eff232840b 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1205,6 +1205,35 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
 }
 EXPORT_SYMBOL_GPL(hash_page);
 
+int __hash_page(unsigned long ea, unsigned long msr, unsigned long trap,
+		unsigned long dsisr)
+{
+	unsigned long access = _PAGE_PRESENT;
+	unsigned long flags = 0;
+	struct mm_struct *mm = current->mm;
+
+	if (REGION_ID(ea) == VMALLOC_REGION_ID)
+		mm = &init_mm;
+
+	if (dsisr & DSISR_NOHPTE)
+		flags |= HPTE_NOHPTE_UPDATE;
+
+	if (dsisr & DSISR_ISSTORE)
+		access |= _PAGE_RW;
+	/*
+	 * We need to set the _PAGE_USER bit if MSR_PR is set or if we are
+	 * accessing a userspace segment (even from the kernel). We assume
+	 * kernel addresses always have the high bit set.
+	 */
+	if ((msr & MSR_PR) || (REGION_ID(ea) == USER_REGION_ID))
+		access |= _PAGE_USER;
+
+	if (trap == 0x400)
+		access |= _PAGE_EXEC;
+
+	return hash_page_mm(mm, ea, access, trap, flags);
+}
+
 void hash_preload(struct mm_struct *mm, unsigned long ea,
 		  unsigned long access, unsigned long trap)
 {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (18 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 19/35] powerpc/mm: Remove the dependency on pte bit position in asm code Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-03  7:13   ` Anshuman Khandual
  2016-02-18  1:11   ` Paul Mackerras
  2015-12-01  3:36 ` [PATCH V6 21/35] powerpc/mm: Remove pte_val usage for the second half of pgtable_t Aneesh Kumar K.V
                   ` (14 subsequent siblings)
  34 siblings, 2 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

This free up 11 bits in pte_t. In the later patch we also change
the pte_t format so that we can start supporting migration pte
at pmd level. We now track 4k subpage valid bit as below

If we have _PAGE_COMBO set, we override the _PAGE_F_GIX_SHIFT
and _PAGE_F_SECOND. Together we have 4 bits, each of them
used to indicate whether any of the 4 4k subpage in that group
is valid. ie,

[ group 1 bit ]   [ group 2 bit ]  ..... [ group 4 ]
[ subpage 1 - 4]  [ subpage 5- 8]  ..... [ subpage 13 - 16]

We still track each 4k subpage slot number and secondary hash
information in the second half of pgtable_t. Removing the subpage
tracking have some significant overhead on aim9 and ebizzy benchmark and
to support THP with 4K subpage, we do need a pgtable_t of 4096 bytes.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 10 +-------
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 35 ++++++---------------------
 arch/powerpc/include/asm/book3s/64/hash.h     | 10 ++++----
 arch/powerpc/mm/hash64_64k.c                  | 34 ++++++++++++++++++++++++--
 arch/powerpc/mm/hash_low_64.S                 |  6 +----
 arch/powerpc/mm/hugetlbpage-hash64.c          |  5 +---
 arch/powerpc/mm/pgtable_64.c                  |  2 +-
 7 files changed, 48 insertions(+), 54 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 537eacecf6e9..75e8b9326e4b 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -47,17 +47,9 @@
 /* Bits to mask out from a PGD to get to the PUD page */
 #define PGD_MASKED_BITS		0
 
-/* PTE bits */
-#define _PAGE_HASHPTE	0x0400 /* software: pte has an associated HPTE */
-#define _PAGE_SECONDARY 0x8000 /* software: HPTE is in secondary group */
-#define _PAGE_GROUP_IX  0x7000 /* software: HPTE index within group */
-#define _PAGE_F_SECOND  _PAGE_SECONDARY
-#define _PAGE_F_GIX     _PAGE_GROUP_IX
-#define _PAGE_SPECIAL	0x10000 /* software: special page */
-
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | \
-			 _PAGE_SECONDARY | _PAGE_GROUP_IX)
+			 _PAGE_F_SECOND | _PAGE_F_GIX)
 
 /* shift to put page number into pte */
 #define PTE_RPN_SHIFT	(17)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index ee073822145d..a268416ca4a4 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -31,33 +31,13 @@
 /* Bits to mask out from a PGD/PUD to get to the PMD page */
 #define PUD_MASKED_BITS		0x1ff
 
-/* Additional PTE bits (don't change without checking asm in hash_low.S) */
-#define _PAGE_SPECIAL	0x00000400 /* software: special page */
-#define _PAGE_HPTE_SUB	0x0ffff000 /* combo only: sub pages HPTE bits */
-#define _PAGE_HPTE_SUB0	0x08000000 /* combo only: first sub page */
-#define _PAGE_COMBO	0x10000000 /* this is a combo 4k page */
-#define _PAGE_4K_PFN	0x20000000 /* PFN is for a single 4k page */
-
-/* For 64K page, we don't have a separate _PAGE_HASHPTE bit. Instead,
- * we set that to be the whole sub-bits mask. The C code will only
- * test this, so a multi-bit mask will work. For combo pages, this
- * is equivalent as effectively, the old _PAGE_HASHPTE was an OR of
- * all the sub bits. For real 64k pages, we now have the assembly set
- * _PAGE_HPTE_SUB0 in addition to setting the HIDX bits which overlap
- * that mask. This is fine as long as the HIDX bits are never set on
- * a PTE that isn't hashed, which is the case today.
- *
- * A little nit is for the huge page C code, which does the hashing
- * in C, we need to provide which bit to use.
- */
-#define _PAGE_HASHPTE	_PAGE_HPTE_SUB
-
-/* Note the full page bits must be in the same location as for normal
- * 4k pages as the same assembly will be used to insert 64K pages
- * whether the kernel has CONFIG_PPC_64K_PAGES or not
+#define _PAGE_COMBO	0x00020000 /* this is a combo 4k page */
+#define _PAGE_4K_PFN	0x00040000 /* PFN is for a single 4k page */
+/*
+ * Used to track subpage group valid if _PAGE_COMBO is set
+ * This overloads _PAGE_F_GIX and _PAGE_F_SECOND
  */
-#define _PAGE_F_SECOND  0x00008000 /* full page: hidx bits */
-#define _PAGE_F_GIX     0x00007000 /* full page: hidx bits */
+#define _PAGE_COMBO_VALID	(_PAGE_F_GIX | _PAGE_F_SECOND)
 
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | _PAGE_COMBO)
@@ -103,8 +83,7 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 }
 
 #define __rpte_to_pte(r)	((r).pte)
-#define __rpte_sub_valid(rpte, index) \
-	(pte_val(rpte.pte) & (_PAGE_HPTE_SUB0 >> (index)))
+extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
 /*
  * Trick: we set __end to va + 64k, which happens works for
  * a 16M page as well as we want only one iteration
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 48237e66e823..2f2034621a69 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -81,7 +81,12 @@
 #define _PAGE_DIRTY		0x0080 /* C: page changed */
 #define _PAGE_ACCESSED		0x0100 /* R: page referenced */
 #define _PAGE_RW		0x0200 /* software: user write access allowed */
+#define _PAGE_HASHPTE		0x0400 /* software: pte has an associated HPTE */
 #define _PAGE_BUSY		0x0800 /* software: PTE & hash are busy */
+#define _PAGE_F_GIX		0x7000 /* full page: hidx bits */
+#define _PAGE_F_GIX_SHIFT	12
+#define _PAGE_F_SECOND		0x8000 /* Whether to use secondary hash or not */
+#define _PAGE_SPECIAL		0x10000 /* software: special page */
 
 /* No separate kernel read-only */
 #define _PAGE_KERNEL_RW		(_PAGE_RW | _PAGE_DIRTY) /* user access blocked by key */
@@ -210,11 +215,6 @@
 
 #define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
 #define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
-/*
- * We save the slot number & secondary bit in the second half of the
- * PTE page. We use the 8 bytes per each pte entry.
- */
-#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
 
 #ifndef __ASSEMBLY__
 #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 9ffeae2cbb57..f1b86ba63430 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -15,6 +15,35 @@
 #include <linux/mm.h>
 #include <asm/machdep.h>
 #include <asm/mmu.h>
+/*
+ * index from 0 - 15
+ */
+bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
+{
+	unsigned long g_idx;
+	unsigned long ptev = pte_val(rpte.pte);
+
+	g_idx = (ptev & _PAGE_COMBO_VALID) >> _PAGE_F_GIX_SHIFT;
+	index = index >> 2;
+	if (g_idx & (0x1 << index))
+		return true;
+	else
+		return false;
+}
+/*
+ * index from 0 - 15
+ */
+static unsigned long mark_subptegroup_valid(unsigned long ptev, unsigned long index)
+{
+	unsigned long g_idx;
+
+	if (!(ptev & _PAGE_COMBO))
+		return ptev;
+	index = index >> 2;
+	g_idx = 0x1 << index;
+
+	return ptev | (g_idx << _PAGE_F_GIX_SHIFT);
+}
 
 int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		   pte_t *ptep, unsigned long trap, unsigned long flags,
@@ -102,7 +131,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 */
 	if (!(old_pte & _PAGE_COMBO)) {
 		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
-		old_pte &= ~_PAGE_HPTE_SUB;
+		old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;
 		goto htab_insert_hpte;
 	}
 	/*
@@ -192,7 +221,8 @@ repeat:
 	/* __real_pte use pte_val() any idea why ? FIXME!! */
 	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
 	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
-	new_pte |= (_PAGE_HPTE_SUB0 >> subpg_index);
+	new_pte = mark_subptegroup_valid(new_pte, subpg_index);
+	new_pte |=  _PAGE_HASHPTE;
 	/*
 	 * check __real_pte for details on matching smp_rmb()
 	 */
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index 6b4d4c1d0628..359839a57f26 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -285,7 +285,7 @@ htab_modify_pte:
 
 	/* Secondary group ? if yes, get a inverted hash value */
 	mr	r5,r28
-	andi.	r0,r31,_PAGE_SECONDARY
+	andi.	r0,r31,_PAGE_F_SECOND
 	beq	1f
 	not	r5,r5
 1:
@@ -473,11 +473,7 @@ ht64_insert_pte:
 	lis	r0,_PAGE_HPTEFLAGS@h
 	ori	r0,r0,_PAGE_HPTEFLAGS@l
 	andc	r30,r30,r0
-#ifdef CONFIG_PPC_64K_PAGES
-	oris	r30,r30,_PAGE_HPTE_SUB0@h
-#else
 	ori	r30,r30,_PAGE_HASHPTE
-#endif
 	/* Phyical address in r5 */
 	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
 	sldi	r5,r5,PAGE_SHIFT
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index d94b1af53a93..7584e8445512 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -91,11 +91,8 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
 
 		/* clear HPTE slot informations in new PTE */
-#ifdef CONFIG_PPC_64K_PAGES
-		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HPTE_SUB0;
-#else
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
-#endif
+
 		/* Add in WIMG bits */
 		rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
 				      _PAGE_COHERENT | _PAGE_GUARDED));
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index d692ae31cfc7..3967e3cce03e 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -625,7 +625,7 @@ void pmdp_splitting_flush(struct vm_area_struct *vma,
 	"1:	ldarx	%0,0,%3\n\
 		andi.	%1,%0,%6\n\
 		bne-	1b \n\
-		ori	%1,%0,%4 \n\
+		oris	%1,%0,%4@h \n\
 		stdcx.	%1,0,%3 \n\
 		bne-	1b"
 	: "=&r" (old), "=&r" (tmp), "=m" (*pmdp)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 21/35] powerpc/mm: Remove pte_val usage for the second half of pgtable_t
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (19 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 22/35] powerpc/mm: Increase the width of #define Aneesh Kumar K.V
                   ` (13 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 4 +++-
 arch/powerpc/mm/hash64_64k.c                  | 1 -
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index a268416ca4a4..9fca7fae434b 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -61,6 +61,7 @@
 static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
 {
 	real_pte_t rpte;
+	unsigned long *hidxp;
 
 	rpte.pte = pte;
 	rpte.hidx = 0;
@@ -70,7 +71,8 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
 		 * check. The store side ordering is done in __hash_page_4K
 		 */
 		smp_rmb();
-		rpte.hidx = pte_val(*((ptep) + PTRS_PER_PTE));
+		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+		rpte.hidx = *hidxp;
 	}
 	return rpte;
 }
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index f1b86ba63430..8f7328075f04 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -218,7 +218,6 @@ repeat:
 	 * nobody is undating hidx.
 	 */
 	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
-	/* __real_pte use pte_val() any idea why ? FIXME!! */
 	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
 	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
 	new_pte = mark_subptegroup_valid(new_pte, subpg_index);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 22/35] powerpc/mm: Increase the width of #define
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (20 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 21/35] powerpc/mm: Remove pte_val usage for the second half of pgtable_t Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 23/35] powerpc/mm: Convert __hash_page_64K to C Aneesh Kumar K.V
                   ` (12 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

No real change, only style changes

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 2f2034621a69..e4ea9d73a541 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -69,23 +69,23 @@
  * We could create separate kernel read-only if we used the 3 PP bits
  * combinations that newer processors provide but we currently don't.
  */
-#define _PAGE_PRESENT		0x0001 /* software: pte contains a translation */
-#define _PAGE_USER		0x0002 /* matches one of the PP bits */
+#define _PAGE_PRESENT		0x00001 /* software: pte contains a translation */
+#define _PAGE_USER		0x00002 /* matches one of the PP bits */
 #define _PAGE_BIT_SWAP_TYPE	2
-#define _PAGE_EXEC		0x0004 /* No execute on POWER4 and newer (we invert) */
-#define _PAGE_GUARDED		0x0008
+#define _PAGE_EXEC		0x00004 /* No execute on POWER4 and newer (we invert) */
+#define _PAGE_GUARDED		0x00008
 /* We can derive Memory coherence from _PAGE_NO_CACHE */
 #define _PAGE_COHERENT		0x0
-#define _PAGE_NO_CACHE		0x0020 /* I: cache inhibit */
-#define _PAGE_WRITETHRU		0x0040 /* W: cache write-through */
-#define _PAGE_DIRTY		0x0080 /* C: page changed */
-#define _PAGE_ACCESSED		0x0100 /* R: page referenced */
-#define _PAGE_RW		0x0200 /* software: user write access allowed */
-#define _PAGE_HASHPTE		0x0400 /* software: pte has an associated HPTE */
-#define _PAGE_BUSY		0x0800 /* software: PTE & hash are busy */
-#define _PAGE_F_GIX		0x7000 /* full page: hidx bits */
+#define _PAGE_NO_CACHE		0x00020 /* I: cache inhibit */
+#define _PAGE_WRITETHRU		0x00040 /* W: cache write-through */
+#define _PAGE_DIRTY		0x00080 /* C: page changed */
+#define _PAGE_ACCESSED		0x00100 /* R: page referenced */
+#define _PAGE_RW		0x00200 /* software: user write access allowed */
+#define _PAGE_HASHPTE		0x00400 /* software: pte has an associated HPTE */
+#define _PAGE_BUSY		0x00800 /* software: PTE & hash are busy */
+#define _PAGE_F_GIX		0x07000 /* full page: hidx bits */
 #define _PAGE_F_GIX_SHIFT	12
-#define _PAGE_F_SECOND		0x8000 /* Whether to use secondary hash or not */
+#define _PAGE_F_SECOND		0x08000 /* Whether to use secondary hash or not */
 #define _PAGE_SPECIAL		0x10000 /* software: special page */
 
 /* No separate kernel read-only */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 23/35] powerpc/mm: Convert __hash_page_64K to C
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (21 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 22/35] powerpc/mm: Increase the width of #define Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 24/35] powerpc/mm: Convert 4k insert from asm " Aneesh Kumar K.V
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Convert from asm to C

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   3 +-
 arch/powerpc/mm/hash64_64k.c                  | 130 ++++++++++++
 arch/powerpc/mm/hash_low_64.S                 | 290 +-------------------------
 arch/powerpc/mm/hash_utils_64.c               |  19 +-
 4 files changed, 134 insertions(+), 308 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 9fca7fae434b..46b5d0ab11de 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -40,7 +40,8 @@
 #define _PAGE_COMBO_VALID	(_PAGE_F_GIX | _PAGE_F_SECOND)
 
 /* PTE flags to conserve for HPTE identification */
-#define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | _PAGE_COMBO)
+#define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_F_SECOND | \
+			 _PAGE_F_GIX | _PAGE_HASHPTE | _PAGE_COMBO)
 
 /* Shift to put page number into pte.
  *
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 8f7328075f04..fc0898eb309d 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -229,3 +229,133 @@ repeat:
 	*ptep = __pte(new_pte & ~_PAGE_BUSY);
 	return 0;
 }
+
+int __hash_page_64K(unsigned long ea, unsigned long access,
+		    unsigned long vsid, pte_t *ptep, unsigned long trap,
+		    unsigned long flags, int ssize)
+{
+
+	unsigned long hpte_group;
+	unsigned long rflags, pa;
+	unsigned long old_pte, new_pte;
+	unsigned long vpn, hash, slot;
+	unsigned long shift = mmu_psize_defs[MMU_PAGE_64K].shift;
+
+	/*
+	 * atomically mark the linux large page PTE busy and dirty
+	 */
+	do {
+		pte_t pte = READ_ONCE(*ptep);
+
+		old_pte = pte_val(pte);
+		/* If PTE busy, retry the access */
+		if (unlikely(old_pte & _PAGE_BUSY))
+			return 0;
+		/* If PTE permissions don't match, take page fault */
+		if (unlikely(access & ~old_pte))
+			return 1;
+		/*
+		 * Check if PTE has the cache-inhibit bit set
+		 * If so, bail out and refault as a 4k page
+		 */
+		if (!mmu_has_feature(MMU_FTR_CI_LARGE_PAGE) &&
+		    unlikely(old_pte & _PAGE_NO_CACHE))
+			return 0;
+		/*
+		 * Try to lock the PTE, add ACCESSED and DIRTY if it was
+		 * a write access. Since this is 4K insert of 64K page size
+		 * also add _PAGE_COMBO
+		 */
+		new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED;
+		if (access & _PAGE_RW)
+			new_pte |= _PAGE_DIRTY;
+	} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
+					  old_pte, new_pte));
+	/*
+	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
+	 * need to add in 0x1 if it's a read-only user page
+	 */
+	rflags = new_pte & _PAGE_USER;
+	if ((new_pte & _PAGE_USER) && !((new_pte & _PAGE_RW) &&
+					(new_pte & _PAGE_DIRTY)))
+		rflags |= 0x1;
+	/*
+	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
+	 */
+	rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
+	/*
+	 * Always add C and Memory coherence bit
+	 */
+	rflags |= HPTE_R_C | HPTE_R_M;
+	/*
+	 * Add in WIMG bits
+	 */
+	rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
+				_PAGE_COHERENT | _PAGE_GUARDED));
+
+	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
+	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
+		rflags = hash_page_do_lazy_icache(rflags, __pte(old_pte), trap);
+
+	vpn  = hpt_vpn(ea, vsid, ssize);
+	if (unlikely(old_pte & _PAGE_HASHPTE)) {
+		/*
+		 * There MIGHT be an HPTE for this pte
+		 */
+		hash = hpt_hash(vpn, shift, ssize);
+		if (old_pte & _PAGE_F_SECOND)
+			hash = ~hash;
+		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+		slot += (old_pte & _PAGE_F_GIX) >> _PAGE_F_GIX_SHIFT;
+
+		if (ppc_md.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,
+					 MMU_PAGE_64K, ssize, flags) == -1)
+			old_pte &= ~_PAGE_HPTEFLAGS;
+	}
+
+	if (likely(!(old_pte & _PAGE_HASHPTE))) {
+
+		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
+		hash = hpt_hash(vpn, shift, ssize);
+
+repeat:
+		hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+
+		/* Insert into the hash table, primary slot */
+		slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
+				  MMU_PAGE_64K, MMU_PAGE_64K, ssize);
+		/*
+		 * Primary is full, try the secondary
+		 */
+		if (unlikely(slot == -1)) {
+			hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+			slot = ppc_md.hpte_insert(hpte_group, vpn, pa,
+						  rflags, HPTE_V_SECONDARY,
+						  MMU_PAGE_64K, MMU_PAGE_64K, ssize);
+			if (slot == -1) {
+				if (mftb() & 0x1)
+					hpte_group = ((hash & htab_hash_mask) *
+						      HPTES_PER_GROUP) & ~0x7UL;
+				ppc_md.hpte_remove(hpte_group);
+				/*
+				 * FIXME!! Should be try the group from which we removed ?
+				 */
+				goto repeat;
+			}
+		}
+		/*
+		 * Hypervisor failure. Restore old pmd and return -1
+		 * similar to __hash_page_*
+		 */
+		if (unlikely(slot == -2)) {
+			*ptep = __pte(old_pte);
+			hash_failure_debug(ea, access, vsid, trap, ssize,
+					   MMU_PAGE_64K, MMU_PAGE_64K, old_pte);
+			return -1;
+		}
+		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
+		new_pte |= (slot << _PAGE_F_GIX_SHIFT) & (_PAGE_F_SECOND | _PAGE_F_GIX);
+	}
+	*ptep = __pte(new_pte & ~_PAGE_BUSY);
+	return 0;
+}
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index 359839a57f26..f7d49cf0ccb7 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -328,292 +328,4 @@ htab_pte_insert_failure:
 	li	r3,-1
 	b	htab_bail
 
-#else /* CONFIG_PPC_64K_PAGES */
-
-/*****************************************************************************
- *                                                                           *
- *           64K SW & 64K HW in a 64K segment pages implementation           *
- *                                                                           *
- *****************************************************************************/
-
-_GLOBAL(__hash_page_64K)
-	mflr	r0
-	std	r0,16(r1)
-	stdu	r1,-STACKFRAMESIZE(r1)
-	/* Save all params that we need after a function call */
-	std	r6,STK_PARAM(R6)(r1)
-	std	r8,STK_PARAM(R8)(r1)
-	std	r9,STK_PARAM(R9)(r1)
-
-	/* Save non-volatile registers.
-	 * r31 will hold "old PTE"
-	 * r30 is "new PTE"
-	 * r29 is vpn
-	 * r28 is a hash value
-	 * r27 is hashtab mask (maybe dynamic patched instead ?)
-	 */
-	std	r27,STK_REG(R27)(r1)
-	std	r28,STK_REG(R28)(r1)
-	std	r29,STK_REG(R29)(r1)
-	std	r30,STK_REG(R30)(r1)
-	std	r31,STK_REG(R31)(r1)
-
-	/* Step 1:
-	 *
-	 * Check permissions, atomically mark the linux PTE busy
-	 * and hashed.
-	 */
-1:
-	ldarx	r31,0,r6
-	/* Check access rights (access & ~(pte_val(*ptep))) */
-	andc.	r0,r4,r31
-	bne-	ht64_wrong_access
-	/* Check if PTE is busy */
-	andi.	r0,r31,_PAGE_BUSY
-	/* If so, just bail out and refault if needed. Someone else
-	 * is changing this PTE anyway and might hash it.
-	 */
-	bne-	ht64_bail_ok
-BEGIN_FTR_SECTION
-	/* Check if PTE has the cache-inhibit bit set */
-	andi.	r0,r31,_PAGE_NO_CACHE
-	/* If so, bail out and refault as a 4k page */
-	bne-	ht64_bail_ok
-END_MMU_FTR_SECTION_IFCLR(MMU_FTR_CI_LARGE_PAGE)
-	/* Prepare new PTE value (turn access RW into DIRTY, then
-	 * add BUSY and ACCESSED)
-	 */
-	rlwinm	r30,r4,32-9+7,31-7,31-7	/* _PAGE_RW -> _PAGE_DIRTY */
-	or	r30,r30,r31
-	ori	r30,r30,_PAGE_BUSY | _PAGE_ACCESSED
-	/* Write the linux PTE atomically (setting busy) */
-	stdcx.	r30,0,r6
-	bne-	1b
-	isync
-
-	/* Step 2:
-	 *
-	 * Insert/Update the HPTE in the hash table. At this point,
-	 * r4 (access) is re-useable, we use it for the new HPTE flags
-	 */
-
-BEGIN_FTR_SECTION
-	cmpdi	r9,0			/* check segment size */
-	bne	3f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_1T_SEGMENT)
-	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT - VPN_SHIFT
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT - VPN_SHIFT)
-	or	r29,r28,r29
-
-	/* Calculate hash value for primary slot and store it in r28
-	 * r3 = va, r5 = vsid
-	 * r0 = (va >> 16) & ((1ul << (28 - 16)) -1)
-	 */
-	rldicl	r0,r3,64-16,52
-	xor	r28,r5,r0		/* hash */
-	b	4f
-
-3:	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT_1T - VPN_SHIFT
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT_1T - VPN_SHIFT)
-	or	r29,r28,r29
-	/*
-	 * calculate hash value for primary slot and
-	 * store it in r28 for 1T segment
-	 * r3 = va, r5 = vsid
-	 */
-	sldi	r28,r5,25		/* vsid << 25 */
-	/* r0 = (va >> 16) & ((1ul << (40 - 16)) -1) */
-	rldicl	r0,r3,64-16,40
-	xor	r28,r28,r5		/* vsid ^ ( vsid << 25) */
-	xor	r28,r28,r0		/* hash */
-
-	/* Convert linux PTE bits into HW equivalents */
-4:	andi.	r3,r30,0x1fe		/* Get basic set of flags */
-	xori	r3,r3,HPTE_R_N		/* _PAGE_EXEC -> NOEXEC */
-	rlwinm	r0,r30,32-9+1,30,30	/* _PAGE_RW -> _PAGE_USER (r0) */
-	rlwinm	r4,r30,32-7+1,30,30	/* _PAGE_DIRTY -> _PAGE_USER (r4) */
-	and	r0,r0,r4		/* _PAGE_RW & _PAGE_DIRTY ->r0 bit 30*/
-	andc	r0,r30,r0		/* r0 = pte & ~r0 */
-	rlwimi	r3,r0,32-1,31,31	/* Insert result into PP lsb */
-	/*
-	 * Always add "C" bit for perf. Memory coherence is always enabled
-	 */
-	ori	r3,r3,HPTE_R_C | HPTE_R_M
-
-	/* We eventually do the icache sync here (maybe inline that
-	 * code rather than call a C function...)
-	 */
-BEGIN_FTR_SECTION
-	mr	r4,r30
-	mr	r5,r7
-	bl	hash_page_do_lazy_icache
-END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE)
-
-	/* At this point, r3 contains new PP bits, save them in
-	 * place of "access" in the param area (sic)
-	 */
-	std	r3,STK_PARAM(R4)(r1)
-
-	/* Get htab_hash_mask */
-	ld	r4,htab_hash_mask@got(2)
-	ld	r27,0(r4)	/* htab_hash_mask -> r27 */
-
-	/* Check if we may already be in the hashtable, in this case, we
-	 * go to out-of-line code to try to modify the HPTE
-	 */
-	rldicl.	r0,r31,64-12,48
-	bne	ht64_modify_pte
-
-ht64_insert_pte:
-	/* Clear hpte bits in new pte (we also clear BUSY btw) and
-	 * add _PAGE_HPTE_SUB0
-	 */
-	lis	r0,_PAGE_HPTEFLAGS@h
-	ori	r0,r0,_PAGE_HPTEFLAGS@l
-	andc	r30,r30,r0
-	ori	r30,r30,_PAGE_HASHPTE
-	/* Phyical address in r5 */
-	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
-	sldi	r5,r5,PAGE_SHIFT
-
-	/* Calculate primary group hash */
-	and	r0,r28,r27
-	rldicr	r3,r0,3,63-3	/* r0 = (hash & mask) << 3 */
-
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,0			/* !bolted, !secondary */
-	li	r8,MMU_PAGE_64K
-	li	r9,MMU_PAGE_64K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl ht64_call_hpte_insert1
-ht64_call_hpte_insert1:
-	bl	.			/* patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge	ht64_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	ht64_pte_insert_failure
-
-	/* Now try secondary slot */
-
-	/* Phyical address in r5 */
-	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
-	sldi	r5,r5,PAGE_SHIFT
-
-	/* Calculate secondary group hash */
-	andc	r0,r27,r28
-	rldicr	r3,r0,3,63-3	/* r0 = (~hash & mask) << 3 */
-
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,HPTE_V_SECONDARY	/* !bolted, secondary */
-	li	r8,MMU_PAGE_64K
-	li	r9,MMU_PAGE_64K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl ht64_call_hpte_insert2
-ht64_call_hpte_insert2:
-	bl	.			/* patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge+	ht64_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	ht64_pte_insert_failure
-
-	/* Both are full, we need to evict something */
-	mftb	r0
-	/* Pick a random group based on TB */
-	andi.	r0,r0,1
-	mr	r5,r28
-	bne	2f
-	not	r5,r5
-2:	and	r0,r5,r27
-	rldicr	r3,r0,3,63-3	/* r0 = (hash & mask) << 3 */
-	/* Call ppc_md.hpte_remove */
-.globl ht64_call_hpte_remove
-ht64_call_hpte_remove:
-	bl	.			/* patched by htab_finish_init() */
-
-	/* Try all again */
-	b	ht64_insert_pte
-
-ht64_bail_ok:
-	li	r3,0
-	b	ht64_bail
-
-ht64_pte_insert_ok:
-	/* Insert slot number & secondary bit in PTE */
-	rldimi	r30,r3,12,63-15
-
-	/* Write out the PTE with a normal write
-	 * (maybe add eieio may be good still ?)
-	 */
-ht64_write_out_pte:
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r30,0(r6)
-	li	r3, 0
-ht64_bail:
-	ld	r27,STK_REG(R27)(r1)
-	ld	r28,STK_REG(R28)(r1)
-	ld	r29,STK_REG(R29)(r1)
-	ld      r30,STK_REG(R30)(r1)
-	ld      r31,STK_REG(R31)(r1)
-	addi    r1,r1,STACKFRAMESIZE
-	ld      r0,16(r1)
-	mtlr    r0
-	blr
-
-ht64_modify_pte:
-	/* Keep PP bits in r4 and slot idx from the PTE around in r3 */
-	mr	r4,r3
-	rlwinm	r3,r31,32-12,29,31
-
-	/* Secondary group ? if yes, get a inverted hash value */
-	mr	r5,r28
-	andi.	r0,r31,_PAGE_F_SECOND
-	beq	1f
-	not	r5,r5
-1:
-	/* Calculate proper slot value for ppc_md.hpte_updatepp */
-	and	r0,r5,r27
-	rldicr	r0,r0,3,63-3	/* r0 = (hash & mask) << 3 */
-	add	r3,r0,r3	/* add slot idx */
-
-	/* Call ppc_md.hpte_updatepp */
-	mr	r5,r29			/* vpn */
-	li	r6,MMU_PAGE_64K		/* base page size */
-	li	r7,MMU_PAGE_64K		/* actual page size */
-	ld	r8,STK_PARAM(R9)(r1)	/* segment size */
-	ld	r9,STK_PARAM(R8)(r1)	/* get "flags" param */
-.globl ht64_call_hpte_updatepp
-ht64_call_hpte_updatepp:
-	bl	.			/* patched by htab_finish_init() */
-
-	/* if we failed because typically the HPTE wasn't really here
-	 * we try an insertion.
-	 */
-	cmpdi	0,r3,-1
-	beq-	ht64_insert_pte
-
-	/* Clear the BUSY bit and Write out the PTE */
-	li	r0,_PAGE_BUSY
-	andc	r30,r30,r0
-	b	ht64_write_out_pte
-
-ht64_wrong_access:
-	/* Bail out clearing reservation */
-	stdcx.	r31,0,r6
-	li	r3,1
-	b	ht64_bail
-
-ht64_pte_insert_failure:
-	/* Bail out restoring old PTE */
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r31,0(r6)
-	li	r3,-1
-	b	ht64_bail
-
-
-#endif /* CONFIG_PPC_64K_PAGES */
+#endif
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 12eff232840b..60ad15d76052 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -633,28 +633,11 @@ extern u32 htab_call_hpte_insert1[];
 extern u32 htab_call_hpte_insert2[];
 extern u32 htab_call_hpte_remove[];
 extern u32 htab_call_hpte_updatepp[];
-extern u32 ht64_call_hpte_insert1[];
-extern u32 ht64_call_hpte_insert2[];
-extern u32 ht64_call_hpte_remove[];
-extern u32 ht64_call_hpte_updatepp[];
 
 static void __init htab_finish_init(void)
 {
-#ifdef CONFIG_PPC_64K_PAGES
-	patch_branch(ht64_call_hpte_insert1,
-		ppc_function_entry(ppc_md.hpte_insert),
-		BRANCH_SET_LINK);
-	patch_branch(ht64_call_hpte_insert2,
-		ppc_function_entry(ppc_md.hpte_insert),
-		BRANCH_SET_LINK);
-	patch_branch(ht64_call_hpte_remove,
-		ppc_function_entry(ppc_md.hpte_remove),
-		BRANCH_SET_LINK);
-	patch_branch(ht64_call_hpte_updatepp,
-		ppc_function_entry(ppc_md.hpte_updatepp),
-		BRANCH_SET_LINK);
-#else /* !CONFIG_PPC_64K_PAGES */
 
+#ifdef CONFIG_PPC_4K_PAGES
 	patch_branch(htab_call_hpte_insert1,
 		ppc_function_entry(ppc_md.hpte_insert),
 		BRANCH_SET_LINK);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 24/35] powerpc/mm: Convert 4k insert from asm to C
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (22 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 23/35] powerpc/mm: Convert __hash_page_64K to C Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 25/35] powerpc/mm: Add helper for converting pte bit to hpte bits Aneesh Kumar K.V
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

This is similar to 64K insert. May be we want to consolidate

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/Makefile        |   6 +-
 arch/powerpc/mm/hash64_4k.c     | 139 +++++++++++++++++
 arch/powerpc/mm/hash_low_64.S   | 331 ----------------------------------------
 arch/powerpc/mm/hash_utils_64.c |  26 ----
 4 files changed, 142 insertions(+), 360 deletions(-)
 create mode 100644 arch/powerpc/mm/hash64_4k.c
 delete mode 100644 arch/powerpc/mm/hash_low_64.S

diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index f80ad1a76cc8..1ffeda85c086 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -14,11 +14,11 @@ obj-$(CONFIG_PPC_MMU_NOHASH)	+= mmu_context_nohash.o tlb_nohash.o \
 obj-$(CONFIG_PPC_BOOK3E)	+= tlb_low_$(CONFIG_WORD_SIZE)e.o
 hash64-$(CONFIG_PPC_NATIVE)	:= hash_native_64.o
 obj-$(CONFIG_PPC_STD_MMU_64)	+= hash_utils_64.o slb_low.o slb.o $(hash64-y)
-obj-$(CONFIG_PPC_STD_MMU_32)	+= ppc_mmu_32.o
-obj-$(CONFIG_PPC_STD_MMU)	+= hash_low_$(CONFIG_WORD_SIZE).o \
-				   tlb_hash$(CONFIG_WORD_SIZE).o \
+obj-$(CONFIG_PPC_STD_MMU_32)	+= ppc_mmu_32.o hash_low_32.o
+obj-$(CONFIG_PPC_STD_MMU)	+= tlb_hash$(CONFIG_WORD_SIZE).o \
 				   mmu_context_hash$(CONFIG_WORD_SIZE).o
 ifeq ($(CONFIG_PPC_STD_MMU_64),y)
+obj-$(CONFIG_PPC_4K_PAGES)	+= hash64_4k.o
 obj-$(CONFIG_PPC_64K_PAGES)	+= hash64_64k.o
 endif
 obj-$(CONFIG_PPC_ICSWX)		+= icswx.o
diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
new file mode 100644
index 000000000000..3b49c6f18741
--- /dev/null
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -0,0 +1,139 @@
+/*
+ * Copyright IBM Corporation, 2015
+ * Author Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#include <linux/mm.h>
+#include <asm/machdep.h>
+#include <asm/mmu.h>
+
+int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
+		   pte_t *ptep, unsigned long trap, unsigned long flags,
+		   int ssize, int subpg_prot)
+{
+	unsigned long hpte_group;
+	unsigned long rflags, pa;
+	unsigned long old_pte, new_pte;
+	unsigned long vpn, hash, slot;
+	unsigned long shift = mmu_psize_defs[MMU_PAGE_4K].shift;
+
+	/*
+	 * atomically mark the linux large page PTE busy and dirty
+	 */
+	do {
+		pte_t pte = READ_ONCE(*ptep);
+
+		old_pte = pte_val(pte);
+		/* If PTE busy, retry the access */
+		if (unlikely(old_pte & _PAGE_BUSY))
+			return 0;
+		/* If PTE permissions don't match, take page fault */
+		if (unlikely(access & ~old_pte))
+			return 1;
+		/*
+		 * Try to lock the PTE, add ACCESSED and DIRTY if it was
+		 * a write access. Since this is 4K insert of 64K page size
+		 * also add _PAGE_COMBO
+		 */
+		new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED | _PAGE_HASHPTE;
+		if (access & _PAGE_RW)
+			new_pte |= _PAGE_DIRTY;
+	} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
+					  old_pte, new_pte));
+	/*
+	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
+	 * need to add in 0x1 if it's a read-only user page
+	 */
+	rflags = new_pte & _PAGE_USER;
+	if ((new_pte & _PAGE_USER) && !((new_pte & _PAGE_RW) &&
+					(new_pte & _PAGE_DIRTY)))
+		rflags |= 0x1;
+	/*
+	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
+	 */
+	rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
+	/*
+	 * Always add C and Memory coherence bit
+	 */
+	rflags |= HPTE_R_C | HPTE_R_M;
+	/*
+	 * Add in WIMG bits
+	 */
+	rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
+				_PAGE_COHERENT | _PAGE_GUARDED));
+
+	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
+	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
+		rflags = hash_page_do_lazy_icache(rflags, __pte(old_pte), trap);
+
+	vpn  = hpt_vpn(ea, vsid, ssize);
+	if (unlikely(old_pte & _PAGE_HASHPTE)) {
+		/*
+		 * There MIGHT be an HPTE for this pte
+		 */
+		hash = hpt_hash(vpn, shift, ssize);
+		if (old_pte & _PAGE_F_SECOND)
+			hash = ~hash;
+		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+		slot += (old_pte & _PAGE_F_GIX) >> _PAGE_F_GIX_SHIFT;
+
+		if (ppc_md.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_4K,
+					 MMU_PAGE_4K, ssize, flags) == -1)
+			old_pte &= ~_PAGE_HPTEFLAGS;
+	}
+
+	if (likely(!(old_pte & _PAGE_HASHPTE))) {
+
+		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
+		hash = hpt_hash(vpn, shift, ssize);
+
+repeat:
+		hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+
+		/* Insert into the hash table, primary slot */
+		slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
+				  MMU_PAGE_4K, MMU_PAGE_4K, ssize);
+		/*
+		 * Primary is full, try the secondary
+		 */
+		if (unlikely(slot == -1)) {
+			hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+			slot = ppc_md.hpte_insert(hpte_group, vpn, pa,
+						  rflags, HPTE_V_SECONDARY,
+						  MMU_PAGE_4K, MMU_PAGE_4K, ssize);
+			if (slot == -1) {
+				if (mftb() & 0x1)
+					hpte_group = ((hash & htab_hash_mask) *
+						      HPTES_PER_GROUP) & ~0x7UL;
+				ppc_md.hpte_remove(hpte_group);
+				/*
+				 * FIXME!! Should be try the group from which we removed ?
+				 */
+				goto repeat;
+			}
+		}
+		/*
+		 * Hypervisor failure. Restore old pmd and return -1
+		 * similar to __hash_page_*
+		 */
+		if (unlikely(slot == -2)) {
+			*ptep = __pte(old_pte);
+			hash_failure_debug(ea, access, vsid, trap, ssize,
+					   MMU_PAGE_4K, MMU_PAGE_4K, old_pte);
+			return -1;
+		}
+		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
+		new_pte |= (slot << _PAGE_F_GIX_SHIFT) & (_PAGE_F_SECOND | _PAGE_F_GIX);
+	}
+	*ptep = __pte(new_pte & ~_PAGE_BUSY);
+	return 0;
+}
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
deleted file mode 100644
index f7d49cf0ccb7..000000000000
--- a/arch/powerpc/mm/hash_low_64.S
+++ /dev/null
@@ -1,331 +0,0 @@
-/*
- * ppc64 MMU hashtable management routines
- *
- * (c) Copyright IBM Corp. 2003, 2005
- *
- * Maintained by: Benjamin Herrenschmidt
- *                <benh@kernel.crashing.org>
- *
- * This file is covered by the GNU Public Licence v2 as
- * described in the kernel's COPYING file.
- */
-
-#include <asm/reg.h>
-#include <asm/pgtable.h>
-#include <asm/mmu.h>
-#include <asm/page.h>
-#include <asm/types.h>
-#include <asm/ppc_asm.h>
-#include <asm/asm-offsets.h>
-#include <asm/cputable.h>
-
-	.text
-
-/*
- * Stackframe:
- *		
- *         +-> Back chain			(SP + 256)
- *         |   General register save area	(SP + 112)
- *         |   Parameter save area		(SP + 48)
- *         |   TOC save area			(SP + 40)
- *         |   link editor doubleword		(SP + 32)
- *         |   compiler doubleword		(SP + 24)
- *         |   LR save area			(SP + 16)
- *         |   CR save area			(SP + 8)
- * SP ---> +-- Back chain			(SP + 0)
- */
-
-#ifndef CONFIG_PPC_64K_PAGES
-
-/*****************************************************************************
- *                                                                           *
- *           4K SW & 4K HW pages implementation                              *
- *                                                                           *
- *****************************************************************************/
-
-
-/*
- * _hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
- *		 pte_t *ptep, unsigned long trap, unsigned long flags,
- *		 int ssize)
- *
- * Adds a 4K page to the hash table in a segment of 4K pages only
- */
-
-_GLOBAL(__hash_page_4K)
-	mflr	r0
-	std	r0,16(r1)
-	stdu	r1,-STACKFRAMESIZE(r1)
-	/* Save all params that we need after a function call */
-	std	r6,STK_PARAM(R6)(r1)
-	std	r8,STK_PARAM(R8)(r1)
-	std	r9,STK_PARAM(R9)(r1)
-	
-	/* Save non-volatile registers.
-	 * r31 will hold "old PTE"
-	 * r30 is "new PTE"
-	 * r29 is vpn
-	 * r28 is a hash value
-	 * r27 is hashtab mask (maybe dynamic patched instead ?)
-	 */
-	std	r27,STK_REG(R27)(r1)
-	std	r28,STK_REG(R28)(r1)
-	std	r29,STK_REG(R29)(r1)
-	std	r30,STK_REG(R30)(r1)
-	std	r31,STK_REG(R31)(r1)
-	
-	/* Step 1:
-	 *
-	 * Check permissions, atomically mark the linux PTE busy
-	 * and hashed.
-	 */ 
-1:
-	ldarx	r31,0,r6
-	/* Check access rights (access & ~(pte_val(*ptep))) */
-	andc.	r0,r4,r31
-	bne-	htab_wrong_access
-	/* Check if PTE is busy */
-	andi.	r0,r31,_PAGE_BUSY
-	/* If so, just bail out and refault if needed. Someone else
-	 * is changing this PTE anyway and might hash it.
-	 */
-	bne-	htab_bail_ok
-
-	/* Prepare new PTE value (turn access RW into DIRTY, then
-	 * add BUSY,HASHPTE and ACCESSED)
-	 */
-	rlwinm	r30,r4,32-9+7,31-7,31-7	/* _PAGE_RW -> _PAGE_DIRTY */
-	or	r30,r30,r31
-	ori	r30,r30,_PAGE_BUSY | _PAGE_ACCESSED | _PAGE_HASHPTE
-	/* Write the linux PTE atomically (setting busy) */
-	stdcx.	r30,0,r6
-	bne-	1b
-	isync
-
-	/* Step 2:
-	 *
-	 * Insert/Update the HPTE in the hash table. At this point,
-	 * r4 (access) is re-useable, we use it for the new HPTE flags
-	 */
-
-BEGIN_FTR_SECTION
-	cmpdi	r9,0			/* check segment size */
-	bne	3f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_1T_SEGMENT)
-	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT - VPN_SHIFT
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT - VPN_SHIFT)
-	or	r29,r28,r29
-	/*
-	 * Calculate hash value for primary slot and store it in r28
-	 * r3 = va, r5 = vsid
-	 * r0 = (va >> 12) & ((1ul << (28 - 12)) -1)
-	 */
-	rldicl	r0,r3,64-12,48
-	xor	r28,r5,r0		/* hash */
-	b	4f
-
-3:	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT_1T - VPN_SHIFT
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT_1T - VPN_SHIFT)
-	or	r29,r28,r29
-
-	/*
-	 * calculate hash value for primary slot and
-	 * store it in r28 for 1T segment
-	 * r3 = va, r5 = vsid
-	 */
-	sldi	r28,r5,25		/* vsid << 25 */
-	/* r0 =  (va >> 12) & ((1ul << (40 - 12)) -1) */
-	rldicl	r0,r3,64-12,36
-	xor	r28,r28,r5		/* vsid ^ ( vsid << 25) */
-	xor	r28,r28,r0		/* hash */
-
-	/* Convert linux PTE bits into HW equivalents */
-4:	andi.	r3,r30,0x1fe		/* Get basic set of flags */
-	xori	r3,r3,HPTE_R_N		/* _PAGE_EXEC -> NOEXEC */
-	rlwinm	r0,r30,32-9+1,30,30	/* _PAGE_RW -> _PAGE_USER (r0) */
-	rlwinm	r4,r30,32-7+1,30,30	/* _PAGE_DIRTY -> _PAGE_USER (r4) */
-	and	r0,r0,r4		/* _PAGE_RW & _PAGE_DIRTY ->r0 bit 30*/
-	andc	r0,r30,r0		/* r0 = pte & ~r0 */
-	rlwimi	r3,r0,32-1,31,31	/* Insert result into PP lsb */
-	/*
-	 * Always add "C" bit for perf. Memory coherence is always enabled
-	 */
-	ori	r3,r3,HPTE_R_C | HPTE_R_M
-
-	/* We eventually do the icache sync here (maybe inline that
-	 * code rather than call a C function...) 
-	 */
-BEGIN_FTR_SECTION
-	mr	r4,r30
-	mr	r5,r7
-	bl	hash_page_do_lazy_icache
-END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE)
-
-	/* At this point, r3 contains new PP bits, save them in
-	 * place of "access" in the param area (sic)
-	 */
-	std	r3,STK_PARAM(R4)(r1)
-
-	/* Get htab_hash_mask */
-	ld	r4,htab_hash_mask@got(2)
-	ld	r27,0(r4)	/* htab_hash_mask -> r27 */
-
-	/* Check if we may already be in the hashtable, in this case, we
-	 * go to out-of-line code to try to modify the HPTE
-	 */
-	andi.	r0,r31,_PAGE_HASHPTE
-	bne	htab_modify_pte
-
-htab_insert_pte:
-	/* Clear hpte bits in new pte (we also clear BUSY btw) and
-	 * add _PAGE_HASHPTE
-	 */
-	lis	r0,_PAGE_HPTEFLAGS@h
-	ori	r0,r0,_PAGE_HPTEFLAGS@l
-	andc	r30,r30,r0
-	ori	r30,r30,_PAGE_HASHPTE
-
-	/* physical address r5 */
-	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
-	sldi	r5,r5,PAGE_SHIFT
-
-	/* Calculate primary group hash */
-	and	r0,r28,r27
-	rldicr	r3,r0,3,63-3		/* r3 = (hash & mask) << 3 */
-
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,0			/* !bolted, !secondary */
-	li	r8,MMU_PAGE_4K		/* page size */
-	li	r9,MMU_PAGE_4K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl htab_call_hpte_insert1
-htab_call_hpte_insert1:
-	bl	.			/* Patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge	htab_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	htab_pte_insert_failure
-
-	/* Now try secondary slot */
-	
-	/* physical address r5 */
-	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
-	sldi	r5,r5,PAGE_SHIFT
-
-	/* Calculate secondary group hash */
-	andc	r0,r27,r28
-	rldicr	r3,r0,3,63-3	/* r0 = (~hash & mask) << 3 */
-	
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,HPTE_V_SECONDARY	/* !bolted, secondary */
-	li	r8,MMU_PAGE_4K		/* page size */
-	li	r9,MMU_PAGE_4K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl htab_call_hpte_insert2
-htab_call_hpte_insert2:
-	bl	.			/* Patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge+	htab_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	htab_pte_insert_failure
-
-	/* Both are full, we need to evict something */
-	mftb	r0
-	/* Pick a random group based on TB */
-	andi.	r0,r0,1
-	mr	r5,r28
-	bne	2f
-	not	r5,r5
-2:	and	r0,r5,r27
-	rldicr	r3,r0,3,63-3	/* r0 = (hash & mask) << 3 */	
-	/* Call ppc_md.hpte_remove */
-.globl htab_call_hpte_remove
-htab_call_hpte_remove:
-	bl	.			/* Patched by htab_finish_init() */
-
-	/* Try all again */
-	b	htab_insert_pte	
-
-htab_bail_ok:
-	li	r3,0
-	b	htab_bail
-
-htab_pte_insert_ok:
-	/* Insert slot number & secondary bit in PTE */
-	rldimi	r30,r3,12,63-15
-		
-	/* Write out the PTE with a normal write
-	 * (maybe add eieio may be good still ?)
-	 */
-htab_write_out_pte:
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r30,0(r6)
-	li	r3, 0
-htab_bail:
-	ld	r27,STK_REG(R27)(r1)
-	ld	r28,STK_REG(R28)(r1)
-	ld	r29,STK_REG(R29)(r1)
-	ld      r30,STK_REG(R30)(r1)
-	ld      r31,STK_REG(R31)(r1)
-	addi    r1,r1,STACKFRAMESIZE
-	ld      r0,16(r1)
-	mtlr    r0
-	blr
-
-htab_modify_pte:
-	/* Keep PP bits in r4 and slot idx from the PTE around in r3 */
-	mr	r4,r3
-	rlwinm	r3,r31,32-12,29,31
-
-	/* Secondary group ? if yes, get a inverted hash value */
-	mr	r5,r28
-	andi.	r0,r31,_PAGE_F_SECOND
-	beq	1f
-	not	r5,r5
-1:
-	/* Calculate proper slot value for ppc_md.hpte_updatepp */
-	and	r0,r5,r27
-	rldicr	r0,r0,3,63-3	/* r0 = (hash & mask) << 3 */
-	add	r3,r0,r3	/* add slot idx */
-
-	/* Call ppc_md.hpte_updatepp */
-	mr	r5,r29			/* vpn */
-	li	r6,MMU_PAGE_4K		/* base page size */
-	li	r7,MMU_PAGE_4K		/* actual page size */
-	ld	r8,STK_PARAM(R9)(r1)	/* segment size */
-	ld	r9,STK_PARAM(R8)(r1)	/* get "flags" param */
-.globl htab_call_hpte_updatepp
-htab_call_hpte_updatepp:
-	bl	.			/* Patched by htab_finish_init() */
-
-	/* if we failed because typically the HPTE wasn't really here
-	 * we try an insertion. 
-	 */
-	cmpdi	0,r3,-1
-	beq-	htab_insert_pte
-
-	/* Clear the BUSY bit and Write out the PTE */
-	li	r0,_PAGE_BUSY
-	andc	r30,r30,r0
-	b	htab_write_out_pte
-
-htab_wrong_access:
-	/* Bail out clearing reservation */
-	stdcx.	r31,0,r6
-	li	r3,1
-	b	htab_bail
-
-htab_pte_insert_failure:
-	/* Bail out restoring old PTE */
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r31,0(r6)
-	li	r3,-1
-	b	htab_bail
-
-#endif
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 60ad15d76052..04d549527eaa 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -629,31 +629,6 @@ int remove_section_mapping(unsigned long start, unsigned long end)
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
-extern u32 htab_call_hpte_insert1[];
-extern u32 htab_call_hpte_insert2[];
-extern u32 htab_call_hpte_remove[];
-extern u32 htab_call_hpte_updatepp[];
-
-static void __init htab_finish_init(void)
-{
-
-#ifdef CONFIG_PPC_4K_PAGES
-	patch_branch(htab_call_hpte_insert1,
-		ppc_function_entry(ppc_md.hpte_insert),
-		BRANCH_SET_LINK);
-	patch_branch(htab_call_hpte_insert2,
-		ppc_function_entry(ppc_md.hpte_insert),
-		BRANCH_SET_LINK);
-	patch_branch(htab_call_hpte_remove,
-		ppc_function_entry(ppc_md.hpte_remove),
-		BRANCH_SET_LINK);
-	patch_branch(htab_call_hpte_updatepp,
-		ppc_function_entry(ppc_md.hpte_updatepp),
-		BRANCH_SET_LINK);
-#endif
-
-}
-
 static void __init htab_initialize(void)
 {
 	unsigned long table;
@@ -800,7 +775,6 @@ static void __init htab_initialize(void)
 					 mmu_linear_psize, mmu_kernel_ssize));
 	}
 
-	htab_finish_init();
 
 	DBG(" <- htab_initialize()\n");
 }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 25/35] powerpc/mm: Add helper for converting pte bit to hpte bits
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (23 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 24/35] powerpc/mm: Convert 4k insert from asm " Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 26/35] powerpc/mm: Move WIMG update to helper Aneesh Kumar K.V
                   ` (9 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Instead of open coding it in multiple code paths, export the helper
and add more documentation. Also make sure we don't make assumption
regarding pte bit position

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h |  1 +
 arch/powerpc/mm/hash64_4k.c               | 13 +-----------
 arch/powerpc/mm/hash64_64k.c              | 35 +++----------------------------
 arch/powerpc/mm/hash_utils_64.c           | 22 ++++++++++++-------
 arch/powerpc/mm/hugepage-hash64.c         | 13 +-----------
 arch/powerpc/mm/hugetlbpage-hash64.c      |  4 +---
 6 files changed, 21 insertions(+), 67 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index e4ea9d73a541..9c212449b2e8 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -236,6 +236,7 @@ extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
 					 pmd_t *pmdp,
 					 unsigned long clr,
 					 unsigned long set);
+extern unsigned long htab_convert_pte_flags(unsigned long pteflags);
 /* Atomic PTE updates */
 static inline unsigned long pte_update(struct mm_struct *mm,
 				       unsigned long addr,
diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
index 3b49c6f18741..ee863137035a 100644
--- a/arch/powerpc/mm/hash64_4k.c
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -53,18 +53,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
 	 * need to add in 0x1 if it's a read-only user page
 	 */
-	rflags = new_pte & _PAGE_USER;
-	if ((new_pte & _PAGE_USER) && !((new_pte & _PAGE_RW) &&
-					(new_pte & _PAGE_DIRTY)))
-		rflags |= 0x1;
-	/*
-	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
-	 */
-	rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
-	/*
-	 * Always add C and Memory coherence bit
-	 */
-	rflags |= HPTE_R_C | HPTE_R_M;
+	rflags = htab_convert_pte_flags(new_pte);
 	/*
 	 * Add in WIMG bits
 	 */
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index fc0898eb309d..b14280e9d850 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -85,22 +85,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * Handle the subpage protection bits
 	 */
 	subpg_pte = new_pte & ~subpg_prot;
-	/*
-	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
-	 * need to add in 0x1 if it's a read-only user page
-	 */
-	rflags = subpg_pte & _PAGE_USER;
-	if ((subpg_pte & _PAGE_USER) && !((subpg_pte & _PAGE_RW) &&
-					(subpg_pte & _PAGE_DIRTY)))
-		rflags |= 0x1;
-	/*
-	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
-	 */
-	rflags |= ((subpg_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
-	/*
-	 * Always add C and Memory coherence bit
-	 */
-	rflags |= HPTE_R_C | HPTE_R_M;
+	rflags = htab_convert_pte_flags(subpg_pte);
 	/*
 	 * Add in WIMG bits
 	 */
@@ -271,22 +256,8 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 			new_pte |= _PAGE_DIRTY;
 	} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
 					  old_pte, new_pte));
-	/*
-	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
-	 * need to add in 0x1 if it's a read-only user page
-	 */
-	rflags = new_pte & _PAGE_USER;
-	if ((new_pte & _PAGE_USER) && !((new_pte & _PAGE_RW) &&
-					(new_pte & _PAGE_DIRTY)))
-		rflags |= 0x1;
-	/*
-	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
-	 */
-	rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
-	/*
-	 * Always add C and Memory coherence bit
-	 */
-	rflags |= HPTE_R_C | HPTE_R_M;
+
+	rflags = htab_convert_pte_flags(new_pte);
 	/*
 	 * Add in WIMG bits
 	 */
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 04d549527eaa..3b5e547b965d 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -159,20 +159,26 @@ static struct mmu_psize_def mmu_psize_defaults_gp[] = {
 	},
 };
 
-static unsigned long htab_convert_pte_flags(unsigned long pteflags)
+unsigned long htab_convert_pte_flags(unsigned long pteflags)
 {
-	unsigned long rflags = pteflags & 0x1fa;
+	unsigned long rflags = 0;
 
 	/* _PAGE_EXEC -> NOEXEC */
 	if ((pteflags & _PAGE_EXEC) == 0)
 		rflags |= HPTE_R_N;
-
-	/* PP bits. PAGE_USER is already PP bit 0x2, so we only
-	 * need to add in 0x1 if it's a read-only user page
+	/*
+	 * PP bits:
+	 * Linux use slb key 0 for kernel and 1 for user.
+	 * kernel areas are mapped by PP bits 00
+	 * and and there is no kernel RO (_PAGE_KERNEL_RO).
+	 * User area mapped by 0x2 and read only use by
+	 * 0x3.
 	 */
-	if ((pteflags & _PAGE_USER) && !((pteflags & _PAGE_RW) &&
-					 (pteflags & _PAGE_DIRTY)))
-		rflags |= 1;
+	if (pteflags & _PAGE_USER) {
+		rflags |= 0x2;
+		if (!((pteflags & _PAGE_RW) && (pteflags & _PAGE_DIRTY)))
+			rflags |= 0x1;
+	}
 	/*
 	 * Always add "C" bit for perf. Memory coherence is always enabled
 	 */
diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c
index 4d87122cf6a7..91fcac6f989d 100644
--- a/arch/powerpc/mm/hugepage-hash64.c
+++ b/arch/powerpc/mm/hugepage-hash64.c
@@ -54,18 +54,7 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
 			new_pmd |= _PAGE_DIRTY;
 	} while (old_pmd != __cmpxchg_u64((unsigned long *)pmdp,
 					  old_pmd, new_pmd));
-	/*
-	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
-	 * need to add in 0x1 if it's a read-only user page
-	 */
-	rflags = new_pmd & _PAGE_USER;
-	if ((new_pmd & _PAGE_USER) && !((new_pmd & _PAGE_RW) &&
-					   (new_pmd & _PAGE_DIRTY)))
-		rflags |= 0x1;
-	/*
-	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
-	 */
-	rflags |= ((new_pmd & _PAGE_EXEC) ? 0 : HPTE_R_N);
+	rflags = htab_convert_pte_flags(new_pmd);
 
 #if 0
 	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index 7584e8445512..304c8520506e 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -59,10 +59,8 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 			new_pte |= _PAGE_DIRTY;
 	} while(old_pte != __cmpxchg_u64((unsigned long *)ptep,
 					 old_pte, new_pte));
+	rflags = htab_convert_pte_flags(new_pte);
 
-	rflags = 0x2 | (!(new_pte & _PAGE_RW));
-	/* _PAGE_EXEC -> HW_NO_EXEC since it's inverted */
-	rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
 	sz = ((1UL) << shift);
 	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
 		/* No CPU has hugepages but lacks no execute, so we
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 26/35] powerpc/mm: Move WIMG update to helper.
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (24 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 25/35] powerpc/mm: Add helper for converting pte bit to hpte bits Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 27/35] powerpc/mm: Move hugetlb related headers Aneesh Kumar K.V
                   ` (8 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Only difference here is, we apply the WIMG mapping early, so rflags
passed to updatepp will also be changed.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hash64_4k.c          |  5 -----
 arch/powerpc/mm/hash64_64k.c         | 10 ----------
 arch/powerpc/mm/hash_utils_64.c      | 13 ++++++++++++-
 arch/powerpc/mm/hugepage-hash64.c    |  7 -------
 arch/powerpc/mm/hugetlbpage-hash64.c |  8 --------
 5 files changed, 12 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
index ee863137035a..e7c04542ba62 100644
--- a/arch/powerpc/mm/hash64_4k.c
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -54,11 +54,6 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * need to add in 0x1 if it's a read-only user page
 	 */
 	rflags = htab_convert_pte_flags(new_pte);
-	/*
-	 * Add in WIMG bits
-	 */
-	rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
-				_PAGE_COHERENT | _PAGE_GUARDED));
 
 	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index b14280e9d850..0762c1e08c88 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -86,11 +86,6 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 */
 	subpg_pte = new_pte & ~subpg_prot;
 	rflags = htab_convert_pte_flags(subpg_pte);
-	/*
-	 * Add in WIMG bits
-	 */
-	rflags |= (subpg_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
-				_PAGE_COHERENT | _PAGE_GUARDED));
 
 	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {
@@ -258,11 +253,6 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 					  old_pte, new_pte));
 
 	rflags = htab_convert_pte_flags(new_pte);
-	/*
-	 * Add in WIMG bits
-	 */
-	rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
-				_PAGE_COHERENT | _PAGE_GUARDED));
 
 	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 3b5e547b965d..3d261bc6fef8 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -182,7 +182,18 @@ unsigned long htab_convert_pte_flags(unsigned long pteflags)
 	/*
 	 * Always add "C" bit for perf. Memory coherence is always enabled
 	 */
-	return rflags | HPTE_R_C | HPTE_R_M;
+	rflags |=  HPTE_R_C | HPTE_R_M;
+	/*
+	 * Add in WIG bits
+	 */
+	if (pteflags & _PAGE_WRITETHRU)
+		rflags |= HPTE_R_W;
+	if (pteflags & _PAGE_NO_CACHE)
+		rflags |= HPTE_R_I;
+	if (pteflags & _PAGE_GUARDED)
+		rflags |= HPTE_R_G;
+
+	return rflags;
 }
 
 int htab_bolt_mapping(unsigned long vstart, unsigned long vend,
diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c
index 91fcac6f989d..1f666de0110a 100644
--- a/arch/powerpc/mm/hugepage-hash64.c
+++ b/arch/powerpc/mm/hugepage-hash64.c
@@ -120,13 +120,6 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
 		pa = pmd_pfn(__pmd(old_pmd)) << PAGE_SHIFT;
 		new_pmd |= _PAGE_HASHPTE;
 
-		/* Add in WIMG bits */
-		rflags |= (new_pmd & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
-				      _PAGE_GUARDED));
-		/*
-		 * enable the memory coherence always
-		 */
-		rflags |= HPTE_R_M;
 repeat:
 		hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
 
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index 304c8520506e..0734e4daffef 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -91,14 +91,6 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		/* clear HPTE slot informations in new PTE */
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
 
-		/* Add in WIMG bits */
-		rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
-				      _PAGE_COHERENT | _PAGE_GUARDED));
-		/*
-		 * enable the memory coherence always
-		 */
-		rflags |= HPTE_R_M;
-
 		slot = hpte_insert_repeating(hash, vpn, pa, rflags, 0,
 					     mmu_psize, ssize);
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 27/35] powerpc/mm: Move hugetlb related headers
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (25 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 26/35] powerpc/mm: Move WIMG update to helper Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 28/35] powerpc/mm: Move THP headers around Aneesh Kumar K.V
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

W.r.t hugetlb, we support two format for pmd. With book3s_64 and
64K linux page size, we can have pte at the pmd level. Hence we
don't need to support hugepd there. For everything else hugepd
is supported and pmd_huge is (0).

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 31 ++++++++++++
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 51 +++++++++++++++++++
 arch/powerpc/include/asm/nohash/pgtable.h     | 25 ++++++++++
 arch/powerpc/include/asm/page.h               | 42 ++--------------
 arch/powerpc/mm/hugetlbpage-hash64.c          | 18 +++++++
 arch/powerpc/mm/hugetlbpage.c                 | 72 ---------------------------
 6 files changed, 129 insertions(+), 110 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 75e8b9326e4b..b4d25529d179 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -93,6 +93,37 @@ extern struct page *pgd_page(pgd_t pgd);
 #define remap_4k_pfn(vma, addr, pfn, prot)	\
 	remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, (prot))
 
+#ifdef CONFIG_HUGETLB_PAGE
+/*
+ * For 4k page size, we support explicit hugepage via hugepd
+ */
+static inline int pmd_huge(pmd_t pmd)
+{
+	return 0;
+}
+
+static inline int pud_huge(pud_t pud)
+{
+	return 0;
+}
+
+static inline int pgd_huge(pgd_t pgd)
+{
+	return 0;
+}
+#define pgd_huge pgd_huge
+
+static inline int hugepd_ok(hugepd_t hpd)
+{
+	/*
+	 * hugepd pointer, bottom two bits == 00 and next 4 bits
+	 * indicate size of table
+	 */
+	return (((hpd.pd & 0x3) == 0x0) && ((hpd.pd & HUGEPD_SHIFT_MASK) != 0));
+}
+#define is_hugepd(hpd)		(hugepd_ok(hpd))
+#endif
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_4K_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 46b5d0ab11de..1857d19de18e 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -119,6 +119,57 @@ extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
 #define pgd_pte(pgd)	(pud_pte(((pud_t){ pgd })))
 #define pte_pgd(pte)	((pgd_t)pte_pud(pte))
 
+#ifdef CONFIG_HUGETLB_PAGE
+/*
+ * We have PGD_INDEX_SIZ = 12 and PTE_INDEX_SIZE = 8, so that we can have
+ * 16GB hugepage pte in PGD and 16MB hugepage pte at PMD;
+ *
+ * Defined in such a way that we can optimize away code block at build time
+ * if CONFIG_HUGETLB_PAGE=n.
+ */
+static inline int pmd_huge(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pmd_val(pmd) & 0x3) != 0x0);
+}
+
+static inline int pud_huge(pud_t pud)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pud_val(pud) & 0x3) != 0x0);
+}
+
+static inline int pgd_huge(pgd_t pgd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pgd_val(pgd) & 0x3) != 0x0);
+}
+#define pgd_huge pgd_huge
+
+#ifdef CONFIG_DEBUG_VM
+extern int hugepd_ok(hugepd_t hpd);
+#define is_hugepd(hpd)               (hugepd_ok(hpd))
+#else
+/*
+ * With 64k page size, we have hugepage ptes in the pgd and pmd entries. We don't
+ * need to setup hugepage directory for them. Our pte and page directory format
+ * enable us to have this enabled.
+ */
+static inline int hugepd_ok(hugepd_t hpd)
+{
+	return 0;
+}
+#define is_hugepd(pdep)			0
+#endif /* CONFIG_DEBUG_VM */
+
+#endif /* CONFIG_HUGETLB_PAGE */
+
 #endif	/* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_64K_H */
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index c0c41a2409d2..1263c22d60d8 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -223,5 +223,30 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 				     unsigned long size, pgprot_t vma_prot);
 #define __HAVE_PHYS_MEM_ACCESS_PROT
 
+#ifdef CONFIG_HUGETLB_PAGE
+static inline int hugepd_ok(hugepd_t hpd)
+{
+	return (hpd.pd > 0);
+}
+
+static inline int pmd_huge(pmd_t pmd)
+{
+	return 0;
+}
+
+static inline int pud_huge(pud_t pud)
+{
+	return 0;
+}
+
+static inline int pgd_huge(pgd_t pgd)
+{
+	return 0;
+}
+#define pgd_huge		pgd_huge
+
+#define is_hugepd(hpd)		(hugepd_ok(hpd))
+#endif
+
 #endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 9d2f38e1b21d..3e4002364aba 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -386,45 +386,11 @@ typedef unsigned long pgprot_t;
 
 typedef struct { signed long pd; } hugepd_t;
 
-#ifdef CONFIG_HUGETLB_PAGE
-#ifdef CONFIG_PPC_BOOK3S_64
-#ifdef CONFIG_PPC_64K_PAGES
-/*
- * With 64k page size, we have hugepage ptes in the pgd and pmd entries. We don't
- * need to setup hugepage directory for them. Our pte and page directory format
- * enable us to have this enabled. But to avoid errors when implementing new
- * features disable hugepd for 64K. We enable a debug version here, So we catch
- * wrong usage.
- */
-#ifdef CONFIG_DEBUG_VM
-extern int hugepd_ok(hugepd_t hpd);
-#else
-#define hugepd_ok(x)	(0)
-#endif
-#else
-static inline int hugepd_ok(hugepd_t hpd)
-{
-	/*
-	 * hugepd pointer, bottom two bits == 00 and next 4 bits
-	 * indicate size of table
-	 */
-	return (((hpd.pd & 0x3) == 0x0) && ((hpd.pd & HUGEPD_SHIFT_MASK) != 0));
-}
-#endif
-#else
-static inline int hugepd_ok(hugepd_t hpd)
-{
-	return (hpd.pd > 0);
-}
-#endif
-
-#define is_hugepd(hpd)               (hugepd_ok(hpd))
-#define pgd_huge pgd_huge
-int pgd_huge(pgd_t pgd);
-#else /* CONFIG_HUGETLB_PAGE */
-#define is_hugepd(pdep)			0
-#define pgd_huge(pgd)			0
+#ifndef CONFIG_HUGETLB_PAGE
+#define is_hugepd(pdep)		(0)
+#define pgd_huge(pgd)		(0)
 #endif /* CONFIG_HUGETLB_PAGE */
+
 #define __hugepd(x) ((hugepd_t) { (x) })
 
 struct page;
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index 0734e4daffef..e2138c7ae70f 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -114,3 +114,21 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 	*ptep = __pte(new_pte & ~_PAGE_BUSY);
 	return 0;
 }
+
+#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_DEBUG_VM)
+/*
+ * This enables us to catch the wrong page directory format
+ * Moved here so that we can use WARN() in the call.
+ */
+int hugepd_ok(hugepd_t hpd)
+{
+	bool is_hugepd;
+
+	/*
+	 * We should not find this format in page directory, warn otherwise.
+	 */
+	is_hugepd = (((hpd.pd & 0x3) == 0x0) && ((hpd.pd & HUGEPD_SHIFT_MASK) != 0));
+	WARN(is_hugepd, "Found wrong page directory format\n");
+	return 0;
+}
+#endif
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 9833fee493ec..bc72e542a83e 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -53,78 +53,6 @@ static unsigned nr_gpages;
 
 #define hugepd_none(hpd)	((hpd).pd == 0)
 
-#ifdef CONFIG_PPC_BOOK3S_64
-/*
- * At this point we do the placement change only for BOOK3S 64. This would
- * possibly work on other subarchs.
- */
-
-/*
- * We have PGD_INDEX_SIZ = 12 and PTE_INDEX_SIZE = 8, so that we can have
- * 16GB hugepage pte in PGD and 16MB hugepage pte at PMD;
- *
- * Defined in such a way that we can optimize away code block at build time
- * if CONFIG_HUGETLB_PAGE=n.
- */
-int pmd_huge(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
-}
-
-int pud_huge(pud_t pud)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pud_val(pud) & 0x3) != 0x0);
-}
-
-int pgd_huge(pgd_t pgd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pgd_val(pgd) & 0x3) != 0x0);
-}
-
-#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_DEBUG_VM)
-/*
- * This enables us to catch the wrong page directory format
- * Moved here so that we can use WARN() in the call.
- */
-int hugepd_ok(hugepd_t hpd)
-{
-	bool is_hugepd;
-
-	/*
-	 * We should not find this format in page directory, warn otherwise.
-	 */
-	is_hugepd = (((hpd.pd & 0x3) == 0x0) && ((hpd.pd & HUGEPD_SHIFT_MASK) != 0));
-	WARN(is_hugepd, "Found wrong page directory format\n");
-	return 0;
-}
-#endif
-
-#else
-int pmd_huge(pmd_t pmd)
-{
-	return 0;
-}
-
-int pud_huge(pud_t pud)
-{
-	return 0;
-}
-
-int pgd_huge(pgd_t pgd)
-{
-	return 0;
-}
-#endif
-
 pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
 {
 	/* Only called for hugetlbfs pages, hence can ignore THP */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 28/35] powerpc/mm: Move THP headers around
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (26 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 27/35] powerpc/mm: Move hugetlb related headers Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 29/35] powerpc/mm: Add a _PAGE_PTE bit Aneesh Kumar K.V
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

We support THP only with book3s_64 and 64K page size. Move
THP details to hash64-64k.h to clarify the same.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 126 +++++++++++++
 arch/powerpc/include/asm/book3s/64/hash.h     | 223 +++++------------------
 arch/powerpc/include/asm/nohash/64/pgtable.h  | 253 +-------------------------
 arch/powerpc/mm/hash_native_64.c              |  10 +
 arch/powerpc/mm/pgtable_64.c                  |   2 +-
 arch/powerpc/platforms/pseries/lpar.c         |  10 +
 6 files changed, 201 insertions(+), 423 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 1857d19de18e..7570677c11c3 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -170,6 +170,132 @@ static inline int hugepd_ok(hugepd_t hpd)
 
 #endif /* CONFIG_HUGETLB_PAGE */
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
+					 unsigned long addr,
+					 pmd_t *pmdp,
+					 unsigned long clr,
+					 unsigned long set);
+static inline char *get_hpte_slot_array(pmd_t *pmdp)
+{
+	/*
+	 * The hpte hindex is stored in the pgtable whose address is in the
+	 * second half of the PMD
+	 *
+	 * Order this load with the test for pmd_trans_huge in the caller
+	 */
+	smp_rmb();
+	return *(char **)(pmdp + PTRS_PER_PMD);
+
+
+}
+/*
+ * The linux hugepage PMD now include the pmd entries followed by the address
+ * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
+ * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
+ * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
+ * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
+ *
+ * The last three bits are intentionally left to zero. This memory location
+ * are also used as normal page PTE pointers. So if we have any pointers
+ * left around while we collapse a hugepage, we need to make sure
+ * _PAGE_PRESENT bit of that is zero when we look at them
+ */
+static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
+{
+	return (hpte_slot_array[index] >> 3) & 0x1;
+}
+
+static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
+					   int index)
+{
+	return hpte_slot_array[index] >> 4;
+}
+
+static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
+					unsigned int index, unsigned int hidx)
+{
+	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
+}
+
+/*
+ *
+ * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
+ * page. The hugetlbfs page table walking and mangling paths are totally
+ * separated form the core VM paths and they're differentiated by
+ *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
+ *
+ * pmd_trans_huge() is defined as false at build time if
+ * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
+ * time in such case.
+ *
+ * For ppc64 we need to differntiate from explicit hugepages from THP, because
+ * for THP we also track the subpage details at the pmd level. We don't do
+ * that for explicit huge pages.
+ *
+ */
+static inline int pmd_trans_huge(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
+}
+
+static inline int pmd_trans_splitting(pmd_t pmd)
+{
+	if (pmd_trans_huge(pmd))
+		return pmd_val(pmd) & _PAGE_SPLITTING;
+	return 0;
+}
+
+static inline int pmd_large(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pmd_val(pmd) & 0x3) != 0x0);
+}
+
+static inline pmd_t pmd_mknotpresent(pmd_t pmd)
+{
+	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
+}
+
+static inline pmd_t pmd_mksplitting(pmd_t pmd)
+{
+	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
+}
+
+#define __HAVE_ARCH_PMD_SAME
+static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
+{
+	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
+}
+
+static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
+					      unsigned long addr, pmd_t *pmdp)
+{
+	unsigned long old;
+
+	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+		return 0;
+	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
+	return ((old & _PAGE_ACCESSED) != 0);
+}
+
+#define __HAVE_ARCH_PMDP_SET_WRPROTECT
+static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+				      pmd_t *pmdp)
+{
+
+	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
+		return;
+
+	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
+}
+
+#endif /*  CONFIG_TRANSPARENT_HUGEPAGE */
 #endif	/* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_64K_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 9c212449b2e8..42e1273adad1 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -2,6 +2,55 @@
 #define _ASM_POWERPC_BOOK3S_64_HASH_H
 #ifdef __KERNEL__
 
+/*
+ * Common bits between 4K and 64K pages in a linux-style PTE.
+ * These match the bits in the (hardware-defined) PowerPC PTE as closely
+ * as possible. Additional bits may be defined in pgtable-hash64-*.h
+ *
+ * Note: We only support user read/write permissions. Supervisor always
+ * have full read/write to pages above PAGE_OFFSET (pages below that
+ * always use the user access permissions).
+ *
+ * We could create separate kernel read-only if we used the 3 PP bits
+ * combinations that newer processors provide but we currently don't.
+ */
+#define _PAGE_PRESENT		0x00001 /* software: pte contains a translation */
+#define _PAGE_USER		0x00002 /* matches one of the PP bits */
+#define _PAGE_BIT_SWAP_TYPE	2
+#define _PAGE_EXEC		0x00004 /* No execute on POWER4 and newer (we invert) */
+#define _PAGE_GUARDED		0x00008
+/* We can derive Memory coherence from _PAGE_NO_CACHE */
+#define _PAGE_COHERENT		0x0
+#define _PAGE_NO_CACHE		0x00020 /* I: cache inhibit */
+#define _PAGE_WRITETHRU		0x00040 /* W: cache write-through */
+#define _PAGE_DIRTY		0x00080 /* C: page changed */
+#define _PAGE_ACCESSED		0x00100 /* R: page referenced */
+#define _PAGE_RW		0x00200 /* software: user write access allowed */
+#define _PAGE_HASHPTE		0x00400 /* software: pte has an associated HPTE */
+#define _PAGE_BUSY		0x00800 /* software: PTE & hash are busy */
+#define _PAGE_F_GIX		0x07000 /* full page: hidx bits */
+#define _PAGE_F_GIX_SHIFT	12
+#define _PAGE_F_SECOND		0x08000 /* Whether to use secondary hash or not */
+#define _PAGE_SPECIAL		0x10000 /* software: special page */
+
+/*
+ * THP pages can't be special. So use the _PAGE_SPECIAL
+ */
+#define _PAGE_SPLITTING _PAGE_SPECIAL
+
+/*
+ * We need to differentiate between explicit huge page and THP huge
+ * page, since THP huge page also need to track real subpage details
+ */
+#define _PAGE_THP_HUGE  _PAGE_4K_PFN
+
+/*
+ * set of bits not changed in pmd_modify.
+ */
+#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
+			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
+			 _PAGE_THP_HUGE)
+
 #ifdef CONFIG_PPC_64K_PAGES
 #include <asm/book3s/64/hash-64k.h>
 #else
@@ -57,36 +106,6 @@
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
 #endif /* CONFIG_PPC_MM_SLICES */
-/*
- * Common bits between 4K and 64K pages in a linux-style PTE.
- * These match the bits in the (hardware-defined) PowerPC PTE as closely
- * as possible. Additional bits may be defined in pgtable-hash64-*.h
- *
- * Note: We only support user read/write permissions. Supervisor always
- * have full read/write to pages above PAGE_OFFSET (pages below that
- * always use the user access permissions).
- *
- * We could create separate kernel read-only if we used the 3 PP bits
- * combinations that newer processors provide but we currently don't.
- */
-#define _PAGE_PRESENT		0x00001 /* software: pte contains a translation */
-#define _PAGE_USER		0x00002 /* matches one of the PP bits */
-#define _PAGE_BIT_SWAP_TYPE	2
-#define _PAGE_EXEC		0x00004 /* No execute on POWER4 and newer (we invert) */
-#define _PAGE_GUARDED		0x00008
-/* We can derive Memory coherence from _PAGE_NO_CACHE */
-#define _PAGE_COHERENT		0x0
-#define _PAGE_NO_CACHE		0x00020 /* I: cache inhibit */
-#define _PAGE_WRITETHRU		0x00040 /* W: cache write-through */
-#define _PAGE_DIRTY		0x00080 /* C: page changed */
-#define _PAGE_ACCESSED		0x00100 /* R: page referenced */
-#define _PAGE_RW		0x00200 /* software: user write access allowed */
-#define _PAGE_HASHPTE		0x00400 /* software: pte has an associated HPTE */
-#define _PAGE_BUSY		0x00800 /* software: PTE & hash are busy */
-#define _PAGE_F_GIX		0x07000 /* full page: hidx bits */
-#define _PAGE_F_GIX_SHIFT	12
-#define _PAGE_F_SECOND		0x08000 /* Whether to use secondary hash or not */
-#define _PAGE_SPECIAL		0x10000 /* software: special page */
 
 /* No separate kernel read-only */
 #define _PAGE_KERNEL_RW		(_PAGE_RW | _PAGE_DIRTY) /* user access blocked by key */
@@ -105,24 +124,6 @@
 
 /* Hash table based platforms need atomic updates of the linux PTE */
 #define PTE_ATOMIC_UPDATES	1
-
-/*
- * THP pages can't be special. So use the _PAGE_SPECIAL
- */
-#define _PAGE_SPLITTING _PAGE_SPECIAL
-
-/*
- * We need to differentiate between explicit huge page and THP huge
- * page, since THP huge page also need to track real subpage details
- */
-#define _PAGE_THP_HUGE  _PAGE_4K_PFN
-
-/*
- * set of bits not changed in pmd_modify.
- */
-#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
-			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
 #define _PTE_NONE_MASK	_PAGE_HPTEFLAGS
 /*
  * The mask convered by the RPN must be a ULL on 32-bit platforms with
@@ -231,11 +232,6 @@
 
 extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
 			    pte_t *ptep, unsigned long pte, int huge);
-extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
-					 unsigned long addr,
-					 pmd_t *pmdp,
-					 unsigned long clr,
-					 unsigned long set);
 extern unsigned long htab_convert_pte_flags(unsigned long pteflags);
 /* Atomic PTE updates */
 static inline unsigned long pte_update(struct mm_struct *mm,
@@ -361,127 +357,6 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 #define __HAVE_ARCH_PTE_SAME
 #define pte_same(A,B)	(((pte_val(A) ^ pte_val(B)) & ~_PAGE_HPTEFLAGS) == 0)
 
-static inline char *get_hpte_slot_array(pmd_t *pmdp)
-{
-	/*
-	 * The hpte hindex is stored in the pgtable whose address is in the
-	 * second half of the PMD
-	 *
-	 * Order this load with the test for pmd_trans_huge in the caller
-	 */
-	smp_rmb();
-	return *(char **)(pmdp + PTRS_PER_PMD);
-
-
-}
-/*
- * The linux hugepage PMD now include the pmd entries followed by the address
- * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
- * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
- * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
- * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
- *
- * The last three bits are intentionally left to zero. This memory location
- * are also used as normal page PTE pointers. So if we have any pointers
- * left around while we collapse a hugepage, we need to make sure
- * _PAGE_PRESENT bit of that is zero when we look at them
- */
-static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
-{
-	return (hpte_slot_array[index] >> 3) & 0x1;
-}
-
-static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
-					   int index)
-{
-	return hpte_slot_array[index] >> 4;
-}
-
-static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
-					unsigned int index, unsigned int hidx)
-{
-	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
-}
-
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-/*
- *
- * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
- * page. The hugetlbfs page table walking and mangling paths are totally
- * separated form the core VM paths and they're differentiated by
- *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
- *
- * pmd_trans_huge() is defined as false at build time if
- * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
- * time in such case.
- *
- * For ppc64 we need to differntiate from explicit hugepages from THP, because
- * for THP we also track the subpage details at the pmd level. We don't do
- * that for explicit huge pages.
- *
- */
-static inline int pmd_trans_huge(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
-}
-
-static inline int pmd_trans_splitting(pmd_t pmd)
-{
-	if (pmd_trans_huge(pmd))
-		return pmd_val(pmd) & _PAGE_SPLITTING;
-	return 0;
-}
-
-#endif
-static inline int pmd_large(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
-}
-
-static inline pmd_t pmd_mknotpresent(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
-}
-
-static inline pmd_t pmd_mksplitting(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
-}
-
-#define __HAVE_ARCH_PMD_SAME
-static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
-{
-	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
-}
-
-static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
-					      unsigned long addr, pmd_t *pmdp)
-{
-	unsigned long old;
-
-	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
-		return 0;
-	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
-	return ((old & _PAGE_ACCESSED) != 0);
-}
-
-#define __HAVE_ARCH_PMDP_SET_WRPROTECT
-static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
-				      pmd_t *pmdp)
-{
-
-	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
-		return;
-
-	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
-}
-
 /* Generic accessors to PTE bits */
 static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
 static inline int pte_dirty(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_DIRTY); }
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index c24e03f22655..d635a924d652 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -154,6 +154,11 @@ static inline void pmd_clear(pmd_t *pmdp)
 	*pmdp = __pmd(0);
 }
 
+static inline pte_t pmd_pte(pmd_t pmd)
+{
+	return __pte(pmd_val(pmd));
+}
+
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
 				 || (pmd_val(pmd) & PMD_BAD_BITS))
@@ -389,252 +394,4 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
 #endif /* __ASSEMBLY__ */
 
-/*
- * THP pages can't be special. So use the _PAGE_SPECIAL
- */
-#define _PAGE_SPLITTING _PAGE_SPECIAL
-
-/*
- * We need to differentiate between explicit huge page and THP huge
- * page, since THP huge page also need to track real subpage details
- */
-#define _PAGE_THP_HUGE  _PAGE_4K_PFN
-
-/*
- * set of bits not changed in pmd_modify.
- */
-#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
-			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
-
-#ifndef __ASSEMBLY__
-/*
- * The linux hugepage PMD now include the pmd entries followed by the address
- * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
- * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
- * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
- * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
- *
- * The last three bits are intentionally left to zero. This memory location
- * are also used as normal page PTE pointers. So if we have any pointers
- * left around while we collapse a hugepage, we need to make sure
- * _PAGE_PRESENT bit of that is zero when we look at them
- */
-static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
-{
-	return (hpte_slot_array[index] >> 3) & 0x1;
-}
-
-static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
-					   int index)
-{
-	return hpte_slot_array[index] >> 4;
-}
-
-static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
-					unsigned int index, unsigned int hidx)
-{
-	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
-}
-
-struct page *realmode_pfn_to_page(unsigned long pfn);
-
-static inline char *get_hpte_slot_array(pmd_t *pmdp)
-{
-	/*
-	 * The hpte hindex is stored in the pgtable whose address is in the
-	 * second half of the PMD
-	 *
-	 * Order this load with the test for pmd_trans_huge in the caller
-	 */
-	smp_rmb();
-	return *(char **)(pmdp + PTRS_PER_PMD);
-
-
-}
-
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
-				   pmd_t *pmdp, unsigned long old_pmd);
-extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot);
-extern pmd_t mk_pmd(struct page *page, pgprot_t pgprot);
-extern pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot);
-extern void set_pmd_at(struct mm_struct *mm, unsigned long addr,
-		       pmd_t *pmdp, pmd_t pmd);
-extern void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
-				 pmd_t *pmd);
-/*
- *
- * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
- * page. The hugetlbfs page table walking and mangling paths are totally
- * separated form the core VM paths and they're differentiated by
- *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
- *
- * pmd_trans_huge() is defined as false at build time if
- * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
- * time in such case.
- *
- * For ppc64 we need to differntiate from explicit hugepages from THP, because
- * for THP we also track the subpage details at the pmd level. We don't do
- * that for explicit huge pages.
- *
- */
-static inline int pmd_trans_huge(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
-}
-
-static inline int pmd_trans_splitting(pmd_t pmd)
-{
-	if (pmd_trans_huge(pmd))
-		return pmd_val(pmd) & _PAGE_SPLITTING;
-	return 0;
-}
-
-extern int has_transparent_hugepage(void);
-#else
-static inline void hpte_do_hugepage_flush(struct mm_struct *mm,
-					  unsigned long addr, pmd_t *pmdp,
-					  unsigned long old_pmd)
-{
-
-	WARN(1, "%s called with THP disabled\n", __func__);
-}
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
-
-static inline int pmd_large(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
-}
-
-static inline pte_t pmd_pte(pmd_t pmd)
-{
-	return __pte(pmd_val(pmd));
-}
-
-static inline pmd_t pte_pmd(pte_t pte)
-{
-	return __pmd(pte_val(pte));
-}
-
-static inline pte_t *pmdp_ptep(pmd_t *pmd)
-{
-	return (pte_t *)pmd;
-}
-
-#define pmd_pfn(pmd)		pte_pfn(pmd_pte(pmd))
-#define pmd_dirty(pmd)		pte_dirty(pmd_pte(pmd))
-#define pmd_young(pmd)		pte_young(pmd_pte(pmd))
-#define pmd_mkold(pmd)		pte_pmd(pte_mkold(pmd_pte(pmd)))
-#define pmd_wrprotect(pmd)	pte_pmd(pte_wrprotect(pmd_pte(pmd)))
-#define pmd_mkdirty(pmd)	pte_pmd(pte_mkdirty(pmd_pte(pmd)))
-#define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
-#define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
-
-#define __HAVE_ARCH_PMD_WRITE
-#define pmd_write(pmd)		pte_write(pmd_pte(pmd))
-
-static inline pmd_t pmd_mkhuge(pmd_t pmd)
-{
-	/* Do nothing, mk_pmd() does this part.  */
-	return pmd;
-}
-
-static inline pmd_t pmd_mknotpresent(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
-}
-
-static inline pmd_t pmd_mksplitting(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
-}
-
-#define __HAVE_ARCH_PMD_SAME
-static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
-{
-	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
-}
-
-#define __HAVE_ARCH_PMDP_SET_ACCESS_FLAGS
-extern int pmdp_set_access_flags(struct vm_area_struct *vma,
-				 unsigned long address, pmd_t *pmdp,
-				 pmd_t entry, int dirty);
-
-extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
-					 unsigned long addr,
-					 pmd_t *pmdp,
-					 unsigned long clr,
-					 unsigned long set);
-
-static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
-					      unsigned long addr, pmd_t *pmdp)
-{
-	unsigned long old;
-
-	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
-		return 0;
-	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
-	return ((old & _PAGE_ACCESSED) != 0);
-}
-
-#define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
-extern int pmdp_test_and_clear_young(struct vm_area_struct *vma,
-				     unsigned long address, pmd_t *pmdp);
-#define __HAVE_ARCH_PMDP_CLEAR_YOUNG_FLUSH
-extern int pmdp_clear_flush_young(struct vm_area_struct *vma,
-				  unsigned long address, pmd_t *pmdp);
-
-#define __HAVE_ARCH_PMDP_HUGE_GET_AND_CLEAR
-extern pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
-				     unsigned long addr, pmd_t *pmdp);
-
-#define __HAVE_ARCH_PMDP_SET_WRPROTECT
-static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
-				      pmd_t *pmdp)
-{
-
-	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
-		return;
-
-	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
-}
-
-#define __HAVE_ARCH_PMDP_SPLITTING_FLUSH
-extern void pmdp_splitting_flush(struct vm_area_struct *vma,
-				 unsigned long address, pmd_t *pmdp);
-
-extern pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
-				 unsigned long address, pmd_t *pmdp);
-#define pmdp_collapse_flush pmdp_collapse_flush
-
-#define __HAVE_ARCH_PGTABLE_DEPOSIT
-extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
-				       pgtable_t pgtable);
-#define __HAVE_ARCH_PGTABLE_WITHDRAW
-extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
-
-#define __HAVE_ARCH_PMDP_INVALIDATE
-extern void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
-			    pmd_t *pmdp);
-
-#define pmd_move_must_withdraw pmd_move_must_withdraw
-struct spinlock;
-static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
-					 struct spinlock *old_pmd_ptl)
-{
-	/*
-	 * Archs like ppc64 use pgtable to store per pmd
-	 * specific information. So when we switch the pmd,
-	 * we should also withdraw and deposit the pgtable
-	 */
-	return true;
-}
-#endif /* __ASSEMBLY__ */
 #endif /* _ASM_POWERPC_NOHASH_64_PGTABLE_H */
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index c8822af10a58..8eaac81347fd 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -429,6 +429,7 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
 	local_irq_restore(flags);
 }
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static void native_hugepage_invalidate(unsigned long vsid,
 				       unsigned long addr,
 				       unsigned char *hpte_slot_array,
@@ -482,6 +483,15 @@ static void native_hugepage_invalidate(unsigned long vsid,
 	}
 	local_irq_restore(flags);
 }
+#else
+static void native_hugepage_invalidate(unsigned long vsid,
+				       unsigned long addr,
+				       unsigned char *hpte_slot_array,
+				       int psize, int ssize, int local)
+{
+	WARN(1, "%s called without THP support\n", __func__);
+}
+#endif
 
 static inline int __hpte_actual_psize(unsigned int lp, int psize)
 {
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 3967e3cce03e..d42dd289abfe 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -359,7 +359,7 @@ struct page *pud_page(pud_t pud)
 struct page *pmd_page(pmd_t pmd)
 {
 	if (pmd_trans_huge(pmd) || pmd_huge(pmd))
-		return pfn_to_page(pmd_pfn(pmd));
+		return pte_page(pmd_pte(pmd));
 	return virt_to_page(pmd_page_vaddr(pmd));
 }
 
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index b7a67e3d2201..6d46547871aa 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -396,6 +396,7 @@ static void pSeries_lpar_hpte_invalidate(unsigned long slot, unsigned long vpn,
 	BUG_ON(lpar_rc != H_SUCCESS);
 }
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
 /*
  * Limit iterations holding pSeries_lpar_tlbie_lock to 3. We also need
  * to make sure that we avoid bouncing the hypervisor tlbie lock.
@@ -494,6 +495,15 @@ static void pSeries_lpar_hugepage_invalidate(unsigned long vsid,
 		__pSeries_lpar_hugepage_invalidate(slot_array, vpn_array,
 						   index, psize, ssize);
 }
+#else
+static void pSeries_lpar_hugepage_invalidate(unsigned long vsid,
+					     unsigned long addr,
+					     unsigned char *hpte_slot_array,
+					     int psize, int ssize, int local)
+{
+	WARN(1, "%s called without THP support\n", __func__);
+}
+#endif
 
 static void pSeries_lpar_hpte_removebolted(unsigned long ea,
 					   int psize, int ssize)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 29/35] powerpc/mm: Add a _PAGE_PTE bit
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (27 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 28/35] powerpc/mm: Move THP headers around Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 30/35] powerpc/mm: Don't hardcode page table size Aneesh Kumar K.V
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

For a pte entry we will have _PAGE_PTE set. Our pte page
address have a minimum alignment requirement of HUGEPD_SHIFT_MASK + 1.
We use the lower 7 bits to indicate hugepd. ie.

For pmd and pgd we can find:
1) _PAGE_PTE set pte -> indicate PTE
2) bits [2..6] non zero -> indicate hugepd.
   They also encode the size. We skip bit 1 (_PAGE_PRESENT).
3) othewise pointer to next table.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |  9 ++++++---
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 23 +++++++++--------------
 arch/powerpc/include/asm/book3s/64/hash.h     | 13 +++++++------
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  3 +--
 arch/powerpc/include/asm/pte-common.h         |  5 +++++
 arch/powerpc/mm/hugetlbpage.c                 |  4 ++--
 arch/powerpc/mm/pgtable.c                     |  4 ++++
 arch/powerpc/mm/pgtable_64.c                  |  7 +------
 8 files changed, 35 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index b4d25529d179..e59832c94609 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -116,10 +116,13 @@ static inline int pgd_huge(pgd_t pgd)
 static inline int hugepd_ok(hugepd_t hpd)
 {
 	/*
-	 * hugepd pointer, bottom two bits == 00 and next 4 bits
-	 * indicate size of table
+	 * if it is not a pte and have hugepd shift mask
+	 * set, then it is a hugepd directory pointer
 	 */
-	return (((hpd.pd & 0x3) == 0x0) && ((hpd.pd & HUGEPD_SHIFT_MASK) != 0));
+	if (!(hpd.pd & _PAGE_PTE) &&
+	    ((hpd.pd & HUGEPD_SHIFT_MASK) != 0))
+		return true;
+	return false;
 }
 #define is_hugepd(hpd)		(hugepd_ok(hpd))
 #endif
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 7570677c11c3..52110d7af659 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -130,25 +130,25 @@ extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
 static inline int pmd_huge(pmd_t pmd)
 {
 	/*
-	 * leaf pte for huge page, bottom two bits != 00
+	 * leaf pte for huge page
 	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
+	return !!(pmd_val(pmd) & _PAGE_PTE);
 }
 
 static inline int pud_huge(pud_t pud)
 {
 	/*
-	 * leaf pte for huge page, bottom two bits != 00
+	 * leaf pte for huge page
 	 */
-	return ((pud_val(pud) & 0x3) != 0x0);
+	return !!(pud_val(pud) & _PAGE_PTE);
 }
 
 static inline int pgd_huge(pgd_t pgd)
 {
 	/*
-	 * leaf pte for huge page, bottom two bits != 00
+	 * leaf pte for huge page
 	 */
-	return ((pgd_val(pgd) & 0x3) != 0x0);
+	return !!(pgd_val(pgd) & _PAGE_PTE);
 }
 #define pgd_huge pgd_huge
 
@@ -236,10 +236,8 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
  */
 static inline int pmd_trans_huge(pmd_t pmd)
 {
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
+	return !!((pmd_val(pmd) & (_PAGE_PTE | _PAGE_THP_HUGE)) ==
+		  (_PAGE_PTE | _PAGE_THP_HUGE));
 }
 
 static inline int pmd_trans_splitting(pmd_t pmd)
@@ -251,10 +249,7 @@ static inline int pmd_trans_splitting(pmd_t pmd)
 
 static inline int pmd_large(pmd_t pmd)
 {
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
+	return !!(pmd_val(pmd) & _PAGE_PTE);
 }
 
 static inline pmd_t pmd_mknotpresent(pmd_t pmd)
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 42e1273adad1..8b929e531758 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -14,11 +14,12 @@
  * We could create separate kernel read-only if we used the 3 PP bits
  * combinations that newer processors provide but we currently don't.
  */
-#define _PAGE_PRESENT		0x00001 /* software: pte contains a translation */
-#define _PAGE_USER		0x00002 /* matches one of the PP bits */
+#define _PAGE_PTE		0x00001
+#define _PAGE_PRESENT		0x00002 /* software: pte contains a translation */
 #define _PAGE_BIT_SWAP_TYPE	2
-#define _PAGE_EXEC		0x00004 /* No execute on POWER4 and newer (we invert) */
-#define _PAGE_GUARDED		0x00008
+#define _PAGE_USER		0x00004 /* matches one of the PP bits */
+#define _PAGE_EXEC		0x00008 /* No execute on POWER4 and newer (we invert) */
+#define _PAGE_GUARDED		0x00010
 /* We can derive Memory coherence from _PAGE_NO_CACHE */
 #define _PAGE_COHERENT		0x0
 #define _PAGE_NO_CACHE		0x00020 /* I: cache inhibit */
@@ -49,7 +50,7 @@
  */
 #define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
 			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
+			 _PAGE_THP_HUGE | _PAGE_PTE)
 
 #ifdef CONFIG_PPC_64K_PAGES
 #include <asm/book3s/64/hash-64k.h>
@@ -135,7 +136,7 @@
  * pgprot changes
  */
 #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
-			 _PAGE_ACCESSED | _PAGE_SPECIAL)
+			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE)
 /*
  * Mask of bits returned by pte_pgprot()
  */
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index f2ace2cac7bb..bb97b6a52b84 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -213,8 +213,7 @@ static inline int pmd_protnone(pmd_t pmd)
 
 static inline pmd_t pmd_mkhuge(pmd_t pmd)
 {
-	/* Do nothing, mk_pmd() does this part.  */
-	return pmd;
+	return __pmd(pmd_val(pmd) | (_PAGE_PTE | _PAGE_THP_HUGE));
 }
 
 #define __HAVE_ARCH_PMDP_SET_ACCESS_FLAGS
diff --git a/arch/powerpc/include/asm/pte-common.h b/arch/powerpc/include/asm/pte-common.h
index 71537a319fc8..1ec67b043065 100644
--- a/arch/powerpc/include/asm/pte-common.h
+++ b/arch/powerpc/include/asm/pte-common.h
@@ -40,6 +40,11 @@
 #else
 #define _PAGE_RW 0
 #endif
+
+#ifndef _PAGE_PTE
+#define _PAGE_PTE 0
+#endif
+
 #ifndef _PMD_PRESENT_MASK
 #define _PMD_PRESENT_MASK	_PMD_PRESENT
 #endif
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index bc72e542a83e..61b8b7ccea4f 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -894,8 +894,8 @@ void flush_dcache_icache_hugepage(struct page *page)
  * We have 4 cases for pgds and pmds:
  * (1) invalid (all zeroes)
  * (2) pointer to next table, as normal; bottom 6 bits == 0
- * (3) leaf pte for huge page, bottom two bits != 00
- * (4) hugepd pointer, bottom two bits == 00, next 4 bits indicate size of table
+ * (3) leaf pte for huge page _PAGE_PTE set
+ * (4) hugepd pointer, _PAGE_PTE = 0 and bits [2..6] indicate size of table
  *
  * So long as we atomically load page table pointers we are safe against teardown,
  * we can follow the address down to the the page and take a ref on it.
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 83dfcb55ffef..83dfd7925c72 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -179,6 +179,10 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
 	 */
 	VM_WARN_ON((pte_val(*ptep) & (_PAGE_PRESENT | _PAGE_USER)) ==
 		(_PAGE_PRESENT | _PAGE_USER));
+	/*
+	 * Add the pte bit when tryint set a pte
+	 */
+	pte = __pte(pte_val(pte) | _PAGE_PTE);
 
 	/* Note: mm->context.id might not yet have been assigned as
 	 * this context might not have been activated yet when this
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index d42dd289abfe..ea6bc31debb0 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -765,13 +765,8 @@ static pmd_t pmd_set_protbits(pmd_t pmd, pgprot_t pgprot)
 pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot)
 {
 	unsigned long pmdv;
-	/*
-	 * For a valid pte, we would have _PAGE_PRESENT always
-	 * set. We use this to check THP page at pmd level.
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
+
 	pmdv = pfn << PTE_RPN_SHIFT;
-	pmdv |= _PAGE_THP_HUGE;
 	return pmd_set_protbits(__pmd(pmdv), pgprot);
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 30/35] powerpc/mm: Don't hardcode page table size
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (28 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 29/35] powerpc/mm: Add a _PAGE_PTE bit Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 31/35] powerpc/mm: Don't hardcode the hash pte slot shift Aneesh Kumar K.V
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

pte and pmd table size are dependent on config items. Don't
hard code the same. This make sure we use the right value
when masking pmd entries and also while checking pmd_bad

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h    | 30 ++++++++++++++++++------
 arch/powerpc/include/asm/nohash/64/pgtable-64k.h | 21 +++++++++++++----
 arch/powerpc/include/asm/pgalloc-64.h            | 10 --------
 arch/powerpc/mm/init_64.c                        |  4 ----
 4 files changed, 40 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 52110d7af659..cca050aa1aa8 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -25,12 +25,6 @@
 #define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
 #define PGDIR_MASK	(~(PGDIR_SIZE-1))
 
-/* Bits to mask out from a PMD to get to the PTE page */
-/* PMDs point to PTE table fragments which are 4K aligned.  */
-#define PMD_MASKED_BITS		0xfff
-/* Bits to mask out from a PGD/PUD to get to the PMD page */
-#define PUD_MASKED_BITS		0x1ff
-
 #define _PAGE_COMBO	0x00020000 /* this is a combo 4k page */
 #define _PAGE_4K_PFN	0x00040000 /* PFN is for a single 4k page */
 /*
@@ -49,6 +43,24 @@
  * of addressable physical space, or 46 bits for the special 4k PFNs.
  */
 #define PTE_RPN_SHIFT	(30)
+/*
+ * we support 16 fragments per PTE page of 64K size.
+ */
+#define PTE_FRAG_NR	16
+/*
+ * We use a 2K PTE page fragment and another 2K for storing
+ * real_pte_t hash index
+ */
+#define PTE_FRAG_SIZE_SHIFT  12
+#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT)
+
+/*
+ * Bits to mask out from a PMD to get to the PTE page
+ * PMDs point to PTE table fragments which are PTE_FRAG_SIZE aligned.
+ */
+#define PMD_MASKED_BITS		(PTE_FRAG_SIZE - 1)
+/* Bits to mask out from a PGD/PUD to get to the PMD page */
+#define PUD_MASKED_BITS		0x1ff
 
 #ifndef __ASSEMBLY__
 
@@ -112,8 +124,12 @@ extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
 		remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE,	\
 			__pgprot(pgprot_val((prot)) | _PAGE_4K_PFN)))
 
-#define PTE_TABLE_SIZE	(sizeof(real_pte_t) << PTE_INDEX_SIZE)
+#define PTE_TABLE_SIZE	PTE_FRAG_SIZE
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#define PMD_TABLE_SIZE	((sizeof(pmd_t) << PMD_INDEX_SIZE) + (sizeof(unsigned long) << PMD_INDEX_SIZE))
+#else
 #define PMD_TABLE_SIZE	(sizeof(pmd_t) << PMD_INDEX_SIZE)
+#endif
 #define PGD_TABLE_SIZE	(sizeof(pgd_t) << PGD_INDEX_SIZE)
 
 #define pgd_pte(pgd)	(pud_pte(((pud_t){ pgd })))
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
index a44660d76096..2217de6454d6 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
@@ -9,8 +9,19 @@
 #define PUD_INDEX_SIZE	0
 #define PGD_INDEX_SIZE  12
 
+/*
+ * we support 16 fragments per PTE page of 64K size.
+ */
+#define PTE_FRAG_NR	16
+/*
+ * We use a 2K PTE page fragment and another 2K for storing
+ * real_pte_t hash index
+ */
+#define PTE_FRAG_SIZE_SHIFT  12
+#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT)
+
 #ifndef __ASSEMBLY__
-#define PTE_TABLE_SIZE	(sizeof(real_pte_t) << PTE_INDEX_SIZE)
+#define PTE_TABLE_SIZE	PTE_FRAG_SIZE
 #define PMD_TABLE_SIZE	(sizeof(pmd_t) << PMD_INDEX_SIZE)
 #define PGD_TABLE_SIZE	(sizeof(pgd_t) << PGD_INDEX_SIZE)
 #endif	/* __ASSEMBLY__ */
@@ -32,9 +43,11 @@
 #define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
 #define PGDIR_MASK	(~(PGDIR_SIZE-1))
 
-/* Bits to mask out from a PMD to get to the PTE page */
-/* PMDs point to PTE table fragments which are 4K aligned.  */
-#define PMD_MASKED_BITS		0xfff
+/*
+ * Bits to mask out from a PMD to get to the PTE page
+ * PMDs point to PTE table fragments which are PTE_FRAG_SIZE aligned.
+ */
+#define PMD_MASKED_BITS		(PTE_FRAG_SIZE - 1)
 /* Bits to mask out from a PGD/PUD to get to the PMD page */
 #define PUD_MASKED_BITS		0x1ff
 
diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index d8cde71f6734..69ef28a81733 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -163,16 +163,6 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
 }
 
 #else /* if CONFIG_PPC_64K_PAGES */
-/*
- * we support 16 fragments per PTE page.
- */
-#define PTE_FRAG_NR	16
-/*
- * We use a 2K PTE page fragment and another 2K for storing
- * real_pte_t hash index
- */
-#define PTE_FRAG_SIZE_SHIFT  12
-#define PTE_FRAG_SIZE (2 * PTRS_PER_PTE * sizeof(pte_t))
 
 extern pte_t *page_table_alloc(struct mm_struct *, unsigned long, int);
 extern void page_table_free(struct mm_struct *, unsigned long *, int);
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index d747dd7bc90b..379a6a90644b 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -87,11 +87,7 @@ static void pgd_ctor(void *addr)
 
 static void pmd_ctor(void *addr)
 {
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	memset(addr, 0, PMD_TABLE_SIZE * 2);
-#else
 	memset(addr, 0, PMD_TABLE_SIZE);
-#endif
 }
 
 struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE];
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 31/35] powerpc/mm: Don't hardcode the hash pte slot shift
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (29 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 30/35] powerpc/mm: Don't hardcode page table size Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 32/35] powerpc/nohash: Update 64K nohash config to have 32 pte fragement Aneesh Kumar K.V
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Use the #define instead of open-coding the same

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 2 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 2 +-
 arch/powerpc/include/asm/nohash/64/pgtable.h  | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index cca050aa1aa8..9f9942998587 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -94,7 +94,7 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 {
 	if ((pte_val(rpte.pte) & _PAGE_COMBO))
 		return (rpte.hidx >> (index<<2)) & 0xf;
-	return (pte_val(rpte.pte) >> 12) & 0xf;
+	return (pte_val(rpte.pte) >> _PAGE_F_GIX_SHIFT) & 0xf;
 }
 
 #define __rpte_to_pte(r)	((r).pte)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index bb97b6a52b84..a2d4e0e37067 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -50,7 +50,7 @@
 #define __real_pte(e,p)		(e)
 #define __rpte_to_pte(r)	(__pte(r))
 #endif
-#define __rpte_to_hidx(r,index)	(pte_val(__rpte_to_pte(r)) >> 12)
+#define __rpte_to_hidx(r,index)	(pte_val(__rpte_to_pte(r)) >>_PAGE_F_GIX_SHIFT)
 
 #define pte_iterate_hashed_subpages(rpte, psize, va, index, shift)       \
 	do {							         \
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index d635a924d652..03c226965b46 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -121,7 +121,7 @@
 #define __real_pte(e,p)		(e)
 #define __rpte_to_pte(r)	(__pte(r))
 #endif
-#define __rpte_to_hidx(r,index)	(pte_val(__rpte_to_pte(r)) >> 12)
+#define __rpte_to_hidx(r,index)	(pte_val(__rpte_to_pte(r)) >> _PAGE_F_GIX_SHIFT)
 
 #define pte_iterate_hashed_subpages(rpte, psize, va, index, shift)       \
 	do {							         \
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 32/35] powerpc/nohash: Update 64K nohash config to have 32 pte fragement
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (30 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 31/35] powerpc/mm: Don't hardcode the hash pte slot shift Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 33/35] powerpc/nohash: we don't use real_pte_t for nohash Aneesh Kumar K.V
                   ` (2 subsequent siblings)
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

They don't need to track 4k subpage slot details and hence don't need
second half of pgtable_t.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/nohash/64/pgtable-64k.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
index 2217de6454d6..570fb30be21c 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
@@ -10,14 +10,14 @@
 #define PGD_INDEX_SIZE  12
 
 /*
- * we support 16 fragments per PTE page of 64K size.
+ * we support 32 fragments per PTE page of 64K size
  */
-#define PTE_FRAG_NR	16
+#define PTE_FRAG_NR	32
 /*
  * We use a 2K PTE page fragment and another 2K for storing
  * real_pte_t hash index
  */
-#define PTE_FRAG_SIZE_SHIFT  12
+#define PTE_FRAG_SIZE_SHIFT  11
 #define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT)
 
 #ifndef __ASSEMBLY__
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 33/35] powerpc/nohash: we don't use real_pte_t for nohash
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (31 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 32/35] powerpc/nohash: Update 64K nohash config to have 32 pte fragement Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:36 ` [PATCH V6 34/35] powerpc/mm: Use H_READ with H_READ_4 Aneesh Kumar K.V
  2015-12-01  3:37 ` [PATCH V6 35/35] powerpc/mm: Don't open code pgtable_t size Aneesh Kumar K.V
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

Remove the related functions and #defines

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/nohash/64/pgtable.h | 33 ----------------------------
 1 file changed, 33 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 03c226965b46..b9f734dd5b81 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -106,39 +106,6 @@
 #endif /* CONFIG_PPC_MM_SLICES */
 
 #ifndef __ASSEMBLY__
-
-/*
- * This is the default implementation of various PTE accessors, it's
- * used in all cases except Book3S with 64K pages where we have a
- * concept of sub-pages
- */
-#ifndef __real_pte
-
-#ifdef CONFIG_STRICT_MM_TYPECHECKS
-#define __real_pte(e,p)		((real_pte_t){(e)})
-#define __rpte_to_pte(r)	((r).pte)
-#else
-#define __real_pte(e,p)		(e)
-#define __rpte_to_pte(r)	(__pte(r))
-#endif
-#define __rpte_to_hidx(r,index)	(pte_val(__rpte_to_pte(r)) >> _PAGE_F_GIX_SHIFT)
-
-#define pte_iterate_hashed_subpages(rpte, psize, va, index, shift)       \
-	do {							         \
-		index = 0;					         \
-		shift = mmu_psize_defs[psize].shift;		         \
-
-#define pte_iterate_hashed_end() } while(0)
-
-/*
- * We expect this to be called only for user addresses or kernel virtual
- * addresses other than the linear mapping.
- */
-#define pte_pagesize_index(mm, addr, pte)	MMU_PAGE_4K
-
-#endif /* __real_pte */
-
-
 /* pte_clear moved to later in this file */
 
 #define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 34/35] powerpc/mm: Use H_READ with H_READ_4
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (32 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 33/35] powerpc/nohash: we don't use real_pte_t for nohash Aneesh Kumar K.V
@ 2015-12-01  3:36 ` Aneesh Kumar K.V
  2015-12-01  3:37 ` [PATCH V6 35/35] powerpc/mm: Don't open code pgtable_t size Aneesh Kumar K.V
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:36 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

This will bulk read 4 hash pte slot entries and should reduce the loop

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/plpar_wrappers.h | 17 ++++++++++
 arch/powerpc/platforms/pseries/lpar.c     | 54 +++++++++++++++----------------
 2 files changed, 44 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
index 67859edbf8fd..1b394247afc2 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -202,6 +202,23 @@ static inline long plpar_pte_read_raw(unsigned long flags, unsigned long ptex,
 }
 
 /*
+ * ptes must be 8*sizeof(unsigned long)
+ */
+static inline long plpar_pte_read_4(unsigned long flags, unsigned long ptex,
+				    unsigned long *ptes)
+
+{
+	long rc;
+	unsigned long retbuf[PLPAR_HCALL9_BUFSIZE];
+
+	rc = plpar_hcall9(H_READ, retbuf, flags | H_READ_4, ptex);
+
+	memcpy(ptes, retbuf, 8*sizeof(unsigned long));
+
+	return rc;
+}
+
+/*
  * plpar_pte_read_4_raw can be called in real mode.
  * ptes must be 8*sizeof(unsigned long)
  */
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 6d46547871aa..477290ad855e 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -315,48 +315,48 @@ static long pSeries_lpar_hpte_updatepp(unsigned long slot,
 	return 0;
 }
 
-static unsigned long pSeries_lpar_hpte_getword0(unsigned long slot)
+static long __pSeries_lpar_hpte_find(unsigned long want_v, unsigned long hpte_group)
 {
-	unsigned long dword0;
-	unsigned long lpar_rc;
-	unsigned long dummy_word1;
-	unsigned long flags;
+	long lpar_rc;
+	unsigned long i, j;
+	struct {
+		unsigned long pteh;
+		unsigned long ptel;
+	} ptes[4];
 
-	/* Read 1 pte at a time                        */
-	/* Do not need RPN to logical page translation */
-	/* No cross CEC PFT access                     */
-	flags = 0;
+	for (i = 0; i < HPTES_PER_GROUP; i += 4, hpte_group += 4) {
 
-	lpar_rc = plpar_pte_read(flags, slot, &dword0, &dummy_word1);
+		lpar_rc = plpar_pte_read_4(0, hpte_group, (void *)ptes);
+		if (lpar_rc != H_SUCCESS)
+			continue;
 
-	BUG_ON(lpar_rc != H_SUCCESS);
+		for (j = 0; j < 4; j++) {
+			if (HPTE_V_COMPARE(ptes[j].pteh, want_v) &&
+			    (ptes[j].pteh & HPTE_V_VALID))
+				return i + j;
+		}
+	}
 
-	return dword0;
+	return -1;
 }
 
 static long pSeries_lpar_hpte_find(unsigned long vpn, int psize, int ssize)
 {
-	unsigned long hash;
-	unsigned long i;
 	long slot;
-	unsigned long want_v, hpte_v;
+	unsigned long hash;
+	unsigned long want_v;
+	unsigned long hpte_group;
 
 	hash = hpt_hash(vpn, mmu_psize_defs[psize].shift, ssize);
 	want_v = hpte_encode_avpn(vpn, psize, ssize);
 
 	/* Bolted entries are always in the primary group */
-	slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-	for (i = 0; i < HPTES_PER_GROUP; i++) {
-		hpte_v = pSeries_lpar_hpte_getword0(slot);
-
-		if (HPTE_V_COMPARE(hpte_v, want_v) && (hpte_v & HPTE_V_VALID))
-			/* HPTE matches */
-			return slot;
-		++slot;
-	}
-
-	return -1;
-} 
+	hpte_group = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+	slot = __pSeries_lpar_hpte_find(want_v, hpte_group);
+	if (slot < 0)
+		return -1;
+	return hpte_group + slot;
+}
 
 static void pSeries_lpar_hpte_updateboltedpp(unsigned long newpp,
 					     unsigned long ea,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH V6 35/35] powerpc/mm: Don't open code pgtable_t size
  2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (33 preceding siblings ...)
  2015-12-01  3:36 ` [PATCH V6 34/35] powerpc/mm: Use H_READ with H_READ_4 Aneesh Kumar K.V
@ 2015-12-01  3:37 ` Aneesh Kumar K.V
  34 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-01  3:37 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

The slot information of base page size hash pte is stored in the
pgtable_t w.r.t transparent hugepage. We need to make sure we don't
index beyond pgtable_t size.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hugepage-hash64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c
index 1f666de0110a..baf1301ded0c 100644
--- a/arch/powerpc/mm/hugepage-hash64.c
+++ b/arch/powerpc/mm/hugepage-hash64.c
@@ -71,7 +71,7 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
 	 */
 	shift = mmu_psize_defs[psize].shift;
 	index = (ea & ~HPAGE_PMD_MASK) >> shift;
-	BUG_ON(index >= 4096);
+	BUG_ON(index >= PTE_FRAG_SIZE);
 
 	vpn = hpt_vpn(ea, vsid, ssize);
 	hpte_slot_array = get_hpte_slot_array(pmdp);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t
  2015-12-01  3:36 ` [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t Aneesh Kumar K.V
@ 2015-12-03  7:13   ` Anshuman Khandual
  2015-12-03 18:57     ` Aneesh Kumar K.V
  2016-02-18  1:11   ` Paul Mackerras
  1 sibling, 1 reply; 45+ messages in thread
From: Anshuman Khandual @ 2015-12-03  7:13 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev

On 12/01/2015 09:06 AM, Aneesh Kumar K.V wrote:
> This free up 11 bits in pte_t. In the later patch we also change
> the pte_t format so that we can start supporting migration pte
> at pmd level. We now track 4k subpage valid bit as below
> 
> If we have _PAGE_COMBO set, we override the _PAGE_F_GIX_SHIFT
> and _PAGE_F_SECOND. Together we have 4 bits, each of them
> used to indicate whether any of the 4 4k subpage in that group
> is valid. ie,
> 
> [ group 1 bit ]   [ group 2 bit ]  ..... [ group 4 ]
> [ subpage 1 - 4]  [ subpage 5- 8]  ..... [ subpage 13 - 16]
> 
> We still track each 4k subpage slot number and secondary hash
> information in the second half of pgtable_t. Removing the subpage
> tracking have some significant overhead on aim9 and ebizzy benchmark and
> to support THP with 4K subpage, we do need a pgtable_t of 4096 bytes.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

We removed 16 bits which used to track individual sub pages and added
back 4 bits to track newly created sub page groups. So we save 12 bits
there. How it is 11 bits ? Have we added back one more bit into pte_t
some where for sub page purpose which I am missing ?

> ---
>  arch/powerpc/include/asm/book3s/64/hash-4k.h  | 10 +-------
>  arch/powerpc/include/asm/book3s/64/hash-64k.h | 35 ++++++---------------------
>  arch/powerpc/include/asm/book3s/64/hash.h     | 10 ++++----
>  arch/powerpc/mm/hash64_64k.c                  | 34 ++++++++++++++++++++++++--
>  arch/powerpc/mm/hash_low_64.S                 |  6 +----
>  arch/powerpc/mm/hugetlbpage-hash64.c          |  5 +---
>  arch/powerpc/mm/pgtable_64.c                  |  2 +-
>  7 files changed, 48 insertions(+), 54 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> index 537eacecf6e9..75e8b9326e4b 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> @@ -47,17 +47,9 @@
>  /* Bits to mask out from a PGD to get to the PUD page */
>  #define PGD_MASKED_BITS		0
>  
> -/* PTE bits */
> -#define _PAGE_HASHPTE	0x0400 /* software: pte has an associated HPTE */
> -#define _PAGE_SECONDARY 0x8000 /* software: HPTE is in secondary group */
> -#define _PAGE_GROUP_IX  0x7000 /* software: HPTE index within group */
> -#define _PAGE_F_SECOND  _PAGE_SECONDARY
> -#define _PAGE_F_GIX     _PAGE_GROUP_IX
> -#define _PAGE_SPECIAL	0x10000 /* software: special page */
> -
>  /* PTE flags to conserve for HPTE identification */
>  #define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | \
> -			 _PAGE_SECONDARY | _PAGE_GROUP_IX)
> +			 _PAGE_F_SECOND | _PAGE_F_GIX)
>  
>  /* shift to put page number into pte */
>  #define PTE_RPN_SHIFT	(17)
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index ee073822145d..a268416ca4a4 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -31,33 +31,13 @@
>  /* Bits to mask out from a PGD/PUD to get to the PMD page */
>  #define PUD_MASKED_BITS		0x1ff
>  
> -/* Additional PTE bits (don't change without checking asm in hash_low.S) */
> -#define _PAGE_SPECIAL	0x00000400 /* software: special page */
> -#define _PAGE_HPTE_SUB	0x0ffff000 /* combo only: sub pages HPTE bits */
> -#define _PAGE_HPTE_SUB0	0x08000000 /* combo only: first sub page */
> -#define _PAGE_COMBO	0x10000000 /* this is a combo 4k page */
> -#define _PAGE_4K_PFN	0x20000000 /* PFN is for a single 4k page */
> -
> -/* For 64K page, we don't have a separate _PAGE_HASHPTE bit. Instead,
> - * we set that to be the whole sub-bits mask. The C code will only
> - * test this, so a multi-bit mask will work. For combo pages, this
> - * is equivalent as effectively, the old _PAGE_HASHPTE was an OR of
> - * all the sub bits. For real 64k pages, we now have the assembly set
> - * _PAGE_HPTE_SUB0 in addition to setting the HIDX bits which overlap
> - * that mask. This is fine as long as the HIDX bits are never set on
> - * a PTE that isn't hashed, which is the case today.
> - *
> - * A little nit is for the huge page C code, which does the hashing
> - * in C, we need to provide which bit to use.
> - */
> -#define _PAGE_HASHPTE	_PAGE_HPTE_SUB
> -
> -/* Note the full page bits must be in the same location as for normal
> - * 4k pages as the same assembly will be used to insert 64K pages
> - * whether the kernel has CONFIG_PPC_64K_PAGES or not
> +#define _PAGE_COMBO	0x00020000 /* this is a combo 4k page */
> +#define _PAGE_4K_PFN	0x00040000 /* PFN is for a single 4k page */
> +/*
> + * Used to track subpage group valid if _PAGE_COMBO is set
> + * This overloads _PAGE_F_GIX and _PAGE_F_SECOND
>   */
> -#define _PAGE_F_SECOND  0x00008000 /* full page: hidx bits */
> -#define _PAGE_F_GIX     0x00007000 /* full page: hidx bits */
> +#define _PAGE_COMBO_VALID	(_PAGE_F_GIX | _PAGE_F_SECOND)
>  
>  /* PTE flags to conserve for HPTE identification */
>  #define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | _PAGE_COMBO)
> @@ -103,8 +83,7 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
>  }
>  
>  #define __rpte_to_pte(r)	((r).pte)
> -#define __rpte_sub_valid(rpte, index) \
> -	(pte_val(rpte.pte) & (_PAGE_HPTE_SUB0 >> (index)))
> +extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
>  /*
>   * Trick: we set __end to va + 64k, which happens works for
>   * a 16M page as well as we want only one iteration
> diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> index 48237e66e823..2f2034621a69 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> @@ -81,7 +81,12 @@
>  #define _PAGE_DIRTY		0x0080 /* C: page changed */
>  #define _PAGE_ACCESSED		0x0100 /* R: page referenced */
>  #define _PAGE_RW		0x0200 /* software: user write access allowed */
> +#define _PAGE_HASHPTE		0x0400 /* software: pte has an associated HPTE */
>  #define _PAGE_BUSY		0x0800 /* software: PTE & hash are busy */
> +#define _PAGE_F_GIX		0x7000 /* full page: hidx bits */
> +#define _PAGE_F_GIX_SHIFT	12
> +#define _PAGE_F_SECOND		0x8000 /* Whether to use secondary hash or not */
> +#define _PAGE_SPECIAL		0x10000 /* software: special page */
>  
>  /* No separate kernel read-only */
>  #define _PAGE_KERNEL_RW		(_PAGE_RW | _PAGE_DIRTY) /* user access blocked by key */
> @@ -210,11 +215,6 @@
>  
>  #define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
>  #define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
> -/*
> - * We save the slot number & secondary bit in the second half of the
> - * PTE page. We use the 8 bytes per each pte entry.
> - */
> -#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
>  
>  #ifndef __ASSEMBLY__
>  #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
> index 9ffeae2cbb57..f1b86ba63430 100644
> --- a/arch/powerpc/mm/hash64_64k.c
> +++ b/arch/powerpc/mm/hash64_64k.c
> @@ -15,6 +15,35 @@
>  #include <linux/mm.h>
>  #include <asm/machdep.h>
>  #include <asm/mmu.h>
> +/*
> + * index from 0 - 15
> + */
> +bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
> +{
> +	unsigned long g_idx;
> +	unsigned long ptev = pte_val(rpte.pte);
> +
> +	g_idx = (ptev & _PAGE_COMBO_VALID) >> _PAGE_F_GIX_SHIFT;
> +	index = index >> 2;
> +	if (g_idx & (0x1 << index))
> +		return true;
> +	else
> +		return false;
> +}

This function checks for validity of the sub page group not individual
sub page index. Because we dont track individual sub page index validity
any more, wondering if that is even possible.

> +/*
> + * index from 0 - 15
> + */
> +static unsigned long mark_subptegroup_valid(unsigned long ptev, unsigned long index)
> +{
> +	unsigned long g_idx;
> +
> +	if (!(ptev & _PAGE_COMBO))
> +		return ptev;
> +	index = index >> 2;
> +	g_idx = 0x1 << index;
> +
> +	return ptev | (g_idx << _PAGE_F_GIX_SHIFT);
> +}
>  
>  int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>  		   pte_t *ptep, unsigned long trap, unsigned long flags,
> @@ -102,7 +131,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>  	 */
>  	if (!(old_pte & _PAGE_COMBO)) {
>  		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
> -		old_pte &= ~_PAGE_HPTE_SUB;
> +		old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;

Or  use _PAGE_COMBO_VALID directly instead ?
 
>  		goto htab_insert_hpte;
>  	}
>  	/*
> @@ -192,7 +221,8 @@ repeat:
>  	/* __real_pte use pte_val() any idea why ? FIXME!! */
>  	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
>  	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
> -	new_pte |= (_PAGE_HPTE_SUB0 >> subpg_index);
> +	new_pte = mark_subptegroup_valid(new_pte, subpg_index);
> +	new_pte |=  _PAGE_HASHPTE;
>  	/*
>  	 * check __real_pte for details on matching smp_rmb()
>  	 */
> diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
> index 6b4d4c1d0628..359839a57f26 100644
> --- a/arch/powerpc/mm/hash_low_64.S
> +++ b/arch/powerpc/mm/hash_low_64.S
> @@ -285,7 +285,7 @@ htab_modify_pte:
>  
>  	/* Secondary group ? if yes, get a inverted hash value */
>  	mr	r5,r28
> -	andi.	r0,r31,_PAGE_SECONDARY
> +	andi.	r0,r31,_PAGE_F_SECOND
>  	beq	1f
>  	not	r5,r5
>  1:
> @@ -473,11 +473,7 @@ ht64_insert_pte:
>  	lis	r0,_PAGE_HPTEFLAGS@h
>  	ori	r0,r0,_PAGE_HPTEFLAGS@l
>  	andc	r30,r30,r0
> -#ifdef CONFIG_PPC_64K_PAGES
> -	oris	r30,r30,_PAGE_HPTE_SUB0@h
> -#else
>  	ori	r30,r30,_PAGE_HASHPTE
> -#endif
>  	/* Phyical address in r5 */
>  	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
>  	sldi	r5,r5,PAGE_SHIFT
> diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
> index d94b1af53a93..7584e8445512 100644
> --- a/arch/powerpc/mm/hugetlbpage-hash64.c
> +++ b/arch/powerpc/mm/hugetlbpage-hash64.c
> @@ -91,11 +91,8 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
>  		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
>  
>  		/* clear HPTE slot informations in new PTE */
> -#ifdef CONFIG_PPC_64K_PAGES
> -		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HPTE_SUB0;
> -#else
>  		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
> -#endif
> +
>  		/* Add in WIMG bits */
>  		rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
>  				      _PAGE_COHERENT | _PAGE_GUARDED));
> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
> index d692ae31cfc7..3967e3cce03e 100644
> --- a/arch/powerpc/mm/pgtable_64.c
> +++ b/arch/powerpc/mm/pgtable_64.c
> @@ -625,7 +625,7 @@ void pmdp_splitting_flush(struct vm_area_struct *vma,
>  	"1:	ldarx	%0,0,%3\n\
>  		andi.	%1,%0,%6\n\
>  		bne-	1b \n\
> -		ori	%1,%0,%4 \n\
> +		oris	%1,%0,%4@h \n\

Why is this change required ?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t
  2015-12-03  7:13   ` Anshuman Khandual
@ 2015-12-03 18:57     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2015-12-03 18:57 UTC (permalink / raw)
  To: Anshuman Khandual, benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev

Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

> On 12/01/2015 09:06 AM, Aneesh Kumar K.V wrote:
>> This free up 11 bits in pte_t. In the later patch we also change
>> the pte_t format so that we can start supporting migration pte
>> at pmd level. We now track 4k subpage valid bit as below
>> 
>> If we have _PAGE_COMBO set, we override the _PAGE_F_GIX_SHIFT
>> and _PAGE_F_SECOND. Together we have 4 bits, each of them
>> used to indicate whether any of the 4 4k subpage in that group
>> is valid. ie,
>> 
>> [ group 1 bit ]   [ group 2 bit ]  ..... [ group 4 ]
>> [ subpage 1 - 4]  [ subpage 5- 8]  ..... [ subpage 13 - 16]
>> 
>> We still track each 4k subpage slot number and secondary hash
>> information in the second half of pgtable_t. Removing the subpage
>> tracking have some significant overhead on aim9 and ebizzy benchmark and
>> to support THP with 4K subpage, we do need a pgtable_t of 4096 bytes.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>
> We removed 16 bits which used to track individual sub pages and added
> back 4 bits to track newly created sub page groups. So we save 12 bits
> there. How it is 11 bits ? Have we added back one more bit into pte_t
> some where for sub page purpose which I am missing ?


We added back 5 bits, (I guess you are missing _PAGE_HASHPTE )


>
>> ---
>>  arch/powerpc/include/asm/book3s/64/hash-4k.h  | 10 +-------
>>  arch/powerpc/include/asm/book3s/64/hash-64k.h | 35 ++++++---------------------
>>  arch/powerpc/include/asm/book3s/64/hash.h     | 10 ++++----
>>  arch/powerpc/mm/hash64_64k.c                  | 34 ++++++++++++++++++++++++--

.....
......

> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
>> index 9ffeae2cbb57..f1b86ba63430 100644
>> --- a/arch/powerpc/mm/hash64_64k.c
>> +++ b/arch/powerpc/mm/hash64_64k.c
>> @@ -15,6 +15,35 @@
>>  #include <linux/mm.h>
>>  #include <asm/machdep.h>
>>  #include <asm/mmu.h>
>> +/*
>> + * index from 0 - 15
>> + */
>> +bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
>> +{
>> +	unsigned long g_idx;
>> +	unsigned long ptev = pte_val(rpte.pte);
>> +
>> +	g_idx = (ptev & _PAGE_COMBO_VALID) >> _PAGE_F_GIX_SHIFT;
>> +	index = index >> 2;
>> +	if (g_idx & (0x1 << index))
>> +		return true;
>> +	else
>> +		return false;
>> +}
>
> This function checks for validity of the sub page group not individual
> sub page index. Because we dont track individual sub page index validity
> any more, wondering if that is even possible.
>

That is correct. We are not tracking individual subpage validity after
this patch. That is explained in the commit message.


>> +/*
>> + * index from 0 - 15
>> + */
>> +static unsigned long mark_subptegroup_valid(unsigned long ptev, unsigned long index)
>> +{
>> +	unsigned long g_idx;
>> +
>> +	if (!(ptev & _PAGE_COMBO))
>> +		return ptev;
>> +	index = index >> 2;
>> +	g_idx = 0x1 << index;
>> +
>> +	return ptev | (g_idx << _PAGE_F_GIX_SHIFT);
>> +}
>>  
>>  int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>>  		   pte_t *ptep, unsigned long trap, unsigned long flags,
>> @@ -102,7 +131,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>>  	 */
>>  	if (!(old_pte & _PAGE_COMBO)) {
>>  		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
>> -		old_pte &= ~_PAGE_HPTE_SUB;
>> +		old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;
>
> Or  use _PAGE_COMBO_VALID directly instead ?


Nope we are not clearning a combo related detail there. We are
invalidating an old 64K mapping there. (yes, the values are matching but
we should not use the _PAGE_COMBO_VALID there).


>
>>  		goto htab_insert_hpte;
>>  	}
>>  	/*
>> @@ -192,7 +221,8 @@ repeat:
>>  	/* __real_pte use pte_val() any idea why ? FIXME!! */
>>  	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
>>  	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
>> -	new_pte |= (_PAGE_HPTE_SUB0 >> subpg_index);
>> +	new_pte = mark_subptegroup_valid(new_pte, subpg_index);
>> +	new_pte |=  _PAGE_HASHPTE;
>>  	/*
>>  	 * check __real_pte for details on matching smp_rmb()
>>  	 */
>> diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
>> index 6b4d4c1d0628..359839a57f26 100644
>> --- a/arch/powerpc/mm/hash_low_64.S
>> +++ b/arch/powerpc/mm/hash_low_64.S
>> @@ -285,7 +285,7 @@ htab_modify_pte:
>>  
>>  	/* Secondary group ? if yes, get a inverted hash value */
>>  	mr	r5,r28
>> -	andi.	r0,r31,_PAGE_SECONDARY
>> +	andi.	r0,r31,_PAGE_F_SECOND
>>  	beq	1f
>>  	not	r5,r5
>>  1:
>> @@ -473,11 +473,7 @@ ht64_insert_pte:
>>  	lis	r0,_PAGE_HPTEFLAGS@h
>>  	ori	r0,r0,_PAGE_HPTEFLAGS@l
>>  	andc	r30,r30,r0
>> -#ifdef CONFIG_PPC_64K_PAGES
>> -	oris	r30,r30,_PAGE_HPTE_SUB0@h
>> -#else
>>  	ori	r30,r30,_PAGE_HASHPTE
>> -#endif
>>  	/* Phyical address in r5 */
>>  	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
>>  	sldi	r5,r5,PAGE_SHIFT
>> diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
>> index d94b1af53a93..7584e8445512 100644
>> --- a/arch/powerpc/mm/hugetlbpage-hash64.c
>> +++ b/arch/powerpc/mm/hugetlbpage-hash64.c
>> @@ -91,11 +91,8 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
>>  		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
>>  
>>  		/* clear HPTE slot informations in new PTE */
>> -#ifdef CONFIG_PPC_64K_PAGES
>> -		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HPTE_SUB0;
>> -#else
>>  		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
>> -#endif
>> +
>>  		/* Add in WIMG bits */
>>  		rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
>>  				      _PAGE_COHERENT | _PAGE_GUARDED));
>> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
>> index d692ae31cfc7..3967e3cce03e 100644
>> --- a/arch/powerpc/mm/pgtable_64.c
>> +++ b/arch/powerpc/mm/pgtable_64.c
>> @@ -625,7 +625,7 @@ void pmdp_splitting_flush(struct vm_area_struct *vma,
>>  	"1:	ldarx	%0,0,%3\n\
>>  		andi.	%1,%0,%6\n\
>>  		bne-	1b \n\
>> -		ori	%1,%0,%4 \n\
>> +		oris	%1,%0,%4@h \n\
>
> Why is this change required ?


Because _PAGE_SPLITTING is beyond 16 bits now ?

-aneesh

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [V6,01/35] powerpc/mm: move pte headers to book3s directory
  2015-12-01  3:36 ` [PATCH V6 01/35] powerpc/mm: move pte headers to book3s directory Aneesh Kumar K.V
@ 2015-12-14 10:46   ` Michael Ellerman
  2016-02-18  2:52   ` [PATCH V6 01/35] " Balbir Singh
  1 sibling, 0 replies; 45+ messages in thread
From: Michael Ellerman @ 2015-12-14 10:46 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev, Aneesh Kumar K.V

On Tue, 2015-01-12 at 03:36:26 UTC, "Aneesh Kumar K.V" wrote:
> Acked-by: Scott Wood <scottwood@freescale.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Entire series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/26b6a3d9bb48f8b4624a6228

cheers

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t
  2015-12-01  3:36 ` [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t Aneesh Kumar K.V
  2015-12-03  7:13   ` Anshuman Khandual
@ 2016-02-18  1:11   ` Paul Mackerras
  2016-02-18  1:49     ` Aneesh Kumar K.V
  1 sibling, 1 reply; 45+ messages in thread
From: Paul Mackerras @ 2016-02-18  1:11 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, Scott Wood, Denis Kirjanov, linuxppc-dev

On Tue, Dec 01, 2015 at 09:06:45AM +0530, Aneesh Kumar K.V wrote:
> This free up 11 bits in pte_t. In the later patch we also change
> the pte_t format so that we can start supporting migration pte
> at pmd level. We now track 4k subpage valid bit as below
> 
> If we have _PAGE_COMBO set, we override the _PAGE_F_GIX_SHIFT
> and _PAGE_F_SECOND. Together we have 4 bits, each of them
> used to indicate whether any of the 4 4k subpage in that group
> is valid. ie,
> 
> [ group 1 bit ]   [ group 2 bit ]  ..... [ group 4 ]
> [ subpage 1 - 4]  [ subpage 5- 8]  ..... [ subpage 13 - 16]
> 
> We still track each 4k subpage slot number and secondary hash
> information in the second half of pgtable_t. Removing the subpage
> tracking have some significant overhead on aim9 and ebizzy benchmark and
> to support THP with 4K subpage, we do need a pgtable_t of 4096 bytes.

I know this has already been applied, but this hunk looks wrong:

> @@ -102,7 +131,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>  	 */
>  	if (!(old_pte & _PAGE_COMBO)) {
>  		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
> -		old_pte &= ~_PAGE_HPTE_SUB;
> +		old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;

Shouldn't this be:

+		old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);

instead?

Paul.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t
  2016-02-18  1:11   ` Paul Mackerras
@ 2016-02-18  1:49     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 45+ messages in thread
From: Aneesh Kumar K.V @ 2016-02-18  1:49 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: benh, mpe, Scott Wood, Denis Kirjanov, linuxppc-dev

Paul Mackerras <paulus@ozlabs.org> writes:

> On Tue, Dec 01, 2015 at 09:06:45AM +0530, Aneesh Kumar K.V wrote:
>> This free up 11 bits in pte_t. In the later patch we also change
>> the pte_t format so that we can start supporting migration pte
>> at pmd level. We now track 4k subpage valid bit as below
>> 
>> If we have _PAGE_COMBO set, we override the _PAGE_F_GIX_SHIFT
>> and _PAGE_F_SECOND. Together we have 4 bits, each of them
>> used to indicate whether any of the 4 4k subpage in that group
>> is valid. ie,
>> 
>> [ group 1 bit ]   [ group 2 bit ]  ..... [ group 4 ]
>> [ subpage 1 - 4]  [ subpage 5- 8]  ..... [ subpage 13 - 16]
>> 
>> We still track each 4k subpage slot number and secondary hash
>> information in the second half of pgtable_t. Removing the subpage
>> tracking have some significant overhead on aim9 and ebizzy benchmark and
>> to support THP with 4K subpage, we do need a pgtable_t of 4096 bytes.
>
> I know this has already been applied, but this hunk looks wrong:
>
>> @@ -102,7 +131,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>>  	 */
>>  	if (!(old_pte & _PAGE_COMBO)) {
>>  		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
>> -		old_pte &= ~_PAGE_HPTE_SUB;
>> +		old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;
>
> Shouldn't this be:
>
> +		old_pte &= ~(_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND);
>
> instead?

Thanks for checking this closely. Yes it should be what you suggested. I
will do a patch for this.

-aneesh

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V6 01/35] powerpc/mm: move pte headers to book3s directory
  2015-12-01  3:36 ` [PATCH V6 01/35] powerpc/mm: move pte headers to book3s directory Aneesh Kumar K.V
  2015-12-14 10:46   ` [V6,01/35] " Michael Ellerman
@ 2016-02-18  2:52   ` Balbir Singh
  1 sibling, 0 replies; 45+ messages in thread
From: Balbir Singh @ 2016-02-18  2:52 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev

On Tue, 2015-12-01 at 09:06 +0530, Aneesh Kumar K.V wrote:
> Acked-by: Scott Wood <scottwood@freescale.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} | 0
>  arch/powerpc/include/asm/{pte-hash64.h => book3s/64/hash.h} | 0
>  arch/powerpc/include/asm/pgtable-ppc32.h                    | 2 +-
>  arch/powerpc/include/asm/pgtable-ppc64.h                    | 2 +-
>  4 files changed, 2 insertions(+), 2 deletions(-)
>  rename arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} (100%)
>  rename arch/powerpc/include/asm/{pte-hash64.h => book3s/64/hash.h} (100%)

Why not pte-hash.h?

Balbir Singh.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V6 04/35] powerpc/mm: make a separate copy for book3s (part 2)
  2015-12-01  3:36 ` [PATCH V6 04/35] powerpc/mm: make a separate copy for book3s (part 2) Aneesh Kumar K.V
@ 2016-02-18  2:54   ` Balbir Singh
  0 siblings, 0 replies; 45+ messages in thread
From: Balbir Singh @ 2016-02-18  2:54 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev

On Tue, 2015-12-01 at 09:06 +0530, Aneesh Kumar K.V wrote:
> Keep it seperate to make rebasing easier
> 
> Acked-by: Scott Wood <scottwood@freescale.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/32/pgtable.h | 6 +++---
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 6 +++---
>  arch/powerpc/include/asm/pgtable-ppc32.h     | 2 --
>  arch/powerpc/include/asm/pgtable-ppc64.h     | 4 ----
>  4 files changed, 6 insertions(+), 12 deletions(-)
> 

One side effect is that someone could by mistake include
asm/book3s/32/pgtable.h and asm/pgtable-ppc32.h and get
duplicate definitions and have the compiler complain


Ditto for the 64 bit headers

Balbir

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V6 05/35] powerpc/mm: Move hash specific pte width and other defines to book3s
  2015-12-01  3:36 ` [PATCH V6 05/35] powerpc/mm: Move hash specific pte width and other defines to book3s Aneesh Kumar K.V
@ 2016-02-18  4:45   ` Balbir Singh
  0 siblings, 0 replies; 45+ messages in thread
From: Balbir Singh @ 2016-02-18  4:45 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev



On 01/12/15 14:36, Aneesh Kumar K.V wrote:
> This further make a copy of pte defines to book3s/64/hash*.h. This
> remove the dependency on pgtable-ppc64-4k.h and pgtable-ppc64-64k.h
>
> Acked-by: Scott Wood <scottwood@freescale.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/hash-4k.h  | 86 ++++++++++++++++++++++++++-
>  arch/powerpc/include/asm/book3s/64/hash-64k.h | 46 +++++++++++++-
>  arch/powerpc/include/asm/book3s/64/pgtable.h  |  6 +-
>  3 files changed, 129 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> index c134e809aac3..f2c51cd61f69 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> @@ -1,4 +1,51 @@
> -/* To be include by pgtable-hash64.h only */
> +#ifndef _ASM_POWERPC_BOOK3S_64_HASH_4K_H
> +#define _ASM_POWERPC_BOOK3S_64_HASH_4K_H
> +/*
> + * Entries per page directory level.  The PTE level must use a 64b record
> + * for each page table entry.  The PMD and PGD level use a 32b record for
> + * each entry by assuming that each entry is page aligned.
> + */

More clarity on this please
> +#define PTE_INDEX_SIZE  9
> +#define PMD_INDEX_SIZE  7
> +#define PUD_INDEX_SIZE  9
> +#define PGD_INDEX_SIZE  9
> +
We use 9+7+9+9+(12) bits?
> +#ifndef __ASSEMBLY__
> +#define PTE_TABLE_SIZE	(sizeof(pte_t) << PTE_INDEX_SIZE)
> +#define PMD_TABLE_SIZE	(sizeof(pmd_t) << PMD_INDEX_SIZE)
> +#define PUD_TABLE_SIZE	(sizeof(pud_t) << PUD_INDEX_SIZE)
> +#define PGD_TABLE_SIZE	(sizeof(pgd_t) << PGD_INDEX_SIZE)
> +#endif	/* __ASSEMBLY__ */
> +
> +#define PTRS_PER_PTE	(1 << PTE_INDEX_SIZE)
> +#define PTRS_PER_PMD	(1 << PMD_INDEX_SIZE)
> +#define PTRS_PER_PUD	(1 << PUD_INDEX_SIZE)
> +#define PTRS_PER_PGD	(1 << PGD_INDEX_SIZE)
> +
> +/* PMD_SHIFT determines what a second-level page table entry can map */
> +#define PMD_SHIFT	(PAGE_SHIFT + PTE_INDEX_SIZE)
> +#define PMD_SIZE	(1UL << PMD_SHIFT)
> +#define PMD_MASK	(~(PMD_SIZE-1))
> +
> +/* With 4k base page size, hugepage PTEs go at the PMD level */
> +#define MIN_HUGEPTE_SHIFT	PMD_SHIFT
> +
> +/* PUD_SHIFT determines what a third-level page table entry can map */
> +#define PUD_SHIFT	(PMD_SHIFT + PMD_INDEX_SIZE)
> +#define PUD_SIZE	(1UL << PUD_SHIFT)
> +#define PUD_MASK	(~(PUD_SIZE-1))
> +
> +/* PGDIR_SHIFT determines what a fourth-level page table entry can map */
> +#define PGDIR_SHIFT	(PUD_SHIFT + PUD_INDEX_SIZE)
> +#define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
> +#define PGDIR_MASK	(~(PGDIR_SIZE-1))
> +
> +/* Bits to mask out from a PMD to get to the PTE page */
> +#define PMD_MASKED_BITS		0
> +/* Bits to mask out from a PUD to get to the PMD page */
> +#define PUD_MASKED_BITS		0
> +/* Bits to mask out from a PGD to get to the PUD page */
> +#define PGD_MASKED_BITS		0
>  
Don't get why these are all 0?
>  /* PTE bits */
>  #define _PAGE_HASHPTE	0x0400 /* software: pte has an associated HPTE */
> @@ -15,3 +62,40 @@
>  /* shift to put page number into pte */
>  #define PTE_RPN_SHIFT	(17)
>  
> +#ifndef __ASSEMBLY__
> +/*
> + * 4-level page tables related bits
> + */
> +
> +#define pgd_none(pgd)		(!pgd_val(pgd))
> +#define pgd_bad(pgd)		(pgd_val(pgd) == 0)
> +#define pgd_present(pgd)	(pgd_val(pgd) != 0)
> +#define pgd_clear(pgdp)		(pgd_val(*(pgdp)) = 0)
> +#define pgd_page_vaddr(pgd)	(pgd_val(pgd) & ~PGD_MASKED_BITS)
> +
> +static inline pte_t pgd_pte(pgd_t pgd)
> +{
> +	return __pte(pgd_val(pgd));
> +}
> +
> +static inline pgd_t pte_pgd(pte_t pte)
> +{
> +	return __pgd(pte_val(pte));
> +}
> +extern struct page *pgd_page(pgd_t pgd);
> +
> +#define pud_offset(pgdp, addr)	\
> +  (((pud_t *) pgd_page_vaddr(*(pgdp))) + \
> +    (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
> +


> +#define pud_ERROR(e) \
> +	pr_err("%s:%d: bad pud %08lx.\n", __FILE__, __LINE__, pud_val(e))
> +
> +/*
> + * On all 4K setups, remap_4k_pfn() equates to remap_pfn_range() */
> +#define remap_4k_pfn(vma, addr, pfn, prot)	\
> +	remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, (prot))
> +
> +#endif /* !__ASSEMBLY__ */
> +
> +#endif /* _ASM_POWERPC_BOOK3S_64_HASH_4K_H */
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 4f4ec2ab45c9..ee073822145d 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -1,4 +1,35 @@
> -/* To be include by pgtable-hash64.h only */
> +#ifndef _ASM_POWERPC_BOOK3S_64_HASH_64K_H
> +#define _ASM_POWERPC_BOOK3S_64_HASH_64K_H
> +
> +#include <asm-generic/pgtable-nopud.h>
> +
> +#define PTE_INDEX_SIZE  8
> +#define PMD_INDEX_SIZE  10
> +#define PUD_INDEX_SIZE	0
> +#define PGD_INDEX_SIZE  12
> +

OK.. So there is PGD but no PUD?
> +#define PTRS_PER_PTE	(1 << PTE_INDEX_SIZE)
> +#define PTRS_PER_PMD	(1 << PMD_INDEX_SIZE)
> +#define PTRS_PER_PGD	(1 << PGD_INDEX_SIZE)
> +
> +/* With 4k base page size, hugepage PTEs go at the PMD level */
> +#define MIN_HUGEPTE_SHIFT	PAGE_SHIFT
> +

Typo, refers to 4k base page size.

> +/* PMD_SHIFT determines what a second-level page table entry can map */
> +#define PMD_SHIFT	(PAGE_SHIFT + PTE_INDEX_SIZE)
> +#define PMD_SIZE	(1UL << PMD_SHIFT)
> +#define PMD_MASK	(~(PMD_SIZE-1))
> +
> +/* PGDIR_SHIFT determines what a third-level page table entry can map */
> +#define PGDIR_SHIFT	(PMD_SHIFT + PMD_INDEX_SIZE)
> +#define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
> +#define PGDIR_MASK	(~(PGDIR_SIZE-1))
> +
> +/* Bits to mask out from a PMD to get to the PTE page */
> +/* PMDs point to PTE table fragments which are 4K aligned.  */
> +#define PMD_MASKED_BITS		0xfff
> +/* Bits to mask out from a PGD/PUD to get to the PMD page */
> +#define PUD_MASKED_BITS		0x1ff
>  
>  /* Additional PTE bits (don't change without checking asm in hash_low.S) */
>  #define _PAGE_SPECIAL	0x00000400 /* software: special page */
> @@ -74,8 +105,8 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
>  #define __rpte_to_pte(r)	((r).pte)
>  #define __rpte_sub_valid(rpte, index) \
>  	(pte_val(rpte.pte) & (_PAGE_HPTE_SUB0 >> (index)))
> -
> -/* Trick: we set __end to va + 64k, which happens works for
> +/*
> + * Trick: we set __end to va + 64k, which happens works for
>   * a 16M page as well as we want only one iteration
>   */
>  #define pte_iterate_hashed_subpages(rpte, psize, vpn, index, shift)	\
> @@ -99,4 +130,13 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
>  		remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE,	\
>  			__pgprot(pgprot_val((prot)) | _PAGE_4K_PFN)))
>  
> +#define PTE_TABLE_SIZE	(sizeof(real_pte_t) << PTE_INDEX_SIZE)
> +#define PMD_TABLE_SIZE	(sizeof(pmd_t) << PMD_INDEX_SIZE)
> +#define PGD_TABLE_SIZE	(sizeof(pgd_t) << PGD_INDEX_SIZE)
> +
> +#define pgd_pte(pgd)	(pud_pte(((pud_t){ pgd })))
> +#define pte_pgd(pte)	((pgd_t)pte_pud(pte))
> +
>  #endif	/* __ASSEMBLY__ */
> +
> +#endif /* _ASM_POWERPC_BOOK3S_64_HASH_64K_H */
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index cdd5284d9eaa..2741ac6fbd3d 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -5,11 +5,7 @@
>   * the ppc64 hashed page table.
>   */
>  
> -#ifdef CONFIG_PPC_64K_PAGES
> -#include <asm/pgtable-ppc64-64k.h>
> -#else
> -#include <asm/pgtable-ppc64-4k.h>
> -#endif
> +#include <asm/book3s/64/hash.h>
>  #include <asm/barrier.h>
>  
>  #define FIRST_USER_ADDRESS	0UL

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH V6 07/35] powerpc/mm: Don't have generic headers introduce functions touching pte bits
  2015-12-01  3:36 ` [PATCH V6 07/35] powerpc/mm: Don't have generic headers introduce functions touching pte bits Aneesh Kumar K.V
@ 2016-02-18  4:58   ` Balbir Singh
  0 siblings, 0 replies; 45+ messages in thread
From: Balbir Singh @ 2016-02-18  4:58 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe, Scott Wood, Denis Kirjanov
  Cc: linuxppc-dev



On 01/12/15 14:36, Aneesh Kumar K.V wrote:
> We are going to drop pte_common.h in the later patch. The idea is to
> enable hash code not require to define all PTE bits. Having PTE bits
> defined in pte_common.h made the code unnecessarily complex.
>
> Acked-by: Scott Wood <scottwood@freescale.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/pgtable.h          | 176 +++++++++++++++++++
>  .../include/asm/{pgtable.h => pgtable-book3e.h}    |  96 +----------
>  arch/powerpc/include/asm/pgtable.h                 | 192 +--------------------
>  3 files changed, 185 insertions(+), 279 deletions(-)
>  copy arch/powerpc/include/asm/{pgtable.h => pgtable-book3e.h} (70%)
>
> diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
> index 3818cc7bc9b7..fa270cfcf30a 100644
> --- a/arch/powerpc/include/asm/book3s/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/pgtable.h
> @@ -8,4 +8,180 @@
>  #endif
>  
>  #define FIRST_USER_ADDRESS	0UL
> +#ifndef __ASSEMBLY__
> +
> +/* Generic accessors to PTE bits */
> +static inline int pte_write(pte_t pte)
> +{
> +	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO;
> +}
> +static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
> +static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
> +static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
> +static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
> +static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
> +
> +#ifdef CONFIG_NUMA_BALANCING
> +/*
> + * These work without NUMA balancing but the kernel does not care. See the
> + * comment in include/asm-generic/pgtable.h . On powerpc, this will only
> + * work for user pages and always return true for kernel pages.
> + */
> +static inline int pte_protnone(pte_t pte)
> +{
> +	return (pte_val(pte) &
> +		(_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
> +}
> +
> +static inline int pmd_protnone(pmd_t pmd)
> +{
> +	return pte_protnone(pmd_pte(pmd));
> +}
> +#endif /* CONFIG_NUMA_BALANCING */
> +
> +static inline int pte_present(pte_t pte)
> +{
> +	return pte_val(pte) & _PAGE_PRESENT;
> +}
> +
> +/* Conversion functions: convert a page and protection to a page entry,
Typo.. comment style, comment starts on the same line for multi-block comments?
> + * and a page entry and page directory to the page they refer to.
> + *
> + * Even if PTEs can be unsigned long long, a PFN is always an unsigned
> + * long for now.
> + */
> +static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
> +	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
> +		     pgprot_val(pgprot)); }
> +static inline unsigned long pte_pfn(pte_t pte)	{
> +	return pte_val(pte) >> PTE_RPN_SHIFT; }
> +
> +/* Generic modifiers for PTE bits */
> +static inline pte_t pte_wrprotect(pte_t pte) {
> +	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
> +	pte_val(pte) |= _PAGE_RO; return pte; }
> +static inline pte_t pte_mkclean(pte_t pte) {
> +	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
> +static inline pte_t pte_mkold(pte_t pte) {
> +	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
> +static inline pte_t pte_mkwrite(pte_t pte) {
> +	pte_val(pte) &= ~_PAGE_RO;
> +	pte_val(pte) |= _PAGE_RW; return pte; }
> +static inline pte_t pte_mkdirty(pte_t pte) {
> +	pte_val(pte) |= _PAGE_DIRTY; return pte; }
> +static inline pte_t pte_mkyoung(pte_t pte) {
> +	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
> +static inline pte_t pte_mkspecial(pte_t pte) {
> +	pte_val(pte) |= _PAGE_SPECIAL; return pte; }
> +static inline pte_t pte_mkhuge(pte_t pte) {
> +	return pte; }
> +static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
> +{
> +	pte_val(pte) = (pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot);
> +	return pte;
> +}
> +
> +
> +/* Insert a PTE, top-level function is out of line. It uses an inline
Comment style?
> + * low level function in the respective pgtable-* files
> + */
> +extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
> +		       pte_t pte);
> +
> +/* This low level function performs the actual PTE insertion
> + * Setting the PTE depends on the MMU type and other factors. It's
> + * an horrible mess that I'm not going to try to clean up now but
> + * I'm keeping it in one place rather than spread around
> + */
> +static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
> +				pte_t *ptep, pte_t pte, int percpu)
> +{
> +#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
> +	/* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
> +	 * helper pte_update() which does an atomic update. We need to do that
> +	 * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
> +	 * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
> +	 * the hash bits instead (ie, same as the non-SMP case)
> +	 */
> +	if (percpu)
> +		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
> +			      | (pte_val(pte) & ~_PAGE_HASHPTE));
> +	else
> +		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
> +
> +#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
> +	/* Second case is 32-bit with 64-bit PTE.  In this case, we
> +	 * can just store as long as we do the two halves in the right order
> +	 * with a barrier in between. This is possible because we take care,
> +	 * in the hash code, to pre-invalidate if the PTE was already hashed,
> +	 * which synchronizes us with any concurrent invalidation.
> +	 * In the percpu case, we also fallback to the simple update preserving
> +	 * the hash bits
> +	 */
> +	if (percpu) {
> +		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
> +			      | (pte_val(pte) & ~_PAGE_HASHPTE));
> +		return;
> +	}
> +	if (pte_val(*ptep) & _PAGE_HASHPTE)
> +		flush_hash_entry(mm, ptep, addr);
> +	__asm__ __volatile__("\
> +		stw%U0%X0 %2,%0\n\
> +		eieio\n\
> +		stw%U0%X0 %L2,%1"
> +	: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
> +	: "r" (pte) : "memory");
> +
> +#elif defined(CONFIG_PPC_STD_MMU_32)
> +	/* Third case is 32-bit hash table in UP mode, we need to preserve
> +	 * the _PAGE_HASHPTE bit since we may not have invalidated the previous
> +	 * translation in the hash yet (done in a subsequent flush_tlb_xxx())
> +	 * and see we need to keep track that this PTE needs invalidating
> +	 */
> +	*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
> +		      | (pte_val(pte) & ~_PAGE_HASHPTE));
> +
> +#else
> +	/* Anything else just stores the PTE normally. That covers all 64-bit
> +	 * cases, and 32-bit non-hash with 32-bit PTEs.
> +	 */
> +	*ptep = pte;
> +#endif
> +}
> +
> +
> +#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
> +extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
> +				 pte_t *ptep, pte_t entry, int dirty);
> +
> +/*
> + * Macro to mark a page protection value as "uncacheable".
> + */
> +
> +#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
> +			 _PAGE_WRITETHRU)
> +
> +#define pgprot_noncached(prot)	  (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
> +				            _PAGE_NO_CACHE | _PAGE_GUARDED))
> +
> +#define pgprot_noncached_wc(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
> +				            _PAGE_NO_CACHE))
> +
> +#define pgprot_cached(prot)       (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
> +				            _PAGE_COHERENT))
> +
> +#define pgprot_cached_wthru(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
> +				            _PAGE_COHERENT | _PAGE_WRITETHRU))
> +
> +#define pgprot_cached_noncoherent(prot) \
> +		(__pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL))
> +
> +#define pgprot_writecombine pgprot_noncached_wc
> +
> +struct file;
> +extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
> +				     unsigned long size, pgprot_t vma_prot);
> +#define __HAVE_PHYS_MEM_ACCESS_PROT
> +
> +#endif /* __ASSEMBLY__ */
>  #endif
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable-book3e.h
> similarity index 70%
> copy from arch/powerpc/include/asm/pgtable.h
> copy to arch/powerpc/include/asm/pgtable-book3e.h
> index c304d0767919..a3221cff2e31 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable-book3e.h
> @@ -1,41 +1,19 @@
> -#ifndef _ASM_POWERPC_PGTABLE_H
> -#define _ASM_POWERPC_PGTABLE_H
> -#ifdef __KERNEL__
> +#ifndef _ASM_POWERPC_PGTABLE_BOOK3E_H
> +#define _ASM_POWERPC_PGTABLE_BOOK3E_H
>  
> -#ifndef __ASSEMBLY__
> -#include <linux/mmdebug.h>
> -#include <linux/mmzone.h>
> -#include <asm/processor.h>		/* For TASK_SIZE */
> -#include <asm/mmu.h>
> -#include <asm/page.h>
> -
> -struct mm_struct;
> -
> -#endif /* !__ASSEMBLY__ */
> -
> -#ifdef CONFIG_PPC_BOOK3S
> -#include <asm/book3s/pgtable.h>
> -#else
>  #if defined(CONFIG_PPC64)
> -#  include <asm/pgtable-ppc64.h>
> +#include <asm/pgtable-ppc64.h>
>  #else
> -#  include <asm/pgtable-ppc32.h>
> +#include <asm/pgtable-ppc32.h>
>  #endif
> -#endif /* !CONFIG_PPC_BOOK3S */
> -
> -/*
> - * We save the slot number & secondary bit in the second half of the
> - * PTE page. We use the 8 bytes per each pte entry.
> - */
> -#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
>  
>  #ifndef __ASSEMBLY__
>  
> -#include <asm/tlbflush.h>
> -
>  /* Generic accessors to PTE bits */
>  static inline int pte_write(pte_t pte)
> -{	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO; }
> +{
> +	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO;
> +}
>  static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
>  static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
>  static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
> @@ -77,10 +55,6 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
>  static inline unsigned long pte_pfn(pte_t pte)	{
>  	return pte_val(pte) >> PTE_RPN_SHIFT; }
>  
> -/* Keep these as a macros to avoid include dependency mess */
> -#define pte_page(x)		pfn_to_page(pte_pfn(x))
> -#define mk_pte(page, pgprot)	pfn_pte(page_to_pfn(page), (pgprot))
> -
>  /* Generic modifiers for PTE bits */
>  static inline pte_t pte_wrprotect(pte_t pte) {
>  	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
> @@ -221,59 +195,5 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>  				     unsigned long size, pgprot_t vma_prot);
>  #define __HAVE_PHYS_MEM_ACCESS_PROT
>  
> -/*
> - * ZERO_PAGE is a global shared page that is always zero: used
> - * for zero-mapped memory areas etc..
> - */
> -extern unsigned long empty_zero_page[];
> -#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
> -
> -extern pgd_t swapper_pg_dir[];
> -
> -void limit_zone_pfn(enum zone_type zone, unsigned long max_pfn);
> -int dma_pfn_limit_to_zone(u64 pfn_limit);
> -extern void paging_init(void);
> -
> -/*
> - * kern_addr_valid is intended to indicate whether an address is a valid
> - * kernel address.  Most 32-bit archs define it as always true (like this)
> - * but most 64-bit archs actually perform a test.  What should we do here?
> - */
> -#define kern_addr_valid(addr)	(1)
> -
> -#include <asm-generic/pgtable.h>
> -
> -
> -/*
> - * This gets called at the end of handling a page fault, when
> - * the kernel has put a new PTE into the page table for the process.
> - * We use it to ensure coherency between the i-cache and d-cache
> - * for the page which has just been mapped in.
> - * On machines which use an MMU hash table, we use this to put a
> - * corresponding HPTE into the hash table ahead of time, instead of
> - * waiting for the inevitable extra hash-table miss exception.
> - */
> -extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t *);
> -
> -extern int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
> -		       unsigned long end, int write,
> -		       struct page **pages, int *nr);
> -#ifndef CONFIG_TRANSPARENT_HUGEPAGE
> -#define pmd_large(pmd)		0
> -#define has_transparent_hugepage() 0
> -#endif
> -pte_t *__find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
> -				   bool *is_thp, unsigned *shift);
> -static inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
> -					       bool *is_thp, unsigned *shift)
> -{
> -	if (!arch_irqs_disabled()) {
> -		pr_info("%s called with irq enabled\n", __func__);
> -		dump_stack();
> -	}
> -	return __find_linux_pte_or_hugepte(pgdir, ea, is_thp, shift);
> -}
Where's this function gone?
>
<snip>

Balbir Singh

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2016-02-18  4:58 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-01  3:36 [PATCH V6 00/35] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 01/35] powerpc/mm: move pte headers to book3s directory Aneesh Kumar K.V
2015-12-14 10:46   ` [V6,01/35] " Michael Ellerman
2016-02-18  2:52   ` [PATCH V6 01/35] " Balbir Singh
2015-12-01  3:36 ` [PATCH V6 02/35] powerpc/mm: move pte headers to book3s directory (part 2) Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 03/35] powerpc/mm: make a separate copy for book3s Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 04/35] powerpc/mm: make a separate copy for book3s (part 2) Aneesh Kumar K.V
2016-02-18  2:54   ` Balbir Singh
2015-12-01  3:36 ` [PATCH V6 05/35] powerpc/mm: Move hash specific pte width and other defines to book3s Aneesh Kumar K.V
2016-02-18  4:45   ` Balbir Singh
2015-12-01  3:36 ` [PATCH V6 06/35] powerpc/mm: Delete booke bits from book3s Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 07/35] powerpc/mm: Don't have generic headers introduce functions touching pte bits Aneesh Kumar K.V
2016-02-18  4:58   ` Balbir Singh
2015-12-01  3:36 ` [PATCH V6 08/35] powerpc/mm: Drop pte-common.h from BOOK3S 64 Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 09/35] powerpc/mm: Don't use pte_val as lvalue Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 10/35] powerpc/mm: Don't use pmd_val, pud_val and pgd_val " Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 11/35] powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 12/35] powerpc/mm: Move PTE bits from generic functions to hash64 functions Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 13/35] powerpc/booke: Move nohash headers (part 1) Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 14/35] powerpc/booke: Move nohash headers (part 2) Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 15/35] powerpc/booke: Move nohash headers (part 3) Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 16/35] powerpc/booke: Move nohash headers (part 4) Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 17/35] powerpc/booke: Move nohash headers (part 5) Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 18/35] powerpc/mm: Convert 4k hash insert to C Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 19/35] powerpc/mm: Remove the dependency on pte bit position in asm code Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 20/35] powerpc/mm: Don't track subpage valid bit in pte_t Aneesh Kumar K.V
2015-12-03  7:13   ` Anshuman Khandual
2015-12-03 18:57     ` Aneesh Kumar K.V
2016-02-18  1:11   ` Paul Mackerras
2016-02-18  1:49     ` Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 21/35] powerpc/mm: Remove pte_val usage for the second half of pgtable_t Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 22/35] powerpc/mm: Increase the width of #define Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 23/35] powerpc/mm: Convert __hash_page_64K to C Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 24/35] powerpc/mm: Convert 4k insert from asm " Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 25/35] powerpc/mm: Add helper for converting pte bit to hpte bits Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 26/35] powerpc/mm: Move WIMG update to helper Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 27/35] powerpc/mm: Move hugetlb related headers Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 28/35] powerpc/mm: Move THP headers around Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 29/35] powerpc/mm: Add a _PAGE_PTE bit Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 30/35] powerpc/mm: Don't hardcode page table size Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 31/35] powerpc/mm: Don't hardcode the hash pte slot shift Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 32/35] powerpc/nohash: Update 64K nohash config to have 32 pte fragement Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 33/35] powerpc/nohash: we don't use real_pte_t for nohash Aneesh Kumar K.V
2015-12-01  3:36 ` [PATCH V6 34/35] powerpc/mm: Use H_READ with H_READ_4 Aneesh Kumar K.V
2015-12-01  3:37 ` [PATCH V6 35/35] powerpc/mm: Don't open code pgtable_t size Aneesh Kumar K.V

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.