* [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE @ 2018-07-27 13:22 Laurent Dufour 2018-07-27 13:22 ` [PATCH 1/3] powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE Laurent Dufour ` (3 more replies) 0 siblings, 4 replies; 7+ messages in thread From: Laurent Dufour @ 2018-07-27 13:22 UTC (permalink / raw) To: linuxppc-dev, linux-kernel On very large system we could see soft lockup fired when a process is exiting watchdog: BUG: soft lockup - CPU#851 stuck for 21s! [forkoff:215523] Modules linked in: pseries_rng rng_core xfs raid10 vmx_crypto btrfs libcrc32c xor zstd_decompress zstd_compress xxhash lzo_compress raid6_pq crc32c_vpmsum lpfc crc_t10dif crct10dif_generic crct10dif_common dm_multipath scsi_dh_rdac scsi_dh_alua autofs4 CPU: 851 PID: 215523 Comm: forkoff Not tainted 4.17.0 #1 NIP: c0000000000b995c LR: c0000000000b8f64 CTR: 000000000000aa18 REGS: c00006b0645b7610 TRAP: 0901 Not tainted (4.17.0) MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 22042082 XER: 00000000 CFAR: 00000000006cf8f0 SOFTE: 0 GPR00: 0010000000000000 c00006b0645b7890 c000000000f99200 0000000000000000 GPR04: 8e000001a5a4de58 400249cf1bfd5480 8e000001a5a4de50 400249cf1bfd5480 GPR08: 8e000001a5a4de48 400249cf1bfd5480 8e000001a5a4de40 400249cf1bfd5480 GPR12: ffffffffffffffff c00000001e690800 NIP [c0000000000b995c] plpar_hcall9+0x44/0x7c LR [c0000000000b8f64] pSeries_lpar_flush_hash_range+0x324/0x3d0 Call Trace: [c00006b0645b7890] [8e000001a5a4dd20] 0x8e000001a5a4dd20 (unreliable) [c00006b0645b7a00] [c00000000006d5b0] flush_hash_range+0x60/0x110 [c00006b0645b7a50] [c000000000072a2c] __flush_tlb_pending+0x4c/0xd0 [c00006b0645b7a80] [c0000000002eaf44] unmap_page_range+0x984/0xbd0 [c00006b0645b7bc0] [c0000000002eb594] unmap_vmas+0x84/0x100 [c00006b0645b7c10] [c0000000002f8afc] exit_mmap+0xac/0x1f0 [c00006b0645b7cd0] [c0000000000f2638] mmput+0x98/0x1b0 [c00006b0645b7d00] [c0000000000fc9d0] do_exit+0x330/0xc00 [c00006b0645b7dc0] [c0000000000fd384] do_group_exit+0x64/0x100 [c00006b0645b7e00] [c0000000000fd44c] sys_exit_group+0x2c/0x30 [c00006b0645b7e30] [c00000000000b960] system_call+0x58/0x6c Instruction dump: 60000000 f8810028 7ca42b78 7cc53378 7ce63b78 7d074378 7d284b78 7d495378 e9410060 e9610068 e9810070 44000022 <7d806378> e9810028 f88c0000 f8ac0008 This happens when removing the PTE by calling the hypervisor using the H_BULK_REMOVE call. This call is processing up to 4 PTEs but is doing a tlbie for each PTE it is processing. This could lead to long time spent in the hypervisor (sometimes up to 4s) and soft lockup being raised because the scheduler is not called in zap_pte_range(). Since the Power7's time, the hypervisor is providing a new hcall H_BLOCK_REMOVE allowing processing up to 8 PTEs with one call to tlbie. By limiting the amount of tlbie generated, this reduces the time spent invalidating the PTEs. This hcall requires that the pages are "all within the same naturally aligned 8 page virtual address block". With this patch series applied, I couldn't see any soft lockup raised on the victim LPAR I was running the test one. This series is covering both normal pages and huge pages. Laurent Dufour (3): powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE powerpc/pseries/mm: factorize PTE slot computation powerpc/pseries/mm: call H_BLOCK_REMOVE arch/powerpc/include/asm/firmware.h | 3 +- arch/powerpc/include/asm/hvcall.h | 1 + arch/powerpc/platforms/pseries/firmware.c | 1 + arch/powerpc/platforms/pseries/lpar.c | 250 ++++++++++++++++++++++++++---- 4 files changed, 228 insertions(+), 27 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE 2018-07-27 13:22 [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE Laurent Dufour @ 2018-07-27 13:22 ` Laurent Dufour 2018-07-27 13:22 ` [PATCH 2/3] powerpc/pseries/mm: factorize PTE slot computation Laurent Dufour ` (2 subsequent siblings) 3 siblings, 0 replies; 7+ messages in thread From: Laurent Dufour @ 2018-07-27 13:22 UTC (permalink / raw) To: linuxppc-dev, linux-kernel Cc: Aneesh Kumar K.V, Nicholas Piggin, Michael Ellerman, Paul Mackerras, Benjamin Herrenschmidt This feature tells if the hcall H_BLOCK_REMOVE is available. Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> --- arch/powerpc/include/asm/firmware.h | 3 ++- arch/powerpc/platforms/pseries/firmware.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h index 535add3f7791..360ba197f9d2 100644 --- a/arch/powerpc/include/asm/firmware.h +++ b/arch/powerpc/include/asm/firmware.h @@ -53,6 +53,7 @@ #define FW_FEATURE_PRRN ASM_CONST(0x0000000200000000) #define FW_FEATURE_DRMEM_V2 ASM_CONST(0x0000000400000000) #define FW_FEATURE_DRC_INFO ASM_CONST(0x0000000800000000) +#define FW_FEATURE_BLOCK_REMOVE ASM_CONST(0x0000001000000000) #ifndef __ASSEMBLY__ @@ -70,7 +71,7 @@ enum { FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY | FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN | FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRMEM_V2 | - FW_FEATURE_DRC_INFO, + FW_FEATURE_DRC_INFO | FW_FEATURE_BLOCK_REMOVE, FW_FEATURE_PSERIES_ALWAYS = 0, FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL, FW_FEATURE_POWERNV_ALWAYS = 0, diff --git a/arch/powerpc/platforms/pseries/firmware.c b/arch/powerpc/platforms/pseries/firmware.c index a3bbeb43689e..1624501386f4 100644 --- a/arch/powerpc/platforms/pseries/firmware.c +++ b/arch/powerpc/platforms/pseries/firmware.c @@ -65,6 +65,7 @@ hypertas_fw_features_table[] = { {FW_FEATURE_SET_MODE, "hcall-set-mode"}, {FW_FEATURE_BEST_ENERGY, "hcall-best-energy-1*"}, {FW_FEATURE_HPT_RESIZE, "hcall-hpt-resize"}, + {FW_FEATURE_BLOCK_REMOVE, "hcall-block-remove"}, }; /* Build up the firmware features bitmask using the contents of -- 2.7.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] powerpc/pseries/mm: factorize PTE slot computation 2018-07-27 13:22 [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE Laurent Dufour 2018-07-27 13:22 ` [PATCH 1/3] powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE Laurent Dufour @ 2018-07-27 13:22 ` Laurent Dufour 2018-07-27 13:22 ` [PATCH 3/3] powerpc/pseries/mm: call H_BLOCK_REMOVE Laurent Dufour 2018-07-27 14:10 ` [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE Laurent Dufour 3 siblings, 0 replies; 7+ messages in thread From: Laurent Dufour @ 2018-07-27 13:22 UTC (permalink / raw) To: linuxppc-dev, linux-kernel Cc: Aneesh Kumar K.V, Nicholas Piggin, Michael Ellerman, Paul Mackerras, Benjamin Herrenschmidt This part of code will be called also when dealing with H_BLOCK_REMOVE. Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> --- arch/powerpc/platforms/pseries/lpar.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index 52eeff1297f4..96b8cd8a802d 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -547,6 +547,24 @@ static int pSeries_lpar_hpte_removebolted(unsigned long ea, return 0; } + +static inline unsigned long compute_slot(real_pte_t pte, + unsigned long vpn, + unsigned long index, + unsigned long shift, + int ssize) +{ + unsigned long slot, hash, hidx; + + hash = hpt_hash(vpn, shift, ssize); + hidx = __rpte_to_hidx(pte, index); + if (hidx & _PTEIDX_SECONDARY) + hash = ~hash; + slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; + slot += hidx & _PTEIDX_GROUP_IX; + return slot; +} + /* * Take a spinlock around flushes to avoid bouncing the hypervisor tlbie * lock. @@ -559,7 +577,7 @@ static void pSeries_lpar_flush_hash_range(unsigned long number, int local) struct ppc64_tlb_batch *batch = this_cpu_ptr(&ppc64_tlb_batch); int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE); unsigned long param[PLPAR_HCALL9_BUFSIZE]; - unsigned long hash, index, shift, hidx, slot; + unsigned long index, shift, slot; real_pte_t pte; int psize, ssize; @@ -573,12 +591,7 @@ static void pSeries_lpar_flush_hash_range(unsigned long number, int local) vpn = batch->vpn[i]; pte = batch->pte[i]; pte_iterate_hashed_subpages(pte, psize, vpn, index, shift) { - hash = hpt_hash(vpn, shift, ssize); - hidx = __rpte_to_hidx(pte, index); - if (hidx & _PTEIDX_SECONDARY) - hash = ~hash; - slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; - slot += hidx & _PTEIDX_GROUP_IX; + slot = compute_slot(pte, vpn, index, shift, ssize); if (!firmware_has_feature(FW_FEATURE_BULK_REMOVE)) { /* * lpar doesn't use the passed actual page size -- 2.7.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] powerpc/pseries/mm: call H_BLOCK_REMOVE 2018-07-27 13:22 [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE Laurent Dufour 2018-07-27 13:22 ` [PATCH 1/3] powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE Laurent Dufour 2018-07-27 13:22 ` [PATCH 2/3] powerpc/pseries/mm: factorize PTE slot computation Laurent Dufour @ 2018-07-27 13:22 ` Laurent Dufour 2018-07-27 14:10 ` [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE Laurent Dufour 3 siblings, 0 replies; 7+ messages in thread From: Laurent Dufour @ 2018-07-27 13:22 UTC (permalink / raw) To: linuxppc-dev, linux-kernel Cc: Aneesh Kumar K.V, Nicholas Piggin, Michael Ellerman, Paul Mackerras, Benjamin Herrenschmidt This hypervisor call allows to remove up to 8 ptes with only call to tlbie. The virtual pages must be all within the same naturally aligned 8 page virtual address block and have the same page and segment size encodings. Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> --- arch/powerpc/include/asm/hvcall.h | 1 + arch/powerpc/platforms/pseries/lpar.c | 223 +++++++++++++++++++++++++++++++--- 2 files changed, 205 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index 662c8347d699..e403d574651d 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -278,6 +278,7 @@ #define H_COP 0x304 #define H_GET_MPP_X 0x314 #define H_SET_MODE 0x31C +#define H_BLOCK_REMOVE 0x328 #define H_CLEAR_HPT 0x358 #define H_REQUEST_VMC 0x360 #define H_RESIZE_HPT_PREPARE 0x36C diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index 96b8cd8a802d..41ed03245eb4 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -418,6 +418,73 @@ static void pSeries_lpar_hpte_invalidate(unsigned long slot, unsigned long vpn, BUG_ON(lpar_rc != H_SUCCESS); } + +/* + * As defined in the PAPR's section 14.5.4.1.8 + * The control mask doesn't include the returned reference and change bit from + * the processed PTE. + */ +#define HBLKR_AVPN 0x0100000000000000UL +#define HBLKR_CTRL_MASK 0xf800000000000000UL +#define HBLKR_CTRL_SUCCESS 0x8000000000000000UL +#define HBLKR_CTRL_ERRNOTFOUND 0x8800000000000000UL +#define HBLKR_CTRL_ERRBUSY 0xa000000000000000UL + +/** + * H_BLOCK_REMOVE caller. + * @idx should point to the latest @param entry set with a PTEX. + * If PTE cannot be processed because another CPUs has already locked that + * group, those entries are put back in @param starting at index 1. + * If entries has to be retried and @retry_busy is set to true, these entries + * are retried until success. If @retry_busy is set to false, the returned + * is the number of entries yet to process. + */ +static unsigned long call_block_remove(unsigned long idx, unsigned long *param, + bool retry_busy) +{ + unsigned long i, rc, new_idx; + unsigned long retbuf[PLPAR_HCALL9_BUFSIZE]; + +again: + new_idx = 0; + BUG_ON((idx < 2) || (idx > PLPAR_HCALL9_BUFSIZE)); + if (idx < PLPAR_HCALL9_BUFSIZE) + param[idx] = HBR_END; + + rc = plpar_hcall9(H_BLOCK_REMOVE, retbuf, + param[0], /* AVA */ + param[1], param[2], param[3], param[4], /* TS0-7 */ + param[5], param[6], param[7], param[8]); + if (rc == H_SUCCESS) + return 0; + + BUG_ON(rc != H_PARTIAL); + + /* Check that the unprocessed entries were 'not found' or 'busy' */ + for (i = 0; i < idx-1; i++) { + unsigned long ctrl = retbuf[i] & HBLKR_CTRL_MASK; + + if (ctrl == HBLKR_CTRL_ERRBUSY) { + param[++new_idx] = param[i+1]; + continue; + } + + BUG_ON(ctrl != HBLKR_CTRL_SUCCESS + && ctrl != HBLKR_CTRL_ERRNOTFOUND); + } + + /* + * If there were entries found busy, retry these entries if requested, + * of if all the entries have to be retried. + */ + if (new_idx && (retry_busy || new_idx == (PLPAR_HCALL9_BUFSIZE-1))) { + idx = new_idx + 1; + goto again; + } + + return new_idx; +} + #ifdef CONFIG_TRANSPARENT_HUGEPAGE /* * Limit iterations holding pSeries_lpar_tlbie_lock to 3. We also need @@ -425,17 +492,59 @@ static void pSeries_lpar_hpte_invalidate(unsigned long slot, unsigned long vpn, */ #define PPC64_HUGE_HPTE_BATCH 12 -static void __pSeries_lpar_hugepage_invalidate(unsigned long *slot, - unsigned long *vpn, int count, - int psize, int ssize) +static void hugepage_block_invalidate(unsigned long *slot, unsigned long *vpn, + int count, int psize, int ssize) { unsigned long param[PLPAR_HCALL9_BUFSIZE]; - int i = 0, pix = 0, rc; - unsigned long flags = 0; - int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE); + unsigned long shift, current_vpgb, vpgb; + int i, pix = 0; - if (lock_tlbie) - spin_lock_irqsave(&pSeries_lpar_tlbie_lock, flags); + shift = mmu_psize_defs[psize].shift; + + for (i = 0; i < count; i++) { + /* + * Shifting 3 bits more on the right to get a + * 8 pages aligned virtual addresse. + */ + vpgb = (vpn[i] >> (shift - VPN_SHIFT + 3)); + if (!pix || vpgb != current_vpgb) { + /* + * Need to start a new 8 pages block, flush + * the current one if needed. + */ + if (pix) + (void)call_block_remove(pix, param, true); + current_vpgb = vpgb; + param[0] = hpte_encode_avpn(vpn[i], psize, ssize); + pix = 1; + } + + param[pix++] = HBR_REQUEST | HBLKR_AVPN | slot[i]; + if (pix == PLPAR_HCALL9_BUFSIZE) { + pix = call_block_remove(pix, param, false); + /* + * pix = 0 means that all the entries were + * removed, we can start a new block. + * Otherwise, this means that there are entries + * to retry, and pix points to latest one, so + * we should increment it and try to continue + * the same block. + */ + if (!pix) + current_vpgb = 0; + else + pix++; + } + } + if (pix) + (void)call_block_remove(pix, param, true); +} + +static void hugepage_bulk_invalidate(unsigned long *slot, unsigned long *vpn, + int count, int psize, int ssize) +{ + unsigned long param[PLPAR_HCALL9_BUFSIZE]; + int i = 0, pix = 0, rc; for (i = 0; i < count; i++) { @@ -443,17 +552,6 @@ static void __pSeries_lpar_hugepage_invalidate(unsigned long *slot, pSeries_lpar_hpte_invalidate(slot[i], vpn[i], psize, 0, ssize, 0); } else { - param[pix] = HBR_REQUEST | HBR_AVPN | slot[i]; - param[pix+1] = hpte_encode_avpn(vpn[i], psize, ssize); - pix += 2; - if (pix == 8) { - rc = plpar_hcall9(H_BULK_REMOVE, param, - param[0], param[1], param[2], - param[3], param[4], param[5], - param[6], param[7]); - BUG_ON(rc != H_SUCCESS); - pix = 0; - } } } if (pix) { @@ -463,6 +561,23 @@ static void __pSeries_lpar_hugepage_invalidate(unsigned long *slot, param[6], param[7]); BUG_ON(rc != H_SUCCESS); } +} + +static inline void __pSeries_lpar_hugepage_invalidate(unsigned long *slot, + unsigned long *vpn, + int count, int psize, + int ssize) +{ + unsigned long flags = 0; + int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE); + + if (lock_tlbie) + spin_lock_irqsave(&pSeries_lpar_tlbie_lock, flags); + + if (firmware_has_feature(FW_FEATURE_BLOCK_REMOVE)) + hugepage_block_invalidate(slot, vpn, count, psize, ssize); + else + hugepage_bulk_invalidate(slot, vpn, count, psize, ssize); if (lock_tlbie) spin_unlock_irqrestore(&pSeries_lpar_tlbie_lock, flags); @@ -565,6 +680,70 @@ static inline unsigned long compute_slot(real_pte_t pte, return slot; } +/** + * The hcall H_BLOCK_REMOVE implies that the virtual pages to processed are + * "all within the same naturally aligned 8 page virtual address block". + */ +static void do_block_remove(unsigned long number, struct ppc64_tlb_batch *batch, + unsigned long *param) +{ + unsigned long vpn; + unsigned long i, pix = 0; + unsigned long index, shift, slot, current_vpgb, vpgb; + real_pte_t pte; + int psize, ssize; + + psize = batch->psize; + ssize = batch->ssize; + + for (i = 0; i < number; i++) { + vpn = batch->vpn[i]; + pte = batch->pte[i]; + pte_iterate_hashed_subpages(pte, psize, vpn, index, shift) { + /* + * Shifting 3 bits more on the right to get a + * 8 pages aligned virtual addresse. + */ + vpgb = (vpn >> (shift - VPN_SHIFT + 3)); + if (!pix || vpgb != current_vpgb) { + /* + * Need to start a new 8 pages block, flush + * the current one if needed. + */ + if (pix) + (void)call_block_remove(pix, param, + true); + current_vpgb = vpgb; + param[0] = hpte_encode_avpn(vpn, psize, + ssize); + pix = 1; + } + + slot = compute_slot(pte, vpn, index, shift, ssize); + param[pix++] = HBR_REQUEST | HBLKR_AVPN | slot; + + if (pix == PLPAR_HCALL9_BUFSIZE) { + pix = call_block_remove(pix, param, false); + /* + * pix = 0 means that all the entries were + * removed, we can start a new block. + * Otherwise, this means that there are entries + * to retry, and pix points to latest one, so + * we should increment it and try to continue + * the same block. + */ + if (!pix) + current_vpgb = 0; + else + pix++; + } + } pte_iterate_hashed_end(); + } + + if (pix > 1) + (void)call_block_remove(pix, param, true); +} + /* * Take a spinlock around flushes to avoid bouncing the hypervisor tlbie * lock. @@ -584,6 +763,11 @@ static void pSeries_lpar_flush_hash_range(unsigned long number, int local) if (lock_tlbie) spin_lock_irqsave(&pSeries_lpar_tlbie_lock, flags); + if (firmware_has_feature(FW_FEATURE_BLOCK_REMOVE)) { + do_block_remove(number, batch, param); + goto out; + } + psize = batch->psize; ssize = batch->ssize; pix = 0; @@ -622,6 +806,7 @@ static void pSeries_lpar_flush_hash_range(unsigned long number, int local) BUG_ON(rc != H_SUCCESS); } +out: if (lock_tlbie) spin_unlock_irqrestore(&pSeries_lpar_tlbie_lock, flags); } -- 2.7.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE 2018-07-27 13:22 [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE Laurent Dufour ` (2 preceding siblings ...) 2018-07-27 13:22 ` [PATCH 3/3] powerpc/pseries/mm: call H_BLOCK_REMOVE Laurent Dufour @ 2018-07-27 14:10 ` Laurent Dufour 3 siblings, 0 replies; 7+ messages in thread From: Laurent Dufour @ 2018-07-27 14:10 UTC (permalink / raw) To: linuxppc-dev, linux-kernel Sorry for the noise, I forgot to add CC people in copy of this cover. A wall new thread has been resent : https://lkml.org/lkml/2018/7/27/651 On 27/07/2018 15:22, Laurent Dufour wrote: > On very large system we could see soft lockup fired when a process is exiting > > watchdog: BUG: soft lockup - CPU#851 stuck for 21s! [forkoff:215523] > Modules linked in: pseries_rng rng_core xfs raid10 vmx_crypto btrfs libcrc32c xor zstd_decompress zstd_compress xxhash lzo_compress raid6_pq crc32c_vpmsum lpfc crc_t10dif crct10dif_generic crct10dif_common dm_multipath scsi_dh_rdac scsi_dh_alua autofs4 > CPU: 851 PID: 215523 Comm: forkoff Not tainted 4.17.0 #1 > NIP: c0000000000b995c LR: c0000000000b8f64 CTR: 000000000000aa18 > REGS: c00006b0645b7610 TRAP: 0901 Not tainted (4.17.0) > MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 22042082 XER: 00000000 > CFAR: 00000000006cf8f0 SOFTE: 0 > GPR00: 0010000000000000 c00006b0645b7890 c000000000f99200 0000000000000000 > GPR04: 8e000001a5a4de58 400249cf1bfd5480 8e000001a5a4de50 400249cf1bfd5480 > GPR08: 8e000001a5a4de48 400249cf1bfd5480 8e000001a5a4de40 400249cf1bfd5480 > GPR12: ffffffffffffffff c00000001e690800 > NIP [c0000000000b995c] plpar_hcall9+0x44/0x7c > LR [c0000000000b8f64] pSeries_lpar_flush_hash_range+0x324/0x3d0 > Call Trace: > [c00006b0645b7890] [8e000001a5a4dd20] 0x8e000001a5a4dd20 (unreliable) > [c00006b0645b7a00] [c00000000006d5b0] flush_hash_range+0x60/0x110 > [c00006b0645b7a50] [c000000000072a2c] __flush_tlb_pending+0x4c/0xd0 > [c00006b0645b7a80] [c0000000002eaf44] unmap_page_range+0x984/0xbd0 > [c00006b0645b7bc0] [c0000000002eb594] unmap_vmas+0x84/0x100 > [c00006b0645b7c10] [c0000000002f8afc] exit_mmap+0xac/0x1f0 > [c00006b0645b7cd0] [c0000000000f2638] mmput+0x98/0x1b0 > [c00006b0645b7d00] [c0000000000fc9d0] do_exit+0x330/0xc00 > [c00006b0645b7dc0] [c0000000000fd384] do_group_exit+0x64/0x100 > [c00006b0645b7e00] [c0000000000fd44c] sys_exit_group+0x2c/0x30 > [c00006b0645b7e30] [c00000000000b960] system_call+0x58/0x6c > Instruction dump: > 60000000 f8810028 7ca42b78 7cc53378 7ce63b78 7d074378 7d284b78 7d495378 > e9410060 e9610068 e9810070 44000022 <7d806378> e9810028 f88c0000 f8ac0008 > > This happens when removing the PTE by calling the hypervisor using the > H_BULK_REMOVE call. This call is processing up to 4 PTEs but is doing a > tlbie for each PTE it is processing. This could lead to long time spent in > the hypervisor (sometimes up to 4s) and soft lockup being raised because > the scheduler is not called in zap_pte_range(). > > Since the Power7's time, the hypervisor is providing a new hcall > H_BLOCK_REMOVE allowing processing up to 8 PTEs with one call to > tlbie. By limiting the amount of tlbie generated, this reduces the time > spent invalidating the PTEs. > > This hcall requires that the pages are "all within the same naturally > aligned 8 page virtual address block". > > With this patch series applied, I couldn't see any soft lockup raised on > the victim LPAR I was running the test one. > > This series is covering both normal pages and huge pages. > > Laurent Dufour (3): > powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE > powerpc/pseries/mm: factorize PTE slot computation > powerpc/pseries/mm: call H_BLOCK_REMOVE > > arch/powerpc/include/asm/firmware.h | 3 +- > arch/powerpc/include/asm/hvcall.h | 1 + > arch/powerpc/platforms/pseries/firmware.c | 1 + > arch/powerpc/platforms/pseries/lpar.c | 250 ++++++++++++++++++++++++++---- > 4 files changed, 228 insertions(+), 27 deletions(-) > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [resend] [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE @ 2018-07-27 13:51 Laurent Dufour 2018-07-27 13:51 ` [PATCH 1/3] powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE Laurent Dufour 0 siblings, 1 reply; 7+ messages in thread From: Laurent Dufour @ 2018-07-27 13:51 UTC (permalink / raw) To: linuxppc-dev, linux-kernel; +Cc: aneesh.kumar, mpe, benh, paulus, npiggin [Resending so everyone is getting the cover letter] On very large system we could see soft lockup fired when a process is exiting watchdog: BUG: soft lockup - CPU#851 stuck for 21s! [forkoff:215523] Modules linked in: pseries_rng rng_core xfs raid10 vmx_crypto btrfs libcrc32c xor zstd_decompress zstd_compress xxhash lzo_compress raid6_pq crc32c_vpmsum lpfc crc_t10dif crct10dif_generic crct10dif_common dm_multipath scsi_dh_rdac scsi_dh_alua autofs4 CPU: 851 PID: 215523 Comm: forkoff Not tainted 4.17.0 #1 NIP: c0000000000b995c LR: c0000000000b8f64 CTR: 000000000000aa18 REGS: c00006b0645b7610 TRAP: 0901 Not tainted (4.17.0) MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 22042082 XER: 00000000 CFAR: 00000000006cf8f0 SOFTE: 0 GPR00: 0010000000000000 c00006b0645b7890 c000000000f99200 0000000000000000 GPR04: 8e000001a5a4de58 400249cf1bfd5480 8e000001a5a4de50 400249cf1bfd5480 GPR08: 8e000001a5a4de48 400249cf1bfd5480 8e000001a5a4de40 400249cf1bfd5480 GPR12: ffffffffffffffff c00000001e690800 NIP [c0000000000b995c] plpar_hcall9+0x44/0x7c LR [c0000000000b8f64] pSeries_lpar_flush_hash_range+0x324/0x3d0 Call Trace: [c00006b0645b7890] [8e000001a5a4dd20] 0x8e000001a5a4dd20 (unreliable) [c00006b0645b7a00] [c00000000006d5b0] flush_hash_range+0x60/0x110 [c00006b0645b7a50] [c000000000072a2c] __flush_tlb_pending+0x4c/0xd0 [c00006b0645b7a80] [c0000000002eaf44] unmap_page_range+0x984/0xbd0 [c00006b0645b7bc0] [c0000000002eb594] unmap_vmas+0x84/0x100 [c00006b0645b7c10] [c0000000002f8afc] exit_mmap+0xac/0x1f0 [c00006b0645b7cd0] [c0000000000f2638] mmput+0x98/0x1b0 [c00006b0645b7d00] [c0000000000fc9d0] do_exit+0x330/0xc00 [c00006b0645b7dc0] [c0000000000fd384] do_group_exit+0x64/0x100 [c00006b0645b7e00] [c0000000000fd44c] sys_exit_group+0x2c/0x30 [c00006b0645b7e30] [c00000000000b960] system_call+0x58/0x6c Instruction dump: 60000000 f8810028 7ca42b78 7cc53378 7ce63b78 7d074378 7d284b78 7d495378 e9410060 e9610068 e9810070 44000022 <7d806378> e9810028 f88c0000 f8ac0008 This happens when removing the PTE by calling the hypervisor using the H_BULK_REMOVE call. This call is processing up to 4 PTEs but is doing a tlbie for each PTE it is processing. This could lead to long time spent in the hypervisor (sometimes up to 4s) and soft lockup being raised because the scheduler is not called in zap_pte_range(). Since the Power7's time, the hypervisor is providing a new hcall H_BLOCK_REMOVE allowing processing up to 8 PTEs with one call to tlbie. By limiting the amount of tlbie generated, this reduces the time spent invalidating the PTEs. This hcall requires that the pages are "all within the same naturally aligned 8 page virtual address block". With this patch series applied, I couldn't see any soft lockup raised on the victim LPAR I was running the test one. This series is covering both normal pages and huge pages. Laurent Dufour (3): powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE powerpc/pseries/mm: factorize PTE slot computation powerpc/pseries/mm: call H_BLOCK_REMOVE arch/powerpc/include/asm/firmware.h | 3 +- arch/powerpc/include/asm/hvcall.h | 1 + arch/powerpc/platforms/pseries/firmware.c | 1 + arch/powerpc/platforms/pseries/lpar.c | 250 ++++++++++++++++++++++++++---- 4 files changed, 228 insertions(+), 27 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE 2018-07-27 13:51 [resend] " Laurent Dufour @ 2018-07-27 13:51 ` Laurent Dufour 2018-07-30 14:18 ` Aneesh Kumar K.V 0 siblings, 1 reply; 7+ messages in thread From: Laurent Dufour @ 2018-07-27 13:51 UTC (permalink / raw) To: linuxppc-dev, linux-kernel; +Cc: aneesh.kumar, mpe, benh, paulus, npiggin This feature tells if the hcall H_BLOCK_REMOVE is available. Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> --- arch/powerpc/include/asm/firmware.h | 3 ++- arch/powerpc/platforms/pseries/firmware.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h index 535add3f7791..360ba197f9d2 100644 --- a/arch/powerpc/include/asm/firmware.h +++ b/arch/powerpc/include/asm/firmware.h @@ -53,6 +53,7 @@ #define FW_FEATURE_PRRN ASM_CONST(0x0000000200000000) #define FW_FEATURE_DRMEM_V2 ASM_CONST(0x0000000400000000) #define FW_FEATURE_DRC_INFO ASM_CONST(0x0000000800000000) +#define FW_FEATURE_BLOCK_REMOVE ASM_CONST(0x0000001000000000) #ifndef __ASSEMBLY__ @@ -70,7 +71,7 @@ enum { FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY | FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN | FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRMEM_V2 | - FW_FEATURE_DRC_INFO, + FW_FEATURE_DRC_INFO | FW_FEATURE_BLOCK_REMOVE, FW_FEATURE_PSERIES_ALWAYS = 0, FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL, FW_FEATURE_POWERNV_ALWAYS = 0, diff --git a/arch/powerpc/platforms/pseries/firmware.c b/arch/powerpc/platforms/pseries/firmware.c index a3bbeb43689e..1624501386f4 100644 --- a/arch/powerpc/platforms/pseries/firmware.c +++ b/arch/powerpc/platforms/pseries/firmware.c @@ -65,6 +65,7 @@ hypertas_fw_features_table[] = { {FW_FEATURE_SET_MODE, "hcall-set-mode"}, {FW_FEATURE_BEST_ENERGY, "hcall-best-energy-1*"}, {FW_FEATURE_HPT_RESIZE, "hcall-hpt-resize"}, + {FW_FEATURE_BLOCK_REMOVE, "hcall-block-remove"}, }; /* Build up the firmware features bitmask using the contents of -- 2.7.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE 2018-07-27 13:51 ` [PATCH 1/3] powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE Laurent Dufour @ 2018-07-30 14:18 ` Aneesh Kumar K.V 0 siblings, 0 replies; 7+ messages in thread From: Aneesh Kumar K.V @ 2018-07-30 14:18 UTC (permalink / raw) To: Laurent Dufour, linuxppc-dev, linux-kernel; +Cc: mpe, benh, paulus, npiggin Laurent Dufour <ldufour@linux.vnet.ibm.com> writes: > This feature tells if the hcall H_BLOCK_REMOVE is available. > > Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Paul Mackerras <paulus@samba.org> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> > --- > arch/powerpc/include/asm/firmware.h | 3 ++- > arch/powerpc/platforms/pseries/firmware.c | 1 + > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h > index 535add3f7791..360ba197f9d2 100644 > --- a/arch/powerpc/include/asm/firmware.h > +++ b/arch/powerpc/include/asm/firmware.h > @@ -53,6 +53,7 @@ > #define FW_FEATURE_PRRN ASM_CONST(0x0000000200000000) > #define FW_FEATURE_DRMEM_V2 ASM_CONST(0x0000000400000000) > #define FW_FEATURE_DRC_INFO ASM_CONST(0x0000000800000000) > +#define FW_FEATURE_BLOCK_REMOVE ASM_CONST(0x0000001000000000) > > #ifndef __ASSEMBLY__ > > @@ -70,7 +71,7 @@ enum { > FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY | > FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN | > FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRMEM_V2 | > - FW_FEATURE_DRC_INFO, > + FW_FEATURE_DRC_INFO | FW_FEATURE_BLOCK_REMOVE, > FW_FEATURE_PSERIES_ALWAYS = 0, > FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL, > FW_FEATURE_POWERNV_ALWAYS = 0, > diff --git a/arch/powerpc/platforms/pseries/firmware.c b/arch/powerpc/platforms/pseries/firmware.c > index a3bbeb43689e..1624501386f4 100644 > --- a/arch/powerpc/platforms/pseries/firmware.c > +++ b/arch/powerpc/platforms/pseries/firmware.c > @@ -65,6 +65,7 @@ hypertas_fw_features_table[] = { > {FW_FEATURE_SET_MODE, "hcall-set-mode"}, > {FW_FEATURE_BEST_ENERGY, "hcall-best-energy-1*"}, > {FW_FEATURE_HPT_RESIZE, "hcall-hpt-resize"}, > + {FW_FEATURE_BLOCK_REMOVE, "hcall-block-remove"}, > }; > > /* Build up the firmware features bitmask using the contents of > -- > 2.7.4 ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-07-30 14:19 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-27 13:22 [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE Laurent Dufour 2018-07-27 13:22 ` [PATCH 1/3] powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE Laurent Dufour 2018-07-27 13:22 ` [PATCH 2/3] powerpc/pseries/mm: factorize PTE slot computation Laurent Dufour 2018-07-27 13:22 ` [PATCH 3/3] powerpc/pseries/mm: call H_BLOCK_REMOVE Laurent Dufour 2018-07-27 14:10 ` [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE Laurent Dufour 2018-07-27 13:51 [resend] " Laurent Dufour 2018-07-27 13:51 ` [PATCH 1/3] powerpc/pseries/mm: Introducing FW_FEATURE_BLOCK_REMOVE Laurent Dufour 2018-07-30 14:18 ` Aneesh Kumar K.V
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.