All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] faultaround updates
@ 2014-08-01 11:51 Kirill A. Shutemov
  2014-08-01 11:51 ` [PATCHv2 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-08-01 11:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Andrey Ryabinin, Sasha Levin, David Rientjes,
	linux-mm, Kirill A. Shutemov

One fix and one tweak for faultaround code.

As alternative, we could just drop debugfs interface and make
fault_around_bytes constant.

Kirill A. Shutemov (2):
  mm: close race between do_fault_around() and fault_around_bytes_set()
  mm: mark fault_around_bytes __read_mostly

 mm/memory.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

-- 
2.0.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCHv2 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()
  2014-08-01 11:51 [PATCH 0/2] faultaround updates Kirill A. Shutemov
@ 2014-08-01 11:51 ` Kirill A. Shutemov
  2014-08-01 21:36   ` Naoya Horiguchi
  2014-08-01 11:51 ` [PATCH 2/2] mm: mark fault_around_bytes __read_mostly Kirill A. Shutemov
  2014-08-01 21:32 ` [PATCH 0/2] faultaround updates David Rientjes
  2 siblings, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-08-01 11:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Andrey Ryabinin, Sasha Levin, David Rientjes,
	linux-mm, Kirill A. Shutemov

Things can go wrong if fault_around_bytes will be changed under
do_fault_around(): between fault_around_mask() and fault_around_pages().

Let's read fault_around_bytes only once during do_fault_around() and
calculate mask based on the reading.

Note: fault_around_bytes can only be updated via debug interface. Also
I've tried but was not able to trigger a bad behaviour without the
patch. So I would not consider this patch as urgent.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/memory.c | 21 +++++++--------------
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 6ea15ed23ec4..be43fd9606db 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2770,16 +2770,6 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
 
 static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
 
-static inline unsigned long fault_around_pages(void)
-{
-	return fault_around_bytes >> PAGE_SHIFT;
-}
-
-static inline unsigned long fault_around_mask(void)
-{
-	return ~(fault_around_bytes - 1) & PAGE_MASK;
-}
-
 #ifdef CONFIG_DEBUG_FS
 static int fault_around_bytes_get(void *data, u64 *val)
 {
@@ -2844,12 +2834,15 @@ late_initcall(fault_around_debugfs);
 static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
 		pte_t *pte, pgoff_t pgoff, unsigned int flags)
 {
-	unsigned long start_addr;
+	unsigned long start_addr, nr_pages, mask;
 	pgoff_t max_pgoff;
 	struct vm_fault vmf;
 	int off;
 
-	start_addr = max(address & fault_around_mask(), vma->vm_start);
+	nr_pages = ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;
+	mask = ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
+
+	start_addr = max(address & mask, vma->vm_start);
 	off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
 	pte -= off;
 	pgoff -= off;
@@ -2861,7 +2854,7 @@ static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
 	max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
 		PTRS_PER_PTE - 1;
 	max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1,
-			pgoff + fault_around_pages() - 1);
+			pgoff + nr_pages - 1);
 
 	/* Check if it makes any sense to call ->map_pages */
 	while (!pte_none(*pte)) {
@@ -2896,7 +2889,7 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * something).
 	 */
 	if (vma->vm_ops->map_pages && !(flags & FAULT_FLAG_NONLINEAR) &&
-	    fault_around_pages() > 1) {
+	    fault_around_bytes >> PAGE_SHIFT > 1) {
 		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
 		do_fault_around(vma, address, pte, pgoff, flags);
 		if (!pte_same(*pte, orig_pte))
-- 
2.0.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] mm: mark fault_around_bytes __read_mostly
  2014-08-01 11:51 [PATCH 0/2] faultaround updates Kirill A. Shutemov
  2014-08-01 11:51 ` [PATCHv2 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
@ 2014-08-01 11:51 ` Kirill A. Shutemov
  2014-08-01 21:32 ` [PATCH 0/2] faultaround updates David Rientjes
  2 siblings, 0 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-08-01 11:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Andrey Ryabinin, Sasha Levin, David Rientjes,
	linux-mm, Kirill A. Shutemov

fault_around_bytes can only be changed via debugfs. Let's mark it
read-mostly.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Suggested-by: David Rientjes <rientjes@google.com>
Acked-by: David Rientjes <rientjes@google.com>
---
 mm/memory.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index be43fd9606db..281556eb4e62 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2768,7 +2768,8 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
 	update_mmu_cache(vma, address, pte);
 }
 
-static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
+static unsigned long fault_around_bytes __read_mostly =
+	rounddown_pow_of_two(65536);
 
 #ifdef CONFIG_DEBUG_FS
 static int fault_around_bytes_get(void *data, u64 *val)
-- 
2.0.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] faultaround updates
  2014-08-01 11:51 [PATCH 0/2] faultaround updates Kirill A. Shutemov
  2014-08-01 11:51 ` [PATCHv2 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
  2014-08-01 11:51 ` [PATCH 2/2] mm: mark fault_around_bytes __read_mostly Kirill A. Shutemov
@ 2014-08-01 21:32 ` David Rientjes
  2014-08-02  8:39   ` Kirill A. Shutemov
  2 siblings, 1 reply; 8+ messages in thread
From: David Rientjes @ 2014-08-01 21:32 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Andrew Morton, Dave Hansen, Andrey Ryabinin, Sasha Levin, linux-mm

On Fri, 1 Aug 2014, Kirill A. Shutemov wrote:

> One fix and one tweak for faultaround code.
> 
> As alternative, we could just drop debugfs interface and make
> fault_around_bytes constant.
> 

If we can remove the debugfs interface, then it seems better than 
continuing to support it.  Any objections to removing it?

> Kirill A. Shutemov (2):
>   mm: close race between do_fault_around() and fault_around_bytes_set()
>   mm: mark fault_around_bytes __read_mostly
> 
>  mm/memory.c | 24 +++++++++---------------
>  1 file changed, 9 insertions(+), 15 deletions(-)
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCHv2 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()
  2014-08-01 11:51 ` [PATCHv2 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
@ 2014-08-01 21:36   ` Naoya Horiguchi
  0 siblings, 0 replies; 8+ messages in thread
From: Naoya Horiguchi @ 2014-08-01 21:36 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Andrew Morton, Dave Hansen, Andrey Ryabinin, Sasha Levin,
	David Rientjes, linux-mm

On Fri, Aug 01, 2014 at 02:51:08PM +0300, Kirill A. Shutemov wrote:
> Things can go wrong if fault_around_bytes will be changed under
> do_fault_around(): between fault_around_mask() and fault_around_pages().
> 
> Let's read fault_around_bytes only once during do_fault_around() and
> calculate mask based on the reading.
> 
> Note: fault_around_bytes can only be updated via debug interface. Also
> I've tried but was not able to trigger a bad behaviour without the
> patch. So I would not consider this patch as urgent.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  mm/memory.c | 21 +++++++--------------
>  1 file changed, 7 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 6ea15ed23ec4..be43fd9606db 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2770,16 +2770,6 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
>  
>  static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
>  
> -static inline unsigned long fault_around_pages(void)
> -{
> -	return fault_around_bytes >> PAGE_SHIFT;
> -}
> -
> -static inline unsigned long fault_around_mask(void)
> -{
> -	return ~(fault_around_bytes - 1) & PAGE_MASK;
> -}
> -
>  #ifdef CONFIG_DEBUG_FS
>  static int fault_around_bytes_get(void *data, u64 *val)
>  {
> @@ -2844,12 +2834,15 @@ late_initcall(fault_around_debugfs);
>  static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
>  		pte_t *pte, pgoff_t pgoff, unsigned int flags)
>  {
> -	unsigned long start_addr;
> +	unsigned long start_addr, nr_pages, mask;
>  	pgoff_t max_pgoff;
>  	struct vm_fault vmf;
>  	int off;
>  
> -	start_addr = max(address & fault_around_mask(), vma->vm_start);
> +	nr_pages = ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;
> +	mask = ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;

If nr_pages never becomes 0, don't we need to do (& PAGE_MASK) ?

Thanks,
Naoya Horiguchi

> +
> +	start_addr = max(address & mask, vma->vm_start);
>  	off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
>  	pte -= off;
>  	pgoff -= off;
> @@ -2861,7 +2854,7 @@ static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
>  	max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
>  		PTRS_PER_PTE - 1;
>  	max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1,
> -			pgoff + fault_around_pages() - 1);
> +			pgoff + nr_pages - 1);
>  
>  	/* Check if it makes any sense to call ->map_pages */
>  	while (!pte_none(*pte)) {
> @@ -2896,7 +2889,7 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	 * something).
>  	 */
>  	if (vma->vm_ops->map_pages && !(flags & FAULT_FLAG_NONLINEAR) &&
> -	    fault_around_pages() > 1) {
> +	    fault_around_bytes >> PAGE_SHIFT > 1) {
>  		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
>  		do_fault_around(vma, address, pte, pgoff, flags);
>  		if (!pte_same(*pte, orig_pte))
> -- 
> 2.0.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] faultaround updates
  2014-08-01 21:32 ` [PATCH 0/2] faultaround updates David Rientjes
@ 2014-08-02  8:39   ` Kirill A. Shutemov
  2014-08-04 22:20     ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-08-02  8:39 UTC (permalink / raw)
  To: David Rientjes
  Cc: Kirill A. Shutemov, Andrew Morton, Dave Hansen, Andrey Ryabinin,
	Sasha Levin, linux-mm

On Fri, Aug 01, 2014 at 02:32:36PM -0700, David Rientjes wrote:
> On Fri, 1 Aug 2014, Kirill A. Shutemov wrote:
> 
> > One fix and one tweak for faultaround code.
> > 
> > As alternative, we could just drop debugfs interface and make
> > fault_around_bytes constant.
> > 
> 
> If we can remove the debugfs interface, then it seems better than 
> continuing to support it.  Any objections to removing it?

Andrew asked it initially. Up to him.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] faultaround updates
  2014-08-02  8:39   ` Kirill A. Shutemov
@ 2014-08-04 22:20     ` Andrew Morton
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2014-08-04 22:20 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: David Rientjes, Kirill A. Shutemov, Dave Hansen, Andrey Ryabinin,
	Sasha Levin, linux-mm

On Sat, 2 Aug 2014 11:39:29 +0300 "Kirill A. Shutemov" <kirill@shutemov.name> wrote:

> On Fri, Aug 01, 2014 at 02:32:36PM -0700, David Rientjes wrote:
> > On Fri, 1 Aug 2014, Kirill A. Shutemov wrote:
> > 
> > > One fix and one tweak for faultaround code.
> > > 
> > > As alternative, we could just drop debugfs interface and make
> > > fault_around_bytes constant.
> > > 
> > 
> > If we can remove the debugfs interface, then it seems better than 
> > continuing to support it.  Any objections to removing it?
> 
> Andrew asked it initially. Up to him.

Well, we had a bunch of magic constants in there which may not be
optimized.  The idea is to make them tunable so that interested parties
can determine the best settings without having to rebuild the kernel. 
Once that's all done we can remove the tunable (because it's debugfs)
and hard-wire the optimised constants.

But I don't think anyone has done this tuning work yet.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 0/2] faultaround updates
@ 2014-07-29 11:33 Kirill A. Shutemov
  0 siblings, 0 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-07-29 11:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Hansen, Andrey Ryabinin, Sasha Levin, David Rientjes,
	linux-mm, Kirill A. Shutemov

One fix and one tweak for faultaround code.

As alternative, we could just drop debugfs interface and make
fault_around_bytes constant.

Kirill A. Shutemov (2):
  mm: close race between do_fault_around() and fault_around_bytes_set()
  mm: mark fault_around_bytes __read_mostly

 mm/memory.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

-- 
2.0.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-08-04 22:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-01 11:51 [PATCH 0/2] faultaround updates Kirill A. Shutemov
2014-08-01 11:51 ` [PATCHv2 1/2] mm: close race between do_fault_around() and fault_around_bytes_set() Kirill A. Shutemov
2014-08-01 21:36   ` Naoya Horiguchi
2014-08-01 11:51 ` [PATCH 2/2] mm: mark fault_around_bytes __read_mostly Kirill A. Shutemov
2014-08-01 21:32 ` [PATCH 0/2] faultaround updates David Rientjes
2014-08-02  8:39   ` Kirill A. Shutemov
2014-08-04 22:20     ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2014-07-29 11:33 Kirill A. Shutemov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.