linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] mm: vma->vm_flags diet
@ 2012-03-31  9:25 Konstantin Khlebnikov
  2012-03-31  9:29 ` [PATCH 1/7] mm, x86, PAT: rework linear pfn-mmap tracking Konstantin Khlebnikov
                   ` (7 more replies)
  0 siblings, 8 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-31  9:25 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel; +Cc: Linus Torvalds

This patch-set moves/kills some VM_* flags in vma->vm_flags bit-field,
as result there appears four free bits.

Also I'm working on VM_RESERVED reorganization, probably it also can be killed.
It lost original swapout-protection sense in 2.6 and now is used for other purposes.

---

Konstantin Khlebnikov (7):
      mm, x86, PAT: rework linear pfn-mmap tracking
      mm: introduce vma flag VM_ARCH_1
      mm: kill vma flag VM_CAN_NONLINEAR
      mm: kill vma flag VM_INSERTPAGE
      mm, drm/udl: fixup vma flags on mmap
      mm: kill vma flag VM_EXECUTABLE
      mm: move madvise vma flags to the end


 arch/powerpc/oprofile/cell/spu_task_sync.c |   15 ++----
 arch/tile/mm/elf.c                         |   12 ++---
 arch/x86/mm/pat.c                          |   25 +++++++---
 drivers/gpu/drm/udl/udl_drv.c              |    2 -
 drivers/gpu/drm/udl/udl_drv.h              |    1 
 drivers/gpu/drm/udl/udl_gem.c              |   14 ++++++
 drivers/oprofile/buffer_sync.c             |   17 +------
 drivers/staging/android/ashmem.c           |    1 
 fs/9p/vfs_file.c                           |    1 
 fs/btrfs/file.c                            |    2 -
 fs/ceph/addr.c                             |    2 -
 fs/cifs/file.c                             |    1 
 fs/ecryptfs/file.c                         |    1 
 fs/ext4/file.c                             |    2 -
 fs/fuse/file.c                             |    1 
 fs/gfs2/file.c                             |    2 -
 fs/nfs/file.c                              |    1 
 fs/nilfs2/file.c                           |    2 -
 fs/ocfs2/mmap.c                            |    2 -
 fs/ubifs/file.c                            |    1 
 fs/xfs/xfs_file.c                          |    2 -
 include/asm-generic/pgtable.h              |    4 +-
 include/linux/fs.h                         |    2 +
 include/linux/mm.h                         |   69 ++++++++++++----------------
 include/linux/mm_types.h                   |    1 
 include/linux/mman.h                       |    1 
 kernel/auditsc.c                           |   17 +------
 kernel/fork.c                              |   29 ++----------
 mm/filemap.c                               |    2 -
 mm/filemap_xip.c                           |    3 +
 mm/fremap.c                                |   14 +++---
 mm/huge_memory.c                           |   10 ++--
 mm/ksm.c                                   |    9 +++-
 mm/memory.c                                |   29 ++++++++----
 mm/mmap.c                                  |   32 +++----------
 mm/nommu.c                                 |   19 ++++----
 mm/shmem.c                                 |    3 -
 security/tomoyo/util.c                     |   14 +-----
 38 files changed, 158 insertions(+), 207 deletions(-)

-- 
Signature

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/7] mm, x86, PAT: rework linear pfn-mmap tracking
  2012-03-31  9:25 [PATCH 0/7] mm: vma->vm_flags diet Konstantin Khlebnikov
@ 2012-03-31  9:29 ` Konstantin Khlebnikov
  2012-03-31 17:09   ` [PATCH 1/7 v2] " Konstantin Khlebnikov
  2012-03-31  9:29 ` [PATCH 2/7] mm: introduce vma flag VM_ARCH_1 Konstantin Khlebnikov
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-31  9:29 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Nick Piggin, Suresh Siddha, Pallipadi Venkatesh, Ingo Molnar,
	H. Peter Anvin, Linus Torvalds

This patch replaces generic vma-flag VM_PFN_AT_MMAP with x86-only VM_PAT.

We can toss mapping address from remap_pfn_range() into track_pfn_vma_new(),
and collect all PAT-related logic together in arch/x86/.

This patch also restores orignal frustration-free is_cow_mapping() check in
remap_pfn_range(), as it was before commit v2.6.28-rc8-88-g3c8bb73
("x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3")

is_linear_pfn_mapping() checks can be removed from mm/huge_memory.c,
because it already handled by VM_PFNMAP in VM_NO_THP bit-mask.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Pallipadi Venkatesh <venkatesh.pallipadi@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
---
 arch/x86/mm/pat.c             |   25 +++++++++++++++++--------
 include/asm-generic/pgtable.h |    4 ++--
 include/linux/mm.h            |   15 +--------------
 mm/huge_memory.c              |    7 +++----
 mm/memory.c                   |   15 ++++++++-------
 5 files changed, 31 insertions(+), 35 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index f6ff57b..5c8eb19 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -665,7 +665,7 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
 	unsigned long vma_size = vma->vm_end - vma->vm_start;
 	pgprot_t pgprot;
 
-	if (is_linear_pfn_mapping(vma)) {
+	if (vma->vm_flags & VM_PAT) {
 		/*
 		 * reserve the whole chunk covered by vma. We need the
 		 * starting address and protection from pte.
@@ -690,16 +690,26 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
  * single reserve_pfn_range call.
  */
 int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-			unsigned long pfn, unsigned long size)
+		unsigned long pfn, unsigned long addr, unsigned long size)
 {
 	unsigned long flags;
 	resource_size_t paddr;
 	unsigned long vma_size = vma->vm_end - vma->vm_start;
+	int ret;
 
-	if (is_linear_pfn_mapping(vma)) {
-		/* reserve the whole chunk starting from vm_pgoff */
-		paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
-		return reserve_pfn_range(paddr, vma_size, prot, 0);
+	if (addr == vma->vm_start && size == vma_size) {
+		/* reserve the whole chunk starting from pfn */
+		paddr = (resource_size_t)pfn << PAGE_SHIFT;
+		ret = reserve_pfn_range(paddr, vma_size, prot, 0);
+		if (!ret) {
+			vma->vm_flags |= VM_PAT;
+			/*
+			 * Save starting pfn in vm_pgoff for untrack_pfn_vma(),
+			 * remap_pfn_range() do this only for cow-mappings.
+			 */
+			vma->vm_pgoff = pfn;
+		}
+		return ret;
 	}
 
 	if (!pat_enabled)
@@ -724,11 +734,10 @@ void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 	resource_size_t paddr;
 	unsigned long vma_size = vma->vm_end - vma->vm_start;
 
-	if (is_linear_pfn_mapping(vma)) {
+	if (vma->vm_flags & VM_PAT) {
 		/* free the whole chunk starting from vm_pgoff */
 		paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
 		free_pfn_range(paddr, vma_size);
-		return;
 	}
 }
 
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 125c54e..688a2a5 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -389,7 +389,7 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
  * for physical range indicated by pfn and size.
  */
 static inline int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-					unsigned long pfn, unsigned long size)
+		unsigned long pfn, unsigned long addr, unsigned long size)
 {
 	return 0;
 }
@@ -420,7 +420,7 @@ static inline void untrack_pfn_vma(struct vm_area_struct *vma,
 }
 #else
 extern int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-				unsigned long pfn, unsigned long size);
+		unsigned long pfn, unsigned long addr, unsigned long size);
 extern int track_pfn_vma_copy(struct vm_area_struct *vma);
 extern void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 				unsigned long size);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d8738a4..b8e5fe5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -117,7 +117,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_CAN_NONLINEAR 0x08000000	/* Has ->fault & does nonlinear pages */
 #define VM_MIXEDMAP	0x10000000	/* Can contain "struct page" and pure PFN pages */
 #define VM_SAO		0x20000000	/* Strong Access Ordering (powerpc) */
-#define VM_PFN_AT_MMAP	0x40000000	/* PFNMAP vma that is fully mapped at mmap time */
+#define VM_PAT		0x40000000	/* PAT reserves whole VMA at once (x86) */
 #define VM_MERGEABLE	0x80000000	/* KSM may merge identical pages */
 
 /* Bits set in the VMA until the stack is in its final location */
@@ -158,19 +158,6 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_RETRY_NOWAIT	0x10	/* Don't drop mmap_sem and wait when retrying */
 #define FAULT_FLAG_KILLABLE	0x20	/* The fault task is in SIGKILL killable region */
 
-/*
- * This interface is used by x86 PAT code to identify a pfn mapping that is
- * linear over entire vma. This is to optimize PAT code that deals with
- * marking the physical region with a particular prot. This is not for generic
- * mm use. Note also that this check will not work if the pfn mapping is
- * linear for a vma starting at physical address 0. In which case PAT code
- * falls back to slow path of reserving physical range page by page.
- */
-static inline int is_linear_pfn_mapping(struct vm_area_struct *vma)
-{
-	return !!(vma->vm_flags & VM_PFN_AT_MMAP);
-}
-
 static inline int is_pfn_mapping(struct vm_area_struct *vma)
 {
 	return !!(vma->vm_flags & VM_PFNMAP);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f0e5306..cf827da 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1650,7 +1650,7 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma)
 	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
 	 * true too, verify it here.
 	 */
-	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags & VM_NO_THP);
+	VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 	hend = vma->vm_end & HPAGE_PMD_MASK;
 	if (hstart < hend)
@@ -1908,7 +1908,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
 	 * true too, verify it here.
 	 */
-	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags & VM_NO_THP);
+	VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 
 	pgd = pgd_offset(mm, address);
 	if (!pgd_present(*pgd))
@@ -2150,8 +2150,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 		 * If is_pfn_mapping() is true is_learn_pfn_mapping()
 		 * must be true too, verify it here.
 		 */
-		VM_BUG_ON(is_linear_pfn_mapping(vma) ||
-			  vma->vm_flags & VM_NO_THP);
+		VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 
 		hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 		hend = vma->vm_end & HPAGE_PMD_MASK;
diff --git a/mm/memory.c b/mm/memory.c
index 6105f47..e6e4dfd 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2145,7 +2145,7 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_vma_new(vma, &pgprot, pfn, PAGE_SIZE))
+	if (track_pfn_vma_new(vma, &pgprot, pfn, addr, PAGE_SIZE))
 		return -EINVAL;
 
 	ret = insert_pfn(vma, addr, pfn, pgprot);
@@ -2285,23 +2285,24 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 	 * There's a horrible special case to handle copy-on-write
 	 * behaviour that some programs depend on. We mark the "original"
 	 * un-COW'ed pages by matching them up with "vma->vm_pgoff".
+	 * See vm_normal_page() for details.
 	 */
-	if (addr == vma->vm_start && end == vma->vm_end) {
+
+	if (is_cow_mapping(vma->vm_flags)) {
+		if (addr != vma->vm_start || end != vma->vm_end)
+			return -EINVAL;
 		vma->vm_pgoff = pfn;
-		vma->vm_flags |= VM_PFN_AT_MMAP;
-	} else if (is_cow_mapping(vma->vm_flags))
-		return -EINVAL;
+	}
 
 	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
 
-	err = track_pfn_vma_new(vma, &prot, pfn, PAGE_ALIGN(size));
+	err = track_pfn_vma_new(vma, &prot, pfn, addr, PAGE_ALIGN(size));
 	if (err) {
 		/*
 		 * To indicate that track_pfn related cleanup is not
 		 * needed from higher level routine calling unmap_vmas
 		 */
 		vma->vm_flags &= ~(VM_IO | VM_RESERVED | VM_PFNMAP);
-		vma->vm_flags &= ~VM_PFN_AT_MMAP;
 		return -EINVAL;
 	}
 


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 2/7] mm: introduce vma flag VM_ARCH_1
  2012-03-31  9:25 [PATCH 0/7] mm: vma->vm_flags diet Konstantin Khlebnikov
  2012-03-31  9:29 ` [PATCH 1/7] mm, x86, PAT: rework linear pfn-mmap tracking Konstantin Khlebnikov
@ 2012-03-31  9:29 ` Konstantin Khlebnikov
  2012-03-31 22:25   ` Benjamin Herrenschmidt
  2012-03-31  9:29 ` [PATCH 3/7] mm: kill vma flag VM_CAN_NONLINEAR Konstantin Khlebnikov
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-31  9:29 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Andrea Arcangeli, Minchan Kim, Linus Torvalds

This patch shuffles some bits in vma->vm_flags

before patch:

        0x00000200      0x01000000      0x20000000      0x40000000
x86     VM_NOHUGEPAGE   VM_HUGEPAGE     -               VM_PAT
powerpc -               -               VM_SAO          -
parisc  VM_GROWSUP      -               -               -
ia64    VM_GROWSUP      -               -               -
nommu   -               VM_MAPPED_COPY  -               -
others  -               -               -               -

after patch:

        0x00000200      0x01000000      0x20000000      0x40000000
x86     -               VM_PAT          VM_HUGEPAGE     VM_NOHUGEPAGE
powerpc -               VM_SAO          -               -
parisc  -               VM_GROWSUP      -               -
ia64    -               VM_GROWSUP      -               -
nommu   -               VM_MAPPED_COPY  -               -
others  -               VM_ARCH_1       -               -

And voila! One completely free bit.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
---
 include/linux/mm.h |   34 +++++++++++++++++++++-------------
 mm/huge_memory.c   |    2 +-
 mm/ksm.c           |    7 ++++++-
 3 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index b8e5fe5..a444f47 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -70,6 +70,8 @@ extern unsigned int kobjsize(const void *objp);
 /*
  * vm_flags in vm_area_struct, see mm_types.h.
  */
+#define VM_NONE		0x00000000
+
 #define VM_READ		0x00000001	/* currently active flags */
 #define VM_WRITE	0x00000002
 #define VM_EXEC		0x00000004
@@ -82,12 +84,6 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_MAYSHARE	0x00000080
 
 #define VM_GROWSDOWN	0x00000100	/* general info on the segment */
-#if defined(CONFIG_STACK_GROWSUP) || defined(CONFIG_IA64)
-#define VM_GROWSUP	0x00000200
-#else
-#define VM_GROWSUP	0x00000000
-#define VM_NOHUGEPAGE	0x00000200	/* MADV_NOHUGEPAGE marked this vma */
-#endif
 #define VM_PFNMAP	0x00000400	/* Page-ranges managed without "struct page", just pure PFN */
 #define VM_DENYWRITE	0x00000800	/* ETXTBSY on write attempts.. */
 
@@ -106,20 +102,32 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_NORESERVE	0x00200000	/* should the VM suppress accounting */
 #define VM_HUGETLB	0x00400000	/* Huge TLB Page VM */
 #define VM_NONLINEAR	0x00800000	/* Is non-linear (remap_file_pages) */
-#ifndef CONFIG_TRANSPARENT_HUGEPAGE
-#define VM_MAPPED_COPY	0x01000000	/* T if mapped copy of data (nommu mmap) */
-#else
-#define VM_HUGEPAGE	0x01000000	/* MADV_HUGEPAGE marked this vma */
-#endif
+#define VM_ARCH_1	0x01000000	/* Architecture-specific flag */
 #define VM_INSERTPAGE	0x02000000	/* The vma has had "vm_insert_page()" done on it */
 #define VM_NODUMP	0x04000000	/* Do not include in the core dump */
 
 #define VM_CAN_NONLINEAR 0x08000000	/* Has ->fault & does nonlinear pages */
 #define VM_MIXEDMAP	0x10000000	/* Can contain "struct page" and pure PFN pages */
-#define VM_SAO		0x20000000	/* Strong Access Ordering (powerpc) */
-#define VM_PAT		0x40000000	/* PAT reserves whole VMA at once (x86) */
+#define VM_HUGEPAGE	0x20000000	/* MADV_HUGEPAGE marked this vma */
+#define VM_NOHUGEPAGE	0x40000000	/* MADV_NOHUGEPAGE marked this vma */
 #define VM_MERGEABLE	0x80000000	/* KSM may merge identical pages */
 
+#if defined(CONFIG_X86)
+# define VM_PAT		VM_ARCH_1	/* PAT reserves whole VMA at once (x86) */
+#elif defined(CONFIG_PPC)
+# define VM_SAO		VM_ARCH_1	/* Strong Access Ordering (powerpc) */
+#elif defined(CONFIG_PARISC)
+# define VM_GROWSUP	VM_ARCH_1
+#elif defined(CONFIG_IA64)
+# define VM_GROWSUP	VM_ARCH_1
+#elif !defined(CONFIG_MMU)
+# define VM_MAPPED_COPY	VM_ARCH_1	/* T if mapped copy of data (nommu mmap) */
+#endif
+
+#ifndef VM_GROWSUP
+# define VM_GROWSUP	VM_NONE
+#endif
+
 /* Bits set in the VMA until the stack is in its final location */
 #define VM_STACK_INCOMPLETE_SETUP	(VM_RAND_READ | VM_SEQ_READ)
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index cf827da..6ea5477 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1482,7 +1482,7 @@ out:
 	return ret;
 }
 
-#define VM_NO_THP (VM_SPECIAL|VM_INSERTPAGE|VM_MIXEDMAP|VM_SAO| \
+#define VM_NO_THP (VM_SPECIAL|VM_INSERTPAGE|VM_MIXEDMAP| \
 		   VM_HUGETLB|VM_SHARED|VM_MAYSHARE)
 
 int hugepage_madvise(struct vm_area_struct *vma,
diff --git a/mm/ksm.c b/mm/ksm.c
index 47c8853..d1cbe2a 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1470,9 +1470,14 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
 		if (*vm_flags & (VM_MERGEABLE | VM_SHARED  | VM_MAYSHARE   |
 				 VM_PFNMAP    | VM_IO      | VM_DONTEXPAND |
 				 VM_RESERVED  | VM_HUGETLB | VM_INSERTPAGE |
-				 VM_NONLINEAR | VM_MIXEDMAP | VM_SAO))
+				 VM_NONLINEAR | VM_MIXEDMAP))
 			return 0;		/* just ignore the advice */
 
+#ifdef VM_SAO
+		if (*vm_flags & VM_SAO)
+			return 0;
+#endif
+
 		if (!test_bit(MMF_VM_MERGEABLE, &mm->flags)) {
 			err = __ksm_enter(mm);
 			if (err)


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 3/7] mm: kill vma flag VM_CAN_NONLINEAR
  2012-03-31  9:25 [PATCH 0/7] mm: vma->vm_flags diet Konstantin Khlebnikov
  2012-03-31  9:29 ` [PATCH 1/7] mm, x86, PAT: rework linear pfn-mmap tracking Konstantin Khlebnikov
  2012-03-31  9:29 ` [PATCH 2/7] mm: introduce vma flag VM_ARCH_1 Konstantin Khlebnikov
@ 2012-03-31  9:29 ` Konstantin Khlebnikov
  2012-03-31 17:01   ` Linus Torvalds
  2012-03-31  9:29 ` [PATCH 4/7] mm: kill vma flag VM_INSERTPAGE Konstantin Khlebnikov
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-31  9:29 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Nick Piggin, Ingo Molnar, Linus Torvalds, Alexander Viro

This patch moves actual ptes filling for non-linear file mappings
into special vma operation: ->remap_pages().

Now fs must implement this method to get non-linear mappings support.
If fs uses filemap_fault() then it can use generic_file_remap_pages() for this.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
---
 drivers/staging/android/ashmem.c |    1 -
 fs/9p/vfs_file.c                 |    1 +
 fs/btrfs/file.c                  |    2 +-
 fs/ceph/addr.c                   |    2 +-
 fs/cifs/file.c                   |    1 +
 fs/ecryptfs/file.c               |    1 +
 fs/ext4/file.c                   |    2 +-
 fs/fuse/file.c                   |    1 +
 fs/gfs2/file.c                   |    2 +-
 fs/nfs/file.c                    |    1 +
 fs/nilfs2/file.c                 |    2 +-
 fs/ocfs2/mmap.c                  |    2 +-
 fs/ubifs/file.c                  |    1 +
 fs/xfs/xfs_file.c                |    2 +-
 include/linux/fs.h               |    2 ++
 include/linux/mm.h               |    6 +++---
 mm/filemap.c                     |    2 +-
 mm/filemap_xip.c                 |    3 ++-
 mm/fremap.c                      |   14 ++++++++------
 mm/mmap.c                        |    3 +--
 mm/nommu.c                       |    8 ++++++++
 mm/shmem.c                       |    3 +--
 22 files changed, 39 insertions(+), 23 deletions(-)

diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c
index 9f1f27e..8b36d3d 100644
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -329,7 +329,6 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma)
 	if (vma->vm_file)
 		fput(vma->vm_file);
 	vma->vm_file = asma->file;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 
 out:
 	mutex_unlock(&ashmem_mutex);
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index fc06fd2..34b84f0 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -735,6 +735,7 @@ v9fs_cached_file_write(struct file *filp, const char __user * data,
 static const struct vm_operations_struct v9fs_file_vm_ops = {
 	.fault = filemap_fault,
 	.page_mkwrite = v9fs_vm_page_mkwrite,
+	.remap_pages = generic_file_remap_pages,
 };
 
 
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index d83260d..29a8cfb 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1576,6 +1576,7 @@ out:
 static const struct vm_operations_struct btrfs_file_vm_ops = {
 	.fault		= filemap_fault,
 	.page_mkwrite	= btrfs_page_mkwrite,
+	.remap_pages	= generic_file_remap_pages,
 };
 
 static int btrfs_file_mmap(struct file	*filp, struct vm_area_struct *vma)
@@ -1587,7 +1588,6 @@ static int btrfs_file_mmap(struct file	*filp, struct vm_area_struct *vma)
 
 	file_accessed(filp);
 	vma->vm_ops = &btrfs_file_vm_ops;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 
 	return 0;
 }
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 173b1d2..1745051 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1219,6 +1219,7 @@ out:
 static struct vm_operations_struct ceph_vmops = {
 	.fault		= filemap_fault,
 	.page_mkwrite	= ceph_page_mkwrite,
+	.remap_pages	= generic_file_remap_pages,
 };
 
 int ceph_mmap(struct file *file, struct vm_area_struct *vma)
@@ -1229,6 +1230,5 @@ int ceph_mmap(struct file *file, struct vm_area_struct *vma)
 		return -ENOEXEC;
 	file_accessed(file);
 	vma->vm_ops = &ceph_vmops;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 	return 0;
 }
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 460d87b..591757f 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2557,6 +2557,7 @@ cifs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 static struct vm_operations_struct cifs_file_vm_ops = {
 	.fault = filemap_fault,
 	.page_mkwrite = cifs_page_mkwrite,
+	.remap_pages = generic_file_remap_pages,
 };
 
 int cifs_file_strict_mmap(struct file *file, struct vm_area_struct *vma)
diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c
index 2b17f2f..367de3b 100644
--- a/fs/ecryptfs/file.c
+++ b/fs/ecryptfs/file.c
@@ -146,6 +146,7 @@ static void ecryptfs_vma_close(struct vm_area_struct *vma)
 static const struct vm_operations_struct ecryptfs_file_vm_ops = {
 	.close		= ecryptfs_vma_close,
 	.fault		= filemap_fault,
+	.remap_pages	= generic_file_remap_pages,
 };
 
 static int ecryptfs_file_mmap(struct file *file, struct vm_area_struct *vma)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index cb70f18..adc9a39 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -144,6 +144,7 @@ ext4_file_write(struct kiocb *iocb, const struct iovec *iov,
 static const struct vm_operations_struct ext4_file_vm_ops = {
 	.fault		= filemap_fault,
 	.page_mkwrite   = ext4_page_mkwrite,
+	.remap_pages	= generic_file_remap_pages,
 };
 
 static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
@@ -154,7 +155,6 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
 		return -ENOEXEC;
 	file_accessed(file);
 	vma->vm_ops = &ext4_file_vm_ops;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 	return 0;
 }
 
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index a841868..804f0ac 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1331,6 +1331,7 @@ static const struct vm_operations_struct fuse_file_vm_ops = {
 	.close		= fuse_vma_close,
 	.fault		= filemap_fault,
 	.page_mkwrite	= fuse_page_mkwrite,
+	.remap_pages	= generic_file_remap_pages,
 };
 
 static int fuse_file_mmap(struct file *file, struct vm_area_struct *vma)
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 7683458..9f0804b 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -470,6 +470,7 @@ out:
 static const struct vm_operations_struct gfs2_vm_ops = {
 	.fault = filemap_fault,
 	.page_mkwrite = gfs2_page_mkwrite,
+	.remap_pages = generic_file_remap_pages,
 };
 
 /**
@@ -504,7 +505,6 @@ static int gfs2_mmap(struct file *file, struct vm_area_struct *vma)
 			return error;
 	}
 	vma->vm_ops = &gfs2_vm_ops;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 
 	return 0;
 }
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index aa9b709..31eae53 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -550,6 +550,7 @@ out:
 static const struct vm_operations_struct nfs_file_vm_ops = {
 	.fault = filemap_fault,
 	.page_mkwrite = nfs_vm_page_mkwrite,
+	.remap_pages = generic_file_remap_pages,
 };
 
 static int nfs_need_sync_write(struct file *filp, struct inode *inode)
diff --git a/fs/nilfs2/file.c b/fs/nilfs2/file.c
index 2660152..303cbb9 100644
--- a/fs/nilfs2/file.c
+++ b/fs/nilfs2/file.c
@@ -126,13 +126,13 @@ static int nilfs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 static const struct vm_operations_struct nilfs_file_vm_ops = {
 	.fault		= filemap_fault,
 	.page_mkwrite	= nilfs_page_mkwrite,
+	.remap_pages	= generic_file_remap_pages,
 };
 
 static int nilfs_file_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	file_accessed(file);
 	vma->vm_ops = &nilfs_file_vm_ops;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 	return 0;
 }
 
diff --git a/fs/ocfs2/mmap.c b/fs/ocfs2/mmap.c
index 9cd4108..7aef3f4 100644
--- a/fs/ocfs2/mmap.c
+++ b/fs/ocfs2/mmap.c
@@ -171,6 +171,7 @@ out:
 static const struct vm_operations_struct ocfs2_file_vm_ops = {
 	.fault		= ocfs2_fault,
 	.page_mkwrite	= ocfs2_page_mkwrite,
+	.remap_pages	= generic_file_remap_pages,
 };
 
 int ocfs2_mmap(struct file *file, struct vm_area_struct *vma)
@@ -186,7 +187,6 @@ int ocfs2_mmap(struct file *file, struct vm_area_struct *vma)
 	ocfs2_inode_unlock(file->f_dentry->d_inode, lock_level);
 out:
 	vma->vm_ops = &ocfs2_file_vm_ops;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 	return 0;
 }
 
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 5c8f6dc..fc59462 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1536,6 +1536,7 @@ out_unlock:
 static const struct vm_operations_struct ubifs_file_vm_ops = {
 	.fault        = filemap_fault,
 	.page_mkwrite = ubifs_vm_page_mkwrite,
+	.remap_pages = generic_file_remap_pages,
 };
 
 static int ubifs_file_mmap(struct file *file, struct vm_area_struct *vma)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 54a67dd..4381874 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -987,7 +987,6 @@ xfs_file_mmap(
 	struct vm_area_struct *vma)
 {
 	vma->vm_ops = &xfs_file_vm_ops;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 
 	file_accessed(filp);
 	return 0;
@@ -1041,4 +1040,5 @@ const struct file_operations xfs_dir_file_operations = {
 static const struct vm_operations_struct xfs_file_vm_ops = {
 	.fault		= filemap_fault,
 	.page_mkwrite	= xfs_vm_page_mkwrite,
+	.remap_pages	= generic_file_remap_pages,
 };
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 135693e..db55afd 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2356,6 +2356,8 @@ extern int sb_min_blocksize(struct super_block *, int);
 
 extern int generic_file_mmap(struct file *, struct vm_area_struct *);
 extern int generic_file_readonly_mmap(struct file *, struct vm_area_struct *);
+extern int generic_file_remap_pages(struct vm_area_struct *, unsigned long addr,
+		unsigned long size, pgoff_t pgoff);
 extern int file_read_actor(read_descriptor_t * desc, struct page *page, unsigned long offset, unsigned long size);
 int generic_write_checks(struct file *file, loff_t *pos, size_t *count, int isblk);
 extern ssize_t generic_file_aio_read(struct kiocb *, const struct iovec *, unsigned long, loff_t);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a444f47..0dad037 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -106,7 +106,6 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_INSERTPAGE	0x02000000	/* The vma has had "vm_insert_page()" done on it */
 #define VM_NODUMP	0x04000000	/* Do not include in the core dump */
 
-#define VM_CAN_NONLINEAR 0x08000000	/* Has ->fault & does nonlinear pages */
 #define VM_MIXEDMAP	0x10000000	/* Can contain "struct page" and pure PFN pages */
 #define VM_HUGEPAGE	0x20000000	/* MADV_HUGEPAGE marked this vma */
 #define VM_NOHUGEPAGE	0x40000000	/* MADV_NOHUGEPAGE marked this vma */
@@ -177,8 +176,7 @@ static inline int is_pfn_mapping(struct vm_area_struct *vma)
  * of VM_FAULT_xxx flags that give details about how the fault was handled.
  *
  * pgoff should be used in favour of virtual_address, if possible. If pgoff
- * is used, one may set VM_CAN_NONLINEAR in the vma->vm_flags to get nonlinear
- * mapping support.
+ * is used, one may implement ->remap_pages to get nonlinear mapping support.
  */
 struct vm_fault {
 	unsigned int flags;		/* FAULT_FLAG_xxx flags */
@@ -236,6 +234,8 @@ struct vm_operations_struct {
 	int (*migrate)(struct vm_area_struct *vma, const nodemask_t *from,
 		const nodemask_t *to, unsigned long flags);
 #endif
+	int (*remap_pages)(struct vm_area_struct *vma, unsigned long addr,
+			   unsigned long size, pgoff_t pgoff);
 };
 
 struct mmu_gather;
diff --git a/mm/filemap.c b/mm/filemap.c
index 79c4b2b..34cce46 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1753,6 +1753,7 @@ EXPORT_SYMBOL(filemap_fault);
 
 const struct vm_operations_struct generic_file_vm_ops = {
 	.fault		= filemap_fault,
+	.remap_pages	= generic_file_remap_pages,
 };
 
 /* This is used for a general mmap of a disk file */
@@ -1765,7 +1766,6 @@ int generic_file_mmap(struct file * file, struct vm_area_struct * vma)
 		return -ENOEXEC;
 	file_accessed(file);
 	vma->vm_ops = &generic_file_vm_ops;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 	return 0;
 }
 
diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c
index a4eb311..3c38d07 100644
--- a/mm/filemap_xip.c
+++ b/mm/filemap_xip.c
@@ -304,6 +304,7 @@ out:
 
 static const struct vm_operations_struct xip_file_vm_ops = {
 	.fault	= xip_file_fault,
+	.remap_pages = generic_file_remap_pages,
 };
 
 int xip_file_mmap(struct file * file, struct vm_area_struct * vma)
@@ -312,7 +313,7 @@ int xip_file_mmap(struct file * file, struct vm_area_struct * vma)
 
 	file_accessed(file);
 	vma->vm_ops = &xip_file_vm_ops;
-	vma->vm_flags |= VM_CAN_NONLINEAR | VM_MIXEDMAP;
+	vma->vm_flags |= VM_MIXEDMAP;
 	return 0;
 }
 EXPORT_SYMBOL_GPL(xip_file_mmap);
diff --git a/mm/fremap.c b/mm/fremap.c
index 9ed4fd4..c70190a 100644
--- a/mm/fremap.c
+++ b/mm/fremap.c
@@ -5,6 +5,7 @@
  *
  * started by Ingo Molnar, Copyright (C) 2002, 2003
  */
+#include <linux/export.h>
 #include <linux/backing-dev.h>
 #include <linux/mm.h>
 #include <linux/swap.h>
@@ -80,9 +81,10 @@ out:
 	return err;
 }
 
-static int populate_range(struct mm_struct *mm, struct vm_area_struct *vma,
-			unsigned long addr, unsigned long size, pgoff_t pgoff)
+int generic_file_remap_pages(struct vm_area_struct *vma, unsigned long addr,
+			     unsigned long size, pgoff_t pgoff)
 {
+	struct mm_struct *mm = vma->vm_mm;
 	int err;
 
 	do {
@@ -95,9 +97,9 @@ static int populate_range(struct mm_struct *mm, struct vm_area_struct *vma,
 		pgoff++;
 	} while (size);
 
-        return 0;
-
+	return 0;
 }
+EXPORT_SYMBOL(generic_file_remap_pages);
 
 /**
  * sys_remap_file_pages - remap arbitrary pages of an existing VM_SHARED vma
@@ -167,7 +169,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
 	if (vma->vm_private_data && !(vma->vm_flags & VM_NONLINEAR))
 		goto out;
 
-	if (!(vma->vm_flags & VM_CAN_NONLINEAR))
+	if (!vma->vm_ops->remap_pages)
 		goto out;
 
 	if (start < vma->vm_start || start + size > vma->vm_end)
@@ -229,7 +231,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
 	}
 
 	mmu_notifier_invalidate_range_start(mm, start, start + size);
-	err = populate_range(mm, vma, start, size, pgoff);
+	err = vma->vm_ops->remap_pages(vma, start, size, pgoff);
 	mmu_notifier_invalidate_range_end(mm, start, start + size);
 	if (!err && !(flags & MAP_NONBLOCK)) {
 		if (vma->vm_flags & VM_LOCKED) {
diff --git a/mm/mmap.c b/mm/mmap.c
index a7bf6a3..1a23d2c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -649,8 +649,7 @@ again:			remove_next = 1 + (end > next->vm_end);
 static inline int is_mergeable_vma(struct vm_area_struct *vma,
 			struct file *file, unsigned long vm_flags)
 {
-	/* VM_CAN_NONLINEAR may get set later by f_op->mmap() */
-	if ((vma->vm_flags ^ vm_flags) & ~VM_CAN_NONLINEAR)
+	if (vma->vm_flags ^ vm_flags)
 		return 0;
 	if (vma->vm_file != file)
 		return 0;
diff --git a/mm/nommu.c b/mm/nommu.c
index f59e170..afa0a15 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1961,6 +1961,14 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 }
 EXPORT_SYMBOL(filemap_fault);
 
+int generic_file_remap_pages(struct vm_area_struct *vma, unsigned long addr,
+			     unsigned long size, pgoff_t pgoff)
+{
+	BUG();
+	return 0;
+}
+EXPORT_SYMBOL(generic_file_remap_pages);
+
 static int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
 		unsigned long addr, void *buf, int len, int write)
 {
diff --git a/mm/shmem.c b/mm/shmem.c
index f99ff3e..617621a 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1112,7 +1112,6 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	file_accessed(file);
 	vma->vm_ops = &shmem_vm_ops;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 	return 0;
 }
 
@@ -2430,6 +2429,7 @@ static const struct vm_operations_struct shmem_vm_ops = {
 	.set_policy     = shmem_set_policy,
 	.get_policy     = shmem_get_policy,
 #endif
+	.remap_pages	= generic_file_remap_pages,
 };
 
 static struct dentry *shmem_mount(struct file_system_type *fs_type,
@@ -2623,7 +2623,6 @@ int shmem_zero_setup(struct vm_area_struct *vma)
 		fput(vma->vm_file);
 	vma->vm_file = file;
 	vma->vm_ops = &shmem_vm_ops;
-	vma->vm_flags |= VM_CAN_NONLINEAR;
 	return 0;
 }
 


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 4/7] mm: kill vma flag VM_INSERTPAGE
  2012-03-31  9:25 [PATCH 0/7] mm: vma->vm_flags diet Konstantin Khlebnikov
                   ` (2 preceding siblings ...)
  2012-03-31  9:29 ` [PATCH 3/7] mm: kill vma flag VM_CAN_NONLINEAR Konstantin Khlebnikov
@ 2012-03-31  9:29 ` Konstantin Khlebnikov
  2012-03-31  9:29 ` [PATCH 5/7] mm, drm/udl: fixup vma flags on mmap Konstantin Khlebnikov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-31  9:29 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Nick Piggin, Carsten Otte, Linus Torvalds, Peter Zijlstra

This patch merges VM_INSERTPAGE into VM_MIXEDMAP (and moves it near to VM_PFNMAP).
VM_MIXEDMAP vma anyway can mix pure-pfn ptes, special ptes and normal ptes.

this patch side-effects:
* copy_page_range() now always copies VM_MIXEDMAP vma on fork (why not?)
* in case HAVE_PTE_SPECIAL appears non-special ptes in VM_MIXEDMAP vma.
  seems like all ok, all code ready for this.
* in case !HAVE_PTE_SPECIAL: vm_normal_page() will check pfn_valid() after
  inserting pages via vm_insert_page()
* small change in vma_wants_writenotify(), seems like do_wp_page() can handle this.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Carsten Otte <cotte@de.ibm.com>
---
 include/linux/mm.h |    3 +--
 mm/huge_memory.c   |    3 +--
 mm/ksm.c           |    2 +-
 mm/memory.c        |   14 ++++++++++++--
 mm/mmap.c          |    2 +-
 5 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0dad037..553d134 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -84,6 +84,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_MAYSHARE	0x00000080
 
 #define VM_GROWSDOWN	0x00000100	/* general info on the segment */
+#define VM_MIXEDMAP	0x00000200	/* Can contain "struct page" and pure PFN pages */
 #define VM_PFNMAP	0x00000400	/* Page-ranges managed without "struct page", just pure PFN */
 #define VM_DENYWRITE	0x00000800	/* ETXTBSY on write attempts.. */
 
@@ -103,10 +104,8 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB	0x00400000	/* Huge TLB Page VM */
 #define VM_NONLINEAR	0x00800000	/* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1	0x01000000	/* Architecture-specific flag */
-#define VM_INSERTPAGE	0x02000000	/* The vma has had "vm_insert_page()" done on it */
 #define VM_NODUMP	0x04000000	/* Do not include in the core dump */
 
-#define VM_MIXEDMAP	0x10000000	/* Can contain "struct page" and pure PFN pages */
 #define VM_HUGEPAGE	0x20000000	/* MADV_HUGEPAGE marked this vma */
 #define VM_NOHUGEPAGE	0x40000000	/* MADV_NOHUGEPAGE marked this vma */
 #define VM_MERGEABLE	0x80000000	/* KSM may merge identical pages */
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 6ea5477..65ed599 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1482,8 +1482,7 @@ out:
 	return ret;
 }
 
-#define VM_NO_THP (VM_SPECIAL|VM_INSERTPAGE|VM_MIXEDMAP| \
-		   VM_HUGETLB|VM_SHARED|VM_MAYSHARE)
+#define VM_NO_THP (VM_SPECIAL|VM_MIXEDMAP|VM_HUGETLB|VM_SHARED|VM_MAYSHARE)
 
 int hugepage_madvise(struct vm_area_struct *vma,
 		     unsigned long *vm_flags, int advice)
diff --git a/mm/ksm.c b/mm/ksm.c
index d1cbe2a..f9ccb16 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1469,7 +1469,7 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
 		 */
 		if (*vm_flags & (VM_MERGEABLE | VM_SHARED  | VM_MAYSHARE   |
 				 VM_PFNMAP    | VM_IO      | VM_DONTEXPAND |
-				 VM_RESERVED  | VM_HUGETLB | VM_INSERTPAGE |
+				 VM_RESERVED  | VM_HUGETLB |
 				 VM_NONLINEAR | VM_MIXEDMAP))
 			return 0;		/* just ignore the advice */
 
diff --git a/mm/memory.c b/mm/memory.c
index e6e4dfd..9b8db37 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1043,7 +1043,8 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	 * readonly mappings. The tradeoff is that copy_page_range is more
 	 * efficient than faulting.
 	 */
-	if (!(vma->vm_flags & (VM_HUGETLB|VM_NONLINEAR|VM_PFNMAP|VM_INSERTPAGE))) {
+	if (!(vma->vm_flags & (VM_HUGETLB | VM_NONLINEAR |
+			       VM_PFNMAP | VM_MIXEDMAP))) {
 		if (!vma->anon_vma)
 			return 0;
 	}
@@ -2068,6 +2069,11 @@ out:
  * ask for a shared writable mapping!
  *
  * The page does not need to be reserved.
+ *
+ * Usually this function is called from f_op->mmap() handler
+ * under mm->mmap_sem write-lock, so it can change vma->vm_flags.
+ * Caller must set VM_MIXEDMAP on vma if it wants to call this
+ * function from other places, for example from page-fault handler.
  */
 int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
 			struct page *page)
@@ -2076,7 +2082,11 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
 		return -EFAULT;
 	if (!page_count(page))
 		return -EINVAL;
-	vma->vm_flags |= VM_INSERTPAGE;
+	if (!(vma->vm_flags & VM_MIXEDMAP)) {
+		VM_BUG_ON(down_read_trylock(&vma->vm_mm->mmap_sem));
+		VM_BUG_ON(vma->vm_flags & VM_PFNMAP);
+		vma->vm_flags |= VM_MIXEDMAP;
+	}
 	return insert_page(vma, addr, page, vma->vm_page_prot);
 }
 EXPORT_SYMBOL(vm_insert_page);
diff --git a/mm/mmap.c b/mm/mmap.c
index 1a23d2c..3d254ca 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1177,7 +1177,7 @@ int vma_wants_writenotify(struct vm_area_struct *vma)
 		return 0;
 
 	/* Specialty mapping? */
-	if (vm_flags & (VM_PFNMAP|VM_INSERTPAGE))
+	if (vm_flags & VM_PFNMAP)
 		return 0;
 
 	/* Can the mapping track the dirty pages? */


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 5/7] mm, drm/udl: fixup vma flags on mmap
  2012-03-31  9:25 [PATCH 0/7] mm: vma->vm_flags diet Konstantin Khlebnikov
                   ` (3 preceding siblings ...)
  2012-03-31  9:29 ` [PATCH 4/7] mm: kill vma flag VM_INSERTPAGE Konstantin Khlebnikov
@ 2012-03-31  9:29 ` Konstantin Khlebnikov
  2012-03-31  9:29 ` [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE Konstantin Khlebnikov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-31  9:29 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Dave Airlie, Linus Torvalds, dri-devel

There should be VM_MIXEDMAP, not VM_PFNMAP, because udl_gem_fault() inserts
pages via vm_insert_page(). Other drm/gem drivers already do this.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/udl/udl_drv.c |    2 +-
 drivers/gpu/drm/udl/udl_drv.h |    1 +
 drivers/gpu/drm/udl/udl_gem.c |   14 ++++++++++++++
 3 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/udl/udl_drv.c b/drivers/gpu/drm/udl/udl_drv.c
index 5340c5f..5367390 100644
--- a/drivers/gpu/drm/udl/udl_drv.c
+++ b/drivers/gpu/drm/udl/udl_drv.c
@@ -47,7 +47,7 @@ static struct vm_operations_struct udl_gem_vm_ops = {
 static const struct file_operations udl_driver_fops = {
 	.owner = THIS_MODULE,
 	.open = drm_open,
-	.mmap = drm_gem_mmap,
+	.mmap = udl_drm_gem_mmap,
 	.poll = drm_poll,
 	.read = drm_read,
 	.unlocked_ioctl	= drm_ioctl,
diff --git a/drivers/gpu/drm/udl/udl_drv.h b/drivers/gpu/drm/udl/udl_drv.h
index 1612954..96820d0 100644
--- a/drivers/gpu/drm/udl/udl_drv.h
+++ b/drivers/gpu/drm/udl/udl_drv.h
@@ -121,6 +121,7 @@ struct udl_gem_object *udl_gem_alloc_object(struct drm_device *dev,
 
 int udl_gem_vmap(struct udl_gem_object *obj);
 void udl_gem_vunmap(struct udl_gem_object *obj);
+int udl_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma);
 int udl_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
 
 int udl_handle_damage(struct udl_framebuffer *fb, int x, int y,
diff --git a/drivers/gpu/drm/udl/udl_gem.c b/drivers/gpu/drm/udl/udl_gem.c
index 852642d..92f19ef 100644
--- a/drivers/gpu/drm/udl/udl_gem.c
+++ b/drivers/gpu/drm/udl/udl_gem.c
@@ -71,6 +71,20 @@ int udl_dumb_destroy(struct drm_file *file, struct drm_device *dev,
 	return drm_gem_handle_delete(file, handle);
 }
 
+int udl_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+	int ret;
+
+	ret = drm_gem_mmap(filp, vma);
+	if (ret)
+		return ret;
+
+	vma->vm_flags &= ~VM_PFNMAP;
+	vma->vm_flags |= VM_MIXEDMAP;
+
+	return ret;
+}
+
 int udl_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
 	struct udl_gem_object *obj = to_udl_bo(vma->vm_private_data);


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-03-31  9:25 [PATCH 0/7] mm: vma->vm_flags diet Konstantin Khlebnikov
                   ` (4 preceding siblings ...)
  2012-03-31  9:29 ` [PATCH 5/7] mm, drm/udl: fixup vma flags on mmap Konstantin Khlebnikov
@ 2012-03-31  9:29 ` Konstantin Khlebnikov
  2012-03-31 20:13   ` Oleg Nesterov
  2012-04-02 23:18   ` Matt Helsley
  2012-03-31  9:29 ` [PATCH 7/7] mm: move madvise vma flags to the end Konstantin Khlebnikov
  2012-03-31 14:06 ` [PATCH 0/7] mm: vma->vm_flags diet Andi Kleen
  7 siblings, 2 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-31  9:29 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Oleg Nesterov, Eric Paris, linux-security-module, oprofile-list,
	Matt Helsley, Linus Torvalds, Al Viro

Currently the kernel sets mm->exe_file during sys_execve() and then tracks
number of vmas with VM_EXECUTABLE flag in mm->num_exe_file_vmas, as soon as
this counter drops to zero kernel resets mm->exe_file to NULL. Plus it resets
mm->exe_file at last mmput() when mm->mm_users drops to zero.

Vma with VM_EXECUTABLE flag appears after mapping file with flag MAP_EXECUTABLE,
such vmas can appears only at sys_execve() or after vma splitting, because
sys_mmap ignores this flag. Usually binfmt module sets mm->exe_file and mmaps
some executable vmas with this file, they hold mm->exe_file while task is running.

comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
where all this stuff was introduced:

> The kernel implements readlink of /proc/pid/exe by getting the file from
> the first executable VMA.  Then the path to the file is reconstructed and
> reported as the result.
>
> Because of the VMA walk the code is slightly different on nommu systems.
> This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
> walking the VMAs to find the first executable file-backed VMA we store a
> reference to the exec'd file in the mm_struct.
>
> That reference would prevent the filesystem holding the executable file
> from being unmounted even after unmapping the VMAs.  So we track the number
> of VM_EXECUTABLE VMAs and drop the new reference when the last one is
> unmapped.  This avoids pinning the mounted filesystem.

So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
fix some hypothetical pinning fs from umounting by mm which already unmapped all
its executable files, but still alive. Does anyone know any real world example?
mm can be borrowed by swapoff or some get_task_mm() user, but it's not a big problem.

Thus, we can remove all this stuff together with VM_EXECUTABLE flag and
keep mm->exe_file alive till final mmput().

After that we can access current->mm->exe_file without any locks
(after checking current->mm and mm->exe_file for NULL)

Some code around security and oprofile still uses VM_EXECUTABLE for retrieving
task's executable file, after this patch they will use mm->exe_file directly.
In tomoyo and audit mm is always current->mm, oprofile uses get_task_mm().

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: linux-security-module@vger.kernel.org
Cc: oprofile-list@lists.sf.net
---
 arch/powerpc/oprofile/cell/spu_task_sync.c |   15 ++++----------
 arch/tile/mm/elf.c                         |   12 ++++--------
 drivers/oprofile/buffer_sync.c             |   17 +++-------------
 include/linux/mm.h                         |    4 ----
 include/linux/mm_types.h                   |    1 -
 include/linux/mman.h                       |    1 -
 kernel/auditsc.c                           |   17 ++--------------
 kernel/fork.c                              |   29 ++++------------------------
 mm/mmap.c                                  |   27 +++++---------------------
 mm/nommu.c                                 |   11 +----------
 security/tomoyo/util.c                     |   14 +++-----------
 11 files changed, 26 insertions(+), 122 deletions(-)

diff --git a/arch/powerpc/oprofile/cell/spu_task_sync.c b/arch/powerpc/oprofile/cell/spu_task_sync.c
index 642fca1..28f1af2 100644
--- a/arch/powerpc/oprofile/cell/spu_task_sync.c
+++ b/arch/powerpc/oprofile/cell/spu_task_sync.c
@@ -304,7 +304,7 @@ static inline unsigned long fast_get_dcookie(struct path *path)
 	return cookie;
 }
 
-/* Look up the dcookie for the task's first VM_EXECUTABLE mapping,
+/* Look up the dcookie for the task's mm->exe_file,
  * which corresponds loosely to "application name". Also, determine
  * the offset for the SPU ELF object.  If computed offset is
  * non-zero, it implies an embedded SPU object; otherwise, it's a
@@ -321,7 +321,6 @@ get_exec_dcookie_and_offset(struct spu *spu, unsigned int *offsetp,
 {
 	unsigned long app_cookie = 0;
 	unsigned int my_offset = 0;
-	struct file *app = NULL;
 	struct vm_area_struct *vma;
 	struct mm_struct *mm = spu->mm;
 
@@ -330,16 +329,10 @@ get_exec_dcookie_and_offset(struct spu *spu, unsigned int *offsetp,
 
 	down_read(&mm->mmap_sem);
 
-	for (vma = mm->mmap; vma; vma = vma->vm_next) {
-		if (!vma->vm_file)
-			continue;
-		if (!(vma->vm_flags & VM_EXECUTABLE))
-			continue;
-		app_cookie = fast_get_dcookie(&vma->vm_file->f_path);
+	if (mm->exe_file) {
+		app_cookie = fast_get_dcookie(&mm->exe_file->f_path);
 		pr_debug("got dcookie for %s\n",
-			 vma->vm_file->f_dentry->d_name.name);
-		app = vma->vm_file;
-		break;
+			 mm->exe_file->f_dentry->d_name.name);
 	}
 
 	for (vma = mm->mmap; vma; vma = vma->vm_next) {
diff --git a/arch/tile/mm/elf.c b/arch/tile/mm/elf.c
index 758b603..43e5279 100644
--- a/arch/tile/mm/elf.c
+++ b/arch/tile/mm/elf.c
@@ -39,16 +39,12 @@ static void sim_notify_exec(const char *binary_name)
 static int notify_exec(void)
 {
 	int retval = 0;  /* failure */
-	struct vm_area_struct *vma = current->mm->mmap;
-	while (vma) {
-		if ((vma->vm_flags & VM_EXECUTABLE) && vma->vm_file)
-			break;
-		vma = vma->vm_next;
-	}
-	if (vma) {
+	struct mm_struct *mm = current->mm;
+
+	if (mm->exe_file) {
 		char *buf = (char *) __get_free_page(GFP_KERNEL);
 		if (buf) {
-			char *path = d_path(&vma->vm_file->f_path,
+			char *path = d_path(&mm->exe_file->f_path,
 					    buf, PAGE_SIZE);
 			if (!IS_ERR(path)) {
 				sim_notify_exec(path);
diff --git a/drivers/oprofile/buffer_sync.c b/drivers/oprofile/buffer_sync.c
index f34b5b2..d93b2b6 100644
--- a/drivers/oprofile/buffer_sync.c
+++ b/drivers/oprofile/buffer_sync.c
@@ -216,7 +216,7 @@ static inline unsigned long fast_get_dcookie(struct path *path)
 }
 
 
-/* Look up the dcookie for the task's first VM_EXECUTABLE mapping,
+/* Look up the dcookie for the task's mm->exe_file,
  * which corresponds loosely to "application name". This is
  * not strictly necessary but allows oprofile to associate
  * shared-library samples with particular applications
@@ -224,21 +224,10 @@ static inline unsigned long fast_get_dcookie(struct path *path)
 static unsigned long get_exec_dcookie(struct mm_struct *mm)
 {
 	unsigned long cookie = NO_COOKIE;
-	struct vm_area_struct *vma;
-
-	if (!mm)
-		goto out;
 
-	for (vma = mm->mmap; vma; vma = vma->vm_next) {
-		if (!vma->vm_file)
-			continue;
-		if (!(vma->vm_flags & VM_EXECUTABLE))
-			continue;
-		cookie = fast_get_dcookie(&vma->vm_file->f_path);
-		break;
-	}
+	if (mm && mm->exe_file)
+		cookie = fast_get_dcookie(&mm->exe_file->f_path);
 
-out:
 	return cookie;
 }
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 553d134..3a4d721 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -88,7 +88,6 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_PFNMAP	0x00000400	/* Page-ranges managed without "struct page", just pure PFN */
 #define VM_DENYWRITE	0x00000800	/* ETXTBSY on write attempts.. */
 
-#define VM_EXECUTABLE	0x00001000
 #define VM_LOCKED	0x00002000
 #define VM_IO           0x00004000	/* Memory mapped I/O or similar */
 
@@ -1374,9 +1373,6 @@ extern void exit_mmap(struct mm_struct *);
 extern int mm_take_all_locks(struct mm_struct *mm);
 extern void mm_drop_all_locks(struct mm_struct *mm);
 
-/* From fs/proc/base.c. callers must _not_ hold the mm's exe_file_lock */
-extern void added_exe_file_vma(struct mm_struct *mm);
-extern void removed_exe_file_vma(struct mm_struct *mm);
 extern void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file);
 extern struct file *get_mm_exe_file(struct mm_struct *mm);
 
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 3cc3062..b480c06 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -378,7 +378,6 @@ struct mm_struct {
 
 	/* store ref to file /proc/<pid>/exe symlink points to */
 	struct file *exe_file;
-	unsigned long num_exe_file_vmas;
 #ifdef CONFIG_MMU_NOTIFIER
 	struct mmu_notifier_mm *mmu_notifier_mm;
 #endif
diff --git a/include/linux/mman.h b/include/linux/mman.h
index 8b74e9b..77cec2f 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -86,7 +86,6 @@ calc_vm_flag_bits(unsigned long flags)
 {
 	return _calc_vm_trans(flags, MAP_GROWSDOWN,  VM_GROWSDOWN ) |
 	       _calc_vm_trans(flags, MAP_DENYWRITE,  VM_DENYWRITE ) |
-	       _calc_vm_trans(flags, MAP_EXECUTABLE, VM_EXECUTABLE) |
 	       _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    );
 }
 #endif /* __KERNEL__ */
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index af1de0f..aa27a00 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1164,21 +1164,8 @@ static void audit_log_task_info(struct audit_buffer *ab, struct task_struct *tsk
 	get_task_comm(name, tsk);
 	audit_log_format(ab, " comm=");
 	audit_log_untrustedstring(ab, name);
-
-	if (mm) {
-		down_read(&mm->mmap_sem);
-		vma = mm->mmap;
-		while (vma) {
-			if ((vma->vm_flags & VM_EXECUTABLE) &&
-			    vma->vm_file) {
-				audit_log_d_path(ab, " exe=",
-						 &vma->vm_file->f_path);
-				break;
-			}
-			vma = vma->vm_next;
-		}
-		up_read(&mm->mmap_sem);
-	}
+	if (mm && mm->exe_file)
+		audit_log_d_path(ab, " exe=", &mm->exe_file->f_path);
 	audit_log_task_context(ab);
 }
 
diff --git a/kernel/fork.c b/kernel/fork.c
index b9372a0..40e4b49 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -587,26 +587,6 @@ void mmput(struct mm_struct *mm)
 }
 EXPORT_SYMBOL_GPL(mmput);
 
-/*
- * We added or removed a vma mapping the executable. The vmas are only mapped
- * during exec and are not mapped with the mmap system call.
- * Callers must hold down_write() on the mm's mmap_sem for these
- */
-void added_exe_file_vma(struct mm_struct *mm)
-{
-	mm->num_exe_file_vmas++;
-}
-
-void removed_exe_file_vma(struct mm_struct *mm)
-{
-	mm->num_exe_file_vmas--;
-	if ((mm->num_exe_file_vmas == 0) && mm->exe_file) {
-		fput(mm->exe_file);
-		mm->exe_file = NULL;
-	}
-
-}
-
 void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
 {
 	if (new_exe_file)
@@ -614,20 +594,19 @@ void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
 	if (mm->exe_file)
 		fput(mm->exe_file);
 	mm->exe_file = new_exe_file;
-	mm->num_exe_file_vmas = 0;
 }
 
+/*
+ * Caller must have mm->mm_users reference,
+ * for example current->mm or acquired by get_task_mm().
+ */
 struct file *get_mm_exe_file(struct mm_struct *mm)
 {
 	struct file *exe_file;
 
-	/* We need mmap_sem to protect against races with removal of
-	 * VM_EXECUTABLE vmas */
-	down_read(&mm->mmap_sem);
 	exe_file = mm->exe_file;
 	if (exe_file)
 		get_file(exe_file);
-	up_read(&mm->mmap_sem);
 	return exe_file;
 }
 
diff --git a/mm/mmap.c b/mm/mmap.c
index 3d254ca..2647bb7 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -230,11 +230,8 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
 	might_sleep();
 	if (vma->vm_ops && vma->vm_ops->close)
 		vma->vm_ops->close(vma);
-	if (vma->vm_file) {
+	if (vma->vm_file)
 		fput(vma->vm_file);
-		if (vma->vm_flags & VM_EXECUTABLE)
-			removed_exe_file_vma(vma->vm_mm);
-	}
 	mpol_put(vma_policy(vma));
 	kmem_cache_free(vm_area_cachep, vma);
 	return next;
@@ -616,11 +613,8 @@ again:			remove_next = 1 + (end > next->vm_end);
 		mutex_unlock(&mapping->i_mmap_mutex);
 
 	if (remove_next) {
-		if (file) {
+		if (file)
 			fput(file);
-			if (next->vm_flags & VM_EXECUTABLE)
-				removed_exe_file_vma(mm);
-		}
 		if (next->anon_vma)
 			anon_vma_merge(vma, next);
 		mm->map_count--;
@@ -1293,8 +1287,6 @@ munmap_back:
 		error = file->f_op->mmap(file, vma);
 		if (error)
 			goto unmap_and_free_vma;
-		if (vm_flags & VM_EXECUTABLE)
-			added_exe_file_vma(mm);
 
 		/* Can addr have changed??
 		 *
@@ -1969,11 +1961,8 @@ static int __split_vma(struct mm_struct * mm, struct vm_area_struct * vma,
 	if (anon_vma_clone(new, vma))
 		goto out_free_mpol;
 
-	if (new->vm_file) {
+	if (new->vm_file)
 		get_file(new->vm_file);
-		if (vma->vm_flags & VM_EXECUTABLE)
-			added_exe_file_vma(mm);
-	}
 
 	if (new->vm_ops && new->vm_ops->open)
 		new->vm_ops->open(new);
@@ -1991,11 +1980,8 @@ static int __split_vma(struct mm_struct * mm, struct vm_area_struct * vma,
 	/* Clean everything up if vma_adjust failed. */
 	if (new->vm_ops && new->vm_ops->close)
 		new->vm_ops->close(new);
-	if (new->vm_file) {
-		if (vma->vm_flags & VM_EXECUTABLE)
-			removed_exe_file_vma(mm);
+	if (new->vm_file)
 		fput(new->vm_file);
-	}
 	unlink_anon_vmas(new);
  out_free_mpol:
 	mpol_put(pol);
@@ -2377,11 +2363,8 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
 			new_vma->vm_start = addr;
 			new_vma->vm_end = addr + len;
 			new_vma->vm_pgoff = pgoff;
-			if (new_vma->vm_file) {
+			if (new_vma->vm_file)
 				get_file(new_vma->vm_file);
-				if (vma->vm_flags & VM_EXECUTABLE)
-					added_exe_file_vma(mm);
-			}
 			if (new_vma->vm_ops && new_vma->vm_ops->open)
 				new_vma->vm_ops->open(new_vma);
 			vma_link(mm, new_vma, prev, rb_link, rb_parent);
diff --git a/mm/nommu.c b/mm/nommu.c
index afa0a15..d617d5c 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -789,11 +789,8 @@ static void delete_vma(struct mm_struct *mm, struct vm_area_struct *vma)
 	kenter("%p", vma);
 	if (vma->vm_ops && vma->vm_ops->close)
 		vma->vm_ops->close(vma);
-	if (vma->vm_file) {
+	if (vma->vm_file)
 		fput(vma->vm_file);
-		if (vma->vm_flags & VM_EXECUTABLE)
-			removed_exe_file_vma(mm);
-	}
 	put_nommu_region(vma->vm_region);
 	kmem_cache_free(vm_area_cachep, vma);
 }
@@ -1287,10 +1284,6 @@ unsigned long do_mmap_pgoff(struct file *file,
 		get_file(file);
 		vma->vm_file = file;
 		get_file(file);
-		if (vm_flags & VM_EXECUTABLE) {
-			added_exe_file_vma(current->mm);
-			vma->vm_mm = current->mm;
-		}
 	}
 
 	down_write(&nommu_region_sem);
@@ -1443,8 +1436,6 @@ error:
 	kmem_cache_free(vm_region_jar, region);
 	if (vma->vm_file)
 		fput(vma->vm_file);
-	if (vma->vm_flags & VM_EXECUTABLE)
-		removed_exe_file_vma(vma->vm_mm);
 	kmem_cache_free(vm_area_cachep, vma);
 	kleave(" = %d", ret);
 	return ret;
diff --git a/security/tomoyo/util.c b/security/tomoyo/util.c
index 867558c..b929dd3 100644
--- a/security/tomoyo/util.c
+++ b/security/tomoyo/util.c
@@ -949,19 +949,11 @@ bool tomoyo_path_matches_pattern(const struct tomoyo_path_info *filename,
 const char *tomoyo_get_exe(void)
 {
 	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
 	const char *cp = NULL;
 
-	if (!mm)
-		return NULL;
-	down_read(&mm->mmap_sem);
-	for (vma = mm->mmap; vma; vma = vma->vm_next) {
-		if ((vma->vm_flags & VM_EXECUTABLE) && vma->vm_file) {
-			cp = tomoyo_realpath_from_path(&vma->vm_file->f_path);
-			break;
-		}
-	}
-	up_read(&mm->mmap_sem);
+	if (mm && mm->exe_file)
+		cp = tomoyo_realpath_from_path(&mm->exe_file->f_path);
+
 	return cp;
 }
 


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 7/7] mm: move madvise vma flags to the end
  2012-03-31  9:25 [PATCH 0/7] mm: vma->vm_flags diet Konstantin Khlebnikov
                   ` (5 preceding siblings ...)
  2012-03-31  9:29 ` [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE Konstantin Khlebnikov
@ 2012-03-31  9:29 ` Konstantin Khlebnikov
  2012-03-31 14:06 ` [PATCH 0/7] mm: vma->vm_flags diet Andi Kleen
  7 siblings, 0 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-31  9:29 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel; +Cc: Linus Torvalds

Let's collect them together.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 include/linux/mm.h |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3a4d721..5e89a4f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -91,10 +91,6 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_LOCKED	0x00002000
 #define VM_IO           0x00004000	/* Memory mapped I/O or similar */
 
-					/* Used by sys_madvise() */
-#define VM_SEQ_READ	0x00008000	/* App will access data sequentially */
-#define VM_RAND_READ	0x00010000	/* App will not benefit from clustered reads */
-
 #define VM_DONTCOPY	0x00020000      /* Do not copy this vma on fork */
 #define VM_DONTEXPAND	0x00040000	/* Cannot expand with mremap() */
 #define VM_RESERVED	0x00080000	/* Count as reserved_vm like IO */
@@ -103,8 +99,11 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HUGETLB	0x00400000	/* Huge TLB Page VM */
 #define VM_NONLINEAR	0x00800000	/* Is non-linear (remap_file_pages) */
 #define VM_ARCH_1	0x01000000	/* Architecture-specific flag */
-#define VM_NODUMP	0x04000000	/* Do not include in the core dump */
 
+					/* Used by sys_madvise() */
+#define VM_NODUMP	0x04000000	/* Do not include in the core dump */
+#define VM_SEQ_READ	0x08000000	/* App will access data sequentially */
+#define VM_RAND_READ	0x10000000	/* App will not benefit from clustered reads */
 #define VM_HUGEPAGE	0x20000000	/* MADV_HUGEPAGE marked this vma */
 #define VM_NOHUGEPAGE	0x40000000	/* MADV_NOHUGEPAGE marked this vma */
 #define VM_MERGEABLE	0x80000000	/* KSM may merge identical pages */


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 0/7] mm: vma->vm_flags diet
  2012-03-31  9:25 [PATCH 0/7] mm: vma->vm_flags diet Konstantin Khlebnikov
                   ` (6 preceding siblings ...)
  2012-03-31  9:29 ` [PATCH 7/7] mm: move madvise vma flags to the end Konstantin Khlebnikov
@ 2012-03-31 14:06 ` Andi Kleen
  7 siblings, 0 replies; 52+ messages in thread
From: Andi Kleen @ 2012-03-31 14:06 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Linus Torvalds

Konstantin Khlebnikov <khlebnikov@openvz.org> writes:

> This patch-set moves/kills some VM_* flags in vma->vm_flags bit-field,
> as result there appears four free bits.
>
> Also I'm working on VM_RESERVED reorganization, probably it also can be killed.
> It lost original swapout-protection sense in 2.6 and now is used for other purposes.

Great, I ran into this problem recently too: I wanted to add a new bit,
but there was none.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 3/7] mm: kill vma flag VM_CAN_NONLINEAR
  2012-03-31  9:29 ` [PATCH 3/7] mm: kill vma flag VM_CAN_NONLINEAR Konstantin Khlebnikov
@ 2012-03-31 17:01   ` Linus Torvalds
  0 siblings, 0 replies; 52+ messages in thread
From: Linus Torvalds @ 2012-03-31 17:01 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Nick Piggin, Ingo Molnar,
	Alexander Viro

On Sat, Mar 31, 2012 at 2:29 AM, Konstantin Khlebnikov
<khlebnikov@openvz.org> wrote:
> This patch moves actual ptes filling for non-linear file mappings
> into special vma operation: ->remap_pages().
>
> Now fs must implement this method to get non-linear mappings support.
> If fs uses filemap_fault() then it can use generic_file_remap_pages() for this.

Me likee.

The other patches in the series look ok too, but this one in
particular is definitely the right thing, and an example of how people
have just used vm_flags bits for all the wrong reasons.

                  Linus

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/7 v2] mm, x86, PAT: rework linear pfn-mmap tracking
  2012-03-31  9:29 ` [PATCH 1/7] mm, x86, PAT: rework linear pfn-mmap tracking Konstantin Khlebnikov
@ 2012-03-31 17:09   ` Konstantin Khlebnikov
  2012-04-03  0:46     ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Suresh Siddha
  0 siblings, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-03-31 17:09 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Andi Kleen, Suresh Siddha, Pallipadi Venkatesh, Ingo Molnar,
	H. Peter Anvin, Linus Torvalds, Nick Piggin

This patch replaces generic vma-flag VM_PFN_AT_MMAP with x86-only VM_PAT.

We can toss mapping address from remap_pfn_range() into track_pfn_vma_new(),
and collect all PAT-related logic together in arch/x86/.

This patch also restores orignal frustration-free is_cow_mapping() check in
remap_pfn_range(), as it was before commit v2.6.28-rc8-88-g3c8bb73
("x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3")

is_linear_pfn_mapping() checks can be removed from mm/huge_memory.c,
because it already handled by VM_PFNMAP in VM_NO_THP bit-mask.

v2: Do not use batched pfn reserving for single-page VMA. This is not optimal
and breaks something, because I see glitches on the screen with i915/drm driver.
With this version glitches are gone, and I see the same regions in
/sys/kernel/debug/x86/pat_memtype_list as before patch. So, please review this
carefully, probably I'm wrong somewhere, or I have triggered some hidden bug.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Pallipadi Venkatesh <venkatesh.pallipadi@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
---
 arch/x86/mm/pat.c             |   28 ++++++++++++++++++++--------
 include/asm-generic/pgtable.h |    4 ++--
 include/linux/mm.h            |   15 +--------------
 mm/huge_memory.c              |    7 +++----
 mm/memory.c                   |   15 ++++++++-------
 5 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index f6ff57b..4632518 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -665,7 +665,7 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
 	unsigned long vma_size = vma->vm_end - vma->vm_start;
 	pgprot_t pgprot;
 
-	if (is_linear_pfn_mapping(vma)) {
+	if (vma->vm_flags & VM_PAT) {
 		/*
 		 * reserve the whole chunk covered by vma. We need the
 		 * starting address and protection from pte.
@@ -690,16 +690,29 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
  * single reserve_pfn_range call.
  */
 int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-			unsigned long pfn, unsigned long size)
+		unsigned long pfn, unsigned long addr, unsigned long size)
 {
 	unsigned long flags;
 	resource_size_t paddr;
 	unsigned long vma_size = vma->vm_end - vma->vm_start;
+	int ret;
 
-	if (is_linear_pfn_mapping(vma)) {
-		/* reserve the whole chunk starting from vm_pgoff */
-		paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
-		return reserve_pfn_range(paddr, vma_size, prot, 0);
+	/*
+	 * Use batched PFN reserving for linear VMA if it bigger than one page.
+	 */
+	if (addr == vma->vm_start && size == vma_size && size > PAGE_SIZE) {
+		/* reserve the whole chunk starting from pfn */
+		paddr = (resource_size_t)pfn << PAGE_SHIFT;
+		ret = reserve_pfn_range(paddr, vma_size, prot, 0);
+		if (!ret) {
+			vma->vm_flags |= VM_PAT;
+			/*
+			 * Save starting pfn in vm_pgoff for untrack_pfn_vma(),
+			 * remap_pfn_range() do this only for cow-mappings.
+			 */
+			vma->vm_pgoff = pfn;
+		}
+		return ret;
 	}
 
 	if (!pat_enabled)
@@ -724,11 +737,10 @@ void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 	resource_size_t paddr;
 	unsigned long vma_size = vma->vm_end - vma->vm_start;
 
-	if (is_linear_pfn_mapping(vma)) {
+	if (vma->vm_flags & VM_PAT) {
 		/* free the whole chunk starting from vm_pgoff */
 		paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
 		free_pfn_range(paddr, vma_size);
-		return;
 	}
 }
 
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 125c54e..688a2a5 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -389,7 +389,7 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
  * for physical range indicated by pfn and size.
  */
 static inline int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-					unsigned long pfn, unsigned long size)
+		unsigned long pfn, unsigned long addr, unsigned long size)
 {
 	return 0;
 }
@@ -420,7 +420,7 @@ static inline void untrack_pfn_vma(struct vm_area_struct *vma,
 }
 #else
 extern int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-				unsigned long pfn, unsigned long size);
+		unsigned long pfn, unsigned long addr, unsigned long size);
 extern int track_pfn_vma_copy(struct vm_area_struct *vma);
 extern void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 				unsigned long size);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d8738a4..b8e5fe5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -117,7 +117,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_CAN_NONLINEAR 0x08000000	/* Has ->fault & does nonlinear pages */
 #define VM_MIXEDMAP	0x10000000	/* Can contain "struct page" and pure PFN pages */
 #define VM_SAO		0x20000000	/* Strong Access Ordering (powerpc) */
-#define VM_PFN_AT_MMAP	0x40000000	/* PFNMAP vma that is fully mapped at mmap time */
+#define VM_PAT		0x40000000	/* PAT reserves whole VMA at once (x86) */
 #define VM_MERGEABLE	0x80000000	/* KSM may merge identical pages */
 
 /* Bits set in the VMA until the stack is in its final location */
@@ -158,19 +158,6 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_RETRY_NOWAIT	0x10	/* Don't drop mmap_sem and wait when retrying */
 #define FAULT_FLAG_KILLABLE	0x20	/* The fault task is in SIGKILL killable region */
 
-/*
- * This interface is used by x86 PAT code to identify a pfn mapping that is
- * linear over entire vma. This is to optimize PAT code that deals with
- * marking the physical region with a particular prot. This is not for generic
- * mm use. Note also that this check will not work if the pfn mapping is
- * linear for a vma starting at physical address 0. In which case PAT code
- * falls back to slow path of reserving physical range page by page.
- */
-static inline int is_linear_pfn_mapping(struct vm_area_struct *vma)
-{
-	return !!(vma->vm_flags & VM_PFN_AT_MMAP);
-}
-
 static inline int is_pfn_mapping(struct vm_area_struct *vma)
 {
 	return !!(vma->vm_flags & VM_PFNMAP);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f0e5306..cf827da 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1650,7 +1650,7 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma)
 	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
 	 * true too, verify it here.
 	 */
-	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags & VM_NO_THP);
+	VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 	hend = vma->vm_end & HPAGE_PMD_MASK;
 	if (hstart < hend)
@@ -1908,7 +1908,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
 	 * true too, verify it here.
 	 */
-	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags & VM_NO_THP);
+	VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 
 	pgd = pgd_offset(mm, address);
 	if (!pgd_present(*pgd))
@@ -2150,8 +2150,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 		 * If is_pfn_mapping() is true is_learn_pfn_mapping()
 		 * must be true too, verify it here.
 		 */
-		VM_BUG_ON(is_linear_pfn_mapping(vma) ||
-			  vma->vm_flags & VM_NO_THP);
+		VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 
 		hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 		hend = vma->vm_end & HPAGE_PMD_MASK;
diff --git a/mm/memory.c b/mm/memory.c
index 6105f47..e6e4dfd 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2145,7 +2145,7 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_vma_new(vma, &pgprot, pfn, PAGE_SIZE))
+	if (track_pfn_vma_new(vma, &pgprot, pfn, addr, PAGE_SIZE))
 		return -EINVAL;
 
 	ret = insert_pfn(vma, addr, pfn, pgprot);
@@ -2285,23 +2285,24 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 	 * There's a horrible special case to handle copy-on-write
 	 * behaviour that some programs depend on. We mark the "original"
 	 * un-COW'ed pages by matching them up with "vma->vm_pgoff".
+	 * See vm_normal_page() for details.
 	 */
-	if (addr == vma->vm_start && end == vma->vm_end) {
+
+	if (is_cow_mapping(vma->vm_flags)) {
+		if (addr != vma->vm_start || end != vma->vm_end)
+			return -EINVAL;
 		vma->vm_pgoff = pfn;
-		vma->vm_flags |= VM_PFN_AT_MMAP;
-	} else if (is_cow_mapping(vma->vm_flags))
-		return -EINVAL;
+	}
 
 	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
 
-	err = track_pfn_vma_new(vma, &prot, pfn, PAGE_ALIGN(size));
+	err = track_pfn_vma_new(vma, &prot, pfn, addr, PAGE_ALIGN(size));
 	if (err) {
 		/*
 		 * To indicate that track_pfn related cleanup is not
 		 * needed from higher level routine calling unmap_vmas
 		 */
 		vma->vm_flags &= ~(VM_IO | VM_RESERVED | VM_PFNMAP);
-		vma->vm_flags &= ~VM_PFN_AT_MMAP;
 		return -EINVAL;
 	}
 


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-03-31  9:29 ` [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE Konstantin Khlebnikov
@ 2012-03-31 20:13   ` Oleg Nesterov
  2012-03-31 20:39     ` Cyrill Gorcunov
  2012-04-02 23:04     ` Matt Helsley
  2012-04-02 23:18   ` Matt Helsley
  1 sibling, 2 replies; 52+ messages in thread
From: Oleg Nesterov @ 2012-03-31 20:13 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Eric Paris,
	linux-security-module, oprofile-list, Matt Helsley,
	Linus Torvalds, Al Viro, Cyrill Gorcunov

On 03/31, Konstantin Khlebnikov wrote:
>
> comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
> where all this stuff was introduced:
>
> > ...
> > This avoids pinning the mounted filesystem.
>
> So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
> fix some hypothetical pinning fs from umounting by mm which already unmapped all
> its executable files, but still alive. Does anyone know any real world example?

This is the question to Matt.

> keep mm->exe_file alive till final mmput().

Please see the recent discussion, http://marc.info/?t=133096188900012

(just in case, the patch itself was deadly wrong, don't look at it ;)

> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -378,7 +378,6 @@ struct mm_struct {
>
>  	/* store ref to file /proc/<pid>/exe symlink points to */
>  	struct file *exe_file;
> -	unsigned long num_exe_file_vmas;

Add Cyrill. This conflicts with
c-r-prctl-add-ability-to-set-new-mm_struct-exe_file.patch in -mm.

Oleg.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-03-31 20:13   ` Oleg Nesterov
@ 2012-03-31 20:39     ` Cyrill Gorcunov
  2012-04-02  9:46       ` Konstantin Khlebnikov
  2012-04-02 23:04     ` Matt Helsley
  1 sibling, 1 reply; 52+ messages in thread
From: Cyrill Gorcunov @ 2012-03-31 20:39 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Eric Paris, linux-security-module, oprofile-list, Matt Helsley,
	Linus Torvalds, Al Viro

On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
> 
> Add Cyrill. This conflicts with
> c-r-prctl-add-ability-to-set-new-mm_struct-exe_file.patch in -mm.

Thanks for CC'ing, Oleg. I think if thise series go in it won't
be a problem to update my patch accordingly.

	Cyrill

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/7] mm: introduce vma flag VM_ARCH_1
  2012-03-31  9:29 ` [PATCH 2/7] mm: introduce vma flag VM_ARCH_1 Konstantin Khlebnikov
@ 2012-03-31 22:25   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 52+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-31 22:25 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Andrea Arcangeli,
	Minchan Kim, Linus Torvalds

On Sat, 2012-03-31 at 13:29 +0400, Konstantin Khlebnikov wrote:
> This patch shuffles some bits in vma->vm_flags
> 
> before patch:
> 
>         0x00000200      0x01000000      0x20000000      0x40000000
> x86     VM_NOHUGEPAGE   VM_HUGEPAGE     -               VM_PAT
> powerpc -               -               VM_SAO          -
> parisc  VM_GROWSUP      -               -               -
> ia64    VM_GROWSUP      -               -               -
> nommu   -               VM_MAPPED_COPY  -               -
> others  -               -               -               -
> 
> after patch:
> 
>         0x00000200      0x01000000      0x20000000      0x40000000
> x86     -               VM_PAT          VM_HUGEPAGE     VM_NOHUGEPAGE
> powerpc -               VM_SAO          -               -
> parisc  -               VM_GROWSUP      -               -
> ia64    -               VM_GROWSUP      -               -
> nommu   -               VM_MAPPED_COPY  -               -
> others  -               VM_ARCH_1       -               -
> 
> And voila! One completely free bit.

Great :-) Let me know when you free VM_ARCH_2 as well as I have good use
for it too :-)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-03-31 20:39     ` Cyrill Gorcunov
@ 2012-04-02  9:46       ` Konstantin Khlebnikov
  2012-04-02  9:54         ` Cyrill Gorcunov
  2012-04-02 14:48         ` Oleg Nesterov
  0 siblings, 2 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-02  9:46 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, Eric Paris,
	linux-security-module, oprofile-list, Matt Helsley,
	Linus Torvalds, Al Viro

Cyrill Gorcunov wrote:
> On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
>>
>> Add Cyrill. This conflicts with
>> c-r-prctl-add-ability-to-set-new-mm_struct-exe_file.patch in -mm.
>
> Thanks for CC'ing, Oleg. I think if thise series go in it won't
> be a problem to update my patch accordingly.

In this patch I leave mm->exe_file lockless.
After exec/fork we can change it only for current task and only if mm->mm_users == 1.

something like this:

task_lock(current);
if (atomic_read(&current->mm->mm_users) == 1)
	set_mm_exe_file(current->mm, new_file);
else
	ret = -EBUSY;
task_unlock(current);

task_lock() protect this code against get_task_mm()

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-02  9:46       ` Konstantin Khlebnikov
@ 2012-04-02  9:54         ` Cyrill Gorcunov
  2012-04-02 10:13           ` Konstantin Khlebnikov
  2012-04-02 14:48         ` Oleg Nesterov
  1 sibling, 1 reply; 52+ messages in thread
From: Cyrill Gorcunov @ 2012-04-02  9:54 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, Eric Paris,
	linux-security-module, oprofile-list, Matt Helsley,
	Linus Torvalds, Al Viro

On Mon, Apr 02, 2012 at 01:46:03PM +0400, Konstantin Khlebnikov wrote:
> Cyrill Gorcunov wrote:
> >On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
> >>
> >>Add Cyrill. This conflicts with
> >>c-r-prctl-add-ability-to-set-new-mm_struct-exe_file.patch in -mm.
> >
> >Thanks for CC'ing, Oleg. I think if thise series go in it won't
> >be a problem to update my patch accordingly.
> 
> In this patch I leave mm->exe_file lockless.
> After exec/fork we can change it only for current task and only if mm->mm_users == 1.
> 
> something like this:
> 
> task_lock(current);
> if (atomic_read(&current->mm->mm_users) == 1)
> 	set_mm_exe_file(current->mm, new_file);
> else
> 	ret = -EBUSY;
> task_unlock(current);
> 
> task_lock() protect this code against get_task_mm()

I see. Konstantin, the question is what is more convenient way to update the
patch in linux-next. The c-r-prctl-add-ability-to-set-new-mm_struct-exe_file.patch
is in -mm already, so I either should wait until Andrew pick your series up and
send updating patch on top, or I could fetch your series, update my patch and
send it here as reply. Hmm?

	Cyrill

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-02  9:54         ` Cyrill Gorcunov
@ 2012-04-02 10:13           ` Konstantin Khlebnikov
  0 siblings, 0 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-02 10:13 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, Eric Paris,
	linux-security-module, oprofile-list, Matt Helsley,
	Linus Torvalds, Al Viro

Cyrill Gorcunov wrote:
> On Mon, Apr 02, 2012 at 01:46:03PM +0400, Konstantin Khlebnikov wrote:
>> Cyrill Gorcunov wrote:
>>> On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
>>>>
>>>> Add Cyrill. This conflicts with
>>>> c-r-prctl-add-ability-to-set-new-mm_struct-exe_file.patch in -mm.
>>>
>>> Thanks for CC'ing, Oleg. I think if thise series go in it won't
>>> be a problem to update my patch accordingly.
>>
>> In this patch I leave mm->exe_file lockless.
>> After exec/fork we can change it only for current task and only if mm->mm_users == 1.
>>
>> something like this:
>>
>> task_lock(current);
>> if (atomic_read(&current->mm->mm_users) == 1)
>> 	set_mm_exe_file(current->mm, new_file);
>> else
>> 	ret = -EBUSY;
>> task_unlock(current);
>>
>> task_lock() protect this code against get_task_mm()
>
> I see. Konstantin, the question is what is more convenient way to update the
> patch in linux-next. The c-r-prctl-add-ability-to-set-new-mm_struct-exe_file.patch
> is in -mm already, so I either should wait until Andrew pick your series up and
> send updating patch on top, or I could fetch your series, update my patch and
> send it here as reply. Hmm?

Let's wait for Andrew's response. And maybe somebody disagree with my changes.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-02  9:46       ` Konstantin Khlebnikov
  2012-04-02  9:54         ` Cyrill Gorcunov
@ 2012-04-02 14:48         ` Oleg Nesterov
  2012-04-02 16:02           ` Cyrill Gorcunov
  2012-04-02 16:19           ` Konstantin Khlebnikov
  1 sibling, 2 replies; 52+ messages in thread
From: Oleg Nesterov @ 2012-04-02 14:48 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Cyrill Gorcunov, linux-mm, Andrew Morton, linux-kernel,
	Eric Paris, linux-security-module@vger.kernel.org

On 04/02, Konstantin Khlebnikov wrote:
>
> In this patch I leave mm->exe_file lockless.
> After exec/fork we can change it only for current task and only if mm->mm_users == 1.
>
> something like this:
>
> task_lock(current);

OK, this protects against the race with get_task_mm()

> if (atomic_read(&current->mm->mm_users) == 1)

this means PR_SET_MM_EXE_FILE can fail simply because someone did
get_task_mm(). Or the caller is multithreaded.

> 	set_mm_exe_file(current->mm, new_file);

No, fput() can sleep.

Oleg.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-02 14:48         ` Oleg Nesterov
@ 2012-04-02 16:02           ` Cyrill Gorcunov
  2012-04-02 16:19           ` Konstantin Khlebnikov
  1 sibling, 0 replies; 52+ messages in thread
From: Cyrill Gorcunov @ 2012-04-02 16:02 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel, Eric Paris

On Mon, Apr 02, 2012 at 04:48:21PM +0200, Oleg Nesterov wrote:
> On 04/02, Konstantin Khlebnikov wrote:
> >
> > In this patch I leave mm->exe_file lockless.
> > After exec/fork we can change it only for current task and only if mm->mm_users == 1.
> >
> > something like this:
> >
> > task_lock(current);
> 
> OK, this protects against the race with get_task_mm()
> 
> > if (atomic_read(&current->mm->mm_users) == 1)
> 
> this means PR_SET_MM_EXE_FILE can fail simply because someone did
> get_task_mm(). Or the caller is multithreaded.

So it leads to the same question -- do we *really* need the PR_SET_MM_EXE_FILE
to be one-shot action? Yeah, I know, we agreed that one-shot is better than
anything else from sysadmin perspective and such, but maybe I could introduce
a special capability bit for c/r and allow a program which has such cap to modify
exe-file without checkin mm_users?

/me hides

> 
> > 	set_mm_exe_file(current->mm, new_file);
> 
> No, fput() can sleep.

Sure, it was just "something like" as Konstantin stated, thanks anyway ;)

	Cyrill

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-02 14:48         ` Oleg Nesterov
  2012-04-02 16:02           ` Cyrill Gorcunov
@ 2012-04-02 16:19           ` Konstantin Khlebnikov
  2012-04-02 16:27             ` Cyrill Gorcunov
  1 sibling, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-02 16:19 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Cyrill Gorcunov, linux-mm, Andrew Morton, linux-kernel, Eric Paris

Oleg Nesterov wrote:
> On 04/02, Konstantin Khlebnikov wrote:
>>
>> In this patch I leave mm->exe_file lockless.
>> After exec/fork we can change it only for current task and only if mm->mm_users == 1.
>>
>> something like this:
>>
>> task_lock(current);
>
> OK, this protects against the race with get_task_mm()
>
>> if (atomic_read(&current->mm->mm_users) == 1)
>
> this means PR_SET_MM_EXE_FILE can fail simply because someone did
> get_task_mm(). Or the caller is multithreaded.

This is sad, seems like we should keep mm->exe_file protection by mm->mmap_sem.
So, I'll rework this patch...

>
>> 	set_mm_exe_file(current->mm, new_file);
>
> No, fput() can sleep.

Yep

>
> Oleg.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email:<a href=mailto:"dont@kvack.org">  email@kvack.org</a>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-02 16:19           ` Konstantin Khlebnikov
@ 2012-04-02 16:27             ` Cyrill Gorcunov
  2012-04-02 17:14               ` Konstantin Khlebnikov
  0 siblings, 1 reply; 52+ messages in thread
From: Cyrill Gorcunov @ 2012-04-02 16:27 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Oleg Nesterov
  Cc: linux-mm, Andrew Morton, linux-kernel, Eric Paris

On Mon, Apr 02, 2012 at 08:19:59PM +0400, Konstantin Khlebnikov wrote:
> Oleg Nesterov wrote:
> >On 04/02, Konstantin Khlebnikov wrote:
> >>
> >>In this patch I leave mm->exe_file lockless.
> >>After exec/fork we can change it only for current task and only if mm->mm_users == 1.
> >>
> >>something like this:
> >>
> >>task_lock(current);
> >
> >OK, this protects against the race with get_task_mm()
> >
> >>if (atomic_read(&current->mm->mm_users) == 1)
> >
> >this means PR_SET_MM_EXE_FILE can fail simply because someone did
> >get_task_mm(). Or the caller is multithreaded.
> 
> This is sad, seems like we should keep mm->exe_file protection by mm->mmap_sem.
> So, I'll rework this patch...

Ah, it's about locking. I misundertand it at first.
Oleg, forget about my email then.

	Cyrill

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-02 16:27             ` Cyrill Gorcunov
@ 2012-04-02 17:14               ` Konstantin Khlebnikov
  2012-04-02 18:05                 ` Cyrill Gorcunov
  0 siblings, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-02 17:14 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, Eric Paris

[-- Attachment #1: Type: text/plain, Size: 900 bytes --]

Cyrill Gorcunov wrote:
> On Mon, Apr 02, 2012 at 08:19:59PM +0400, Konstantin Khlebnikov wrote:
>> Oleg Nesterov wrote:
>>> On 04/02, Konstantin Khlebnikov wrote:
>>>>
>>>> In this patch I leave mm->exe_file lockless.
>>>> After exec/fork we can change it only for current task and only if mm->mm_users == 1.
>>>>
>>>> something like this:
>>>>
>>>> task_lock(current);
>>>
>>> OK, this protects against the race with get_task_mm()
>>>
>>>> if (atomic_read(&current->mm->mm_users) == 1)
>>>
>>> this means PR_SET_MM_EXE_FILE can fail simply because someone did
>>> get_task_mm(). Or the caller is multithreaded.
>>
>> This is sad, seems like we should keep mm->exe_file protection by mm->mmap_sem.
>> So, I'll rework this patch...
>
> Ah, it's about locking. I misundertand it at first.
> Oleg, forget about my email then.

Yes, it's about locking. Please review patch for your code from attachment.

[-- Attachment #2: diff-pr-set-mm-exe-file-without-vm_executable --]
[-- Type: text/plain, Size: 2121 bytes --]

diff --git a/include/linux/sched.h b/include/linux/sched.h
index cff94cd..4a41270 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -437,6 +437,7 @@ extern int get_dumpable(struct mm_struct *mm);
 					/* leave room for more dump flags */
 #define MMF_VM_MERGEABLE	16	/* KSM may merge identical pages */
 #define MMF_VM_HUGEPAGE		17	/* set when VM_HUGEPAGE is set on vma */
+#define MMF_EXE_FILE_CHANGED	18	/* see prctl(PR_SET_MM_EXE_FILE) */
 
 #define MMF_INIT_MASK		(MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK)
 
diff --git a/kernel/sys.c b/kernel/sys.c
index da660f3..b217069 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1714,17 +1714,11 @@ static bool vma_flags_mismatch(struct vm_area_struct *vma,
 
 static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
 {
+	struct vm_area_struct *vma;
 	struct file *exe_file;
 	struct dentry *dentry;
 	int err;
 
-	/*
-	 * Setting new mm::exe_file is only allowed when no VM_EXECUTABLE vma's
-	 * remain. So perform a quick test first.
-	 */
-	if (mm->num_exe_file_vmas)
-		return -EBUSY;
-
 	exe_file = fget(fd);
 	if (!exe_file)
 		return -EBADF;
@@ -1745,17 +1739,28 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
 	if (err)
 		goto exit;
 
+	down_write(&mm->mmap_sem);
+	/*
+	 * Forbid mm->exe_file change if there are mapped some other files.
+	 */
+	err = -EEXIST;
+	for (vma = mm->mmap; vma; vma = vma->vm_next) {
+		if (vma->vm_file &&
+		    !path_equal(&vma->vm_file->f_path, &exe_file->f_path))
+			goto out_unlock;
+	}
 	/*
 	 * The symlink can be changed only once, just to disallow arbitrary
 	 * transitions malicious software might bring in. This means one
 	 * could make a snapshot over all processes running and monitor
 	 * /proc/pid/exe changes to notice unusual activity if needed.
 	 */
-	down_write(&mm->mmap_sem);
-	if (likely(!mm->exe_file))
-		set_mm_exe_file(mm, exe_file);
-	else
-		err = -EBUSY;
+	err = -EBUSY;
+	if (test_and_set_bit(MMF_EXE_FILE_CHANGED, &mm->flags))
+		goto out_unlock;
+	set_mm_exe_file(mm, exe_file);
+	err = 0;
+out_unlock:
 	up_write(&mm->mmap_sem);
 
 exit:

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-02 17:14               ` Konstantin Khlebnikov
@ 2012-04-02 18:05                 ` Cyrill Gorcunov
  0 siblings, 0 replies; 52+ messages in thread
From: Cyrill Gorcunov @ 2012-04-02 18:05 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, Eric Paris

On Mon, Apr 02, 2012 at 09:14:44PM +0400, Konstantin Khlebnikov wrote:
...
> >
> >Ah, it's about locking. I misundertand it at first.
> >Oleg, forget about my email then.
> 
> Yes, it's about locking. Please review patch for your code from attachment.

Thanks a lot, Konstantin! This should do the trick.

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index cff94cd..4a41270 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -437,6 +437,7 @@ extern int get_dumpable(struct mm_struct *mm);
>  					/* leave room for more dump flags */
>  #define MMF_VM_MERGEABLE	16	/* KSM may merge identical pages */
>  #define MMF_VM_HUGEPAGE		17	/* set when VM_HUGEPAGE is set on vma */
> +#define MMF_EXE_FILE_CHANGED	18	/* see prctl(PR_SET_MM_EXE_FILE) */
>  
>  #define MMF_INIT_MASK		(MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK)
>  
> diff --git a/kernel/sys.c b/kernel/sys.c
> index da660f3..b217069 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1714,17 +1714,11 @@ static bool vma_flags_mismatch(struct vm_area_struct *vma,
>  
>  static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
>  {
> +	struct vm_area_struct *vma;
>  	struct file *exe_file;
>  	struct dentry *dentry;
>  	int err;
>  
> -	/*
> -	 * Setting new mm::exe_file is only allowed when no VM_EXECUTABLE vma's
> -	 * remain. So perform a quick test first.
> -	 */
> -	if (mm->num_exe_file_vmas)
> -		return -EBUSY;
> -
>  	exe_file = fget(fd);
>  	if (!exe_file)
>  		return -EBADF;
> @@ -1745,17 +1739,28 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
>  	if (err)
>  		goto exit;
>  
> +	down_write(&mm->mmap_sem);
> +	/*
> +	 * Forbid mm->exe_file change if there are mapped some other files.
> +	 */
> +	err = -EEXIST;
> +	for (vma = mm->mmap; vma; vma = vma->vm_next) {
> +		if (vma->vm_file &&
> +		    !path_equal(&vma->vm_file->f_path, &exe_file->f_path))
> +			goto out_unlock;
> +	}

If I understand right, this snippet is emulating old behaviour (ie as
it was with num_exe_file_vmas), thus -EBUSY might be more appropriate?
But it's really a small nit I think. Thanks again.

	Cyrill

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-03-31 20:13   ` Oleg Nesterov
  2012-03-31 20:39     ` Cyrill Gorcunov
@ 2012-04-02 23:04     ` Matt Helsley
  2012-04-03  5:10       ` Konstantin Khlebnikov
  1 sibling, 1 reply; 52+ messages in thread
From: Matt Helsley @ 2012-04-02 23:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Eric Paris, linux-security-module, oprofile-list, Matt Helsley,
	Linus Torvalds, Al Viro, Cyrill Gorcunov

On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
> On 03/31, Konstantin Khlebnikov wrote:
> >
> > comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
> > where all this stuff was introduced:
> >
> > > ...
> > > This avoids pinning the mounted filesystem.
> >
> > So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
> > fix some hypothetical pinning fs from umounting by mm which already unmapped all
> > its executable files, but still alive. Does anyone know any real world example?
> 
> This is the question to Matt.

This is where I got the scenario:

https://lkml.org/lkml/2007/7/12/398

Cheers,
	-Matt Helsley

PS: I seem to keep coming back to this so I hope folks don't mind if I leave
some more references to make (re)searching this topic easier:

Thread with Cyrill Gorcunov discussing c/r of symlink:
https://lkml.org/lkml/2012/3/16/448

Thread with Oleg Nesterov re: cleanups:
https://lkml.org/lkml/2012/3/5/240

Thread with Alexey Dobriyan re: cleanups:
https://lkml.org/lkml/2009/6/4/625

mainline commit 925d1c401fa6cfd0df5d2e37da8981494ccdec07
Date:   Tue Apr 29 01:01:36 2008 -0700

	procfs task exe symlink


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-03-31  9:29 ` [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE Konstantin Khlebnikov
  2012-03-31 20:13   ` Oleg Nesterov
@ 2012-04-02 23:18   ` Matt Helsley
  2012-04-03  5:06     ` Konstantin Khlebnikov
  1 sibling, 1 reply; 52+ messages in thread
From: Matt Helsley @ 2012-04-02 23:18 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Oleg Nesterov, Eric Paris,
	linux-security-module, oprofile-list, Matt Helsley,
	Linus Torvalds, Al Viro

On Sat, Mar 31, 2012 at 01:29:29PM +0400, Konstantin Khlebnikov wrote:
> Currently the kernel sets mm->exe_file during sys_execve() and then tracks
> number of vmas with VM_EXECUTABLE flag in mm->num_exe_file_vmas, as soon as
> this counter drops to zero kernel resets mm->exe_file to NULL. Plus it resets
> mm->exe_file at last mmput() when mm->mm_users drops to zero.
> 
> Vma with VM_EXECUTABLE flag appears after mapping file with flag MAP_EXECUTABLE,
> such vmas can appears only at sys_execve() or after vma splitting, because
> sys_mmap ignores this flag. Usually binfmt module sets mm->exe_file and mmaps
> some executable vmas with this file, they hold mm->exe_file while task is running.
> 
> comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
> where all this stuff was introduced:
> 
> > The kernel implements readlink of /proc/pid/exe by getting the file from
> > the first executable VMA.  Then the path to the file is reconstructed and
> > reported as the result.
> >
> > Because of the VMA walk the code is slightly different on nommu systems.
> > This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
> > walking the VMAs to find the first executable file-backed VMA we store a
> > reference to the exec'd file in the mm_struct.
> >
> > That reference would prevent the filesystem holding the executable file
> > from being unmounted even after unmapping the VMAs.  So we track the number
> > of VM_EXECUTABLE VMAs and drop the new reference when the last one is
> > unmapped.  This avoids pinning the mounted filesystem.
> 
> So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
> fix some hypothetical pinning fs from umounting by mm which already unmapped all
> its executable files, but still alive. Does anyone know any real world example?
> mm can be borrowed by swapoff or some get_task_mm() user, but it's not a big problem.
> 
> Thus, we can remove all this stuff together with VM_EXECUTABLE flag and
> keep mm->exe_file alive till final mmput().
> 
> After that we can access current->mm->exe_file without any locks
> (after checking current->mm and mm->exe_file for NULL)
> 
> Some code around security and oprofile still uses VM_EXECUTABLE for retrieving
> task's executable file, after this patch they will use mm->exe_file directly.
> In tomoyo and audit mm is always current->mm, oprofile uses get_task_mm().

Perhaps I'm missing something but it seems like you ought to split
this into two patches. The first could fix up the cell, tile, etc. arch
code to use the exe_file reference rather than walk the VMAs. Then the
second patch could remove the unusual logic used to allow userspace to unpin
the mount and we could continue to discuss that separately. It would
also make the git log somewhat cleaner I think...

Cheers,
	-Matt Helsley <matthltc@us.ibm.com>

> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
> Cc: Matt Helsley <matthltc@us.ibm.com>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Eric Paris <eparis@redhat.com>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: linux-security-module@vger.kernel.org
> Cc: oprofile-list@lists.sf.net
> ---
>  arch/powerpc/oprofile/cell/spu_task_sync.c |   15 ++++----------
>  arch/tile/mm/elf.c                         |   12 ++++--------
>  drivers/oprofile/buffer_sync.c             |   17 +++-------------
>  include/linux/mm.h                         |    4 ----
>  include/linux/mm_types.h                   |    1 -
>  include/linux/mman.h                       |    1 -
>  kernel/auditsc.c                           |   17 ++--------------
>  kernel/fork.c                              |   29 ++++------------------------
>  mm/mmap.c                                  |   27 +++++---------------------
>  mm/nommu.c                                 |   11 +----------
>  security/tomoyo/util.c                     |   14 +++-----------
>  11 files changed, 26 insertions(+), 122 deletions(-)
> 
> diff --git a/arch/powerpc/oprofile/cell/spu_task_sync.c b/arch/powerpc/oprofile/cell/spu_task_sync.c
> index 642fca1..28f1af2 100644
> --- a/arch/powerpc/oprofile/cell/spu_task_sync.c
> +++ b/arch/powerpc/oprofile/cell/spu_task_sync.c
> @@ -304,7 +304,7 @@ static inline unsigned long fast_get_dcookie(struct path *path)
>  	return cookie;
>  }
> 
> -/* Look up the dcookie for the task's first VM_EXECUTABLE mapping,
> +/* Look up the dcookie for the task's mm->exe_file,
>   * which corresponds loosely to "application name". Also, determine
>   * the offset for the SPU ELF object.  If computed offset is
>   * non-zero, it implies an embedded SPU object; otherwise, it's a
> @@ -321,7 +321,6 @@ get_exec_dcookie_and_offset(struct spu *spu, unsigned int *offsetp,
>  {
>  	unsigned long app_cookie = 0;
>  	unsigned int my_offset = 0;
> -	struct file *app = NULL;
>  	struct vm_area_struct *vma;
>  	struct mm_struct *mm = spu->mm;
> 
> @@ -330,16 +329,10 @@ get_exec_dcookie_and_offset(struct spu *spu, unsigned int *offsetp,
> 
>  	down_read(&mm->mmap_sem);
> 
> -	for (vma = mm->mmap; vma; vma = vma->vm_next) {
> -		if (!vma->vm_file)
> -			continue;
> -		if (!(vma->vm_flags & VM_EXECUTABLE))
> -			continue;
> -		app_cookie = fast_get_dcookie(&vma->vm_file->f_path);
> +	if (mm->exe_file) {
> +		app_cookie = fast_get_dcookie(&mm->exe_file->f_path);
>  		pr_debug("got dcookie for %s\n",
> -			 vma->vm_file->f_dentry->d_name.name);
> -		app = vma->vm_file;
> -		break;
> +			 mm->exe_file->f_dentry->d_name.name);
>  	}
> 
>  	for (vma = mm->mmap; vma; vma = vma->vm_next) {
> diff --git a/arch/tile/mm/elf.c b/arch/tile/mm/elf.c
> index 758b603..43e5279 100644
> --- a/arch/tile/mm/elf.c
> +++ b/arch/tile/mm/elf.c
> @@ -39,16 +39,12 @@ static void sim_notify_exec(const char *binary_name)
>  static int notify_exec(void)
>  {
>  	int retval = 0;  /* failure */
> -	struct vm_area_struct *vma = current->mm->mmap;
> -	while (vma) {
> -		if ((vma->vm_flags & VM_EXECUTABLE) && vma->vm_file)
> -			break;
> -		vma = vma->vm_next;
> -	}
> -	if (vma) {
> +	struct mm_struct *mm = current->mm;
> +
> +	if (mm->exe_file) {
>  		char *buf = (char *) __get_free_page(GFP_KERNEL);
>  		if (buf) {
> -			char *path = d_path(&vma->vm_file->f_path,
> +			char *path = d_path(&mm->exe_file->f_path,
>  					    buf, PAGE_SIZE);
>  			if (!IS_ERR(path)) {
>  				sim_notify_exec(path);
> diff --git a/drivers/oprofile/buffer_sync.c b/drivers/oprofile/buffer_sync.c
> index f34b5b2..d93b2b6 100644
> --- a/drivers/oprofile/buffer_sync.c
> +++ b/drivers/oprofile/buffer_sync.c
> @@ -216,7 +216,7 @@ static inline unsigned long fast_get_dcookie(struct path *path)
>  }
> 
> 
> -/* Look up the dcookie for the task's first VM_EXECUTABLE mapping,
> +/* Look up the dcookie for the task's mm->exe_file,
>   * which corresponds loosely to "application name". This is
>   * not strictly necessary but allows oprofile to associate
>   * shared-library samples with particular applications
> @@ -224,21 +224,10 @@ static inline unsigned long fast_get_dcookie(struct path *path)
>  static unsigned long get_exec_dcookie(struct mm_struct *mm)
>  {
>  	unsigned long cookie = NO_COOKIE;
> -	struct vm_area_struct *vma;
> -
> -	if (!mm)
> -		goto out;
> 
> -	for (vma = mm->mmap; vma; vma = vma->vm_next) {
> -		if (!vma->vm_file)
> -			continue;
> -		if (!(vma->vm_flags & VM_EXECUTABLE))
> -			continue;
> -		cookie = fast_get_dcookie(&vma->vm_file->f_path);
> -		break;
> -	}
> +	if (mm && mm->exe_file)
> +		cookie = fast_get_dcookie(&mm->exe_file->f_path);
> 
> -out:
>  	return cookie;
>  }
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 553d134..3a4d721 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -88,7 +88,6 @@ extern unsigned int kobjsize(const void *objp);
>  #define VM_PFNMAP	0x00000400	/* Page-ranges managed without "struct page", just pure PFN */
>  #define VM_DENYWRITE	0x00000800	/* ETXTBSY on write attempts.. */
> 
> -#define VM_EXECUTABLE	0x00001000
>  #define VM_LOCKED	0x00002000
>  #define VM_IO           0x00004000	/* Memory mapped I/O or similar */
> 
> @@ -1374,9 +1373,6 @@ extern void exit_mmap(struct mm_struct *);
>  extern int mm_take_all_locks(struct mm_struct *mm);
>  extern void mm_drop_all_locks(struct mm_struct *mm);
> 
> -/* From fs/proc/base.c. callers must _not_ hold the mm's exe_file_lock */
> -extern void added_exe_file_vma(struct mm_struct *mm);
> -extern void removed_exe_file_vma(struct mm_struct *mm);
>  extern void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file);
>  extern struct file *get_mm_exe_file(struct mm_struct *mm);
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 3cc3062..b480c06 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -378,7 +378,6 @@ struct mm_struct {
> 
>  	/* store ref to file /proc/<pid>/exe symlink points to */
>  	struct file *exe_file;
> -	unsigned long num_exe_file_vmas;
>  #ifdef CONFIG_MMU_NOTIFIER
>  	struct mmu_notifier_mm *mmu_notifier_mm;
>  #endif
> diff --git a/include/linux/mman.h b/include/linux/mman.h
> index 8b74e9b..77cec2f 100644
> --- a/include/linux/mman.h
> +++ b/include/linux/mman.h
> @@ -86,7 +86,6 @@ calc_vm_flag_bits(unsigned long flags)
>  {
>  	return _calc_vm_trans(flags, MAP_GROWSDOWN,  VM_GROWSDOWN ) |
>  	       _calc_vm_trans(flags, MAP_DENYWRITE,  VM_DENYWRITE ) |
> -	       _calc_vm_trans(flags, MAP_EXECUTABLE, VM_EXECUTABLE) |
>  	       _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    );
>  }
>  #endif /* __KERNEL__ */
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index af1de0f..aa27a00 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -1164,21 +1164,8 @@ static void audit_log_task_info(struct audit_buffer *ab, struct task_struct *tsk
>  	get_task_comm(name, tsk);
>  	audit_log_format(ab, " comm=");
>  	audit_log_untrustedstring(ab, name);
> -
> -	if (mm) {
> -		down_read(&mm->mmap_sem);
> -		vma = mm->mmap;
> -		while (vma) {
> -			if ((vma->vm_flags & VM_EXECUTABLE) &&
> -			    vma->vm_file) {
> -				audit_log_d_path(ab, " exe=",
> -						 &vma->vm_file->f_path);
> -				break;
> -			}
> -			vma = vma->vm_next;
> -		}
> -		up_read(&mm->mmap_sem);
> -	}
> +	if (mm && mm->exe_file)
> +		audit_log_d_path(ab, " exe=", &mm->exe_file->f_path);
>  	audit_log_task_context(ab);
>  }
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index b9372a0..40e4b49 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -587,26 +587,6 @@ void mmput(struct mm_struct *mm)
>  }
>  EXPORT_SYMBOL_GPL(mmput);
> 
> -/*
> - * We added or removed a vma mapping the executable. The vmas are only mapped
> - * during exec and are not mapped with the mmap system call.
> - * Callers must hold down_write() on the mm's mmap_sem for these
> - */
> -void added_exe_file_vma(struct mm_struct *mm)
> -{
> -	mm->num_exe_file_vmas++;
> -}
> -
> -void removed_exe_file_vma(struct mm_struct *mm)
> -{
> -	mm->num_exe_file_vmas--;
> -	if ((mm->num_exe_file_vmas == 0) && mm->exe_file) {
> -		fput(mm->exe_file);
> -		mm->exe_file = NULL;
> -	}
> -
> -}
> -
>  void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
>  {
>  	if (new_exe_file)
> @@ -614,20 +594,19 @@ void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
>  	if (mm->exe_file)
>  		fput(mm->exe_file);
>  	mm->exe_file = new_exe_file;
> -	mm->num_exe_file_vmas = 0;
>  }
> 
> +/*
> + * Caller must have mm->mm_users reference,
> + * for example current->mm or acquired by get_task_mm().
> + */
>  struct file *get_mm_exe_file(struct mm_struct *mm)
>  {
>  	struct file *exe_file;
> 
> -	/* We need mmap_sem to protect against races with removal of
> -	 * VM_EXECUTABLE vmas */
> -	down_read(&mm->mmap_sem);
>  	exe_file = mm->exe_file;
>  	if (exe_file)
>  		get_file(exe_file);
> -	up_read(&mm->mmap_sem);
>  	return exe_file;
>  }
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 3d254ca..2647bb7 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -230,11 +230,8 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
>  	might_sleep();
>  	if (vma->vm_ops && vma->vm_ops->close)
>  		vma->vm_ops->close(vma);
> -	if (vma->vm_file) {
> +	if (vma->vm_file)
>  		fput(vma->vm_file);
> -		if (vma->vm_flags & VM_EXECUTABLE)
> -			removed_exe_file_vma(vma->vm_mm);
> -	}
>  	mpol_put(vma_policy(vma));
>  	kmem_cache_free(vm_area_cachep, vma);
>  	return next;
> @@ -616,11 +613,8 @@ again:			remove_next = 1 + (end > next->vm_end);
>  		mutex_unlock(&mapping->i_mmap_mutex);
> 
>  	if (remove_next) {
> -		if (file) {
> +		if (file)
>  			fput(file);
> -			if (next->vm_flags & VM_EXECUTABLE)
> -				removed_exe_file_vma(mm);
> -		}
>  		if (next->anon_vma)
>  			anon_vma_merge(vma, next);
>  		mm->map_count--;
> @@ -1293,8 +1287,6 @@ munmap_back:
>  		error = file->f_op->mmap(file, vma);
>  		if (error)
>  			goto unmap_and_free_vma;
> -		if (vm_flags & VM_EXECUTABLE)
> -			added_exe_file_vma(mm);
> 
>  		/* Can addr have changed??
>  		 *
> @@ -1969,11 +1961,8 @@ static int __split_vma(struct mm_struct * mm, struct vm_area_struct * vma,
>  	if (anon_vma_clone(new, vma))
>  		goto out_free_mpol;
> 
> -	if (new->vm_file) {
> +	if (new->vm_file)
>  		get_file(new->vm_file);
> -		if (vma->vm_flags & VM_EXECUTABLE)
> -			added_exe_file_vma(mm);
> -	}
> 
>  	if (new->vm_ops && new->vm_ops->open)
>  		new->vm_ops->open(new);
> @@ -1991,11 +1980,8 @@ static int __split_vma(struct mm_struct * mm, struct vm_area_struct * vma,
>  	/* Clean everything up if vma_adjust failed. */
>  	if (new->vm_ops && new->vm_ops->close)
>  		new->vm_ops->close(new);
> -	if (new->vm_file) {
> -		if (vma->vm_flags & VM_EXECUTABLE)
> -			removed_exe_file_vma(mm);
> +	if (new->vm_file)
>  		fput(new->vm_file);
> -	}
>  	unlink_anon_vmas(new);
>   out_free_mpol:
>  	mpol_put(pol);
> @@ -2377,11 +2363,8 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
>  			new_vma->vm_start = addr;
>  			new_vma->vm_end = addr + len;
>  			new_vma->vm_pgoff = pgoff;
> -			if (new_vma->vm_file) {
> +			if (new_vma->vm_file)
>  				get_file(new_vma->vm_file);
> -				if (vma->vm_flags & VM_EXECUTABLE)
> -					added_exe_file_vma(mm);
> -			}
>  			if (new_vma->vm_ops && new_vma->vm_ops->open)
>  				new_vma->vm_ops->open(new_vma);
>  			vma_link(mm, new_vma, prev, rb_link, rb_parent);
> diff --git a/mm/nommu.c b/mm/nommu.c
> index afa0a15..d617d5c 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -789,11 +789,8 @@ static void delete_vma(struct mm_struct *mm, struct vm_area_struct *vma)
>  	kenter("%p", vma);
>  	if (vma->vm_ops && vma->vm_ops->close)
>  		vma->vm_ops->close(vma);
> -	if (vma->vm_file) {
> +	if (vma->vm_file)
>  		fput(vma->vm_file);
> -		if (vma->vm_flags & VM_EXECUTABLE)
> -			removed_exe_file_vma(mm);
> -	}
>  	put_nommu_region(vma->vm_region);
>  	kmem_cache_free(vm_area_cachep, vma);
>  }
> @@ -1287,10 +1284,6 @@ unsigned long do_mmap_pgoff(struct file *file,
>  		get_file(file);
>  		vma->vm_file = file;
>  		get_file(file);
> -		if (vm_flags & VM_EXECUTABLE) {
> -			added_exe_file_vma(current->mm);
> -			vma->vm_mm = current->mm;
> -		}
>  	}
> 
>  	down_write(&nommu_region_sem);
> @@ -1443,8 +1436,6 @@ error:
>  	kmem_cache_free(vm_region_jar, region);
>  	if (vma->vm_file)
>  		fput(vma->vm_file);
> -	if (vma->vm_flags & VM_EXECUTABLE)
> -		removed_exe_file_vma(vma->vm_mm);
>  	kmem_cache_free(vm_area_cachep, vma);
>  	kleave(" = %d", ret);
>  	return ret;
> diff --git a/security/tomoyo/util.c b/security/tomoyo/util.c
> index 867558c..b929dd3 100644
> --- a/security/tomoyo/util.c
> +++ b/security/tomoyo/util.c
> @@ -949,19 +949,11 @@ bool tomoyo_path_matches_pattern(const struct tomoyo_path_info *filename,
>  const char *tomoyo_get_exe(void)
>  {
>  	struct mm_struct *mm = current->mm;
> -	struct vm_area_struct *vma;
>  	const char *cp = NULL;
> 
> -	if (!mm)
> -		return NULL;
> -	down_read(&mm->mmap_sem);
> -	for (vma = mm->mmap; vma; vma = vma->vm_next) {
> -		if ((vma->vm_flags & VM_EXECUTABLE) && vma->vm_file) {
> -			cp = tomoyo_realpath_from_path(&vma->vm_file->f_path);
> -			break;
> -		}
> -	}
> -	up_read(&mm->mmap_sem);
> +	if (mm && mm->exe_file)
> +		cp = tomoyo_realpath_from_path(&mm->exe_file->f_path);
> +
>  	return cp;
>  }
> 
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring
  2012-03-31 17:09   ` [PATCH 1/7 v2] " Konstantin Khlebnikov
@ 2012-04-03  0:46     ` Suresh Siddha
  2012-04-03  0:46       ` [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
                         ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: Suresh Siddha @ 2012-04-03  0:46 UTC (permalink / raw)
  To: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel
  Cc: Suresh Siddha, Andi Kleen, Pallipadi Venkatesh, Ingo Molnar,
	H. Peter Anvin, Linus Torvalds, Nick Piggin

Konstantin,

On Sat, 2012-03-31 at 21:09 +0400, Konstantin Khlebnikov wrote:
> v2: Do not use batched pfn reserving for single-page VMA. This is not optimal
> and breaks something, because I see glitches on the screen with i915/drm driver.
> With this version glitches are gone, and I see the same regions in
> /sys/kernel/debug/x86/pat_memtype_list as before patch. So, please review this
> carefully, probably I'm wrong somewhere, or I have triggered some hidden bug.

Actually it is not a hidden bug. In the original code, we were setting
VM_PFN_AT_MMAP only for remap_pfn_range() but not for the vm_insert_pfn().
Also the value of 'vm_pgoff' depends on the driver/mmap_region() in the case of
vm_insert_pfn(). But with your proposed code, you were setting
the VM_PAT for the single-page VMA also and end-up using wrong vm_pgoff in
untrack_pfn_vma().

We can simplify the track/untrack pfn routines and can remove the
dependency on vm_pgoff completely. Am appending a patch which does this
and also modified your x86 PAT patch based on this. Can you please
check and if you are ok, merge these bits with the rest of your patches.

thanks,
suresh
---

Konstantin Khlebnikov (1):
  mm, x86, PAT: rework linear pfn-mmap tracking

Suresh Siddha (1):
  x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn
    vma routines

 arch/x86/mm/pat.c             |   38 ++++++++++++++++++++++++--------------
 include/asm-generic/pgtable.h |    4 ++--
 include/linux/mm.h            |   15 +--------------
 mm/huge_memory.c              |    7 +++----
 mm/memory.c                   |   15 ++++++++-------
 5 files changed, 38 insertions(+), 41 deletions(-)

-- 
1.7.6.5


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines
  2012-04-03  0:46     ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Suresh Siddha
@ 2012-04-03  0:46       ` Suresh Siddha
  2012-04-03  5:37         ` Konstantin Khlebnikov
  2012-04-03  0:46       ` [x86 PAT PATCH 2/2] " Suresh Siddha
  2012-04-03  6:03       ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Konstantin Khlebnikov
  2 siblings, 1 reply; 52+ messages in thread
From: Suresh Siddha @ 2012-04-03  0:46 UTC (permalink / raw)
  To: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel
  Cc: Suresh Siddha, Andi Kleen, Pallipadi Venkatesh, Ingo Molnar,
	H. Peter Anvin, Linus Torvalds, Nick Piggin,
	Konstantin Khlebnikov

'pfn' argument for track_pfn_vma_new() can be used for reserving the attribute
for the pfn range. No need to depend on 'vm_pgoff'

Similarly, untrack_pfn_vma() can depend on the 'pfn' argument if it
is non-zero or can use follow_phys() to get the starting value of the pfn
range.

Also the non zero 'size' argument can be used instead of recomputing
it from vma.

This cleanup also prepares the ground for the track/untrack pfn vma routines
to take over the ownership of setting PAT specific vm_flag in the 'vma'.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 arch/x86/mm/pat.c |   30 +++++++++++++++++-------------
 1 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index f6ff57b..617f42b 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -693,14 +693,10 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
 			unsigned long pfn, unsigned long size)
 {
 	unsigned long flags;
-	resource_size_t paddr;
-	unsigned long vma_size = vma->vm_end - vma->vm_start;
 
-	if (is_linear_pfn_mapping(vma)) {
-		/* reserve the whole chunk starting from vm_pgoff */
-		paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
-		return reserve_pfn_range(paddr, vma_size, prot, 0);
-	}
+	/* reserve the whole chunk starting from pfn */
+	if (is_linear_pfn_mapping(vma))
+		return reserve_pfn_range(pfn, size, prot, 0);
 
 	if (!pat_enabled)
 		return 0;
@@ -716,20 +712,28 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
 /*
  * untrack_pfn_vma is called while unmapping a pfnmap for a region.
  * untrack can be called for a specific region indicated by pfn and size or
- * can be for the entire vma (in which case size can be zero).
+ * can be for the entire vma (in which case pfn, size are zero).
  */
 void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 			unsigned long size)
 {
 	resource_size_t paddr;
-	unsigned long vma_size = vma->vm_end - vma->vm_start;
+	unsigned long prot;
 
-	if (is_linear_pfn_mapping(vma)) {
-		/* free the whole chunk starting from vm_pgoff */
-		paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
-		free_pfn_range(paddr, vma_size);
+	if (!is_linear_pfn_mapping(vma))
 		return;
+
+	/* free the chunk starting from pfn or the whole chunk */
+	paddr = (resource_size_t)pfn;
+	if (!paddr && !size) {
+		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) {
+			WARN_ON_ONCE(1);
+			return;
+		}
+
+		size = vma->vm_end - vma->vm_start;
 	}
+	free_pfn_range(paddr, size);
 }
 
 pgprot_t pgprot_writecombine(pgprot_t prot)
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [x86 PAT PATCH 2/2] mm, x86, PAT: rework linear pfn-mmap tracking
  2012-04-03  0:46     ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Suresh Siddha
  2012-04-03  0:46       ` [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
@ 2012-04-03  0:46       ` Suresh Siddha
  2012-04-03  5:48         ` Konstantin Khlebnikov
  2012-04-03  6:03       ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Konstantin Khlebnikov
  2 siblings, 1 reply; 52+ messages in thread
From: Suresh Siddha @ 2012-04-03  0:46 UTC (permalink / raw)
  To: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel
  Cc: Konstantin Khlebnikov, Andi Kleen, Suresh Siddha,
	Pallipadi Venkatesh, Ingo Molnar, H. Peter Anvin, Linus Torvalds,
	Nick Piggin, Nick Piggin

From: Konstantin Khlebnikov <khlebnikov@openvz.org>

This patch replaces generic vma-flag VM_PFN_AT_MMAP with x86-only VM_PAT.

We can toss mapping address from remap_pfn_range() into track_pfn_vma_new(),
and collect all PAT-related logic together in arch/x86/.

This patch also restores orignal frustration-free is_cow_mapping() check in
remap_pfn_range(), as it was before commit v2.6.28-rc8-88-g3c8bb73
("x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3")

is_linear_pfn_mapping() checks can be removed from mm/huge_memory.c,
because it already handled by VM_PFNMAP in VM_NO_THP bit-mask.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
---
 arch/x86/mm/pat.c             |   16 +++++++++++-----
 include/asm-generic/pgtable.h |    4 ++--
 include/linux/mm.h            |   15 +--------------
 mm/huge_memory.c              |    7 +++----
 mm/memory.c                   |   15 ++++++++-------
 5 files changed, 25 insertions(+), 32 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 617f42b..cde7e19 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -665,7 +665,7 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
 	unsigned long vma_size = vma->vm_end - vma->vm_start;
 	pgprot_t pgprot;
 
-	if (is_linear_pfn_mapping(vma)) {
+	if (vma->vm_flags & VM_PAT) {
 		/*
 		 * reserve the whole chunk covered by vma. We need the
 		 * starting address and protection from pte.
@@ -690,13 +690,19 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
  * single reserve_pfn_range call.
  */
 int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-			unsigned long pfn, unsigned long size)
+		      unsigned long addr, unsigned long pfn, unsigned long size)
 {
 	unsigned long flags;
 
 	/* reserve the whole chunk starting from pfn */
-	if (is_linear_pfn_mapping(vma))
-		return reserve_pfn_range(pfn, size, prot, 0);
+	if (addr == vma->vm_start && size == (vma->vm_end - vma->vm_start)) {
+		int ret;
+
+		ret = reserve_pfn_range(pfn, size, prot, 0);
+		if (!ret)
+			vma->vm_flags |= VM_PAT;
+		return ret;
+	}
 
 	if (!pat_enabled)
 		return 0;
@@ -720,7 +726,7 @@ void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 	resource_size_t paddr;
 	unsigned long prot;
 
-	if (!is_linear_pfn_mapping(vma))
+	if (!(vma->vm_flags & VM_PAT))
 		return;
 
 	/* free the chunk starting from pfn or the whole chunk */
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 125c54e..688a2a5 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -389,7 +389,7 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
  * for physical range indicated by pfn and size.
  */
 static inline int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-					unsigned long pfn, unsigned long size)
+		unsigned long pfn, unsigned long addr, unsigned long size)
 {
 	return 0;
 }
@@ -420,7 +420,7 @@ static inline void untrack_pfn_vma(struct vm_area_struct *vma,
 }
 #else
 extern int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-				unsigned long pfn, unsigned long size);
+		unsigned long pfn, unsigned long addr, unsigned long size);
 extern int track_pfn_vma_copy(struct vm_area_struct *vma);
 extern void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 				unsigned long size);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d8738a4..b8e5fe5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -117,7 +117,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_CAN_NONLINEAR 0x08000000	/* Has ->fault & does nonlinear pages */
 #define VM_MIXEDMAP	0x10000000	/* Can contain "struct page" and pure PFN pages */
 #define VM_SAO		0x20000000	/* Strong Access Ordering (powerpc) */
-#define VM_PFN_AT_MMAP	0x40000000	/* PFNMAP vma that is fully mapped at mmap time */
+#define VM_PAT		0x40000000	/* PAT reserves whole VMA at once (x86) */
 #define VM_MERGEABLE	0x80000000	/* KSM may merge identical pages */
 
 /* Bits set in the VMA until the stack is in its final location */
@@ -158,19 +158,6 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_RETRY_NOWAIT	0x10	/* Don't drop mmap_sem and wait when retrying */
 #define FAULT_FLAG_KILLABLE	0x20	/* The fault task is in SIGKILL killable region */
 
-/*
- * This interface is used by x86 PAT code to identify a pfn mapping that is
- * linear over entire vma. This is to optimize PAT code that deals with
- * marking the physical region with a particular prot. This is not for generic
- * mm use. Note also that this check will not work if the pfn mapping is
- * linear for a vma starting at physical address 0. In which case PAT code
- * falls back to slow path of reserving physical range page by page.
- */
-static inline int is_linear_pfn_mapping(struct vm_area_struct *vma)
-{
-	return !!(vma->vm_flags & VM_PFN_AT_MMAP);
-}
-
 static inline int is_pfn_mapping(struct vm_area_struct *vma)
 {
 	return !!(vma->vm_flags & VM_PFNMAP);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f0e5306..cf827da 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1650,7 +1650,7 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma)
 	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
 	 * true too, verify it here.
 	 */
-	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags & VM_NO_THP);
+	VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 	hend = vma->vm_end & HPAGE_PMD_MASK;
 	if (hstart < hend)
@@ -1908,7 +1908,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
 	 * true too, verify it here.
 	 */
-	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags & VM_NO_THP);
+	VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 
 	pgd = pgd_offset(mm, address);
 	if (!pgd_present(*pgd))
@@ -2150,8 +2150,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 		 * If is_pfn_mapping() is true is_learn_pfn_mapping()
 		 * must be true too, verify it here.
 		 */
-		VM_BUG_ON(is_linear_pfn_mapping(vma) ||
-			  vma->vm_flags & VM_NO_THP);
+		VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 
 		hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 		hend = vma->vm_end & HPAGE_PMD_MASK;
diff --git a/mm/memory.c b/mm/memory.c
index 6105f47..e6e4dfd 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2145,7 +2145,7 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_vma_new(vma, &pgprot, pfn, PAGE_SIZE))
+	if (track_pfn_vma_new(vma, &pgprot, pfn, addr, PAGE_SIZE))
 		return -EINVAL;
 
 	ret = insert_pfn(vma, addr, pfn, pgprot);
@@ -2285,23 +2285,24 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 	 * There's a horrible special case to handle copy-on-write
 	 * behaviour that some programs depend on. We mark the "original"
 	 * un-COW'ed pages by matching them up with "vma->vm_pgoff".
+	 * See vm_normal_page() for details.
 	 */
-	if (addr == vma->vm_start && end == vma->vm_end) {
+
+	if (is_cow_mapping(vma->vm_flags)) {
+		if (addr != vma->vm_start || end != vma->vm_end)
+			return -EINVAL;
 		vma->vm_pgoff = pfn;
-		vma->vm_flags |= VM_PFN_AT_MMAP;
-	} else if (is_cow_mapping(vma->vm_flags))
-		return -EINVAL;
+	}
 
 	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
 
-	err = track_pfn_vma_new(vma, &prot, pfn, PAGE_ALIGN(size));
+	err = track_pfn_vma_new(vma, &prot, pfn, addr, PAGE_ALIGN(size));
 	if (err) {
 		/*
 		 * To indicate that track_pfn related cleanup is not
 		 * needed from higher level routine calling unmap_vmas
 		 */
 		vma->vm_flags &= ~(VM_IO | VM_RESERVED | VM_PFNMAP);
-		vma->vm_flags &= ~VM_PFN_AT_MMAP;
 		return -EINVAL;
 	}
 
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-02 23:18   ` Matt Helsley
@ 2012-04-03  5:06     ` Konstantin Khlebnikov
  2012-04-06 22:48       ` Andrew Morton
  0 siblings, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-03  5:06 UTC (permalink / raw)
  To: Matt Helsley
  Cc: linux-mm, Andrew Morton, linux-kernel, Oleg Nesterov, Eric Paris,
	linux-security-module, oprofile-list, Linus Torvalds, Al Viro

Matt Helsley wrote:
> On Sat, Mar 31, 2012 at 01:29:29PM +0400, Konstantin Khlebnikov wrote:
>> Currently the kernel sets mm->exe_file during sys_execve() and then tracks
>> number of vmas with VM_EXECUTABLE flag in mm->num_exe_file_vmas, as soon as
>> this counter drops to zero kernel resets mm->exe_file to NULL. Plus it resets
>> mm->exe_file at last mmput() when mm->mm_users drops to zero.
>>
>> Vma with VM_EXECUTABLE flag appears after mapping file with flag MAP_EXECUTABLE,
>> such vmas can appears only at sys_execve() or after vma splitting, because
>> sys_mmap ignores this flag. Usually binfmt module sets mm->exe_file and mmaps
>> some executable vmas with this file, they hold mm->exe_file while task is running.
>>
>> comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
>> where all this stuff was introduced:
>>
>>> The kernel implements readlink of /proc/pid/exe by getting the file from
>>> the first executable VMA.  Then the path to the file is reconstructed and
>>> reported as the result.
>>>
>>> Because of the VMA walk the code is slightly different on nommu systems.
>>> This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
>>> walking the VMAs to find the first executable file-backed VMA we store a
>>> reference to the exec'd file in the mm_struct.
>>>
>>> That reference would prevent the filesystem holding the executable file
>>> from being unmounted even after unmapping the VMAs.  So we track the number
>>> of VM_EXECUTABLE VMAs and drop the new reference when the last one is
>>> unmapped.  This avoids pinning the mounted filesystem.
>>
>> So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
>> fix some hypothetical pinning fs from umounting by mm which already unmapped all
>> its executable files, but still alive. Does anyone know any real world example?
>> mm can be borrowed by swapoff or some get_task_mm() user, but it's not a big problem.
>>
>> Thus, we can remove all this stuff together with VM_EXECUTABLE flag and
>> keep mm->exe_file alive till final mmput().
>>
>> After that we can access current->mm->exe_file without any locks
>> (after checking current->mm and mm->exe_file for NULL)
>>
>> Some code around security and oprofile still uses VM_EXECUTABLE for retrieving
>> task's executable file, after this patch they will use mm->exe_file directly.
>> In tomoyo and audit mm is always current->mm, oprofile uses get_task_mm().
>
> Perhaps I'm missing something but it seems like you ought to split
> this into two patches. The first could fix up the cell, tile, etc. arch
> code to use the exe_file reference rather than walk the VMAs. Then the
> second patch could remove the unusual logic used to allow userspace to unpin
> the mount and we could continue to discuss that separately. It would
> also make the git log somewhat cleaner I think...

Ok, I'll resend this patch as independent patch-set,
anyway I need to return mm->mmap_sem locking back.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-02 23:04     ` Matt Helsley
@ 2012-04-03  5:10       ` Konstantin Khlebnikov
  2012-04-03 18:16         ` Matt Helsley
  0 siblings, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-03  5:10 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Oleg Nesterov, linux-mm, Andrew Morton, linux-kernel, Eric Paris,
	linux-security-module, oprofile-list, Linus Torvalds, Al Viro,
	Cyrill Gorcunov

Matt Helsley wrote:
> On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
>> On 03/31, Konstantin Khlebnikov wrote:
>>>
>>> comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
>>> where all this stuff was introduced:
>>>
>>>> ...
>>>> This avoids pinning the mounted filesystem.
>>>
>>> So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
>>> fix some hypothetical pinning fs from umounting by mm which already unmapped all
>>> its executable files, but still alive. Does anyone know any real world example?
>>
>> This is the question to Matt.
>
> This is where I got the scenario:
>
> https://lkml.org/lkml/2007/7/12/398

Cyrill Gogcunov's patch "c/r: prctl: add ability to set new mm_struct::exe_file"
gives userspace ability to unpin vfsmount explicitly.

https://lkml.org/lkml/2012/3/16/449

>
> Cheers,
> 	-Matt Helsley
>
> PS: I seem to keep coming back to this so I hope folks don't mind if I leave
> some more references to make (re)searching this topic easier:
>
> Thread with Cyrill Gorcunov discussing c/r of symlink:
> https://lkml.org/lkml/2012/3/16/448
>
> Thread with Oleg Nesterov re: cleanups:
> https://lkml.org/lkml/2012/3/5/240
>
> Thread with Alexey Dobriyan re: cleanups:
> https://lkml.org/lkml/2009/6/4/625
>
> mainline commit 925d1c401fa6cfd0df5d2e37da8981494ccdec07
> Date:   Tue Apr 29 01:01:36 2008 -0700
>
> 	procfs task exe symlink
>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines
  2012-04-03  0:46       ` [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
@ 2012-04-03  5:37         ` Konstantin Khlebnikov
  2012-04-03 23:31           ` Suresh Siddha
  0 siblings, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-03  5:37 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Andi Kleen, Pallipadi Venkatesh, Ingo Molnar, H. Peter Anvin,
	Linus Torvalds, Nick Piggin

Suresh Siddha wrote:
> 'pfn' argument for track_pfn_vma_new() can be used for reserving the attribute
> for the pfn range. No need to depend on 'vm_pgoff'
>
> Similarly, untrack_pfn_vma() can depend on the 'pfn' argument if it
> is non-zero or can use follow_phys() to get the starting value of the pfn
> range.
>
> Also the non zero 'size' argument can be used instead of recomputing
> it from vma.
>
> This cleanup also prepares the ground for the track/untrack pfn vma routines
> to take over the ownership of setting PAT specific vm_flag in the 'vma'.
>
> Signed-off-by: Suresh Siddha<suresh.b.siddha@intel.com>
> Cc: Venkatesh Pallipadi<venki@google.com>
> Cc: Konstantin Khlebnikov<khlebnikov@openvz.org>
> ---
>   arch/x86/mm/pat.c |   30 +++++++++++++++++-------------
>   1 files changed, 17 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> index f6ff57b..617f42b 100644
> --- a/arch/x86/mm/pat.c
> +++ b/arch/x86/mm/pat.c
> @@ -693,14 +693,10 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
>   			unsigned long pfn, unsigned long size)
>   {
>   	unsigned long flags;
> -	resource_size_t paddr;
> -	unsigned long vma_size = vma->vm_end - vma->vm_start;
>
> -	if (is_linear_pfn_mapping(vma)) {
> -		/* reserve the whole chunk starting from vm_pgoff */
> -		paddr = (resource_size_t)vma->vm_pgoff<<  PAGE_SHIFT;
> -		return reserve_pfn_range(paddr, vma_size, prot, 0);
> -	}
> +	/* reserve the whole chunk starting from pfn */
> +	if (is_linear_pfn_mapping(vma))
> +		return reserve_pfn_range(pfn, size, prot, 0);

you mix here pfn and paddr: old code passes paddr as first argument of reserve_pfn_range().

>
>   	if (!pat_enabled)
>   		return 0;
> @@ -716,20 +712,28 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
>   /*
>    * untrack_pfn_vma is called while unmapping a pfnmap for a region.
>    * untrack can be called for a specific region indicated by pfn and size or
> - * can be for the entire vma (in which case size can be zero).
> + * can be for the entire vma (in which case pfn, size are zero).
>    */
>   void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
>   			unsigned long size)
>   {
>   	resource_size_t paddr;
> -	unsigned long vma_size = vma->vm_end - vma->vm_start;
> +	unsigned long prot;
>
> -	if (is_linear_pfn_mapping(vma)) {
> -		/* free the whole chunk starting from vm_pgoff */
> -		paddr = (resource_size_t)vma->vm_pgoff<<  PAGE_SHIFT;
> -		free_pfn_range(paddr, vma_size);
> +	if (!is_linear_pfn_mapping(vma))
>   		return;
> +
> +	/* free the chunk starting from pfn or the whole chunk */
> +	paddr = (resource_size_t)pfn;
> +	if (!paddr&&  !size) {
> +		if (follow_phys(vma, vma->vm_start, 0,&prot,&paddr)) {
> +			WARN_ON_ONCE(1);
> +			return;
> +		}
> +
> +		size = vma->vm_end - vma->vm_start;
>   	}
> +	free_pfn_range(paddr, size);
>   }
>
>   pgprot_t pgprot_writecombine(pgprot_t prot)


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [x86 PAT PATCH 2/2] mm, x86, PAT: rework linear pfn-mmap tracking
  2012-04-03  0:46       ` [x86 PAT PATCH 2/2] " Suresh Siddha
@ 2012-04-03  5:48         ` Konstantin Khlebnikov
  2012-04-03  5:55           ` Konstantin Khlebnikov
  0 siblings, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-03  5:48 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Andi Kleen, Pallipadi Venkatesh, Ingo Molnar, H. Peter Anvin,
	Linus Torvalds, Nick Piggin, Nick Piggin

Suresh Siddha wrote:
> From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>
> This patch replaces generic vma-flag VM_PFN_AT_MMAP with x86-only VM_PAT.
>
> We can toss mapping address from remap_pfn_range() into track_pfn_vma_new(),
> and collect all PAT-related logic together in arch/x86/.
>
> This patch also restores orignal frustration-free is_cow_mapping() check in
> remap_pfn_range(), as it was before commit v2.6.28-rc8-88-g3c8bb73
> ("x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3")
>
> is_linear_pfn_mapping() checks can be removed from mm/huge_memory.c,
> because it already handled by VM_PFNMAP in VM_NO_THP bit-mask.
>
> Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
> Signed-off-by: Suresh Siddha<suresh.b.siddha@intel.com>
> Cc: Venkatesh Pallipadi<venki@google.com>
> Cc: H. Peter Anvin<hpa@zytor.com>
> Cc: Nick Piggin<npiggin@suse.de>
> Cc: Ingo Molnar<mingo@redhat.com>
> ---
>   arch/x86/mm/pat.c             |   16 +++++++++++-----
>   include/asm-generic/pgtable.h |    4 ++--
>   include/linux/mm.h            |   15 +--------------
>   mm/huge_memory.c              |    7 +++----
>   mm/memory.c                   |   15 ++++++++-------
>   5 files changed, 25 insertions(+), 32 deletions(-)
>
> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> index 617f42b..cde7e19 100644
> --- a/arch/x86/mm/pat.c
> +++ b/arch/x86/mm/pat.c
> @@ -665,7 +665,7 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
>   	unsigned long vma_size = vma->vm_end - vma->vm_start;
>   	pgprot_t pgprot;
>
> -	if (is_linear_pfn_mapping(vma)) {
> +	if (vma->vm_flags&  VM_PAT) {
>   		/*
>   		 * reserve the whole chunk covered by vma. We need the
>   		 * starting address and protection from pte.
> @@ -690,13 +690,19 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
>    * single reserve_pfn_range call.
>    */
>   int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
> -			unsigned long pfn, unsigned long size)
> +		      unsigned long addr, unsigned long pfn, unsigned long size)
>   {
>   	unsigned long flags;
>
>   	/* reserve the whole chunk starting from pfn */
> -	if (is_linear_pfn_mapping(vma))
> -		return reserve_pfn_range(pfn, size, prot, 0);
> +	if (addr == vma->vm_start&&  size == (vma->vm_end - vma->vm_start)) {
> +		int ret;
> +
> +		ret = reserve_pfn_range(pfn, size, prot, 0);
> +		if (!ret)
> +			vma->vm_flags |= VM_PAT;
> +		return ret;
> +	}
>
>   	if (!pat_enabled)
>   		return 0;
> @@ -720,7 +726,7 @@ void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
>   	resource_size_t paddr;
>   	unsigned long prot;
>
> -	if (!is_linear_pfn_mapping(vma))
> +	if (!(vma->vm_flags&  VM_PAT))
>   		return;
>
>   	/* free the chunk starting from pfn or the whole chunk */
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 125c54e..688a2a5 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -389,7 +389,7 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
>    * for physical range indicated by pfn and size.
>    */
>   static inline int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
> -					unsigned long pfn, unsigned long size)
> +		unsigned long pfn, unsigned long addr, unsigned long size)
>   {
>   	return 0;
>   }
> @@ -420,7 +420,7 @@ static inline void untrack_pfn_vma(struct vm_area_struct *vma,
>   }
>   #else
>   extern int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
> -				unsigned long pfn, unsigned long size);
> +		unsigned long pfn, unsigned long addr, unsigned long size);
>   extern int track_pfn_vma_copy(struct vm_area_struct *vma);
>   extern void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
>   				unsigned long size);
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index d8738a4..b8e5fe5 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -117,7 +117,7 @@ extern unsigned int kobjsize(const void *objp);
>   #define VM_CAN_NONLINEAR 0x08000000	/* Has ->fault&  does nonlinear pages */
>   #define VM_MIXEDMAP	0x10000000	/* Can contain "struct page" and pure PFN pages */
>   #define VM_SAO		0x20000000	/* Strong Access Ordering (powerpc) */
> -#define VM_PFN_AT_MMAP	0x40000000	/* PFNMAP vma that is fully mapped at mmap time */
> +#define VM_PAT		0x40000000	/* PAT reserves whole VMA at once (x86) */
>   #define VM_MERGEABLE	0x80000000	/* KSM may merge identical pages */
>
>   /* Bits set in the VMA until the stack is in its final location */
> @@ -158,19 +158,6 @@ extern pgprot_t protection_map[16];
>   #define FAULT_FLAG_RETRY_NOWAIT	0x10	/* Don't drop mmap_sem and wait when retrying */
>   #define FAULT_FLAG_KILLABLE	0x20	/* The fault task is in SIGKILL killable region */
>
> -/*
> - * This interface is used by x86 PAT code to identify a pfn mapping that is
> - * linear over entire vma. This is to optimize PAT code that deals with
> - * marking the physical region with a particular prot. This is not for generic
> - * mm use. Note also that this check will not work if the pfn mapping is
> - * linear for a vma starting at physical address 0. In which case PAT code
> - * falls back to slow path of reserving physical range page by page.
> - */
> -static inline int is_linear_pfn_mapping(struct vm_area_struct *vma)
> -{
> -	return !!(vma->vm_flags&  VM_PFN_AT_MMAP);
> -}
> -
>   static inline int is_pfn_mapping(struct vm_area_struct *vma)
>   {
>   	return !!(vma->vm_flags&  VM_PFNMAP);
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index f0e5306..cf827da 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1650,7 +1650,7 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma)
>   	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
>   	 * true too, verify it here.
>   	 */
> -	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags&  VM_NO_THP);
> +	VM_BUG_ON(vma->vm_flags&  VM_NO_THP);
>   	hstart = (vma->vm_start + ~HPAGE_PMD_MASK)&  HPAGE_PMD_MASK;
>   	hend = vma->vm_end&  HPAGE_PMD_MASK;
>   	if (hstart<  hend)
> @@ -1908,7 +1908,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>   	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
>   	 * true too, verify it here.
>   	 */
> -	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags&  VM_NO_THP);
> +	VM_BUG_ON(vma->vm_flags&  VM_NO_THP);
>
>   	pgd = pgd_offset(mm, address);
>   	if (!pgd_present(*pgd))
> @@ -2150,8 +2150,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
>   		 * If is_pfn_mapping() is true is_learn_pfn_mapping()
>   		 * must be true too, verify it here.
>   		 */
> -		VM_BUG_ON(is_linear_pfn_mapping(vma) ||
> -			  vma->vm_flags&  VM_NO_THP);
> +		VM_BUG_ON(vma->vm_flags&  VM_NO_THP);
>
>   		hstart = (vma->vm_start + ~HPAGE_PMD_MASK)&  HPAGE_PMD_MASK;
>   		hend = vma->vm_end&  HPAGE_PMD_MASK;
> diff --git a/mm/memory.c b/mm/memory.c
> index 6105f47..e6e4dfd 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2145,7 +2145,7 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
>
>   	if (addr<  vma->vm_start || addr>= vma->vm_end)
>   		return -EFAULT;
> -	if (track_pfn_vma_new(vma,&pgprot, pfn, PAGE_SIZE))
> +	if (track_pfn_vma_new(vma,&pgprot, pfn, addr, PAGE_SIZE))
>   		return -EINVAL;

Old code does not uses PAT for vm_insert_pfn, now it can use it for single-page vma.
And I see glitches on my notebook if kernel do this (see comment in v2 of my patch)

Probably we shouldn't touch this, plus seems like using pat-engine for single page isn't optimal:
it allocates special control structures for this.

>
>   	ret = insert_pfn(vma, addr, pfn, pgprot);
> @@ -2285,23 +2285,24 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
>   	 * There's a horrible special case to handle copy-on-write
>   	 * behaviour that some programs depend on. We mark the "original"
>   	 * un-COW'ed pages by matching them up with "vma->vm_pgoff".
> +	 * See vm_normal_page() for details.
>   	 */
> -	if (addr == vma->vm_start&&  end == vma->vm_end) {
> +
> +	if (is_cow_mapping(vma->vm_flags)) {
> +		if (addr != vma->vm_start || end != vma->vm_end)
> +			return -EINVAL;
>   		vma->vm_pgoff = pfn;
> -		vma->vm_flags |= VM_PFN_AT_MMAP;
> -	} else if (is_cow_mapping(vma->vm_flags))
> -		return -EINVAL;
> +	}
>
>   	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
>
> -	err = track_pfn_vma_new(vma,&prot, pfn, PAGE_ALIGN(size));
> +	err = track_pfn_vma_new(vma,&prot, pfn, addr, PAGE_ALIGN(size));
>   	if (err) {
>   		/*
>   		 * To indicate that track_pfn related cleanup is not
>   		 * needed from higher level routine calling unmap_vmas
>   		 */
>   		vma->vm_flags&= ~(VM_IO | VM_RESERVED | VM_PFNMAP);
> -		vma->vm_flags&= ~VM_PFN_AT_MMAP;
>   		return -EINVAL;
>   	}
>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [x86 PAT PATCH 2/2] mm, x86, PAT: rework linear pfn-mmap tracking
  2012-04-03  5:48         ` Konstantin Khlebnikov
@ 2012-04-03  5:55           ` Konstantin Khlebnikov
  0 siblings, 0 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-03  5:55 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Andi Kleen, Pallipadi Venkatesh, Ingo Molnar, H. Peter Anvin,
	Linus Torvalds, Nick Piggin, Nick Piggin

Konstantin Khlebnikov wrote:
> Suresh Siddha wrote:
>> From: Konstantin Khlebnikov<khlebnikov@openvz.org>
>>
>> This patch replaces generic vma-flag VM_PFN_AT_MMAP with x86-only VM_PAT.
>>
>> We can toss mapping address from remap_pfn_range() into track_pfn_vma_new(),
>> and collect all PAT-related logic together in arch/x86/.
>>
>> This patch also restores orignal frustration-free is_cow_mapping() check in
>> remap_pfn_range(), as it was before commit v2.6.28-rc8-88-g3c8bb73
>> ("x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3")
>>
>> is_linear_pfn_mapping() checks can be removed from mm/huge_memory.c,
>> because it already handled by VM_PFNMAP in VM_NO_THP bit-mask.
>>
>> Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> Signed-off-by: Suresh Siddha<suresh.b.siddha@intel.com>
>> Cc: Venkatesh Pallipadi<venki@google.com>
>> Cc: H. Peter Anvin<hpa@zytor.com>
>> Cc: Nick Piggin<npiggin@suse.de>
>> Cc: Ingo Molnar<mingo@redhat.com>
>> ---
>>    arch/x86/mm/pat.c             |   16 +++++++++++-----
>>    include/asm-generic/pgtable.h |    4 ++--
>>    include/linux/mm.h            |   15 +--------------
>>    mm/huge_memory.c              |    7 +++----
>>    mm/memory.c                   |   15 ++++++++-------
>>    5 files changed, 25 insertions(+), 32 deletions(-)
>>
>> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
>> index 617f42b..cde7e19 100644
>> --- a/arch/x86/mm/pat.c
>> +++ b/arch/x86/mm/pat.c
>> @@ -665,7 +665,7 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
>>        unsigned long vma_size = vma->vm_end - vma->vm_start;
>>        pgprot_t pgprot;
>>
>> -     if (is_linear_pfn_mapping(vma)) {
>> +     if (vma->vm_flags&   VM_PAT) {
>>                /*
>>                 * reserve the whole chunk covered by vma. We need the
>>                 * starting address and protection from pte.
>> @@ -690,13 +690,19 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
>>     * single reserve_pfn_range call.
>>     */
>>    int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
>> -                     unsigned long pfn, unsigned long size)
>> +                   unsigned long addr, unsigned long pfn, unsigned long size)
>>    {
>>        unsigned long flags;
>>
>>        /* reserve the whole chunk starting from pfn */
>> -     if (is_linear_pfn_mapping(vma))
>> -             return reserve_pfn_range(pfn, size, prot, 0);
>> +     if (addr == vma->vm_start&&   size == (vma->vm_end - vma->vm_start)) {
>> +             int ret;
>> +
>> +             ret = reserve_pfn_range(pfn, size, prot, 0);
>> +             if (!ret)
>> +                     vma->vm_flags |= VM_PAT;
>> +             return ret;
>> +     }
>>
>>        if (!pat_enabled)
>>                return 0;
>> @@ -720,7 +726,7 @@ void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
>>        resource_size_t paddr;
>>        unsigned long prot;
>>
>> -     if (!is_linear_pfn_mapping(vma))
>> +     if (!(vma->vm_flags&   VM_PAT))
>>                return;
>>
>>        /* free the chunk starting from pfn or the whole chunk */
>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
>> index 125c54e..688a2a5 100644
>> --- a/include/asm-generic/pgtable.h
>> +++ b/include/asm-generic/pgtable.h
>> @@ -389,7 +389,7 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
>>     * for physical range indicated by pfn and size.
>>     */
>>    static inline int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
>> -                                     unsigned long pfn, unsigned long size)
>> +             unsigned long pfn, unsigned long addr, unsigned long size)
>>    {
>>        return 0;
>>    }
>> @@ -420,7 +420,7 @@ static inline void untrack_pfn_vma(struct vm_area_struct *vma,
>>    }
>>    #else
>>    extern int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
>> -                             unsigned long pfn, unsigned long size);
>> +             unsigned long pfn, unsigned long addr, unsigned long size);
>>    extern int track_pfn_vma_copy(struct vm_area_struct *vma);
>>    extern void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
>>                                unsigned long size);
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index d8738a4..b8e5fe5 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -117,7 +117,7 @@ extern unsigned int kobjsize(const void *objp);
>>    #define VM_CAN_NONLINEAR 0x08000000 /* Has ->fault&   does nonlinear pages */
>>    #define VM_MIXEDMAP 0x10000000      /* Can contain "struct page" and pure PFN pages */
>>    #define VM_SAO              0x20000000      /* Strong Access Ordering (powerpc) */
>> -#define VM_PFN_AT_MMAP       0x40000000      /* PFNMAP vma that is fully mapped at mmap time */
>> +#define VM_PAT               0x40000000      /* PAT reserves whole VMA at once (x86) */
>>    #define VM_MERGEABLE        0x80000000      /* KSM may merge identical pages */
>>
>>    /* Bits set in the VMA until the stack is in its final location */
>> @@ -158,19 +158,6 @@ extern pgprot_t protection_map[16];
>>    #define FAULT_FLAG_RETRY_NOWAIT     0x10    /* Don't drop mmap_sem and wait when retrying */
>>    #define FAULT_FLAG_KILLABLE 0x20    /* The fault task is in SIGKILL killable region */
>>
>> -/*
>> - * This interface is used by x86 PAT code to identify a pfn mapping that is
>> - * linear over entire vma. This is to optimize PAT code that deals with
>> - * marking the physical region with a particular prot. This is not for generic
>> - * mm use. Note also that this check will not work if the pfn mapping is
>> - * linear for a vma starting at physical address 0. In which case PAT code
>> - * falls back to slow path of reserving physical range page by page.
>> - */
>> -static inline int is_linear_pfn_mapping(struct vm_area_struct *vma)
>> -{
>> -     return !!(vma->vm_flags&   VM_PFN_AT_MMAP);
>> -}
>> -
>>    static inline int is_pfn_mapping(struct vm_area_struct *vma)
>>    {
>>        return !!(vma->vm_flags&   VM_PFNMAP);
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index f0e5306..cf827da 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -1650,7 +1650,7 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma)
>>         * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
>>         * true too, verify it here.
>>         */
>> -     VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags&   VM_NO_THP);
>> +     VM_BUG_ON(vma->vm_flags&   VM_NO_THP);
>>        hstart = (vma->vm_start + ~HPAGE_PMD_MASK)&   HPAGE_PMD_MASK;
>>        hend = vma->vm_end&   HPAGE_PMD_MASK;
>>        if (hstart<   hend)
>> @@ -1908,7 +1908,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>>         * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
>>         * true too, verify it here.
>>         */
>> -     VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags&   VM_NO_THP);
>> +     VM_BUG_ON(vma->vm_flags&   VM_NO_THP);
>>
>>        pgd = pgd_offset(mm, address);
>>        if (!pgd_present(*pgd))
>> @@ -2150,8 +2150,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
>>                 * If is_pfn_mapping() is true is_learn_pfn_mapping()
>>                 * must be true too, verify it here.
>>                 */
>> -             VM_BUG_ON(is_linear_pfn_mapping(vma) ||
>> -                       vma->vm_flags&   VM_NO_THP);
>> +             VM_BUG_ON(vma->vm_flags&   VM_NO_THP);
>>
>>                hstart = (vma->vm_start + ~HPAGE_PMD_MASK)&   HPAGE_PMD_MASK;
>>                hend = vma->vm_end&   HPAGE_PMD_MASK;
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 6105f47..e6e4dfd 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2145,7 +2145,7 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
>>
>>        if (addr<   vma->vm_start || addr>= vma->vm_end)
>>                return -EFAULT;
>> -     if (track_pfn_vma_new(vma,&pgprot, pfn, PAGE_SIZE))
>> +     if (track_pfn_vma_new(vma,&pgprot, pfn, addr, PAGE_SIZE))
>>                return -EINVAL;
>
> Old code does not uses PAT for vm_insert_pfn, now it can use it for single-page vma.
> And I see glitches on my notebook if kernel do this (see comment in v2 of my patch)
>
> Probably we shouldn't touch this, plus seems like using pat-engine for single page isn't optimal:
> it allocates special control structures for this.

Ignore this, I didn't see comment in 0/2 mail

>
>>
>>        ret = insert_pfn(vma, addr, pfn, pgprot);
>> @@ -2285,23 +2285,24 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
>>         * There's a horrible special case to handle copy-on-write
>>         * behaviour that some programs depend on. We mark the "original"
>>         * un-COW'ed pages by matching them up with "vma->vm_pgoff".
>> +      * See vm_normal_page() for details.
>>         */
>> -     if (addr == vma->vm_start&&   end == vma->vm_end) {
>> +
>> +     if (is_cow_mapping(vma->vm_flags)) {
>> +             if (addr != vma->vm_start || end != vma->vm_end)
>> +                     return -EINVAL;
>>                vma->vm_pgoff = pfn;
>> -             vma->vm_flags |= VM_PFN_AT_MMAP;
>> -     } else if (is_cow_mapping(vma->vm_flags))
>> -             return -EINVAL;
>> +     }
>>
>>        vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
>>
>> -     err = track_pfn_vma_new(vma,&prot, pfn, PAGE_ALIGN(size));
>> +     err = track_pfn_vma_new(vma,&prot, pfn, addr, PAGE_ALIGN(size));
>>        if (err) {
>>                /*
>>                 * To indicate that track_pfn related cleanup is not
>>                 * needed from higher level routine calling unmap_vmas
>>                 */
>>                vma->vm_flags&= ~(VM_IO | VM_RESERVED | VM_PFNMAP);
>> -             vma->vm_flags&= ~VM_PFN_AT_MMAP;
>>                return -EINVAL;
>>        }
>>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email:<a href=mailto:"dont@kvack.org">  email@kvack.org</a>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring
  2012-04-03  0:46     ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Suresh Siddha
  2012-04-03  0:46       ` [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
  2012-04-03  0:46       ` [x86 PAT PATCH 2/2] " Suresh Siddha
@ 2012-04-03  6:03       ` Konstantin Khlebnikov
  2012-04-03 23:14         ` Suresh Siddha
  2 siblings, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-03  6:03 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Andi Kleen, Pallipadi Venkatesh, Ingo Molnar, H. Peter Anvin,
	Linus Torvalds, Nick Piggin

Suresh Siddha wrote:
> Konstantin,
>
> On Sat, 2012-03-31 at 21:09 +0400, Konstantin Khlebnikov wrote:
>> v2: Do not use batched pfn reserving for single-page VMA. This is not optimal
>> and breaks something, because I see glitches on the screen with i915/drm driver.
>> With this version glitches are gone, and I see the same regions in
>> /sys/kernel/debug/x86/pat_memtype_list as before patch. So, please review this
>> carefully, probably I'm wrong somewhere, or I have triggered some hidden bug.
>
> Actually it is not a hidden bug. In the original code, we were setting
> VM_PFN_AT_MMAP only for remap_pfn_range() but not for the vm_insert_pfn().
> Also the value of 'vm_pgoff' depends on the driver/mmap_region() in the case of
> vm_insert_pfn(). But with your proposed code, you were setting
> the VM_PAT for the single-page VMA also and end-up using wrong vm_pgoff in
> untrack_pfn_vma().

But I set correct vma->vm_pgoff together with VM_PAT. But, it shouldn't work if vma is expandable...

>
> We can simplify the track/untrack pfn routines and can remove the
> dependency on vm_pgoff completely. Am appending a patch which does this
> and also modified your x86 PAT patch based on this. Can you please
> check and if you are ok, merge these bits with the rest of your patches.

Ok, I'll checks this.

>
> thanks,
> suresh
> ---
>
> Konstantin Khlebnikov (1):
>    mm, x86, PAT: rework linear pfn-mmap tracking
>
> Suresh Siddha (1):
>    x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn
>      vma routines
>
>   arch/x86/mm/pat.c             |   38 ++++++++++++++++++++++++--------------
>   include/asm-generic/pgtable.h |    4 ++--
>   include/linux/mm.h            |   15 +--------------
>   mm/huge_memory.c              |    7 +++----
>   mm/memory.c                   |   15 ++++++++-------
>   5 files changed, 38 insertions(+), 41 deletions(-)
>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-03  5:10       ` Konstantin Khlebnikov
@ 2012-04-03 18:16         ` Matt Helsley
  2012-04-03 19:32           ` Cyrill Gorcunov
  0 siblings, 1 reply; 52+ messages in thread
From: Matt Helsley @ 2012-04-03 18:16 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Matt Helsley, Oleg Nesterov, linux-mm, Andrew Morton,
	linux-kernel, Eric Paris, linux-security-module, oprofile-list,
	Linus Torvalds, Al Viro, Cyrill Gorcunov

On Tue, Apr 03, 2012 at 09:10:20AM +0400, Konstantin Khlebnikov wrote:
> Matt Helsley wrote:
> >On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
> >>On 03/31, Konstantin Khlebnikov wrote:
> >>>
> >>>comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
> >>>where all this stuff was introduced:
> >>>
> >>>>...
> >>>>This avoids pinning the mounted filesystem.
> >>>
> >>>So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
> >>>fix some hypothetical pinning fs from umounting by mm which already unmapped all
> >>>its executable files, but still alive. Does anyone know any real world example?
> >>
> >>This is the question to Matt.
> >
> >This is where I got the scenario:
> >
> >https://lkml.org/lkml/2007/7/12/398
> 
> Cyrill Gogcunov's patch "c/r: prctl: add ability to set new mm_struct::exe_file"
> gives userspace ability to unpin vfsmount explicitly.

Doesn't that break the semantics of the kernel ABI?

Cheers,
	-Matt Helsley


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-03 18:16         ` Matt Helsley
@ 2012-04-03 19:32           ` Cyrill Gorcunov
  2012-04-05 20:29             ` Matt Helsley
  0 siblings, 1 reply; 52+ messages in thread
From: Cyrill Gorcunov @ 2012-04-03 19:32 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Konstantin Khlebnikov, Oleg Nesterov, linux-mm, Andrew Morton,
	linux-kernel, Eric Paris, linux-security-module, oprofile-list,
	Linus Torvalds, Al Viro

On Tue, Apr 03, 2012 at 11:16:31AM -0700, Matt Helsley wrote:
> On Tue, Apr 03, 2012 at 09:10:20AM +0400, Konstantin Khlebnikov wrote:
> > Matt Helsley wrote:
> > >On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
> > >>On 03/31, Konstantin Khlebnikov wrote:
> > >>>
> > >>>comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
> > >>>where all this stuff was introduced:
> > >>>
> > >>>>...
> > >>>>This avoids pinning the mounted filesystem.
> > >>>
> > >>>So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
> > >>>fix some hypothetical pinning fs from umounting by mm which already unmapped all
> > >>>its executable files, but still alive. Does anyone know any real world example?
> > >>
> > >>This is the question to Matt.
> > >
> > >This is where I got the scenario:
> > >
> > >https://lkml.org/lkml/2007/7/12/398
> > 
> > Cyrill Gogcunov's patch "c/r: prctl: add ability to set new mm_struct::exe_file"
> > gives userspace ability to unpin vfsmount explicitly.
> 
> Doesn't that break the semantics of the kernel ABI?

Which one? exe_file can be changed iif there is no MAP_EXECUTABLE left.
Still, once assigned (via this prctl) the mm_struct::exe_file can't be changed
again, until program exit.

	Cyrill

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring
  2012-04-03  6:03       ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Konstantin Khlebnikov
@ 2012-04-03 23:14         ` Suresh Siddha
  2012-04-04  4:40           ` Konstantin Khlebnikov
  0 siblings, 1 reply; 52+ messages in thread
From: Suresh Siddha @ 2012-04-03 23:14 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Andi Kleen,
	Pallipadi Venkatesh, Ingo Molnar, H. Peter Anvin, Linus Torvalds,
	Nick Piggin

On Tue, 2012-04-03 at 10:03 +0400, Konstantin Khlebnikov wrote:
> Suresh Siddha wrote:
> > Konstantin,
> >
> > On Sat, 2012-03-31 at 21:09 +0400, Konstantin Khlebnikov wrote:
> >> v2: Do not use batched pfn reserving for single-page VMA. This is not optimal
> >> and breaks something, because I see glitches on the screen with i915/drm driver.
> >> With this version glitches are gone, and I see the same regions in
> >> /sys/kernel/debug/x86/pat_memtype_list as before patch. So, please review this
> >> carefully, probably I'm wrong somewhere, or I have triggered some hidden bug.
> >
> > Actually it is not a hidden bug. In the original code, we were setting
> > VM_PFN_AT_MMAP only for remap_pfn_range() but not for the vm_insert_pfn().
> > Also the value of 'vm_pgoff' depends on the driver/mmap_region() in the case of
> > vm_insert_pfn(). But with your proposed code, you were setting
> > the VM_PAT for the single-page VMA also and end-up using wrong vm_pgoff in
> > untrack_pfn_vma().
> 
> But I set correct vma->vm_pgoff together with VM_PAT. But, it shouldn't work if vma is expandable...
> 

Also, I am not sure if we can override vm_pgoff in the fault handling
path. For example, looking at unmap_mapping_range_tree() it does depend
on the vm_pgoff value and it might break if we change the vm_pgoff in
track_pfn_vma_new() (which gets called from vm_insert_pfn() as part of
the i915_gem_fault()).

thanks,
suresh




^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines
  2012-04-03  5:37         ` Konstantin Khlebnikov
@ 2012-04-03 23:31           ` Suresh Siddha
  2012-04-04  4:43             ` Konstantin Khlebnikov
  2012-04-05 11:56             ` Konstantin Khlebnikov
  0 siblings, 2 replies; 52+ messages in thread
From: Suresh Siddha @ 2012-04-03 23:31 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Andi Kleen, Pallipadi Venkatesh, Ingo Molnar, H. Peter Anvin,
	Linus Torvalds, Nick Piggin

[-- Attachment #1: Type: text/plain, Size: 2266 bytes --]

On Tue, 2012-04-03 at 09:37 +0400, Konstantin Khlebnikov wrote:
> Suresh Siddha wrote:
> > 'pfn' argument for track_pfn_vma_new() can be used for reserving the attribute
> > for the pfn range. No need to depend on 'vm_pgoff'
> >
> > Similarly, untrack_pfn_vma() can depend on the 'pfn' argument if it
> > is non-zero or can use follow_phys() to get the starting value of the pfn
> > range.
> >
> > Also the non zero 'size' argument can be used instead of recomputing
> > it from vma.
> >
> > This cleanup also prepares the ground for the track/untrack pfn vma routines
> > to take over the ownership of setting PAT specific vm_flag in the 'vma'.
> >
> > Signed-off-by: Suresh Siddha<suresh.b.siddha@intel.com>
> > Cc: Venkatesh Pallipadi<venki@google.com>
> > Cc: Konstantin Khlebnikov<khlebnikov@openvz.org>
> > ---
> >   arch/x86/mm/pat.c |   30 +++++++++++++++++-------------
> >   1 files changed, 17 insertions(+), 13 deletions(-)
> >
> > diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> > index f6ff57b..617f42b 100644
> > --- a/arch/x86/mm/pat.c
> > +++ b/arch/x86/mm/pat.c
> > @@ -693,14 +693,10 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
> >   			unsigned long pfn, unsigned long size)
> >   {
> >   	unsigned long flags;
> > -	resource_size_t paddr;
> > -	unsigned long vma_size = vma->vm_end - vma->vm_start;
> >
> > -	if (is_linear_pfn_mapping(vma)) {
> > -		/* reserve the whole chunk starting from vm_pgoff */
> > -		paddr = (resource_size_t)vma->vm_pgoff<<  PAGE_SHIFT;
> > -		return reserve_pfn_range(paddr, vma_size, prot, 0);
> > -	}
> > +	/* reserve the whole chunk starting from pfn */
> > +	if (is_linear_pfn_mapping(vma))
> > +		return reserve_pfn_range(pfn, size, prot, 0);
> 
> you mix here pfn and paddr: old code passes paddr as first argument of reserve_pfn_range().

oops. That was my oversight. I updated the two patches to address this.
Also I cleared VM_PAT flag as part of the untrack_pfn_vma(), so that the
use cases (like the i915 case) which just evict the pfn's (by using
unmap_mapping_range) with out actually removing the vma will do the
free_pfn_range() only when it is required.

Attached (to this e-mail) are the -v2 versions of the PAT patches. I
tested these on my SNB laptop.

thanks,
suresh

[-- Attachment #2: 0001-x86-pat-remove-the-dependency-on-vm_pgoff-in-track-u.patch --]
[-- Type: text/x-patch, Size: 2890 bytes --]

From: Suresh Siddha <suresh.b.siddha@intel.com>
Subject: x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines

'pfn' argument for track_pfn_vma_new() can be used for reserving the attribute
for the pfn range. No need to depend on 'vm_pgoff'

Similarly, untrack_pfn_vma() can depend on the 'pfn' argument if it
is non-zero or can use follow_phys() to get the starting value of the pfn
range.

Also the non zero 'size' argument can be used instead of recomputing
it from vma.

This cleanup also prepares the ground for the track/untrack pfn vma routines
to take over the ownership of setting PAT specific vm_flag in the 'vma'.

-v2: fixed the first argument for reserve_pfn_range()

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 arch/x86/mm/pat.c |   30 +++++++++++++++++-------------
 1 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index f6ff57b..617f42b 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -693,14 +693,10 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
 			unsigned long pfn, unsigned long size)
 {
 	unsigned long flags;
-	resource_size_t paddr;
-	unsigned long vma_size = vma->vm_end - vma->vm_start;
 
-	if (is_linear_pfn_mapping(vma)) {
-		/* reserve the whole chunk starting from vm_pgoff */
-		paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
-		return reserve_pfn_range(paddr, vma_size, prot, 0);
-	}
+	/* reserve the whole chunk starting from pfn */
+	if (is_linear_pfn_mapping(vma))
+		return reserve_pfn_range(pfn << PAGE_SHIFT, size, prot, 0);
 
 	if (!pat_enabled)
 		return 0;
@@ -716,20 +712,28 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
 /*
  * untrack_pfn_vma is called while unmapping a pfnmap for a region.
  * untrack can be called for a specific region indicated by pfn and size or
- * can be for the entire vma (in which case size can be zero).
+ * can be for the entire vma (in which case pfn, size are zero).
  */
 void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 			unsigned long size)
 {
 	resource_size_t paddr;
-	unsigned long vma_size = vma->vm_end - vma->vm_start;
+	unsigned long prot;
 
-	if (is_linear_pfn_mapping(vma)) {
-		/* free the whole chunk starting from vm_pgoff */
-		paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
-		free_pfn_range(paddr, vma_size);
+	if (!is_linear_pfn_mapping(vma))
 		return;
+
+	/* free the chunk starting from pfn or the whole chunk */
+	paddr = (resource_size_t)pfn << PAGE_SHIFT;
+	if (!paddr && !size) {
+		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) {
+			WARN_ON_ONCE(1);
+			return;
+		}
+
+		size = vma->vm_end - vma->vm_start;
 	}
+	free_pfn_range(paddr, size);
 }
 
 pgprot_t pgprot_writecombine(pgprot_t prot)

[-- Attachment #3: 0002-mm-x86-PAT-rework-linear-pfn-mmap-tracking.patch --]
[-- Type: text/x-patch, Size: 8403 bytes --]

From: Konstantin Khlebnikov <khlebnikov@openvz.org>
Subject: mm, x86, PAT: rework linear pfn-mmap tracking

This patch replaces generic vma-flag VM_PFN_AT_MMAP with x86-only VM_PAT.

We can toss mapping address from remap_pfn_range() into track_pfn_vma_new(),
and collect all PAT-related logic together in arch/x86/.

This patch also restores orignal frustration-free is_cow_mapping() check in
remap_pfn_range(), as it was before commit v2.6.28-rc8-88-g3c8bb73
("x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3")

is_linear_pfn_mapping() checks can be removed from mm/huge_memory.c,
because it already handled by VM_PFNMAP in VM_NO_THP bit-mask.

-v2: Reset the VM_PAT flag as part of untrack_pfn_vma()

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Ingo Molnar <mingo@redhat.com>
---
 arch/x86/mm/pat.c             |   17 ++++++++++++-----
 include/asm-generic/pgtable.h |    4 ++--
 include/linux/mm.h            |   15 +--------------
 mm/huge_memory.c              |    7 +++----
 mm/memory.c                   |   15 ++++++++-------
 5 files changed, 26 insertions(+), 32 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 24c3f95..516404c 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -665,7 +665,7 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
 	unsigned long vma_size = vma->vm_end - vma->vm_start;
 	pgprot_t pgprot;
 
-	if (is_linear_pfn_mapping(vma)) {
+	if (vma->vm_flags & VM_PAT) {
 		/*
 		 * reserve the whole chunk covered by vma. We need the
 		 * starting address and protection from pte.
@@ -690,13 +690,19 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
  * single reserve_pfn_range call.
  */
 int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-			unsigned long pfn, unsigned long size)
+		      unsigned long pfn, unsigned long addr, unsigned long size)
 {
 	unsigned long flags;
 
 	/* reserve the whole chunk starting from pfn */
-	if (is_linear_pfn_mapping(vma))
-		return reserve_pfn_range(pfn << PAGE_SHIFT, size, prot, 0);
+	if (addr == vma->vm_start && size == (vma->vm_end - vma->vm_start)) {
+		int ret;
+
+		ret = reserve_pfn_range(pfn << PAGE_SHIFT, size, prot, 0);
+		if (!ret)
+			vma->vm_flags |= VM_PAT;
+		return ret;
+	}
 
 	if (!pat_enabled)
 		return 0;
@@ -720,7 +726,7 @@ void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 	resource_size_t paddr;
 	unsigned long prot;
 
-	if (!is_linear_pfn_mapping(vma))
+	if (!(vma->vm_flags & VM_PAT))
 		return;
 
 	/* free the chunk starting from pfn or the whole chunk */
@@ -734,6 +740,7 @@ void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 		size = vma->vm_end - vma->vm_start;
 	}
 	free_pfn_range(paddr, size);
+	vma->vm_flags &= ~VM_PAT;
 }
 
 pgprot_t pgprot_writecombine(pgprot_t prot)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 125c54e..688a2a5 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -389,7 +389,7 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
  * for physical range indicated by pfn and size.
  */
 static inline int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-					unsigned long pfn, unsigned long size)
+		unsigned long pfn, unsigned long addr, unsigned long size)
 {
 	return 0;
 }
@@ -420,7 +420,7 @@ static inline void untrack_pfn_vma(struct vm_area_struct *vma,
 }
 #else
 extern int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-				unsigned long pfn, unsigned long size);
+		unsigned long pfn, unsigned long addr, unsigned long size);
 extern int track_pfn_vma_copy(struct vm_area_struct *vma);
 extern void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 				unsigned long size);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d8738a4..b8e5fe5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -117,7 +117,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_CAN_NONLINEAR 0x08000000	/* Has ->fault & does nonlinear pages */
 #define VM_MIXEDMAP	0x10000000	/* Can contain "struct page" and pure PFN pages */
 #define VM_SAO		0x20000000	/* Strong Access Ordering (powerpc) */
-#define VM_PFN_AT_MMAP	0x40000000	/* PFNMAP vma that is fully mapped at mmap time */
+#define VM_PAT		0x40000000	/* PAT reserves whole VMA at once (x86) */
 #define VM_MERGEABLE	0x80000000	/* KSM may merge identical pages */
 
 /* Bits set in the VMA until the stack is in its final location */
@@ -158,19 +158,6 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_RETRY_NOWAIT	0x10	/* Don't drop mmap_sem and wait when retrying */
 #define FAULT_FLAG_KILLABLE	0x20	/* The fault task is in SIGKILL killable region */
 
-/*
- * This interface is used by x86 PAT code to identify a pfn mapping that is
- * linear over entire vma. This is to optimize PAT code that deals with
- * marking the physical region with a particular prot. This is not for generic
- * mm use. Note also that this check will not work if the pfn mapping is
- * linear for a vma starting at physical address 0. In which case PAT code
- * falls back to slow path of reserving physical range page by page.
- */
-static inline int is_linear_pfn_mapping(struct vm_area_struct *vma)
-{
-	return !!(vma->vm_flags & VM_PFN_AT_MMAP);
-}
-
 static inline int is_pfn_mapping(struct vm_area_struct *vma)
 {
 	return !!(vma->vm_flags & VM_PFNMAP);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f0e5306..cf827da 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1650,7 +1650,7 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma)
 	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
 	 * true too, verify it here.
 	 */
-	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags & VM_NO_THP);
+	VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 	hend = vma->vm_end & HPAGE_PMD_MASK;
 	if (hstart < hend)
@@ -1908,7 +1908,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
 	 * true too, verify it here.
 	 */
-	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags & VM_NO_THP);
+	VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 
 	pgd = pgd_offset(mm, address);
 	if (!pgd_present(*pgd))
@@ -2150,8 +2150,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 		 * If is_pfn_mapping() is true is_learn_pfn_mapping()
 		 * must be true too, verify it here.
 		 */
-		VM_BUG_ON(is_linear_pfn_mapping(vma) ||
-			  vma->vm_flags & VM_NO_THP);
+		VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 
 		hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 		hend = vma->vm_end & HPAGE_PMD_MASK;
diff --git a/mm/memory.c b/mm/memory.c
index 6105f47..e6e4dfd 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2145,7 +2145,7 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_vma_new(vma, &pgprot, pfn, PAGE_SIZE))
+	if (track_pfn_vma_new(vma, &pgprot, pfn, addr, PAGE_SIZE))
 		return -EINVAL;
 
 	ret = insert_pfn(vma, addr, pfn, pgprot);
@@ -2285,23 +2285,24 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 	 * There's a horrible special case to handle copy-on-write
 	 * behaviour that some programs depend on. We mark the "original"
 	 * un-COW'ed pages by matching them up with "vma->vm_pgoff".
+	 * See vm_normal_page() for details.
 	 */
-	if (addr == vma->vm_start && end == vma->vm_end) {
+
+	if (is_cow_mapping(vma->vm_flags)) {
+		if (addr != vma->vm_start || end != vma->vm_end)
+			return -EINVAL;
 		vma->vm_pgoff = pfn;
-		vma->vm_flags |= VM_PFN_AT_MMAP;
-	} else if (is_cow_mapping(vma->vm_flags))
-		return -EINVAL;
+	}
 
 	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
 
-	err = track_pfn_vma_new(vma, &prot, pfn, PAGE_ALIGN(size));
+	err = track_pfn_vma_new(vma, &prot, pfn, addr, PAGE_ALIGN(size));
 	if (err) {
 		/*
 		 * To indicate that track_pfn related cleanup is not
 		 * needed from higher level routine calling unmap_vmas
 		 */
 		vma->vm_flags &= ~(VM_IO | VM_RESERVED | VM_PFNMAP);
-		vma->vm_flags &= ~VM_PFN_AT_MMAP;
 		return -EINVAL;
 	}
 

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring
  2012-04-03 23:14         ` Suresh Siddha
@ 2012-04-04  4:40           ` Konstantin Khlebnikov
  0 siblings, 0 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-04  4:40 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: linux-mm, Andrew Morton, linux-kernel, Andi Kleen,
	Pallipadi Venkatesh, Ingo Molnar, H. Peter Anvin, Linus Torvalds,
	Nick Piggin

Suresh Siddha wrote:
> On Tue, 2012-04-03 at 10:03 +0400, Konstantin Khlebnikov wrote:
>> Suresh Siddha wrote:
>>> Konstantin,
>>>
>>> On Sat, 2012-03-31 at 21:09 +0400, Konstantin Khlebnikov wrote:
>>>> v2: Do not use batched pfn reserving for single-page VMA. This is not optimal
>>>> and breaks something, because I see glitches on the screen with i915/drm driver.
>>>> With this version glitches are gone, and I see the same regions in
>>>> /sys/kernel/debug/x86/pat_memtype_list as before patch. So, please review this
>>>> carefully, probably I'm wrong somewhere, or I have triggered some hidden bug.
>>>
>>> Actually it is not a hidden bug. In the original code, we were setting
>>> VM_PFN_AT_MMAP only for remap_pfn_range() but not for the vm_insert_pfn().
>>> Also the value of 'vm_pgoff' depends on the driver/mmap_region() in the case of
>>> vm_insert_pfn(). But with your proposed code, you were setting
>>> the VM_PAT for the single-page VMA also and end-up using wrong vm_pgoff in
>>> untrack_pfn_vma().
>>
>> But I set correct vma->vm_pgoff together with VM_PAT. But, it shouldn't work if vma is expandable...
>>
>
> Also, I am not sure if we can override vm_pgoff in the fault handling
> path. For example, looking at unmap_mapping_range_tree() it does depend
> on the vm_pgoff value and it might break if we change the vm_pgoff in
> track_pfn_vma_new() (which gets called from vm_insert_pfn() as part of
> the i915_gem_fault()).

Yes, and we shouldn't change vma under mm->mmap_sem read-lock.

>
> thanks,
> suresh
>
>
>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines
  2012-04-03 23:31           ` Suresh Siddha
@ 2012-04-04  4:43             ` Konstantin Khlebnikov
  2012-04-05 11:56             ` Konstantin Khlebnikov
  1 sibling, 0 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-04  4:43 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Andi Kleen, Pallipadi Venkatesh, Ingo Molnar, H. Peter Anvin,
	Linus Torvalds, Nick Piggin

Suresh Siddha wrote:
> On Tue, 2012-04-03 at 09:37 +0400, Konstantin Khlebnikov wrote:
>> Suresh Siddha wrote:
>>> 'pfn' argument for track_pfn_vma_new() can be used for reserving the attribute
>>> for the pfn range. No need to depend on 'vm_pgoff'
>>>
>>> Similarly, untrack_pfn_vma() can depend on the 'pfn' argument if it
>>> is non-zero or can use follow_phys() to get the starting value of the pfn
>>> range.
>>>
>>> Also the non zero 'size' argument can be used instead of recomputing
>>> it from vma.
>>>
>>> This cleanup also prepares the ground for the track/untrack pfn vma routines
>>> to take over the ownership of setting PAT specific vm_flag in the 'vma'.
>>>
>>> Signed-off-by: Suresh Siddha<suresh.b.siddha@intel.com>
>>> Cc: Venkatesh Pallipadi<venki@google.com>
>>> Cc: Konstantin Khlebnikov<khlebnikov@openvz.org>
>>> ---
>>>    arch/x86/mm/pat.c |   30 +++++++++++++++++-------------
>>>    1 files changed, 17 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
>>> index f6ff57b..617f42b 100644
>>> --- a/arch/x86/mm/pat.c
>>> +++ b/arch/x86/mm/pat.c
>>> @@ -693,14 +693,10 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
>>>    			unsigned long pfn, unsigned long size)
>>>    {
>>>    	unsigned long flags;
>>> -	resource_size_t paddr;
>>> -	unsigned long vma_size = vma->vm_end - vma->vm_start;
>>>
>>> -	if (is_linear_pfn_mapping(vma)) {
>>> -		/* reserve the whole chunk starting from vm_pgoff */
>>> -		paddr = (resource_size_t)vma->vm_pgoff<<   PAGE_SHIFT;
>>> -		return reserve_pfn_range(paddr, vma_size, prot, 0);
>>> -	}
>>> +	/* reserve the whole chunk starting from pfn */
>>> +	if (is_linear_pfn_mapping(vma))
>>> +		return reserve_pfn_range(pfn, size, prot, 0);
>>
>> you mix here pfn and paddr: old code passes paddr as first argument of reserve_pfn_range().
>
> oops. That was my oversight. I updated the two patches to address this.
> Also I cleared VM_PAT flag as part of the untrack_pfn_vma(), so that the
> use cases (like the i915 case) which just evict the pfn's (by using
> unmap_mapping_range) with out actually removing the vma will do the
> free_pfn_range() only when it is required.
>
> Attached (to this e-mail) are the -v2 versions of the PAT patches. I
> tested these on my SNB laptop.

Ok, I'll send them as part of updated patchset.

>
> thanks,
> suresh


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines
  2012-04-03 23:31           ` Suresh Siddha
  2012-04-04  4:43             ` Konstantin Khlebnikov
@ 2012-04-05 11:56             ` Konstantin Khlebnikov
  2012-04-06  0:01               ` [v3 VM_PAT PATCH 0/3] x86 VM_PAT series Suresh Siddha
  1 sibling, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-05 11:56 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Andi Kleen, Pallipadi Venkatesh, Ingo Molnar, H. Peter Anvin,
	Linus Torvalds, Nick Piggin

Suresh Siddha wrote:
> On Tue, 2012-04-03 at 09:37 +0400, Konstantin Khlebnikov wrote:
>> Suresh Siddha wrote:
>>> 'pfn' argument for track_pfn_vma_new() can be used for reserving the attribute
>>> for the pfn range. No need to depend on 'vm_pgoff'
>>>
>>> Similarly, untrack_pfn_vma() can depend on the 'pfn' argument if it
>>> is non-zero or can use follow_phys() to get the starting value of the pfn
>>> range.
>>>
>>> Also the non zero 'size' argument can be used instead of recomputing
>>> it from vma.
>>>
>>> This cleanup also prepares the ground for the track/untrack pfn vma routines
>>> to take over the ownership of setting PAT specific vm_flag in the 'vma'.
>>>
>>> Signed-off-by: Suresh Siddha<suresh.b.siddha@intel.com>
>>> Cc: Venkatesh Pallipadi<venki@google.com>
>>> Cc: Konstantin Khlebnikov<khlebnikov@openvz.org>
>>> ---
>>>    arch/x86/mm/pat.c |   30 +++++++++++++++++-------------
>>>    1 files changed, 17 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
>>> index f6ff57b..617f42b 100644
>>> --- a/arch/x86/mm/pat.c
>>> +++ b/arch/x86/mm/pat.c
>>> @@ -693,14 +693,10 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
>>>    			unsigned long pfn, unsigned long size)
>>>    {
>>>    	unsigned long flags;
>>> -	resource_size_t paddr;
>>> -	unsigned long vma_size = vma->vm_end - vma->vm_start;
>>>
>>> -	if (is_linear_pfn_mapping(vma)) {
>>> -		/* reserve the whole chunk starting from vm_pgoff */
>>> -		paddr = (resource_size_t)vma->vm_pgoff<<   PAGE_SHIFT;
>>> -		return reserve_pfn_range(paddr, vma_size, prot, 0);
>>> -	}
>>> +	/* reserve the whole chunk starting from pfn */
>>> +	if (is_linear_pfn_mapping(vma))
>>> +		return reserve_pfn_range(pfn, size, prot, 0);
>>
>> you mix here pfn and paddr: old code passes paddr as first argument of reserve_pfn_range().
>
> oops. That was my oversight. I updated the two patches to address this.
> Also I cleared VM_PAT flag as part of the untrack_pfn_vma(), so that the
> use cases (like the i915 case) which just evict the pfn's (by using
> unmap_mapping_range) with out actually removing the vma will do the
> free_pfn_range() only when it is required.
>
> Attached (to this e-mail) are the -v2 versions of the PAT patches. I
> tested these on my SNB laptop.

With this patches I see new ranges in /sys/kernel/debug/x86/pat_memtype_list
This is 4k single-page vma mappged by X11. kernel fills them via vm_insert_pfn().
Is this ok? Maybe we shouldn't use PAT for small VMA?

before patch
# wc ~/pat_memtype_list
52  156 1936 /root/pat_memtype_list

after patch
# wc /sys/kernel/debug/x86/pat_memtype_list
257 771 10136 /sys/kernel/debug/x86/pat_memtype_list

# diff -u  ~/pat_memtype_list /sys/kernel/debug/x86/pat_memtype_list
--- /root/pat_memtype_list	2012-03-31 14:22:30.439956357 +0400
+++ /sys/kernel/debug/x86/pat_memtype_list	2012-04-05 19:43:28.380983643 +0400
@@ -27,6 +27,201 @@
  write-combining @ 0xe0023000-0xe0043000
  write-combining @ 0xe0044000-0xe0064000
  write-combining @ 0xe0064000-0xe046c000
+write-combining @ 0xe0970000-0xe0971000
+write-combining @ 0xe097e000-0xe097f000
+write-combining @ 0xe0982000-0xe0983000
+write-combining @ 0xe0983000-0xe0984000
+write-combining @ 0xe0984000-0xe0985000
+write-combining @ 0xe0985000-0xe0986000
+write-combining @ 0xe0986000-0xe0987000
+write-combining @ 0xe0987000-0xe0988000
+write-combining @ 0xe0988000-0xe0989000
+write-combining @ 0xe0989000-0xe098a000
+write-combining @ 0xe098a000-0xe098b000
+write-combining @ 0xe098b000-0xe098c000
+write-combining @ 0xe098c000-0xe098d000
+write-combining @ 0xe098d000-0xe098e000
+write-combining @ 0xe098e000-0xe098f000
+write-combining @ 0xe098f000-0xe0990000
+write-combining @ 0xe0990000-0xe0991000
+write-combining @ 0xe0991000-0xe0992000
+write-combining @ 0xe0992000-0xe0993000
+write-combining @ 0xe0993000-0xe0994000
+write-combining @ 0xe0994000-0xe0995000
+write-combining @ 0xe0995000-0xe0996000
+write-combining @ 0xe0996000-0xe0997000
+write-combining @ 0xe0997000-0xe0998000
+write-combining @ 0xe0998000-0xe0999000
+write-combining @ 0xe0999000-0xe099a000
+write-combining @ 0xe099a000-0xe099b000
+write-combining @ 0xe099b000-0xe099c000
+write-combining @ 0xe099c000-0xe099d000
+write-combining @ 0xe099d000-0xe099e000
+write-combining @ 0xe099e000-0xe099f000
+write-combining @ 0xe099f000-0xe09a0000
+write-combining @ 0xe09a0000-0xe09a1000
+write-combining @ 0xe09a1000-0xe09a2000
+write-combining @ 0xe09a2000-0xe09a3000
+write-combining @ 0xe09a3000-0xe09a4000
+write-combining @ 0xe09a4000-0xe09a5000
+write-combining @ 0xe09a5000-0xe09a6000
+write-combining @ 0xe09a6000-0xe09a7000
+write-combining @ 0xe09a7000-0xe09a8000
+write-combining @ 0xe138a000-0xe138b000
+write-combining @ 0xe13f3000-0xe13f4000
+write-combining @ 0xe17f4000-0xe17f5000
+write-combining @ 0xe1804000-0xe1805000
+write-combining @ 0xe1805000-0xe1806000
+write-combining @ 0xe1806000-0xe1807000
+write-combining @ 0xe1807000-0xe1808000
+write-combining @ 0xe1808000-0xe1809000
+write-combining @ 0xe1809000-0xe180a000
+write-combining @ 0xe180c000-0xe180d000
+write-combining @ 0xe180d000-0xe180e000
+write-combining @ 0xe180f000-0xe1810000
+write-combining @ 0xe181a000-0xe181b000
+write-combining @ 0xe181b000-0xe181c000
+write-combining @ 0xe181c000-0xe181d000
+write-combining @ 0xe1d51000-0xe1d52000
+write-combining @ 0xe1d52000-0xe1d53000
+write-combining @ 0xe1d53000-0xe1d54000
+write-combining @ 0xe1d54000-0xe1d55000
+write-combining @ 0xe1d86000-0xe1d87000
+write-combining @ 0xe1d88000-0xe1d89000
+write-combining @ 0xe1d89000-0xe1d8a000
+write-combining @ 0xe1d8b000-0xe1d8c000
+write-combining @ 0xe1d8c000-0xe1d8d000
+write-combining @ 0xe1d8e000-0xe1d8f000
+write-combining @ 0xe1d8f000-0xe1d90000
+write-combining @ 0xe1dc0000-0xe1dc1000
+write-combining @ 0xe1dc1000-0xe1dc2000
+write-combining @ 0xe1dc2000-0xe1dc3000
+write-combining @ 0xe1dc4000-0xe1dc5000
+write-combining @ 0xe1dc5000-0xe1dc6000
+write-combining @ 0xe1dc7000-0xe1dc8000
+write-combining @ 0xe1dc8000-0xe1dc9000
+write-combining @ 0xe1e11000-0xe1e12000
+write-combining @ 0xe1e87000-0xe1e88000
+write-combining @ 0xe1e88000-0xe1e89000
+write-combining @ 0xe1e89000-0xe1e8a000
+write-combining @ 0xe1e8a000-0xe1e8b000
+write-combining @ 0xe1f3b000-0xe1f3c000
+write-combining @ 0xe20a8000-0xe20a9000
+write-combining @ 0xe2158000-0xe2159000
+write-combining @ 0xe2159000-0xe215a000
+write-combining @ 0xe215a000-0xe215b000
+write-combining @ 0xe2204000-0xe2205000
+write-combining @ 0xe2314000-0xe2315000
+write-combining @ 0xe2315000-0xe2316000
+write-combining @ 0xe2317000-0xe2318000
+write-combining @ 0xe2318000-0xe2319000
+write-combining @ 0xe2319000-0xe231a000
+write-combining @ 0xe233a000-0xe233b000
+write-combining @ 0xe233c000-0xe233d000
+write-combining @ 0xe233d000-0xe233e000
+write-combining @ 0xe2355000-0xe2356000
+write-combining @ 0xe2357000-0xe2358000
+write-combining @ 0xe2358000-0xe2359000
+write-combining @ 0xe235e000-0xe235f000
+write-combining @ 0xe2361000-0xe2362000
+write-combining @ 0xe2362000-0xe2363000
+write-combining @ 0xe2363000-0xe2364000
+write-combining @ 0xe2366000-0xe2367000
+write-combining @ 0xe2367000-0xe2368000
+write-combining @ 0xe2368000-0xe2369000
+write-combining @ 0xe2369000-0xe236a000
+write-combining @ 0xe236f000-0xe2370000
+write-combining @ 0xe2371000-0xe2372000
+write-combining @ 0xe237d000-0xe237e000
+write-combining @ 0xe2382000-0xe2383000
+write-combining @ 0xe2383000-0xe2384000
+write-combining @ 0xe2386000-0xe2387000
+write-combining @ 0xe2387000-0xe2388000
+write-combining @ 0xe2389000-0xe238a000
+write-combining @ 0xe23c7000-0xe23c8000
+write-combining @ 0xe23ca000-0xe23cb000
+write-combining @ 0xe23cb000-0xe23cc000
+write-combining @ 0xe23cc000-0xe23cd000
+write-combining @ 0xe23cd000-0xe23ce000
+write-combining @ 0xe23ce000-0xe23cf000
+write-combining @ 0xe23d3000-0xe23d4000
+write-combining @ 0xe23d4000-0xe23d5000
+write-combining @ 0xe23d5000-0xe23d6000
+write-combining @ 0xe2453000-0xe2454000
+write-combining @ 0xe2a5e000-0xe2a5f000
+write-combining @ 0xe2a5f000-0xe2a60000
+write-combining @ 0xe2a60000-0xe2a61000
+write-combining @ 0xe2a61000-0xe2a62000
+write-combining @ 0xe2a62000-0xe2a63000
+write-combining @ 0xe2a63000-0xe2a64000
+write-combining @ 0xe2a64000-0xe2a65000
+write-combining @ 0xe2a65000-0xe2a66000
+write-combining @ 0xe2a66000-0xe2a67000
+write-combining @ 0xe2a67000-0xe2a68000
+write-combining @ 0xe2a68000-0xe2a69000
+write-combining @ 0xe2a69000-0xe2a6a000
+write-combining @ 0xe2a6a000-0xe2a6b000
+write-combining @ 0xe2a6b000-0xe2a6c000
+write-combining @ 0xe2a6c000-0xe2a6d000
+write-combining @ 0xe2a6d000-0xe2a6e000
+write-combining @ 0xe2a74000-0xe2a75000
+write-combining @ 0xe2a75000-0xe2a76000
+write-combining @ 0xe2a7b000-0xe2a7c000
+write-combining @ 0xe2a81000-0xe2a82000
+write-combining @ 0xe2a82000-0xe2a83000
+write-combining @ 0xe2a83000-0xe2a84000
+write-combining @ 0xe2a84000-0xe2a85000
+write-combining @ 0xe2a85000-0xe2a86000
+write-combining @ 0xe2af6000-0xe2af7000
+write-combining @ 0xe2af7000-0xe2af8000
+write-combining @ 0xe2af8000-0xe2af9000
+write-combining @ 0xe2b27000-0xe2b28000
+write-combining @ 0xe2bd4000-0xe2bd5000
+write-combining @ 0xe2bd5000-0xe2bd6000
+write-combining @ 0xe2bd6000-0xe2bd7000
+write-combining @ 0xe2bd7000-0xe2bd8000
+write-combining @ 0xe2bd8000-0xe2bd9000
+write-combining @ 0xe2bd9000-0xe2bda000
+write-combining @ 0xe2f19000-0xe2f1a000
+write-combining @ 0xe372c000-0xe372d000
+write-combining @ 0xe372d000-0xe372e000
+write-combining @ 0xe372e000-0xe372f000
+write-combining @ 0xe384a000-0xe384b000
+write-combining @ 0xe384b000-0xe384c000
+write-combining @ 0xe384d000-0xe384e000
+write-combining @ 0xe384e000-0xe384f000
+write-combining @ 0xe384f000-0xe3850000
+write-combining @ 0xe3851000-0xe3852000
+write-combining @ 0xe3852000-0xe3853000
+write-combining @ 0xe3853000-0xe3854000
+write-combining @ 0xe3854000-0xe3855000
+write-combining @ 0xe3855000-0xe3856000
+write-combining @ 0xe3856000-0xe3857000
+write-combining @ 0xe385e000-0xe385f000
+write-combining @ 0xe385f000-0xe3860000
+write-combining @ 0xe3860000-0xe3861000
+write-combining @ 0xe3861000-0xe3862000
+write-combining @ 0xe3862000-0xe3863000
+write-combining @ 0xe3863000-0xe3864000
+write-combining @ 0xe39e8000-0xe39e9000
+write-combining @ 0xe39e9000-0xe39ea000
+write-combining @ 0xe39ed000-0xe39ee000
+write-combining @ 0xe39ee000-0xe39ef000
+write-combining @ 0xe39ef000-0xe39f0000
+write-combining @ 0xe39f1000-0xe39f2000
+write-combining @ 0xe39f3000-0xe39f4000
+write-combining @ 0xe3bf4000-0xe3bf5000
+write-combining @ 0xe4040000-0xe4041000
+write-combining @ 0xe4381000-0xe4382000
+write-combining @ 0xe4382000-0xe4383000
+write-combining @ 0xe4383000-0xe4384000
+write-combining @ 0xe4e91000-0xe4e92000
+write-combining @ 0xe4e94000-0xe4e95000
+write-combining @ 0xe52db000-0xe52dc000
+write-combining @ 0xe555e000-0xe555f000
+write-combining @ 0xe57df000-0xe57e0000
+write-combining @ 0xe57e0000-0xe57e1000
+write-combining @ 0xe57e1000-0xe57e2000
  uncached-minus @ 0xf0000000-0xf0400000
  uncached-minus @ 0xf0000000-0xf0080000
  uncached-minus @ 0xf0200000-0xf0400000

# pmap $(pidof X)
4539:   /usr/bin/X :0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
00007f8c059ee000      4K rw-s-  /dev/dri/card0
00007f8c059ef000      4K rw-s-  /dev/dri/card0
00007f8c059f0000      4K rw-s-  /dev/dri/card0
00007f8c059f5000      4K rw-s-  /dev/dri/card0
00007f8c059f6000      4K rw-s-  /dev/dri/card0
00007f8c059f7000      4K rw-s-  /dev/dri/card0
00007f8c059f8000      4K rw-s-  /dev/dri/card0
00007f8c059f9000      4K rw-s-  /dev/dri/card0
00007f8c059fc000      4K rw-s-  /dev/dri/card0
00007f8c059ff000      4K rw-s-  /dev/dri/card0
00007f8c05a00000    384K rw-s-    [ shmid=0xd000b ]
00007f8c05a60000    384K rw-s-    [ shmid=0xc800a ]
00007f8c05afb000      4K rw-s-  /dev/dri/card0
00007f8c05afd000      4K rw-s-  /dev/dri/card0
00007f8c05afe000      4K rw-s-  /dev/dri/card0
00007f8c05b01000      4K rw-s-  /dev/dri/card0
00007f8c05b02000      4K rw-s-  /dev/dri/card0
00007f8c05b07000      4K rw-s-  /dev/dri/card0
00007f8c05b13000      4K rw-s-  /dev/dri/card0
00007f8c05b15000      4K rw-s-  /dev/dri/card0
00007f8c05b1b000      4K rw-s-  /dev/dri/card0
00007f8c05b1c000      4K rw-s-  /dev/dri/card0
00007f8c05b1d000      4K rw-s-  /dev/dri/card0
00007f8c05b1e000      4K rw-s-  /dev/dri/card0
00007f8c05b21000      4K rw-s-  /dev/dri/card0
00007f8c05b22000      4K rw-s-  /dev/dri/card0
00007f8c05b23000      4K rw-s-  /dev/dri/card0
00007f8c05b26000      4K rw-s-  /dev/dri/card0
00007f8c05b2c000      4K rw-s-  /dev/dri/card0
00007f8c05b2d000      4K rw-s-  /dev/dri/card0
00007f8c05b2e000      4K rw-s-  /dev/dri/card0
00007f8c05b2f000      4K rw-s-  /dev/dri/card0
00007f8c05b30000      4K rw-s-  /dev/dri/card0
00007f8c05b31000      4K rw-s-  /dev/dri/card0
00007f8c05b32000      4K rw-s-  /dev/dri/card0
00007f8c05b33000      4K rw-s-  /dev/dri/card0
00007f8c05b34000      4K rw-s-  /dev/dri/card0
00007f8c05b35000      4K rw-s-  /dev/dri/card0
00007f8c05de2000    384K rw-s-    [ shmid=0xe000f ]
00007f8c05e5e000     64K rw-s-  /dev/dri/card0
00007f8c05e6e000     16K rw-s-  /dev/dri/card0
00007f8c05e72000     28K rw-s-  /dev/dri/card0
00007f8c05e7e000      8K rw-s-  /dev/dri/card0
00007f8c05e80000      8K rw-s-  /dev/dri/card0
00007f8c05e82000      8K rw-s-  /dev/dri/card0
00007f8c05ebd000      8K rw-s-  /dev/dri/card0
00007f8c05ee2000     16K rw-s-  /dev/dri/card0
00007f8c05ee6000     28K rw-s-  /dev/dri/card0
00007f8c05f0c000    256K rw-s-  /dev/dri/card0
00007f8c05f4c000    384K rw-s-    [ shmid=0xd800c ]
00007f8c05fac000    284K rw---    [ anon ]
00007f8c05ff6000     24K rw-s-  /dev/dri/card0
00007f8c05ffc000     24K rw-s-  /dev/dri/card0
00007f8c06002000     24K rw-s-  /dev/dri/card0
00007f8c06008000     24K rw-s-  /dev/dri/card0
00007f8c06032000     16K rw-s-  /dev/dri/card0
00007f8c06036000    384K rw-s-    [ shmid=0xc0009 ]
00007f8c06096000    160K rw-s-  /dev/dri/card0
00007f8c060be000    384K rw-s-    [ shmid=0xb8008 ]
00007f8c0611e000      4K rw-s-  /dev/dri/card0
00007f8c0611f000      4K rw-s-  /dev/dri/card0
00007f8c06120000      4K rw-s-  /dev/dri/card0
00007f8c06121000      4K rw-s-  /dev/dri/card0
00007f8c06122000      4K rw-s-  /dev/dri/card0
00007f8c06123000      4K rw-s-  /dev/dri/card0
00007f8c06124000      4K rw-s-  /dev/dri/card0
00007f8c06125000      4K rw-s-  /dev/dri/card0
00007f8c06126000      4K rw-s-  /dev/dri/card0
00007f8c06127000      4K rw-s-  /dev/dri/card0
00007f8c06128000      4K rw-s-  /dev/dri/card0
00007f8c06129000      4K rw-s-  /dev/dri/card0
00007f8c0612a000      4K rw-s-  /dev/dri/card0
00007f8c0612b000      4K rw-s-  /dev/dri/card0
00007f8c0612c000   5120K rw-s-  /dev/dri/card0
00007f8c0662c000     44K r-x--  /lib/x86_64-linux-gnu/libnss_files-2.13.so
00007f8c06637000   2044K -----  /lib/x86_64-linux-gnu/libnss_files-2.13.so
00007f8c06836000      4K r----  /lib/x86_64-linux-gnu/libnss_files-2.13.so
00007f8c06837000      4K rw---  /lib/x86_64-linux-gnu/libnss_files-2.13.so
00007f8c06838000     40K r-x--  /lib/x86_64-linux-gnu/libnss_nis-2.13.so
00007f8c06842000   2044K -----  /lib/x86_64-linux-gnu/libnss_nis-2.13.so
00007f8c06a41000      4K r----  /lib/x86_64-linux-gnu/libnss_nis-2.13.so
00007f8c06a42000      4K rw---  /lib/x86_64-linux-gnu/libnss_nis-2.13.so
00007f8c06a43000     84K r-x--  /lib/x86_64-linux-gnu/libnsl-2.13.so
00007f8c06a58000   2044K -----  /lib/x86_64-linux-gnu/libnsl-2.13.so
00007f8c06c57000      4K r----  /lib/x86_64-linux-gnu/libnsl-2.13.so
00007f8c06c58000      4K rw---  /lib/x86_64-linux-gnu/libnsl-2.13.so
00007f8c06c59000      8K rw---    [ anon ]
00007f8c06c5b000     28K r-x--  /lib/x86_64-linux-gnu/libnss_compat-2.13.so
00007f8c06c62000   2044K -----  /lib/x86_64-linux-gnu/libnss_compat-2.13.so
00007f8c06e61000      4K r----  /lib/x86_64-linux-gnu/libnss_compat-2.13.so
00007f8c06e62000      4K rw---  /lib/x86_64-linux-gnu/libnss_compat-2.13.so
00007f8c06e63000   5120K rw-s-  /dev/dri/card0
00007f8c07363000     52K r-x--  /usr/lib/xorg/modules/input/synaptics_drv.so
00007f8c07370000   2048K -----  /usr/lib/xorg/modules/input/synaptics_drv.so
00007f8c07570000      4K rw---  /usr/lib/xorg/modules/input/synaptics_drv.so
00007f8c07571000     48K r-x--  /usr/lib/xorg/modules/input/evdev_drv.so
00007f8c0757d000   2044K -----  /usr/lib/xorg/modules/input/evdev_drv.so
00007f8c0777c000      4K rw---  /usr/lib/xorg/modules/input/evdev_drv.so
00007f8c0777d000   5120K rw-s-  /dev/dri/card0
00007f8c07c7d000    928K r-x--  /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
00007f8c07d65000   2048K -----  /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
00007f8c07f65000     32K r----  /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
00007f8c07f6d000      8K rw---  /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
00007f8c07f6f000     84K rw---    [ anon ]
00007f8c07f84000    156K r-x--  /lib/x86_64-linux-gnu/libexpat.so.1.6.0
00007f8c07fab000   2048K -----  /lib/x86_64-linux-gnu/libexpat.so.1.6.0
00007f8c081ab000      8K r----  /lib/x86_64-linux-gnu/libexpat.so.1.6.0
00007f8c081ad000      4K rw---  /lib/x86_64-linux-gnu/libexpat.so.1.6.0
00007f8c081ae000   3752K r-x--  /usr/lib/x86_64-linux-gnu/dri/i965_dri.so
00007f8c08558000   2044K -----  /usr/lib/x86_64-linux-gnu/dri/i965_dri.so
00007f8c08757000    108K rw---  /usr/lib/x86_64-linux-gnu/dri/i965_dri.so
00007f8c08772000     72K rw---    [ anon ]
00007f8c08784000    136K r-x--  /usr/lib/xorg/modules/libfb.so
00007f8c087a6000   2044K -----  /usr/lib/xorg/modules/libfb.so
00007f8c089a5000      4K r----  /usr/lib/xorg/modules/libfb.so
00007f8c089a6000      4K rw---  /usr/lib/xorg/modules/libfb.so
00007f8c089a7000    112K r-x--  /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1.0.0
00007f8c089c3000   2048K -----  /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1.0.0
00007f8c08bc3000      4K r----  /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1.0.0
00007f8c08bc4000      4K rw---  /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1.0.0
00007f8c08bc5000    312K r-x--  /usr/lib/xorg/modules/drivers/intel_drv.so
00007f8c08c13000   2048K -----  /usr/lib/xorg/modules/drivers/intel_drv.so
00007f8c08e13000     16K rw---  /usr/lib/xorg/modules/drivers/intel_drv.so
00007f8c08e17000     20K r-x--  /usr/lib/xorg/modules/extensions/libdri2.so
00007f8c08e1c000   2044K -----  /usr/lib/xorg/modules/extensions/libdri2.so
00007f8c0901b000      4K r----  /usr/lib/xorg/modules/extensions/libdri2.so
00007f8c0901c000      4K rw---  /usr/lib/xorg/modules/extensions/libdri2.so
00007f8c0901d000     44K r-x--  /usr/lib/x86_64-linux-gnu/libdrm.so.2.4.0
00007f8c09028000   2044K -----  /usr/lib/x86_64-linux-gnu/libdrm.so.2.4.0
00007f8c09227000      4K r----  /usr/lib/x86_64-linux-gnu/libdrm.so.2.4.0
00007f8c09228000      4K rw---  /usr/lib/x86_64-linux-gnu/libdrm.so.2.4.0
00007f8c09229000     40K r-x--  /usr/lib/xorg/modules/extensions/libdri.so
00007f8c09233000   2048K -----  /usr/lib/xorg/modules/extensions/libdri.so
00007f8c09433000      4K r----  /usr/lib/xorg/modules/extensions/libdri.so
00007f8c09434000      4K rw---  /usr/lib/xorg/modules/extensions/libdri.so
00007f8c09435000     28K r-x--  /usr/lib/xorg/modules/extensions/librecord.so
00007f8c0943c000   2044K -----  /usr/lib/xorg/modules/extensions/librecord.so
00007f8c0963b000      4K r----  /usr/lib/xorg/modules/extensions/librecord.so
00007f8c0963c000      4K rw---  /usr/lib/xorg/modules/extensions/librecord.so
00007f8c0963d000    376K r-x--  /usr/lib/xorg/modules/extensions/libglx.so
00007f8c0969b000   2048K -----  /usr/lib/xorg/modules/extensions/libglx.so
00007f8c0989b000      4K r----  /usr/lib/xorg/modules/extensions/libglx.so
00007f8c0989c000     12K rw---  /usr/lib/xorg/modules/extensions/libglx.so
00007f8c0989f000      8K rw---    [ anon ]
00007f8c098a1000     20K r-x--  /usr/lib/xorg/modules/extensions/libdbe.so
00007f8c098a6000   2044K -----  /usr/lib/xorg/modules/extensions/libdbe.so
00007f8c09aa5000      4K r----  /usr/lib/xorg/modules/extensions/libdbe.so
00007f8c09aa6000      4K rw---  /usr/lib/xorg/modules/extensions/libdbe.so
00007f8c09aa7000    120K r-x--  /lib/x86_64-linux-gnu/libselinux.so.1
00007f8c09ac5000   2044K -----  /lib/x86_64-linux-gnu/libselinux.so.1
00007f8c09cc4000      4K r----  /lib/x86_64-linux-gnu/libselinux.so.1
00007f8c09cc5000      4K rw---  /lib/x86_64-linux-gnu/libselinux.so.1
00007f8c09cc6000      4K rw---    [ anon ]
00007f8c09cc7000    140K r-x--  /usr/lib/xorg/modules/extensions/libextmod.so
00007f8c09cea000   2044K -----  /usr/lib/xorg/modules/extensions/libextmod.so
00007f8c09ee9000      4K r----  /usr/lib/xorg/modules/extensions/libextmod.so
00007f8c09eea000      8K rw---  /usr/lib/xorg/modules/extensions/libextmod.so
00007f8c09eec000     84K r-x--  /lib/x86_64-linux-gnu/libgcc_s.so.1
00007f8c09f01000   2048K -----  /lib/x86_64-linux-gnu/libgcc_s.so.1
00007f8c0a101000      4K rw---  /lib/x86_64-linux-gnu/libgcc_s.so.1
00007f8c0a102000     12K rw---    [ anon ]
00007f8c0a105000     24K r-x--  /usr/lib/x86_64-linux-gnu/libfontenc.so.1.0.0
00007f8c0a10b000   2044K -----  /usr/lib/x86_64-linux-gnu/libfontenc.so.1.0.0
00007f8c0a30a000      8K rw---  /usr/lib/x86_64-linux-gnu/libfontenc.so.1.0.0
00007f8c0a30c000     60K r-x--  /lib/x86_64-linux-gnu/libbz2.so.1.0.4
00007f8c0a31b000   2044K -----  /lib/x86_64-linux-gnu/libbz2.so.1.0.4
00007f8c0a51a000      8K rw---  /lib/x86_64-linux-gnu/libbz2.so.1.0.4
00007f8c0a51c000    612K r-x--  /usr/lib/x86_64-linux-gnu/libfreetype.so.6.8.1
00007f8c0a5b5000   2044K -----  /usr/lib/x86_64-linux-gnu/libfreetype.so.6.8.1
00007f8c0a7b4000     24K r----  /usr/lib/x86_64-linux-gnu/libfreetype.so.6.8.1
00007f8c0a7ba000      4K rw---  /usr/lib/x86_64-linux-gnu/libfreetype.so.6.8.1
00007f8c0a7bb000     88K r-x--  /usr/lib/x86_64-linux-gnu/libz.so.1.2.6
00007f8c0a7d1000   2044K -----  /usr/lib/x86_64-linux-gnu/libz.so.1.2.6
00007f8c0a9d0000      4K rw---  /usr/lib/x86_64-linux-gnu/libz.so.1.2.6
00007f8c0a9d1000     12K r-x--  /lib/x86_64-linux-gnu/libgpg-error.so.0.8.0
00007f8c0a9d4000   2044K -----  /lib/x86_64-linux-gnu/libgpg-error.so.0.8.0
00007f8c0abd3000      4K rw---  /lib/x86_64-linux-gnu/libgpg-error.so.0.8.0
00007f8c0abd4000   1524K r-x--  /lib/x86_64-linux-gnu/libc-2.13.so
00007f8c0ad51000   2048K -----  /lib/x86_64-linux-gnu/libc-2.13.so
00007f8c0af51000     16K r----  /lib/x86_64-linux-gnu/libc-2.13.so
00007f8c0af55000      4K rw---  /lib/x86_64-linux-gnu/libc-2.13.so
00007f8c0af56000     20K rw---    [ anon ]
00007f8c0af5b000     28K r-x--  /lib/x86_64-linux-gnu/librt-2.13.so
00007f8c0af62000   2044K -----  /lib/x86_64-linux-gnu/librt-2.13.so
00007f8c0b161000      4K r----  /lib/x86_64-linux-gnu/librt-2.13.so
00007f8c0b162000      4K rw---  /lib/x86_64-linux-gnu/librt-2.13.so
00007f8c0b163000    516K r-x--  /lib/x86_64-linux-gnu/libm-2.13.so
00007f8c0b1e4000   2044K -----  /lib/x86_64-linux-gnu/libm-2.13.so
00007f8c0b3e3000      4K r----  /lib/x86_64-linux-gnu/libm-2.13.so
00007f8c0b3e4000      4K rw---  /lib/x86_64-linux-gnu/libm-2.13.so
00007f8c0b3e5000     92K r-x--  /lib/libaudit.so.0.0.0
00007f8c0b3fc000   2044K -----  /lib/libaudit.so.0.0.0
00007f8c0b5fb000      4K r----  /lib/libaudit.so.0.0.0
00007f8c0b5fc000      4K rw---  /lib/libaudit.so.0.0.0
00007f8c0b5fd000     20K r-x--  /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0
00007f8c0b602000   2044K -----  /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0
00007f8c0b801000      4K r----  /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0
00007f8c0b802000      4K rw---  /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0
00007f8c0b803000      8K r-x--  /usr/lib/x86_64-linux-gnu/libXau.so.6.0.0
00007f8c0b805000   2044K -----  /usr/lib/x86_64-linux-gnu/libXau.so.6.0.0
00007f8c0ba04000      4K r----  /usr/lib/x86_64-linux-gnu/libXau.so.6.0.0
00007f8c0ba05000      4K rw---  /usr/lib/x86_64-linux-gnu/libXau.so.6.0.0
00007f8c0ba06000    236K r-x--  /usr/lib/libXfont.so.1.4.1
00007f8c0ba41000   2044K -----  /usr/lib/libXfont.so.1.4.1
00007f8c0bc40000      4K r----  /usr/lib/libXfont.so.1.4.1
00007f8c0bc41000      8K rw---  /usr/lib/libXfont.so.1.4.1
00007f8c0bc43000    520K r-x--  /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.24.4
00007f8c0bcc5000   2044K -----  /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.24.4
00007f8c0bec4000     24K rw---  /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.24.4
00007f8c0beca000     92K r-x--  /lib/x86_64-linux-gnu/libpthread-2.13.so
00007f8c0bee1000   2044K -----  /lib/x86_64-linux-gnu/libpthread-2.13.so
00007f8c0c0e0000      4K r----  /lib/x86_64-linux-gnu/libpthread-2.13.so
00007f8c0c0e1000      4K rw---  /lib/x86_64-linux-gnu/libpthread-2.13.so
00007f8c0c0e2000     16K rw---    [ anon ]
00007f8c0c0e6000     32K r-x--  /usr/lib/x86_64-linux-gnu/libpciaccess.so.0.11.0
00007f8c0c0ee000   2044K -----  /usr/lib/x86_64-linux-gnu/libpciaccess.so.0.11.0
00007f8c0c2ed000      4K r----  /usr/lib/x86_64-linux-gnu/libpciaccess.so.0.11.0
00007f8c0c2ee000      4K rw---  /usr/lib/x86_64-linux-gnu/libpciaccess.so.0.11.0
00007f8c0c2ef000      8K r-x--  /lib/x86_64-linux-gnu/libdl-2.13.so
00007f8c0c2f1000   2048K -----  /lib/x86_64-linux-gnu/libdl-2.13.so
00007f8c0c4f1000      4K r----  /lib/x86_64-linux-gnu/libdl-2.13.so
00007f8c0c4f2000      4K rw---  /lib/x86_64-linux-gnu/libdl-2.13.so
00007f8c0c4f3000    488K r-x--  /lib/x86_64-linux-gnu/libgcrypt.so.11.7.0
00007f8c0c56d000   2048K -----  /lib/x86_64-linux-gnu/libgcrypt.so.11.7.0
00007f8c0c76d000     16K rw---  /lib/x86_64-linux-gnu/libgcrypt.so.11.7.0
00007f8c0c771000     56K r-x--  /lib/x86_64-linux-gnu/libudev.so.0.13.0
00007f8c0c77f000   2044K -----  /lib/x86_64-linux-gnu/libudev.so.0.13.0
00007f8c0c97e000      4K r----  /lib/x86_64-linux-gnu/libudev.so.0.13.0
00007f8c0c97f000      4K rw---  /lib/x86_64-linux-gnu/libudev.so.0.13.0
00007f8c0c980000    124K r-x--  /lib/x86_64-linux-gnu/ld-2.13.so
00007f8c0c99f000      4K rw-s-  /dev/dri/card0
00007f8c0c9a0000      4K rw-s-  /dev/dri/card0
00007f8c0c9a1000      4K rw-s-  /dev/dri/card0
00007f8c0c9a2000      4K rw-s-  /dev/dri/card0
00007f8c0c9a3000      4K rw-s-  /dev/dri/card0
00007f8c0c9a4000    384K rw-s-    [ shmid=0xb0007 ]
00007f8c0ca04000    384K rw-s-    [ shmid=0xa8006 ]
00007f8c0ca64000      4K rw-s-  /dev/dri/card0
00007f8c0ca65000      4K rw-s-  /dev/dri/card0
00007f8c0ca66000      4K rw-s-  /dev/dri/card0
00007f8c0ca67000      4K rw-s-  /dev/dri/card0
00007f8c0ca68000      4K rw-s-  /dev/dri/card0
00007f8c0ca69000      4K rw-s-  /dev/dri/card0
00007f8c0ca6a000      4K rw-s-  /dev/dri/card0
00007f8c0ca6b000      4K rw-s-  /dev/dri/card0
00007f8c0ca6c000      4K rw-s-  /dev/dri/card0
00007f8c0ca6d000      4K rw-s-  /dev/dri/card0
00007f8c0ca6e000      4K rw-s-  /dev/dri/card0
00007f8c0ca6f000      4K rw-s-  /dev/dri/card0
00007f8c0ca70000      4K rw-s-  /dev/dri/card0
00007f8c0ca71000      4K rw-s-  /dev/dri/card0
00007f8c0ca72000      4K rw-s-  /dev/dri/card0
00007f8c0ca73000      4K rw-s-  /dev/dri/card0
00007f8c0ca74000      4K rw-s-  /dev/dri/card0
00007f8c0ca75000      4K rw-s-  /dev/dri/card0
00007f8c0ca76000      4K rw-s-  /dev/dri/card0
00007f8c0ca77000      4K rw-s-  /dev/dri/card0
00007f8c0ca78000      4K rw-s-  /dev/dri/card0
00007f8c0ca79000      4K rw-s-  /dev/dri/card0
00007f8c0ca7a000      4K rw-s-  /dev/dri/card0
00007f8c0ca7b000      4K rw-s-  /dev/dri/card0
00007f8c0ca7c000      4K rw-s-  /dev/dri/card0
00007f8c0ca7d000      4K rw-s-  /dev/dri/card0
00007f8c0ca7e000      4K rw-s-  /dev/dri/card0
00007f8c0ca7f000      4K rw-s-  /dev/dri/card0
00007f8c0ca80000      4K rw-s-  /dev/dri/card0
00007f8c0ca81000      4K rw-s-  /dev/dri/card0
00007f8c0ca82000      4K rw-s-  /dev/dri/card0
00007f8c0ca83000      4K rw-s-  /dev/dri/card0
00007f8c0ca84000      4K rw-s-  /dev/dri/card0
00007f8c0ca85000      4K rw-s-  /dev/dri/card0
00007f8c0ca86000      4K rw-s-  /dev/dri/card0
00007f8c0ca87000      4K rw-s-  /dev/dri/card0
00007f8c0ca88000      4K rw-s-  /dev/dri/card0
00007f8c0ca89000    384K rw-s-    [ shmid=0xa0005 ]
00007f8c0cae9000      4K rw-s-  /dev/dri/card0
00007f8c0caea000      4K rw-s-  /dev/dri/card0
00007f8c0caeb000      4K rw-s-  /dev/dri/card0
00007f8c0caec000      4K rw-s-  /dev/dri/card0
00007f8c0caf0000      4K rw-s-  /dev/dri/card0
00007f8c0caf1000      4K rw-s-  /dev/dri/card0
00007f8c0caf2000      4K rw-s-  /dev/dri/card0
00007f8c0caf3000      4K rw-s-  /dev/dri/card0
00007f8c0caf4000      4K rw-s-  /dev/dri/card0
00007f8c0caf5000      4K rw-s-  /dev/dri/card0
00007f8c0caf6000      4K rw-s-  /dev/dri/card0
00007f8c0caf7000      4K rw-s-  /dev/dri/card0
00007f8c0caf8000      4K rw-s-  /dev/dri/card0
00007f8c0caf9000      4K rw-s-  /dev/dri/card0
00007f8c0cafa000      4K rw-s-  /dev/dri/card0
00007f8c0cafb000      4K rw-s-  /dev/dri/card0
00007f8c0cafc000      4K rw-s-  /dev/dri/card0
00007f8c0cafd000      4K rw-s-  /dev/dri/card0
00007f8c0cafe000      4K rw-s-  /dev/dri/card0
00007f8c0caff000      4K rw-s-  /dev/dri/card0
00007f8c0cb00000      4K rw-s-  /dev/dri/card0
00007f8c0cb01000      4K rw-s-  /dev/dri/card0
00007f8c0cb02000      4K rw-s-  /dev/dri/card0
00007f8c0cb03000      4K rw-s-  /dev/dri/card0
00007f8c0cb04000      4K rw-s-  /dev/dri/card0
00007f8c0cb05000      4K rw-s-  /dev/dri/card0
00007f8c0cb06000      4K rw-s-  /dev/dri/card0
00007f8c0cb07000      4K rw-s-  /dev/dri/card0
00007f8c0cb08000      4K rw-s-  /dev/dri/card0
00007f8c0cb09000      4K rw-s-  /dev/dri/card0
00007f8c0cb0a000      4K rw-s-  /dev/dri/card0
00007f8c0cb0b000    264K rw---    [ anon ]
00007f8c0cb4d000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb4e000     28K rw-s-  /drm mm object (deleted)
00007f8c0cb55000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb56000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb57000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb58000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb59000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5a000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5b000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5c000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5d000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5e000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5f000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb60000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb61000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb62000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb63000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb64000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb65000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb66000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb67000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb68000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb69000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6a000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6b000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6c000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6d000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6e000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6f000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb70000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb71000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb72000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb73000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb74000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb75000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb76000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb77000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb78000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb79000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7a000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7b000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7c000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7d000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7e000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7f000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb80000     24K rw---    [ anon ]
00007f8c0cb86000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb87000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb88000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb89000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8a000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8b000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8c000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8d000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8e000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8f000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb90000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb91000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb92000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb93000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb94000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb95000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb96000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb97000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb98000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb99000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb9a000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb9b000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb9c000     12K rw---    [ anon ]
00007f8c0cb9f000      4K r----  /lib/x86_64-linux-gnu/ld-2.13.so
00007f8c0cba0000      4K rw---  /lib/x86_64-linux-gnu/ld-2.13.so
00007f8c0cba1000      4K rw---    [ anon ]
00007f8c0cba2000   1956K r-x--  /usr/bin/Xorg
00007f8c0cf8a000     12K r----  /usr/bin/Xorg
00007f8c0cf8d000     44K rw---  /usr/bin/Xorg
00007f8c0cf98000     76K rw---    [ anon ]
00007f8c0e9c6000   6600K rw---    [ anon ]
00007fffadf48000    132K rw---    [ stack ]
00007fffadffc000      4K r-x--    [ anon ]
ffffffffff600000      4K r-x--    [ anon ]
  total           121332K
root@zurg:/home/blind# pmap $(pidof X) | less
root@zurg:/home/blind# pmap $(pidof X) | less
root@zurg:/home/blind# pmap $(pidof X)
4539:   /usr/bin/X :0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
00007f8c059ee000      4K rw-s-  /dev/dri/card0
00007f8c059ef000      4K rw-s-  /dev/dri/card0
00007f8c059f0000      4K rw-s-  /dev/dri/card0
00007f8c059f6000      4K rw-s-  /dev/dri/card0
00007f8c059f7000      4K rw-s-  /dev/dri/card0
00007f8c059f8000      4K rw-s-  /dev/dri/card0
00007f8c059f9000      4K rw-s-  /dev/dri/card0
00007f8c059fc000      4K rw-s-  /dev/dri/card0
00007f8c059ff000      4K rw-s-  /dev/dri/card0
00007f8c05a00000    384K rw-s-    [ shmid=0xd000b ]
00007f8c05a60000    384K rw-s-    [ shmid=0xc800a ]
00007f8c05afb000      4K rw-s-  /dev/dri/card0
00007f8c05afd000      4K rw-s-  /dev/dri/card0
00007f8c05afe000      4K rw-s-  /dev/dri/card0
00007f8c05b01000      4K rw-s-  /dev/dri/card0
00007f8c05b02000      4K rw-s-  /dev/dri/card0
00007f8c05b07000      4K rw-s-  /dev/dri/card0
00007f8c05b13000      4K rw-s-  /dev/dri/card0
00007f8c05b1b000      4K rw-s-  /dev/dri/card0
00007f8c05b1c000      4K rw-s-  /dev/dri/card0
00007f8c05b1d000      4K rw-s-  /dev/dri/card0
00007f8c05b1e000      4K rw-s-  /dev/dri/card0
00007f8c05b21000      4K rw-s-  /dev/dri/card0
00007f8c05b22000      4K rw-s-  /dev/dri/card0
00007f8c05b23000      4K rw-s-  /dev/dri/card0
00007f8c05b26000      4K rw-s-  /dev/dri/card0
00007f8c05b2c000      4K rw-s-  /dev/dri/card0
00007f8c05b2d000      4K rw-s-  /dev/dri/card0
00007f8c05b2e000      4K rw-s-  /dev/dri/card0
00007f8c05b2f000      4K rw-s-  /dev/dri/card0
00007f8c05b30000      4K rw-s-  /dev/dri/card0
00007f8c05b31000      4K rw-s-  /dev/dri/card0
00007f8c05b32000      4K rw-s-  /dev/dri/card0
00007f8c05b33000      4K rw-s-  /dev/dri/card0
00007f8c05b34000      4K rw-s-  /dev/dri/card0
00007f8c05b35000      4K rw-s-  /dev/dri/card0
00007f8c05b42000     28K rw-s-  /dev/dri/card0
00007f8c05b8d000    256K rw-s-  /dev/dri/card0
00007f8c05bcd000      8K rw---    [ anon ]
00007f8c05bde000    160K rw-s-  /dev/dri/card0
00007f8c05c2a000    224K rw-s-  /dev/dri/card0
00007f8c05c68000    160K rw-s-  /dev/dri/card0
00007f8c05c90000      4K rw-s-  /dev/dri/card0
00007f8c05c91000     16K rw-s-  /dev/dri/card0
00007f8c05c9c000    160K rw-s-  /dev/dri/card0
00007f8c05cc6000    160K rw-s-  /dev/dri/card0
00007f8c05d06000     48K rw-s-  /dev/dri/card0
00007f8c05d4a000    224K rw-s-  /dev/dri/card0
00007f8c05d82000    384K rw-s-    [ shmid=0xf8010 ]
00007f8c05de2000    384K rw-s-    [ shmid=0xe000f ]
00007f8c05e52000      4K rw-s-  /dev/dri/card0
00007f8c05e56000      4K rw-s-  /dev/dri/card0
00007f8c05e57000      4K rw-s-  /dev/dri/card0
00007f8c05e58000      4K rw-s-  /dev/dri/card0
00007f8c05e59000      4K rw-s-  /dev/dri/card0
00007f8c05e5c000      4K rw-s-  /dev/dri/card0
00007f8c05e5d000      4K rw-s-  /dev/dri/card0
00007f8c05e5e000     64K rw-s-  /dev/dri/card0
00007f8c05e6e000     16K rw-s-  /dev/dri/card0
00007f8c05e72000     28K rw-s-  /dev/dri/card0
00007f8c05e79000      4K rw-s-  /dev/dri/card0
00007f8c05e7e000      8K rw-s-  /dev/dri/card0
00007f8c05e80000      8K rw-s-  /dev/dri/card0
00007f8c05e82000      8K rw-s-  /dev/dri/card0
00007f8c05e84000      4K rw-s-  /dev/dri/card0
00007f8c05e8b000      4K rw-s-  /dev/dri/card0
00007f8c05e8c000      4K rw-s-  /dev/dri/card0
00007f8c05e8e000      4K rw-s-  /dev/dri/card0
00007f8c05e91000      4K rw-s-  /dev/dri/card0
00007f8c05e92000      4K rw-s-  /dev/dri/card0
00007f8c05e93000     16K rw-s-  /dev/dri/card0
00007f8c05e9a000      4K rw-s-  /dev/dri/card0
00007f8c05e9b000      4K rw-s-  /dev/dri/card0
00007f8c05ea1000     32K rw-s-  /dev/dri/card0
00007f8c05eb3000      4K rw-s-  /dev/dri/card0
00007f8c05eb4000      4K rw-s-  /dev/dri/card0
00007f8c05eb5000     16K rw-s-  /dev/dri/card0
00007f8c05eb9000     16K rw-s-  /dev/dri/card0
00007f8c05ebe000      8K rw---    [ anon ]
00007f8c05ec2000      4K rw-s-  /dev/dri/card0
00007f8c05ec3000      4K rw-s-  /dev/dri/card0
00007f8c05ec6000     80K rw-s-  /dev/dri/card0
00007f8c05eda000     16K rw-s-  /dev/dri/card0
00007f8c05ede000     16K rw-s-  /dev/dri/card0
00007f8c05ee2000     16K rw-s-  /dev/dri/card0
00007f8c05ee6000     28K rw-s-  /dev/dri/card0
00007f8c05eed000      4K rw-s-  /dev/dri/card0
00007f8c05eee000      4K rw-s-  /dev/dri/card0
00007f8c05eef000      4K rw-s-  /dev/dri/card0
00007f8c05ef0000     20K rw-s-  /dev/dri/card0
00007f8c05ef5000     20K rw-s-  /dev/dri/card0
00007f8c05efb000      4K rw-s-  /dev/dri/card0
00007f8c05efc000     16K rw-s-  /dev/dri/card0
00007f8c05f00000      4K rw-s-  /dev/dri/card0
00007f8c05f01000      4K rw-s-  /dev/dri/card0
00007f8c05f02000      4K rw-s-  /dev/dri/card0
00007f8c05f04000      4K rw-s-  /dev/dri/card0
00007f8c05f05000      4K rw-s-  /dev/dri/card0
00007f8c05f08000      4K rw-s-  /dev/dri/card0
00007f8c05f0c000    256K rw-s-  /dev/dri/card0
00007f8c05f4c000    384K rw-s-    [ shmid=0xd800c ]
00007f8c05fac000    276K rw---    [ anon ]
00007f8c05ff1000      8K rw-s-  /dev/dri/card0
00007f8c05ff3000      4K rw-s-  /dev/dri/card0
00007f8c05ff4000      4K rw-s-  /dev/dri/card0
00007f8c05ff5000      4K rw-s-  /dev/dri/card0
00007f8c05ff6000     24K rw-s-  /dev/dri/card0
00007f8c05ffc000     24K rw-s-  /dev/dri/card0
00007f8c06002000     24K rw-s-  /dev/dri/card0
00007f8c06008000     24K rw-s-  /dev/dri/card0
00007f8c0600e000      4K rw-s-  /dev/dri/card0
00007f8c06010000      8K rw-s-  /dev/dri/card0
00007f8c06012000      8K rw-s-  /dev/dri/card0
00007f8c06014000      8K rw-s-  /dev/dri/card0
00007f8c06016000      4K rw-s-  /dev/dri/card0
00007f8c06017000      4K rw-s-  /dev/dri/card0
00007f8c06018000      4K rw-s-  /dev/dri/card0
00007f8c06019000      4K rw-s-  /dev/dri/card0
00007f8c0601a000      4K rw-s-  /dev/dri/card0
00007f8c0601b000      4K rw-s-  /dev/dri/card0
00007f8c0601c000      4K rw-s-  /dev/dri/card0
00007f8c0601d000      4K rw-s-  /dev/dri/card0
00007f8c0601e000      4K rw-s-  /dev/dri/card0
00007f8c0601f000      4K rw-s-  /dev/dri/card0
00007f8c06020000     20K rw-s-  /dev/dri/card0
00007f8c06025000      4K rw-s-  /dev/dri/card0
00007f8c06026000      4K rw-s-  /dev/dri/card0
00007f8c06027000     12K rw-s-  /dev/dri/card0
00007f8c0602a000      4K rw-s-  /dev/dri/card0
00007f8c0602b000      4K rw-s-  /dev/dri/card0
00007f8c0602c000      4K rw-s-  /dev/dri/card0
00007f8c0602d000      4K rw-s-  /dev/dri/card0
00007f8c0602e000      4K rw-s-  /dev/dri/card0
00007f8c0602f000      4K rw-s-  /dev/dri/card0
00007f8c06030000      4K rw-s-  /dev/dri/card0
00007f8c06031000      4K rw-s-  /dev/dri/card0
00007f8c06032000      4K rw-s-  /dev/dri/card0
00007f8c06033000      4K rw-s-  /dev/dri/card0
00007f8c06034000      4K rw-s-  /dev/dri/card0
00007f8c06035000      4K rw-s-  /dev/dri/card0
00007f8c06036000    384K rw-s-    [ shmid=0xc0009 ]
00007f8c06096000      4K rw-s-  /dev/dri/card0
00007f8c06097000      4K rw-s-  /dev/dri/card0
00007f8c0609b000      4K rw-s-  /dev/dri/card0
00007f8c060a4000      4K rw-s-  /dev/dri/card0
00007f8c060a5000      4K rw-s-  /dev/dri/card0
00007f8c060a9000     16K rw-s-  /dev/dri/card0
00007f8c060ad000      4K rw-s-  /dev/dri/card0
00007f8c060ae000      4K rw-s-  /dev/dri/card0
00007f8c060af000      4K rw-s-  /dev/dri/card0
00007f8c060b0000      4K rw-s-  /dev/dri/card0
00007f8c060b1000      4K rw-s-  /dev/dri/card0
00007f8c060b2000     16K rw-s-  /dev/dri/card0
00007f8c060b6000     16K rw-s-  /dev/dri/card0
00007f8c060ba000      4K rw-s-  /dev/dri/card0
00007f8c060bb000      4K rw-s-  /dev/dri/card0
00007f8c060be000    384K rw-s-    [ shmid=0xb8008 ]
00007f8c0611e000      4K rw-s-  /dev/dri/card0
00007f8c0611f000      4K rw-s-  /dev/dri/card0
00007f8c06120000      4K rw-s-  /dev/dri/card0
00007f8c06121000      4K rw-s-  /dev/dri/card0
00007f8c06122000      4K rw-s-  /dev/dri/card0
00007f8c06123000      4K rw-s-  /dev/dri/card0
00007f8c06124000      4K rw-s-  /dev/dri/card0
00007f8c06125000      4K rw-s-  /dev/dri/card0
00007f8c06126000      4K rw-s-  /dev/dri/card0
00007f8c06127000      4K rw-s-  /dev/dri/card0
00007f8c06128000      4K rw-s-  /dev/dri/card0
00007f8c06129000      4K rw-s-  /dev/dri/card0
00007f8c0612a000      4K rw-s-  /dev/dri/card0
00007f8c0612b000      4K rw-s-  /dev/dri/card0
00007f8c0612c000   5120K rw-s-  /dev/dri/card0
00007f8c0662c000     44K r-x--  /lib/x86_64-linux-gnu/libnss_files-2.13.so
00007f8c06637000   2044K -----  /lib/x86_64-linux-gnu/libnss_files-2.13.so
00007f8c06836000      4K r----  /lib/x86_64-linux-gnu/libnss_files-2.13.so
00007f8c06837000      4K rw---  /lib/x86_64-linux-gnu/libnss_files-2.13.so
00007f8c06838000     40K r-x--  /lib/x86_64-linux-gnu/libnss_nis-2.13.so
00007f8c06842000   2044K -----  /lib/x86_64-linux-gnu/libnss_nis-2.13.so
00007f8c06a41000      4K r----  /lib/x86_64-linux-gnu/libnss_nis-2.13.so
00007f8c06a42000      4K rw---  /lib/x86_64-linux-gnu/libnss_nis-2.13.so
00007f8c06a43000     84K r-x--  /lib/x86_64-linux-gnu/libnsl-2.13.so
00007f8c06a58000   2044K -----  /lib/x86_64-linux-gnu/libnsl-2.13.so
00007f8c06c57000      4K r----  /lib/x86_64-linux-gnu/libnsl-2.13.so
00007f8c06c58000      4K rw---  /lib/x86_64-linux-gnu/libnsl-2.13.so
00007f8c06c59000      8K rw---    [ anon ]
00007f8c06c5b000     28K r-x--  /lib/x86_64-linux-gnu/libnss_compat-2.13.so
00007f8c06c62000   2044K -----  /lib/x86_64-linux-gnu/libnss_compat-2.13.so
00007f8c06e61000      4K r----  /lib/x86_64-linux-gnu/libnss_compat-2.13.so
00007f8c06e62000      4K rw---  /lib/x86_64-linux-gnu/libnss_compat-2.13.so
00007f8c06e63000   5120K rw-s-  /dev/dri/card0
00007f8c07363000     52K r-x--  /usr/lib/xorg/modules/input/synaptics_drv.so
00007f8c07370000   2048K -----  /usr/lib/xorg/modules/input/synaptics_drv.so
00007f8c07570000      4K rw---  /usr/lib/xorg/modules/input/synaptics_drv.so
00007f8c07571000     48K r-x--  /usr/lib/xorg/modules/input/evdev_drv.so
00007f8c0757d000   2044K -----  /usr/lib/xorg/modules/input/evdev_drv.so
00007f8c0777c000      4K rw---  /usr/lib/xorg/modules/input/evdev_drv.so
00007f8c0777d000   5120K rw-s-  /dev/dri/card0
00007f8c07c7d000    928K r-x--  /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
00007f8c07d65000   2048K -----  /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
00007f8c07f65000     32K r----  /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
00007f8c07f6d000      8K rw---  /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
00007f8c07f6f000     84K rw---    [ anon ]
00007f8c07f84000    156K r-x--  /lib/x86_64-linux-gnu/libexpat.so.1.6.0
00007f8c07fab000   2048K -----  /lib/x86_64-linux-gnu/libexpat.so.1.6.0
00007f8c081ab000      8K r----  /lib/x86_64-linux-gnu/libexpat.so.1.6.0
00007f8c081ad000      4K rw---  /lib/x86_64-linux-gnu/libexpat.so.1.6.0
00007f8c081ae000   3752K r-x--  /usr/lib/x86_64-linux-gnu/dri/i965_dri.so
00007f8c08558000   2044K -----  /usr/lib/x86_64-linux-gnu/dri/i965_dri.so
00007f8c08757000    108K rw---  /usr/lib/x86_64-linux-gnu/dri/i965_dri.so
00007f8c08772000     72K rw---    [ anon ]
00007f8c08784000    136K r-x--  /usr/lib/xorg/modules/libfb.so
00007f8c087a6000   2044K -----  /usr/lib/xorg/modules/libfb.so
00007f8c089a5000      4K r----  /usr/lib/xorg/modules/libfb.so
00007f8c089a6000      4K rw---  /usr/lib/xorg/modules/libfb.so
00007f8c089a7000    112K r-x--  /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1.0.0
00007f8c089c3000   2048K -----  /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1.0.0
00007f8c08bc3000      4K r----  /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1.0.0
00007f8c08bc4000      4K rw---  /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1.0.0
00007f8c08bc5000    312K r-x--  /usr/lib/xorg/modules/drivers/intel_drv.so
00007f8c08c13000   2048K -----  /usr/lib/xorg/modules/drivers/intel_drv.so
00007f8c08e13000     16K rw---  /usr/lib/xorg/modules/drivers/intel_drv.so
00007f8c08e17000     20K r-x--  /usr/lib/xorg/modules/extensions/libdri2.so
00007f8c08e1c000   2044K -----  /usr/lib/xorg/modules/extensions/libdri2.so
00007f8c0901b000      4K r----  /usr/lib/xorg/modules/extensions/libdri2.so
00007f8c0901c000      4K rw---  /usr/lib/xorg/modules/extensions/libdri2.so
00007f8c0901d000     44K r-x--  /usr/lib/x86_64-linux-gnu/libdrm.so.2.4.0
00007f8c09028000   2044K -----  /usr/lib/x86_64-linux-gnu/libdrm.so.2.4.0
00007f8c09227000      4K r----  /usr/lib/x86_64-linux-gnu/libdrm.so.2.4.0
00007f8c09228000      4K rw---  /usr/lib/x86_64-linux-gnu/libdrm.so.2.4.0
00007f8c09229000     40K r-x--  /usr/lib/xorg/modules/extensions/libdri.so
00007f8c09233000   2048K -----  /usr/lib/xorg/modules/extensions/libdri.so
00007f8c09433000      4K r----  /usr/lib/xorg/modules/extensions/libdri.so
00007f8c09434000      4K rw---  /usr/lib/xorg/modules/extensions/libdri.so
00007f8c09435000     28K r-x--  /usr/lib/xorg/modules/extensions/librecord.so
00007f8c0943c000   2044K -----  /usr/lib/xorg/modules/extensions/librecord.so
00007f8c0963b000      4K r----  /usr/lib/xorg/modules/extensions/librecord.so
00007f8c0963c000      4K rw---  /usr/lib/xorg/modules/extensions/librecord.so
00007f8c0963d000    376K r-x--  /usr/lib/xorg/modules/extensions/libglx.so
00007f8c0969b000   2048K -----  /usr/lib/xorg/modules/extensions/libglx.so
00007f8c0989b000      4K r----  /usr/lib/xorg/modules/extensions/libglx.so
00007f8c0989c000     12K rw---  /usr/lib/xorg/modules/extensions/libglx.so
00007f8c0989f000      8K rw---    [ anon ]
00007f8c098a1000     20K r-x--  /usr/lib/xorg/modules/extensions/libdbe.so
00007f8c098a6000   2044K -----  /usr/lib/xorg/modules/extensions/libdbe.so
00007f8c09aa5000      4K r----  /usr/lib/xorg/modules/extensions/libdbe.so
00007f8c09aa6000      4K rw---  /usr/lib/xorg/modules/extensions/libdbe.so
00007f8c09aa7000    120K r-x--  /lib/x86_64-linux-gnu/libselinux.so.1
00007f8c09ac5000   2044K -----  /lib/x86_64-linux-gnu/libselinux.so.1
00007f8c09cc4000      4K r----  /lib/x86_64-linux-gnu/libselinux.so.1
00007f8c09cc5000      4K rw---  /lib/x86_64-linux-gnu/libselinux.so.1
00007f8c09cc6000      4K rw---    [ anon ]
00007f8c09cc7000    140K r-x--  /usr/lib/xorg/modules/extensions/libextmod.so
00007f8c09cea000   2044K -----  /usr/lib/xorg/modules/extensions/libextmod.so
00007f8c09ee9000      4K r----  /usr/lib/xorg/modules/extensions/libextmod.so
00007f8c09eea000      8K rw---  /usr/lib/xorg/modules/extensions/libextmod.so
00007f8c09eec000     84K r-x--  /lib/x86_64-linux-gnu/libgcc_s.so.1
00007f8c09f01000   2048K -----  /lib/x86_64-linux-gnu/libgcc_s.so.1
00007f8c0a101000      4K rw---  /lib/x86_64-linux-gnu/libgcc_s.so.1
00007f8c0a102000     12K rw---    [ anon ]
00007f8c0a105000     24K r-x--  /usr/lib/x86_64-linux-gnu/libfontenc.so.1.0.0
00007f8c0a10b000   2044K -----  /usr/lib/x86_64-linux-gnu/libfontenc.so.1.0.0
00007f8c0a30a000      8K rw---  /usr/lib/x86_64-linux-gnu/libfontenc.so.1.0.0
00007f8c0a30c000     60K r-x--  /lib/x86_64-linux-gnu/libbz2.so.1.0.4
00007f8c0a31b000   2044K -----  /lib/x86_64-linux-gnu/libbz2.so.1.0.4
00007f8c0a51a000      8K rw---  /lib/x86_64-linux-gnu/libbz2.so.1.0.4
00007f8c0a51c000    612K r-x--  /usr/lib/x86_64-linux-gnu/libfreetype.so.6.8.1
00007f8c0a5b5000   2044K -----  /usr/lib/x86_64-linux-gnu/libfreetype.so.6.8.1
00007f8c0a7b4000     24K r----  /usr/lib/x86_64-linux-gnu/libfreetype.so.6.8.1
00007f8c0a7ba000      4K rw---  /usr/lib/x86_64-linux-gnu/libfreetype.so.6.8.1
00007f8c0a7bb000     88K r-x--  /usr/lib/x86_64-linux-gnu/libz.so.1.2.6
00007f8c0a7d1000   2044K -----  /usr/lib/x86_64-linux-gnu/libz.so.1.2.6
00007f8c0a9d0000      4K rw---  /usr/lib/x86_64-linux-gnu/libz.so.1.2.6
00007f8c0a9d1000     12K r-x--  /lib/x86_64-linux-gnu/libgpg-error.so.0.8.0
00007f8c0a9d4000   2044K -----  /lib/x86_64-linux-gnu/libgpg-error.so.0.8.0
00007f8c0abd3000      4K rw---  /lib/x86_64-linux-gnu/libgpg-error.so.0.8.0
00007f8c0abd4000   1524K r-x--  /lib/x86_64-linux-gnu/libc-2.13.so
00007f8c0ad51000   2048K -----  /lib/x86_64-linux-gnu/libc-2.13.so
00007f8c0af51000     16K r----  /lib/x86_64-linux-gnu/libc-2.13.so
00007f8c0af55000      4K rw---  /lib/x86_64-linux-gnu/libc-2.13.so
00007f8c0af56000     20K rw---    [ anon ]
00007f8c0af5b000     28K r-x--  /lib/x86_64-linux-gnu/librt-2.13.so
00007f8c0af62000   2044K -----  /lib/x86_64-linux-gnu/librt-2.13.so
00007f8c0b161000      4K r----  /lib/x86_64-linux-gnu/librt-2.13.so
00007f8c0b162000      4K rw---  /lib/x86_64-linux-gnu/librt-2.13.so
00007f8c0b163000    516K r-x--  /lib/x86_64-linux-gnu/libm-2.13.so
00007f8c0b1e4000   2044K -----  /lib/x86_64-linux-gnu/libm-2.13.so
00007f8c0b3e3000      4K r----  /lib/x86_64-linux-gnu/libm-2.13.so
00007f8c0b3e4000      4K rw---  /lib/x86_64-linux-gnu/libm-2.13.so
00007f8c0b3e5000     92K r-x--  /lib/libaudit.so.0.0.0
00007f8c0b3fc000   2044K -----  /lib/libaudit.so.0.0.0
00007f8c0b5fb000      4K r----  /lib/libaudit.so.0.0.0
00007f8c0b5fc000      4K rw---  /lib/libaudit.so.0.0.0
00007f8c0b5fd000     20K r-x--  /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0
00007f8c0b602000   2044K -----  /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0
00007f8c0b801000      4K r----  /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0
00007f8c0b802000      4K rw---  /usr/lib/x86_64-linux-gnu/libXdmcp.so.6.0.0
00007f8c0b803000      8K r-x--  /usr/lib/x86_64-linux-gnu/libXau.so.6.0.0
00007f8c0b805000   2044K -----  /usr/lib/x86_64-linux-gnu/libXau.so.6.0.0
00007f8c0ba04000      4K r----  /usr/lib/x86_64-linux-gnu/libXau.so.6.0.0
00007f8c0ba05000      4K rw---  /usr/lib/x86_64-linux-gnu/libXau.so.6.0.0
00007f8c0ba06000    236K r-x--  /usr/lib/libXfont.so.1.4.1
00007f8c0ba41000   2044K -----  /usr/lib/libXfont.so.1.4.1
00007f8c0bc40000      4K r----  /usr/lib/libXfont.so.1.4.1
00007f8c0bc41000      8K rw---  /usr/lib/libXfont.so.1.4.1
00007f8c0bc43000    520K r-x--  /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.24.4
00007f8c0bcc5000   2044K -----  /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.24.4
00007f8c0bec4000     24K rw---  /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.24.4
00007f8c0beca000     92K r-x--  /lib/x86_64-linux-gnu/libpthread-2.13.so
00007f8c0bee1000   2044K -----  /lib/x86_64-linux-gnu/libpthread-2.13.so
00007f8c0c0e0000      4K r----  /lib/x86_64-linux-gnu/libpthread-2.13.so
00007f8c0c0e1000      4K rw---  /lib/x86_64-linux-gnu/libpthread-2.13.so
00007f8c0c0e2000     16K rw---    [ anon ]
00007f8c0c0e6000     32K r-x--  /usr/lib/x86_64-linux-gnu/libpciaccess.so.0.11.0
00007f8c0c0ee000   2044K -----  /usr/lib/x86_64-linux-gnu/libpciaccess.so.0.11.0
00007f8c0c2ed000      4K r----  /usr/lib/x86_64-linux-gnu/libpciaccess.so.0.11.0
00007f8c0c2ee000      4K rw---  /usr/lib/x86_64-linux-gnu/libpciaccess.so.0.11.0
00007f8c0c2ef000      8K r-x--  /lib/x86_64-linux-gnu/libdl-2.13.so
00007f8c0c2f1000   2048K -----  /lib/x86_64-linux-gnu/libdl-2.13.so
00007f8c0c4f1000      4K r----  /lib/x86_64-linux-gnu/libdl-2.13.so
00007f8c0c4f2000      4K rw---  /lib/x86_64-linux-gnu/libdl-2.13.so
00007f8c0c4f3000    488K r-x--  /lib/x86_64-linux-gnu/libgcrypt.so.11.7.0
00007f8c0c56d000   2048K -----  /lib/x86_64-linux-gnu/libgcrypt.so.11.7.0
00007f8c0c76d000     16K rw---  /lib/x86_64-linux-gnu/libgcrypt.so.11.7.0
00007f8c0c771000     56K r-x--  /lib/x86_64-linux-gnu/libudev.so.0.13.0
00007f8c0c77f000   2044K -----  /lib/x86_64-linux-gnu/libudev.so.0.13.0
00007f8c0c97e000      4K r----  /lib/x86_64-linux-gnu/libudev.so.0.13.0
00007f8c0c97f000      4K rw---  /lib/x86_64-linux-gnu/libudev.so.0.13.0
00007f8c0c980000    124K r-x--  /lib/x86_64-linux-gnu/ld-2.13.so
00007f8c0c99f000      4K rw-s-  /dev/dri/card0
00007f8c0c9a0000      4K rw-s-  /dev/dri/card0
00007f8c0c9a1000      4K rw-s-  /dev/dri/card0
00007f8c0c9a2000      4K rw-s-  /dev/dri/card0
00007f8c0c9a3000      4K rw-s-  /dev/dri/card0
00007f8c0c9a4000    384K rw-s-    [ shmid=0xb0007 ]
00007f8c0ca04000    384K rw-s-    [ shmid=0xa8006 ]
00007f8c0ca64000      4K rw-s-  /dev/dri/card0
00007f8c0ca65000      4K rw-s-  /dev/dri/card0
00007f8c0ca66000      4K rw-s-  /dev/dri/card0
00007f8c0ca67000      4K rw-s-  /dev/dri/card0
00007f8c0ca68000      4K rw-s-  /dev/dri/card0
00007f8c0ca69000      4K rw-s-  /dev/dri/card0
00007f8c0ca6a000      4K rw-s-  /dev/dri/card0
00007f8c0ca6b000      4K rw-s-  /dev/dri/card0
00007f8c0ca6c000      4K rw-s-  /dev/dri/card0
00007f8c0ca6d000      4K rw-s-  /dev/dri/card0
00007f8c0ca6e000      4K rw-s-  /dev/dri/card0
00007f8c0ca6f000      4K rw-s-  /dev/dri/card0
00007f8c0ca70000      4K rw-s-  /dev/dri/card0
00007f8c0ca71000      4K rw-s-  /dev/dri/card0
00007f8c0ca72000      4K rw-s-  /dev/dri/card0
00007f8c0ca73000      4K rw-s-  /dev/dri/card0
00007f8c0ca74000      4K rw-s-  /dev/dri/card0
00007f8c0ca75000      4K rw-s-  /dev/dri/card0
00007f8c0ca76000      4K rw-s-  /dev/dri/card0
00007f8c0ca77000      4K rw-s-  /dev/dri/card0
00007f8c0ca78000      4K rw-s-  /dev/dri/card0
00007f8c0ca79000      4K rw-s-  /dev/dri/card0
00007f8c0ca7a000      4K rw-s-  /dev/dri/card0
00007f8c0ca7b000      4K rw-s-  /dev/dri/card0
00007f8c0ca7c000      4K rw-s-  /dev/dri/card0
00007f8c0ca7d000      4K rw-s-  /dev/dri/card0
00007f8c0ca7e000      4K rw-s-  /dev/dri/card0
00007f8c0ca7f000      4K rw-s-  /dev/dri/card0
00007f8c0ca80000      4K rw-s-  /dev/dri/card0
00007f8c0ca81000      4K rw-s-  /dev/dri/card0
00007f8c0ca82000      4K rw-s-  /dev/dri/card0
00007f8c0ca83000      4K rw-s-  /dev/dri/card0
00007f8c0ca84000      4K rw-s-  /dev/dri/card0
00007f8c0ca85000      4K rw-s-  /dev/dri/card0
00007f8c0ca86000      4K rw-s-  /dev/dri/card0
00007f8c0ca87000      4K rw-s-  /dev/dri/card0
00007f8c0ca88000      4K rw-s-  /dev/dri/card0
00007f8c0ca89000    384K rw-s-    [ shmid=0xa0005 ]
00007f8c0cae9000      4K rw-s-  /dev/dri/card0
00007f8c0caea000      4K rw-s-  /dev/dri/card0
00007f8c0caeb000      4K rw-s-  /dev/dri/card0
00007f8c0caec000      4K rw-s-  /dev/dri/card0
00007f8c0caed000      4K rw-s-  /dev/dri/card0
00007f8c0caee000      4K rw-s-  /dev/dri/card0
00007f8c0caef000      4K rw-s-  /dev/dri/card0
00007f8c0caf0000      4K rw-s-  /dev/dri/card0
00007f8c0caf1000      4K rw-s-  /dev/dri/card0
00007f8c0caf2000      4K rw-s-  /dev/dri/card0
00007f8c0caf3000      4K rw-s-  /dev/dri/card0
00007f8c0caf4000      4K rw-s-  /dev/dri/card0
00007f8c0caf5000      4K rw-s-  /dev/dri/card0
00007f8c0caf6000      4K rw-s-  /dev/dri/card0
00007f8c0caf7000      4K rw-s-  /dev/dri/card0
00007f8c0caf8000      4K rw-s-  /dev/dri/card0
00007f8c0caf9000      4K rw-s-  /dev/dri/card0
00007f8c0cafa000      4K rw-s-  /dev/dri/card0
00007f8c0cafb000      4K rw-s-  /dev/dri/card0
00007f8c0cafc000      4K rw-s-  /dev/dri/card0
00007f8c0cafd000      4K rw-s-  /dev/dri/card0
00007f8c0cafe000      4K rw-s-  /dev/dri/card0
00007f8c0caff000      4K rw-s-  /dev/dri/card0
00007f8c0cb00000      4K rw-s-  /dev/dri/card0
00007f8c0cb01000      4K rw-s-  /dev/dri/card0
00007f8c0cb02000      4K rw-s-  /dev/dri/card0
00007f8c0cb03000      4K rw-s-  /dev/dri/card0
00007f8c0cb04000      4K rw-s-  /dev/dri/card0
00007f8c0cb05000      4K rw-s-  /dev/dri/card0
00007f8c0cb06000      4K rw-s-  /dev/dri/card0
00007f8c0cb07000      4K rw-s-  /dev/dri/card0
00007f8c0cb08000      4K rw-s-  /dev/dri/card0
00007f8c0cb09000      4K rw-s-  /dev/dri/card0
00007f8c0cb0a000      4K rw-s-  /dev/dri/card0
00007f8c0cb0b000    264K rw---    [ anon ]
00007f8c0cb4d000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb4e000     28K rw-s-  /drm mm object (deleted)
00007f8c0cb55000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb56000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb57000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb58000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb59000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5a000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5b000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5c000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5d000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5e000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb5f000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb60000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb61000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb62000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb63000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb64000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb65000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb66000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb67000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb68000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb69000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6a000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6b000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6c000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6d000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6e000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb6f000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb70000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb71000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb72000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb73000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb74000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb75000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb76000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb77000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb78000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb79000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7a000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7b000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7c000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7d000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7e000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb7f000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb80000     24K rw---    [ anon ]
00007f8c0cb86000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb87000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb88000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb89000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8a000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8b000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8c000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8d000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8e000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb8f000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb90000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb91000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb92000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb93000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb94000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb95000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb96000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb97000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb98000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb99000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb9a000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb9b000      4K rw-s-  /drm mm object (deleted)
00007f8c0cb9c000     12K rw---    [ anon ]
00007f8c0cb9f000      4K r----  /lib/x86_64-linux-gnu/ld-2.13.so
00007f8c0cba0000      4K rw---  /lib/x86_64-linux-gnu/ld-2.13.so
00007f8c0cba1000      4K rw---    [ anon ]
00007f8c0cba2000   1956K r-x--  /usr/bin/Xorg
00007f8c0cf8a000     12K r----  /usr/bin/Xorg
00007f8c0cf8d000     44K rw---  /usr/bin/Xorg
00007f8c0cf98000     76K rw---    [ anon ]
00007f8c0e9c6000   7108K rw---    [ anon ]
00007fffadf48000    132K rw---    [ stack ]
00007fffadffc000      4K r-x--    [ anon ]
ffffffffff600000      4K r-x--    [ anon ]
  total           124132K

>
> thanks,
> suresh


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-03 19:32           ` Cyrill Gorcunov
@ 2012-04-05 20:29             ` Matt Helsley
  2012-04-05 20:53               ` Cyrill Gorcunov
  2012-04-05 21:04               ` Konstantin Khlebnikov
  0 siblings, 2 replies; 52+ messages in thread
From: Matt Helsley @ 2012-04-05 20:29 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Matt Helsley, Konstantin Khlebnikov, Oleg Nesterov, linux-mm,
	Andrew Morton, linux-kernel, Eric Paris, linux-security-module,
	oprofile-list, Linus Torvalds, Al Viro

On Tue, Apr 03, 2012 at 11:32:04PM +0400, Cyrill Gorcunov wrote:
> On Tue, Apr 03, 2012 at 11:16:31AM -0700, Matt Helsley wrote:
> > On Tue, Apr 03, 2012 at 09:10:20AM +0400, Konstantin Khlebnikov wrote:
> > > Matt Helsley wrote:
> > > >On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
> > > >>On 03/31, Konstantin Khlebnikov wrote:
> > > >>>
> > > >>>comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
> > > >>>where all this stuff was introduced:
> > > >>>
> > > >>>>...
> > > >>>>This avoids pinning the mounted filesystem.
> > > >>>
> > > >>>So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
> > > >>>fix some hypothetical pinning fs from umounting by mm which already unmapped all
> > > >>>its executable files, but still alive. Does anyone know any real world example?
> > > >>
> > > >>This is the question to Matt.
> > > >
> > > >This is where I got the scenario:
> > > >
> > > >https://lkml.org/lkml/2007/7/12/398
> > > 
> > > Cyrill Gogcunov's patch "c/r: prctl: add ability to set new mm_struct::exe_file"
> > > gives userspace ability to unpin vfsmount explicitly.
> > 
> > Doesn't that break the semantics of the kernel ABI?
> 
> Which one? exe_file can be changed iif there is no MAP_EXECUTABLE left.
> Still, once assigned (via this prctl) the mm_struct::exe_file can't be changed
> again, until program exit.

The prctl() interface itself is fine as it stands now.

As far as I can tell Konstantin is proposing that we remove the unusual
counter that tracks the number of mappings of the exe_file and require
userspace use the prctl() to drop the last reference. That's what I think
will break the ABI because after that change you *must* change userspace
code to use the prctl(). It's an ABI change because the same sequence of
system calls with the same input bits produces different behavior.

Cheers,
	-Matt


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-05 20:29             ` Matt Helsley
@ 2012-04-05 20:53               ` Cyrill Gorcunov
  2012-04-05 21:04               ` Konstantin Khlebnikov
  1 sibling, 0 replies; 52+ messages in thread
From: Cyrill Gorcunov @ 2012-04-05 20:53 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Konstantin Khlebnikov, Oleg Nesterov, linux-mm, Andrew Morton,
	linux-kernel, Eric Paris, linux-security-module, oprofile-list,
	Linus Torvalds, Al Viro

On Thu, Apr 05, 2012 at 01:29:04PM -0700, Matt Helsley wrote:
...
> > > Doesn't that break the semantics of the kernel ABI?
> > 
> > Which one? exe_file can be changed iif there is no MAP_EXECUTABLE left.
> > Still, once assigned (via this prctl) the mm_struct::exe_file can't be changed
> > again, until program exit.
> 
> The prctl() interface itself is fine as it stands now.
> 
> As far as I can tell Konstantin is proposing that we remove the unusual
> counter that tracks the number of mappings of the exe_file and require
> userspace use the prctl() to drop the last reference. That's what I think
> will break the ABI because after that change you *must* change userspace
> code to use the prctl(). It's an ABI change because the same sequence of
> system calls with the same input bits produces different behavior.

Hi Matt, I see what you mean (I misread your email at first, sorry).
Sure it's impossible to patch already existing programs (and btw, this
prctl code actually won't help a program to drop symlink completely
and live without it then, because old one will gone but new one
will be assigned) so personally I can't answer here on Konstantin's
behalf, but I guess the main question is -- which programs use this
'drop-all-MAP_EXECUTABLE' feature?

	Cyrill

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-05 20:29             ` Matt Helsley
  2012-04-05 20:53               ` Cyrill Gorcunov
@ 2012-04-05 21:04               ` Konstantin Khlebnikov
  2012-04-05 21:44                 ` Matt Helsley
  1 sibling, 1 reply; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-05 21:04 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Cyrill Gorcunov, Oleg Nesterov, linux-mm, Andrew Morton,
	linux-kernel, Eric Paris, linux-security-module, oprofile-list,
	Linus Torvalds, Al Viro

Matt Helsley wrote:
> On Tue, Apr 03, 2012 at 11:32:04PM +0400, Cyrill Gorcunov wrote:
>> On Tue, Apr 03, 2012 at 11:16:31AM -0700, Matt Helsley wrote:
>>> On Tue, Apr 03, 2012 at 09:10:20AM +0400, Konstantin Khlebnikov wrote:
>>>> Matt Helsley wrote:
>>>>> On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
>>>>>> On 03/31, Konstantin Khlebnikov wrote:
>>>>>>>
>>>>>>> comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
>>>>>>> where all this stuff was introduced:
>>>>>>>
>>>>>>>> ...
>>>>>>>> This avoids pinning the mounted filesystem.
>>>>>>>
>>>>>>> So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
>>>>>>> fix some hypothetical pinning fs from umounting by mm which already unmapped all
>>>>>>> its executable files, but still alive. Does anyone know any real world example?
>>>>>>
>>>>>> This is the question to Matt.
>>>>>
>>>>> This is where I got the scenario:
>>>>>
>>>>> https://lkml.org/lkml/2007/7/12/398
>>>>
>>>> Cyrill Gogcunov's patch "c/r: prctl: add ability to set new mm_struct::exe_file"
>>>> gives userspace ability to unpin vfsmount explicitly.
>>>
>>> Doesn't that break the semantics of the kernel ABI?
>>
>> Which one? exe_file can be changed iif there is no MAP_EXECUTABLE left.
>> Still, once assigned (via this prctl) the mm_struct::exe_file can't be changed
>> again, until program exit.
>
> The prctl() interface itself is fine as it stands now.
>
> As far as I can tell Konstantin is proposing that we remove the unusual
> counter that tracks the number of mappings of the exe_file and require
> userspace use the prctl() to drop the last reference. That's what I think
> will break the ABI because after that change you *must* change userspace
> code to use the prctl(). It's an ABI change because the same sequence of
> system calls with the same input bits produces different behavior.

But common software does not require this at all. I did not found real examples,
only hypothesis by Al Viro: https://lkml.org/lkml/2007/7/12/398
libhugetlbfs isn't good example too, the man proc says: /proc/[pid]/exe is alive until
main thread is alive, but in case libhugetlbfs /proc/[pid]/exe disappears too early.
Also I would not call it ABI, this corner-case isn't documented, I'm afraid only few
people in the world knows about it =)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-05 21:04               ` Konstantin Khlebnikov
@ 2012-04-05 21:44                 ` Matt Helsley
  2012-04-05 21:55                   ` Linus Torvalds
  0 siblings, 1 reply; 52+ messages in thread
From: Matt Helsley @ 2012-04-05 21:44 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Matt Helsley, Cyrill Gorcunov, Oleg Nesterov, linux-mm,
	Andrew Morton, linux-kernel, Eric Paris, linux-security-module,
	oprofile-list, Linus Torvalds, Al Viro

On Fri, Apr 06, 2012 at 01:04:43AM +0400, Konstantin Khlebnikov wrote:
> Matt Helsley wrote:
> >On Tue, Apr 03, 2012 at 11:32:04PM +0400, Cyrill Gorcunov wrote:
> >>On Tue, Apr 03, 2012 at 11:16:31AM -0700, Matt Helsley wrote:
> >>>On Tue, Apr 03, 2012 at 09:10:20AM +0400, Konstantin Khlebnikov wrote:
> >>>>Matt Helsley wrote:
> >>>>>On Sat, Mar 31, 2012 at 10:13:24PM +0200, Oleg Nesterov wrote:
> >>>>>>On 03/31, Konstantin Khlebnikov wrote:
> >>>>>>>
> >>>>>>>comment from v2.6.25-6245-g925d1c4 ("procfs task exe symlink"),
> >>>>>>>where all this stuff was introduced:
> >>>>>>>
> >>>>>>>>...
> >>>>>>>>This avoids pinning the mounted filesystem.
> >>>>>>>
> >>>>>>>So, this logic is hooked into every file mmap/unmmap and vma split/merge just to
> >>>>>>>fix some hypothetical pinning fs from umounting by mm which already unmapped all
> >>>>>>>its executable files, but still alive. Does anyone know any real world example?
> >>>>>>
> >>>>>>This is the question to Matt.
> >>>>>
> >>>>>This is where I got the scenario:
> >>>>>
> >>>>>https://lkml.org/lkml/2007/7/12/398
> >>>>
> >>>>Cyrill Gogcunov's patch "c/r: prctl: add ability to set new mm_struct::exe_file"
> >>>>gives userspace ability to unpin vfsmount explicitly.
> >>>
> >>>Doesn't that break the semantics of the kernel ABI?
> >>
> >>Which one? exe_file can be changed iif there is no MAP_EXECUTABLE left.
> >>Still, once assigned (via this prctl) the mm_struct::exe_file can't be changed
> >>again, until program exit.
> >
> >The prctl() interface itself is fine as it stands now.
> >
> >As far as I can tell Konstantin is proposing that we remove the unusual
> >counter that tracks the number of mappings of the exe_file and require
> >userspace use the prctl() to drop the last reference. That's what I think
> >will break the ABI because after that change you *must* change userspace
> >code to use the prctl(). It's an ABI change because the same sequence of
> >system calls with the same input bits produces different behavior.
> 
> But common software does not require this at all. I did not found real examples,
> only hypothesis by Al Viro: https://lkml.org/lkml/2007/7/12/398
> libhugetlbfs isn't good example too, the man proc says: /proc/[pid]/exe is alive until
> main thread is alive, but in case libhugetlbfs /proc/[pid]/exe disappears too early.

*shrug*

Where did you look for real examples? chroot? pivot_root? various initrd
systems? Which versions?

This sort of argument brings up classic questions. How do we know when
to stop looking given the incredible amount of obscure code that's out
there -- most of which we're unlikely to even be aware of? Even if we
only look at "popular" distros how far back do we go? etc.

Perhaps before going through all that effort it would be better to
verify that removing that code impacts performance enough to care. Do
you have numbers? If the numbers aren't there then why bother with
exhaustive and exhausting code searches?

>
> Also I would not call it ABI, this corner-case isn't documented, I'm afraid only few
> people in the world knows about it =)

I don't think the definition of an ABI is whether there's documentation
for it. It's whether the interface is used or not. At least that's the
impression I've gotten from reading Linus' rants over the years.

I think of the ABI as bits input versus behavior (including bits) out. If
the input bits remain the same the qualitative behavior should remain the
same unless there is a bug. Here, roughly speaking, the input bits are the
arguments passed to a sequence of one or more munmap() calls followed by a
umount(). The output is a 0 return value from the umount. Your proposal
would change that output value to -1 -- different bits and different
behavior.

Cheers,
	-Matt Helsley


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-05 21:44                 ` Matt Helsley
@ 2012-04-05 21:55                   ` Linus Torvalds
  2012-04-06  4:36                     ` Konstantin Khlebnikov
  0 siblings, 1 reply; 52+ messages in thread
From: Linus Torvalds @ 2012-04-05 21:55 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Konstantin Khlebnikov, Cyrill Gorcunov, Oleg Nesterov, linux-mm,
	Andrew Morton, linux-kernel, Eric Paris, linux-security-module,
	oprofile-list, Al Viro

On Thu, Apr 5, 2012 at 2:44 PM, Matt Helsley <matthltc@us.ibm.com> wrote:
>
> I don't think the definition of an ABI is whether there's documentation
> for it. It's whether the interface is used or not. At least that's the
> impression I've gotten from reading Linus' rants over the years.

Yes.

That said, I *do* have some very dim memory of us having had real
issues with the /proc/<pid>/exe thing and having regressions due to
holding refcounts to executables that were a.out binaries and not
demand-loaded. And people wanting to unmount filesystems despite the
binaries being live.

That said, I suspect that whatever issues we used to have with that
are pretty long gone. I don't think people use non-mmap'ed binaries
any more. So I think we can try it and see. And revert if somebody
actually notices and has problems.

                    Linus

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [v3 VM_PAT PATCH 0/3] x86 VM_PAT series
  2012-04-05 11:56             ` Konstantin Khlebnikov
@ 2012-04-06  0:01               ` Suresh Siddha
  2012-04-06  0:01                 ` [v3 VM_PAT PATCH 1/3] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
                                   ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: Suresh Siddha @ 2012-04-06  0:01 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Konstantin Khlebnikov, linux-mm,
	Andrew Morton, linux-kernel
  Cc: Suresh Siddha, Andi Kleen, Pallipadi Venkatesh, Ingo Molnar,
	H. Peter Anvin, Linus Torvalds, Nick Piggin

On Thu, 2012-04-05 at 15:56 +0400, Konstantin Khlebnikov wrote:
> With this patches I see new ranges in /sys/kernel/debug/x86/pat_memtype_list
> This is 4k single-page vma mappged by X11. kernel fills them via vm_insert_pfn().
> Is this ok?

This is expected and I saw these new entries too (but not as many as you saw), as the
patch is tracking single page vma's coming from vm_insert_pfn() interface too.

Thinking a bit more about this in the context of your numbers, those new entries that
are getting tracked are not adding any new value. As the driver has already reserved the
whole aperture with write-combining attribute, tracking these single page vma's doesn't
help anymore.

> Maybe we shouldn't use PAT for small VMA?

For vm_insert_pfn(), expectation is that we just look up the memory attribute.
And for remap_pfn_range(), if the whole VMA is remapped, we reserve the new
attribute for the specified pfn-range, as typically drivers
call remap_pfn_range() for the whole VMA (can be a single page) with the desired
attribute (with out the prior reservation of the memory attribute for the pfn range).
So exposing two different API's for this behavior is probably the better way
to address this in a clean way. Revised patches follows.

Konstantin Khlebnikov (1):
  mm, x86, PAT: rework linear pfn-mmap tracking

Suresh Siddha (2):
  x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn
    vma routines
  x86, pat: separate the pfn attribute tracking for remap_pfn_range and
    vm_insert_pfn

 arch/x86/mm/pat.c             |   80 ++++++++++++++++++++++++++++------------
 include/asm-generic/pgtable.h |   57 +++++++++++++++++------------
 include/linux/mm.h            |   15 +-------
 mm/huge_memory.c              |    7 ++--
 mm/memory.c                   |   23 +++++-------
 5 files changed, 104 insertions(+), 78 deletions(-)

-- 
1.7.6.5

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [v3 VM_PAT PATCH 1/3] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines
  2012-04-06  0:01               ` [v3 VM_PAT PATCH 0/3] x86 VM_PAT series Suresh Siddha
@ 2012-04-06  0:01                 ` Suresh Siddha
  2012-04-06  0:01                 ` [v3 VM_PAT PATCH 2/3] x86, pat: separate the pfn attribute tracking for remap_pfn_range and vm_insert_pfn Suresh Siddha
  2012-04-06  0:01                 ` [v3 VM_PAT PATCH 3/3] mm, x86, PAT: rework linear pfn-mmap tracking Suresh Siddha
  2 siblings, 0 replies; 52+ messages in thread
From: Suresh Siddha @ 2012-04-06  0:01 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Konstantin Khlebnikov, linux-mm,
	Andrew Morton, linux-kernel
  Cc: Suresh Siddha, Andi Kleen, Pallipadi Venkatesh, Ingo Molnar,
	H. Peter Anvin, Linus Torvalds, Nick Piggin

'pfn' argument for track_pfn_vma_new() can be used for reserving the attribute
for the pfn range. No need to depend on 'vm_pgoff'

Similarly, untrack_pfn_vma() can depend on the 'pfn' argument if it
is non-zero or can use follow_phys() to get the starting value of the pfn
range.

Also the non zero 'size' argument can be used instead of recomputing
it from vma.

This cleanup also prepares the ground for the track/untrack pfn vma routines
to take over the ownership of setting PAT specific vm_flag in the 'vma'.

-v2: fixed the first argument for reserve_pfn_range()

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 arch/x86/mm/pat.c |   30 +++++++++++++++++-------------
 1 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index f6ff57b..24c3f95 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -693,14 +693,10 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
 			unsigned long pfn, unsigned long size)
 {
 	unsigned long flags;
-	resource_size_t paddr;
-	unsigned long vma_size = vma->vm_end - vma->vm_start;
 
-	if (is_linear_pfn_mapping(vma)) {
-		/* reserve the whole chunk starting from vm_pgoff */
-		paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
-		return reserve_pfn_range(paddr, vma_size, prot, 0);
-	}
+	/* reserve the whole chunk starting from pfn */
+	if (is_linear_pfn_mapping(vma))
+		return reserve_pfn_range(pfn << PAGE_SHIFT, size, prot, 0);
 
 	if (!pat_enabled)
 		return 0;
@@ -716,20 +712,28 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
 /*
  * untrack_pfn_vma is called while unmapping a pfnmap for a region.
  * untrack can be called for a specific region indicated by pfn and size or
- * can be for the entire vma (in which case size can be zero).
+ * can be for the entire vma (in which case pfn, size are zero).
  */
 void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
 			unsigned long size)
 {
 	resource_size_t paddr;
-	unsigned long vma_size = vma->vm_end - vma->vm_start;
+	unsigned long prot;
 
-	if (is_linear_pfn_mapping(vma)) {
-		/* free the whole chunk starting from vm_pgoff */
-		paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT;
-		free_pfn_range(paddr, vma_size);
+	if (!is_linear_pfn_mapping(vma))
 		return;
+
+	/* free the chunk starting from pfn or the whole chunk */
+	paddr = (resource_size_t)pfn << PAGE_SHIFT;
+	if (!paddr && !size) {
+		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) {
+			WARN_ON_ONCE(1);
+			return;
+		}
+
+		size = vma->vm_end - vma->vm_start;
 	}
+	free_pfn_range(paddr, size);
 }
 
 pgprot_t pgprot_writecombine(pgprot_t prot)
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [v3 VM_PAT PATCH 2/3] x86, pat: separate the pfn attribute tracking for remap_pfn_range and vm_insert_pfn
  2012-04-06  0:01               ` [v3 VM_PAT PATCH 0/3] x86 VM_PAT series Suresh Siddha
  2012-04-06  0:01                 ` [v3 VM_PAT PATCH 1/3] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
@ 2012-04-06  0:01                 ` Suresh Siddha
  2012-04-06  0:01                 ` [v3 VM_PAT PATCH 3/3] mm, x86, PAT: rework linear pfn-mmap tracking Suresh Siddha
  2 siblings, 0 replies; 52+ messages in thread
From: Suresh Siddha @ 2012-04-06  0:01 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Konstantin Khlebnikov, linux-mm,
	Andrew Morton, linux-kernel
  Cc: Suresh Siddha, Andi Kleen, Pallipadi Venkatesh, Ingo Molnar,
	H. Peter Anvin, Linus Torvalds, Nick Piggin

With PAT enabled, vm_insert_pfn() looks up the existing pfn memory attribute
and uses it. Expectation is that the driver reserves the memory attributes
for the pfn before calling vm_insert_pfn().

remap_pfn_range() (when called for the whole vma) will setup a
new attribute (based on the prot argument) for the specified pfn range. This
addresses the legacy usage which typically calls remap_pfn_range() with
a desired memory attribute. For ranges smaller than the vma size (which
is typically not the case), remap_pfn_range() will use the
existing memory attribute for the pfn range.

Expose two different API's for these different behaviors.
track_pfn_insert() for tracking the pfn attribute set by
vm_insert_pfn() and track_pfn_remap() for the remap_pfn_range().

This cleanup also prepares the ground for the track/untrack pfn vma routines
to take over the ownership of setting PAT specific vm_flag in the 'vma'.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 arch/x86/mm/pat.c             |   43 +++++++++++++++++++++++--------
 include/asm-generic/pgtable.h |   55 +++++++++++++++++++++++-----------------
 mm/memory.c                   |   13 +++------
 3 files changed, 69 insertions(+), 42 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 24c3f95..d0553bf 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -652,13 +652,13 @@ static void free_pfn_range(u64 paddr, unsigned long size)
 }
 
 /*
- * track_pfn_vma_copy is called when vma that is covering the pfnmap gets
+ * track_pfn_copy is called when vma that is covering the pfnmap gets
  * copied through copy_page_range().
  *
  * If the vma has a linear pfn mapping for the entire range, we get the prot
  * from pte and reserve the entire vma range with single reserve_pfn_range call.
  */
-int track_pfn_vma_copy(struct vm_area_struct *vma)
+int track_pfn_copy(struct vm_area_struct *vma)
 {
 	resource_size_t paddr;
 	unsigned long prot;
@@ -682,17 +682,15 @@ int track_pfn_vma_copy(struct vm_area_struct *vma)
 }
 
 /*
- * track_pfn_vma_new is called when a _new_ pfn mapping is being established
- * for physical range indicated by pfn and size.
- *
  * prot is passed in as a parameter for the new mapping. If the vma has a
  * linear pfn mapping for the entire range reserve the entire vma range with
  * single reserve_pfn_range call.
  */
-int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-			unsigned long pfn, unsigned long size)
+int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
+		    unsigned long pfn, unsigned long size)
 {
 	unsigned long flags;
+	int i;
 
 	/* reserve the whole chunk starting from pfn */
 	if (is_linear_pfn_mapping(vma))
@@ -701,7 +699,30 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
 	if (!pat_enabled)
 		return 0;
 
-	/* for vm_insert_pfn and friends, we set prot based on lookup */
+	/*
+	 * for anything smaller than the vma size, we set prot based
+	 * on the lookup.
+	 */
+	flags = lookup_memtype(pfn << PAGE_SHIFT);
+	for (i = 1; i < size / PAGE_SIZE; i++)
+		if (flags != lookup_memtype((pfn + i) << PAGE_SHIFT))
+			return -EINVAL;
+	
+	*prot = __pgprot((pgprot_val(vma->vm_page_prot) & (~_PAGE_CACHE_MASK)) |
+			 flags);
+
+	return 0;
+}
+
+int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
+		     unsigned long pfn)
+{
+	unsigned long flags;
+
+	if (!pat_enabled)
+		return 0;
+
+	/* we set prot based on lookup */
 	flags = lookup_memtype(pfn << PAGE_SHIFT);
 	*prot = __pgprot((pgprot_val(vma->vm_page_prot) & (~_PAGE_CACHE_MASK)) |
 			 flags);
@@ -710,12 +731,12 @@ int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
 }
 
 /*
- * untrack_pfn_vma is called while unmapping a pfnmap for a region.
+ * untrack_pfn is called while unmapping a pfnmap for a region.
  * untrack can be called for a specific region indicated by pfn and size or
  * can be for the entire vma (in which case pfn, size are zero).
  */
-void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
-			unsigned long size)
+void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
+		 unsigned long size)
 {
 	resource_size_t paddr;
 	unsigned long prot;
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 125c54e..a877649 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -382,48 +382,57 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
 
 #ifndef __HAVE_PFNMAP_TRACKING
 /*
- * Interface that can be used by architecture code to keep track of
- * memory type of pfn mappings (remap_pfn_range, vm_insert_pfn)
- *
- * track_pfn_vma_new is called when a _new_ pfn mapping is being established
- * for physical range indicated by pfn and size.
+ * Interfaces that can be used by architecture code to keep track of
+ * memory type of pfn mappings specified by the remap_pfn_range,
+ * vm_insert_pfn.
+ */
+
+/*
+ * track_pfn_remap is called when a _new_ pfn mapping is being established
+ * by remap_pfn_range() for physical range indicated by pfn and size.
  */
-static inline int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-					unsigned long pfn, unsigned long size)
+static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
+				  unsigned long pfn, unsigned long size)
 {
 	return 0;
 }
 
 /*
- * Interface that can be used by architecture code to keep track of
- * memory type of pfn mappings (remap_pfn_range, vm_insert_pfn)
- *
- * track_pfn_vma_copy is called when vma that is covering the pfnmap gets
+ * track_pfn_insert is called when a _new_ single pfn is established
+ * by vm_insert_pfn().
+ */
+static inline int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
+				   unsigned long pfn)
+{
+	return 0;
+}
+
+/*
+ * track_pfn_copy is called when vma that is covering the pfnmap gets
  * copied through copy_page_range().
  */
-static inline int track_pfn_vma_copy(struct vm_area_struct *vma)
+static inline int track_pfn_copy(struct vm_area_struct *vma)
 {
 	return 0;
 }
 
 /*
- * Interface that can be used by architecture code to keep track of
- * memory type of pfn mappings (remap_pfn_range, vm_insert_pfn)
- *
  * untrack_pfn_vma is called while unmapping a pfnmap for a region.
  * untrack can be called for a specific region indicated by pfn and size or
- * can be for the entire vma (in which case size can be zero).
+ * can be for the entire vma (in which case pfn, size are zero).
  */
-static inline void untrack_pfn_vma(struct vm_area_struct *vma,
-					unsigned long pfn, unsigned long size)
+static inline void untrack_pfn(struct vm_area_struct *vma,
+			       unsigned long pfn, unsigned long size)
 {
 }
 #else
-extern int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t *prot,
-				unsigned long pfn, unsigned long size);
-extern int track_pfn_vma_copy(struct vm_area_struct *vma);
-extern void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn,
-				unsigned long size);
+extern int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
+			   unsigned long pfn, unsigned long size);
+extern int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
+			    unsigned long pfn);
+extern int track_pfn_copy(struct vm_area_struct *vma);
+extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
+			unsigned long size);
 #endif
 
 #ifdef CONFIG_MMU
diff --git a/mm/memory.c b/mm/memory.c
index 6105f47..4cdcf53 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1056,7 +1056,7 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 		 * We do not free on error cases below as remove_vma
 		 * gets called on error from higher level routine
 		 */
-		ret = track_pfn_vma_copy(vma);
+		ret = track_pfn_copy(vma);
 		if (ret)
 			return ret;
 	}
@@ -1311,7 +1311,7 @@ static void unmap_single_vma(struct mmu_gather *tlb,
 		*nr_accounted += (end - start) >> PAGE_SHIFT;
 
 	if (unlikely(is_pfn_mapping(vma)))
-		untrack_pfn_vma(vma, 0, 0);
+		untrack_pfn(vma, 0, 0);
 
 	if (start != end) {
 		if (unlikely(is_vm_hugetlb_page(vma))) {
@@ -2145,14 +2145,11 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
-	if (track_pfn_vma_new(vma, &pgprot, pfn, PAGE_SIZE))
+	if (track_pfn_insert(vma, &pgprot, pfn))
 		return -EINVAL;
 
 	ret = insert_pfn(vma, addr, pfn, pgprot);
 
-	if (ret)
-		untrack_pfn_vma(vma, pfn, PAGE_SIZE);
-
 	return ret;
 }
 EXPORT_SYMBOL(vm_insert_pfn);
@@ -2294,7 +2291,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 
 	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
 
-	err = track_pfn_vma_new(vma, &prot, pfn, PAGE_ALIGN(size));
+	err = track_pfn_remap(vma, &prot, pfn, PAGE_ALIGN(size));
 	if (err) {
 		/*
 		 * To indicate that track_pfn related cleanup is not
@@ -2318,7 +2315,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 	} while (pgd++, addr = next, addr != end);
 
 	if (err)
-		untrack_pfn_vma(vma, pfn, PAGE_ALIGN(size));
+		untrack_pfn(vma, pfn, PAGE_ALIGN(size));
 
 	return err;
 }
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [v3 VM_PAT PATCH 3/3] mm, x86, PAT: rework linear pfn-mmap tracking
  2012-04-06  0:01               ` [v3 VM_PAT PATCH 0/3] x86 VM_PAT series Suresh Siddha
  2012-04-06  0:01                 ` [v3 VM_PAT PATCH 1/3] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
  2012-04-06  0:01                 ` [v3 VM_PAT PATCH 2/3] x86, pat: separate the pfn attribute tracking for remap_pfn_range and vm_insert_pfn Suresh Siddha
@ 2012-04-06  0:01                 ` Suresh Siddha
  2 siblings, 0 replies; 52+ messages in thread
From: Suresh Siddha @ 2012-04-06  0:01 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Konstantin Khlebnikov, linux-mm,
	Andrew Morton, linux-kernel
  Cc: Andi Kleen, Suresh Siddha, Pallipadi Venkatesh, Ingo Molnar,
	H. Peter Anvin, Linus Torvalds, Nick Piggin

From: Konstantin Khlebnikov <khlebnikov@openvz.org>

This patch replaces generic vma-flag VM_PFN_AT_MMAP with x86-only VM_PAT.

We can toss mapping address from remap_pfn_range() into track_pfn_vma_new(),
and collect all PAT-related logic together in arch/x86/.

This patch also restores orignal frustration-free is_cow_mapping() check in
remap_pfn_range(), as it was before commit v2.6.28-rc8-88-g3c8bb73
("x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3")

is_linear_pfn_mapping() checks can be removed from mm/huge_memory.c,
because it already handled by VM_PFNMAP in VM_NO_THP bit-mask.

-v2: Reset the VM_PAT flag as part of untrack_pfn_vma()
-v3: Adapt to the track_pfn_insert/track_pfn_remap API

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Ingo Molnar <mingo@redhat.com>
---
 arch/x86/mm/pat.c             |   17 ++++++++++++-----
 include/asm-generic/pgtable.h |    6 ++++--
 include/linux/mm.h            |   15 +--------------
 mm/huge_memory.c              |    7 +++----
 mm/memory.c                   |   12 ++++++------
 5 files changed, 26 insertions(+), 31 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index d0553bf..bef33df 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -665,7 +665,7 @@ int track_pfn_copy(struct vm_area_struct *vma)
 	unsigned long vma_size = vma->vm_end - vma->vm_start;
 	pgprot_t pgprot;
 
-	if (is_linear_pfn_mapping(vma)) {
+	if (vma->vm_flags & VM_PAT) {
 		/*
 		 * reserve the whole chunk covered by vma. We need the
 		 * starting address and protection from pte.
@@ -687,14 +687,20 @@ int track_pfn_copy(struct vm_area_struct *vma)
  * single reserve_pfn_range call.
  */
 int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
-		    unsigned long pfn, unsigned long size)
+		    unsigned long pfn, unsigned long addr, unsigned long size)
 {
 	unsigned long flags;
 	int i;
 
 	/* reserve the whole chunk starting from pfn */
-	if (is_linear_pfn_mapping(vma))
-		return reserve_pfn_range(pfn << PAGE_SHIFT, size, prot, 0);
+	if (addr == vma->vm_start && size == (vma->vm_end - vma->vm_start)) {
+		int ret;
+
+		ret = reserve_pfn_range(pfn << PAGE_SHIFT, size, prot, 0);
+		if (!ret)
+			vma->vm_flags |= VM_PAT;
+		return ret;
+	}
 
 	if (!pat_enabled)
 		return 0;
@@ -741,7 +747,7 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 	resource_size_t paddr;
 	unsigned long prot;
 
-	if (!is_linear_pfn_mapping(vma))
+	if (!(vma->vm_flags & VM_PAT))
 		return;
 
 	/* free the chunk starting from pfn or the whole chunk */
@@ -755,6 +761,7 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 		size = vma->vm_end - vma->vm_start;
 	}
 	free_pfn_range(paddr, size);
+	vma->vm_flags &= ~VM_PAT;
 }
 
 pgprot_t pgprot_writecombine(pgprot_t prot)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index a877649..ddd613e 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -392,7 +392,8 @@ static inline void ptep_modify_prot_commit(struct mm_struct *mm,
  * by remap_pfn_range() for physical range indicated by pfn and size.
  */
 static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
-				  unsigned long pfn, unsigned long size)
+				  unsigned long pfn, unsigned long addr,
+				  unsigned long size)
 {
 	return 0;
 }
@@ -427,7 +428,8 @@ static inline void untrack_pfn(struct vm_area_struct *vma,
 }
 #else
 extern int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
-			   unsigned long pfn, unsigned long size);
+			   unsigned long pfn, unsigned long addr,
+			   unsigned long size);
 extern int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
 			    unsigned long pfn);
 extern int track_pfn_copy(struct vm_area_struct *vma);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d8738a4..b8e5fe5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -117,7 +117,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_CAN_NONLINEAR 0x08000000	/* Has ->fault & does nonlinear pages */
 #define VM_MIXEDMAP	0x10000000	/* Can contain "struct page" and pure PFN pages */
 #define VM_SAO		0x20000000	/* Strong Access Ordering (powerpc) */
-#define VM_PFN_AT_MMAP	0x40000000	/* PFNMAP vma that is fully mapped at mmap time */
+#define VM_PAT		0x40000000	/* PAT reserves whole VMA at once (x86) */
 #define VM_MERGEABLE	0x80000000	/* KSM may merge identical pages */
 
 /* Bits set in the VMA until the stack is in its final location */
@@ -158,19 +158,6 @@ extern pgprot_t protection_map[16];
 #define FAULT_FLAG_RETRY_NOWAIT	0x10	/* Don't drop mmap_sem and wait when retrying */
 #define FAULT_FLAG_KILLABLE	0x20	/* The fault task is in SIGKILL killable region */
 
-/*
- * This interface is used by x86 PAT code to identify a pfn mapping that is
- * linear over entire vma. This is to optimize PAT code that deals with
- * marking the physical region with a particular prot. This is not for generic
- * mm use. Note also that this check will not work if the pfn mapping is
- * linear for a vma starting at physical address 0. In which case PAT code
- * falls back to slow path of reserving physical range page by page.
- */
-static inline int is_linear_pfn_mapping(struct vm_area_struct *vma)
-{
-	return !!(vma->vm_flags & VM_PFN_AT_MMAP);
-}
-
 static inline int is_pfn_mapping(struct vm_area_struct *vma)
 {
 	return !!(vma->vm_flags & VM_PFNMAP);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f0e5306..cf827da 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1650,7 +1650,7 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma)
 	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
 	 * true too, verify it here.
 	 */
-	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags & VM_NO_THP);
+	VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 	hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 	hend = vma->vm_end & HPAGE_PMD_MASK;
 	if (hstart < hend)
@@ -1908,7 +1908,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	 * If is_pfn_mapping() is true is_learn_pfn_mapping() must be
 	 * true too, verify it here.
 	 */
-	VM_BUG_ON(is_linear_pfn_mapping(vma) || vma->vm_flags & VM_NO_THP);
+	VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 
 	pgd = pgd_offset(mm, address);
 	if (!pgd_present(*pgd))
@@ -2150,8 +2150,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 		 * If is_pfn_mapping() is true is_learn_pfn_mapping()
 		 * must be true too, verify it here.
 		 */
-		VM_BUG_ON(is_linear_pfn_mapping(vma) ||
-			  vma->vm_flags & VM_NO_THP);
+		VM_BUG_ON(vma->vm_flags & VM_NO_THP);
 
 		hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
 		hend = vma->vm_end & HPAGE_PMD_MASK;
diff --git a/mm/memory.c b/mm/memory.c
index 4cdcf53..2ade15b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2282,23 +2282,23 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 	 * There's a horrible special case to handle copy-on-write
 	 * behaviour that some programs depend on. We mark the "original"
 	 * un-COW'ed pages by matching them up with "vma->vm_pgoff".
+	 * See vm_normal_page() for details.
 	 */
-	if (addr == vma->vm_start && end == vma->vm_end) {
+	if (is_cow_mapping(vma->vm_flags)) {
+		if (addr != vma->vm_start || end != vma->vm_end)
+			return -EINVAL;
 		vma->vm_pgoff = pfn;
-		vma->vm_flags |= VM_PFN_AT_MMAP;
-	} else if (is_cow_mapping(vma->vm_flags))
-		return -EINVAL;
+	}
 
 	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
 
-	err = track_pfn_remap(vma, &prot, pfn, PAGE_ALIGN(size));
+	err = track_pfn_remap(vma, &prot, pfn, addr, PAGE_ALIGN(size));
 	if (err) {
 		/*
 		 * To indicate that track_pfn related cleanup is not
 		 * needed from higher level routine calling unmap_vmas
 		 */
 		vma->vm_flags &= ~(VM_IO | VM_RESERVED | VM_PFNMAP);
-		vma->vm_flags &= ~VM_PFN_AT_MMAP;
 		return -EINVAL;
 	}
 
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-05 21:55                   ` Linus Torvalds
@ 2012-04-06  4:36                     ` Konstantin Khlebnikov
  0 siblings, 0 replies; 52+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-06  4:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matt Helsley, Cyrill Gorcunov, Oleg Nesterov, linux-mm,
	Andrew Morton, linux-kernel, Eric Paris, linux-security-module,
	oprofile-list, Al Viro

Linus Torvalds wrote:
> On Thu, Apr 5, 2012 at 2:44 PM, Matt Helsley<matthltc@us.ibm.com>  wrote:
>>
>> I don't think the definition of an ABI is whether there's documentation
>> for it. It's whether the interface is used or not. At least that's the
>> impression I've gotten from reading Linus' rants over the years.
>
> Yes.
>
> That said, I *do* have some very dim memory of us having had real
> issues with the /proc/<pid>/exe thing and having regressions due to
> holding refcounts to executables that were a.out binaries and not
> demand-loaded. And people wanting to unmount filesystems despite the
> binaries being live.
>
> That said, I suspect that whatever issues we used to have with that
> are pretty long gone. I don't think people use non-mmap'ed binaries
> any more. So I think we can try it and see. And revert if somebody
> actually notices and has problems.

Instead of tracking count of vma with VM_EXECUTABLE bit we can track
count of vma with vma->vm_file == mm->exe_file, this will be nearly
the same behaviour. This was in early version of my patch, but I prefer
to go deeper. So, we can revert it without introducing VM_EXECUTABLE again.

>
>                      Linus


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE
  2012-04-03  5:06     ` Konstantin Khlebnikov
@ 2012-04-06 22:48       ` Andrew Morton
  0 siblings, 0 replies; 52+ messages in thread
From: Andrew Morton @ 2012-04-06 22:48 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Matt Helsley, linux-mm, linux-kernel, Oleg Nesterov, Eric Paris,
	linux-security-module, oprofile-list, Linus Torvalds, Al Viro

On Tue, 03 Apr 2012 09:06:12 +0400
Konstantin Khlebnikov <khlebnikov@openvz.org> wrote:

> Ok, I'll resend this patch as independent patch-set,
> anyway I need to return mm->mmap_sem locking back.

We need to work out what to do with "c/r: prctl: add ability to set new
mm_struct::exe_file".  I'm still sitting on the 3.4 c/r patch queue for
various reasons, one of which is that I need to go back and re-review
all the discussion, which was lengthy.  Early next week, hopefully.


^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2012-04-06 22:48 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-31  9:25 [PATCH 0/7] mm: vma->vm_flags diet Konstantin Khlebnikov
2012-03-31  9:29 ` [PATCH 1/7] mm, x86, PAT: rework linear pfn-mmap tracking Konstantin Khlebnikov
2012-03-31 17:09   ` [PATCH 1/7 v2] " Konstantin Khlebnikov
2012-04-03  0:46     ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Suresh Siddha
2012-04-03  0:46       ` [x86 PAT PATCH 1/2] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
2012-04-03  5:37         ` Konstantin Khlebnikov
2012-04-03 23:31           ` Suresh Siddha
2012-04-04  4:43             ` Konstantin Khlebnikov
2012-04-05 11:56             ` Konstantin Khlebnikov
2012-04-06  0:01               ` [v3 VM_PAT PATCH 0/3] x86 VM_PAT series Suresh Siddha
2012-04-06  0:01                 ` [v3 VM_PAT PATCH 1/3] x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines Suresh Siddha
2012-04-06  0:01                 ` [v3 VM_PAT PATCH 2/3] x86, pat: separate the pfn attribute tracking for remap_pfn_range and vm_insert_pfn Suresh Siddha
2012-04-06  0:01                 ` [v3 VM_PAT PATCH 3/3] mm, x86, PAT: rework linear pfn-mmap tracking Suresh Siddha
2012-04-03  0:46       ` [x86 PAT PATCH 2/2] " Suresh Siddha
2012-04-03  5:48         ` Konstantin Khlebnikov
2012-04-03  5:55           ` Konstantin Khlebnikov
2012-04-03  6:03       ` [x86 PAT PATCH 0/2] x86 PAT vm_flag code refactoring Konstantin Khlebnikov
2012-04-03 23:14         ` Suresh Siddha
2012-04-04  4:40           ` Konstantin Khlebnikov
2012-03-31  9:29 ` [PATCH 2/7] mm: introduce vma flag VM_ARCH_1 Konstantin Khlebnikov
2012-03-31 22:25   ` Benjamin Herrenschmidt
2012-03-31  9:29 ` [PATCH 3/7] mm: kill vma flag VM_CAN_NONLINEAR Konstantin Khlebnikov
2012-03-31 17:01   ` Linus Torvalds
2012-03-31  9:29 ` [PATCH 4/7] mm: kill vma flag VM_INSERTPAGE Konstantin Khlebnikov
2012-03-31  9:29 ` [PATCH 5/7] mm, drm/udl: fixup vma flags on mmap Konstantin Khlebnikov
2012-03-31  9:29 ` [PATCH 6/7] mm: kill vma flag VM_EXECUTABLE Konstantin Khlebnikov
2012-03-31 20:13   ` Oleg Nesterov
2012-03-31 20:39     ` Cyrill Gorcunov
2012-04-02  9:46       ` Konstantin Khlebnikov
2012-04-02  9:54         ` Cyrill Gorcunov
2012-04-02 10:13           ` Konstantin Khlebnikov
2012-04-02 14:48         ` Oleg Nesterov
2012-04-02 16:02           ` Cyrill Gorcunov
2012-04-02 16:19           ` Konstantin Khlebnikov
2012-04-02 16:27             ` Cyrill Gorcunov
2012-04-02 17:14               ` Konstantin Khlebnikov
2012-04-02 18:05                 ` Cyrill Gorcunov
2012-04-02 23:04     ` Matt Helsley
2012-04-03  5:10       ` Konstantin Khlebnikov
2012-04-03 18:16         ` Matt Helsley
2012-04-03 19:32           ` Cyrill Gorcunov
2012-04-05 20:29             ` Matt Helsley
2012-04-05 20:53               ` Cyrill Gorcunov
2012-04-05 21:04               ` Konstantin Khlebnikov
2012-04-05 21:44                 ` Matt Helsley
2012-04-05 21:55                   ` Linus Torvalds
2012-04-06  4:36                     ` Konstantin Khlebnikov
2012-04-02 23:18   ` Matt Helsley
2012-04-03  5:06     ` Konstantin Khlebnikov
2012-04-06 22:48       ` Andrew Morton
2012-03-31  9:29 ` [PATCH 7/7] mm: move madvise vma flags to the end Konstantin Khlebnikov
2012-03-31 14:06 ` [PATCH 0/7] mm: vma->vm_flags diet Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).