All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiao Guangrong <guangrong.xiao@linux.intel.com>
To: Dave Hansen <dave.hansen@intel.com>,
	pbonzini@redhat.com, akpm@linux-foundation.org, mhocko@suse.com,
	dan.j.williams@intel.com
Cc: gleb@kernel.org, mtosatti@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, stefanha@redhat.com,
	yuhuang@redhat.com, linux-mm@kvack.org,
	ross.zwisler@linux.intel.com
Subject: Re: [PATCH] Fix region lost in /proc/self/smaps
Date: Fri, 9 Sep 2016 16:19:15 +0800	[thread overview]
Message-ID: <01bcbbe2-5560-ea42-4d75-6ab50c3060d4@linux.intel.com> (raw)
In-Reply-To: <57D1703E.4070504@intel.com>



On 09/08/2016 10:05 PM, Dave Hansen wrote:
> On 09/07/2016 08:36 PM, Xiao Guangrong wrote:>> The user will see two
> VMAs in their output:
>>>
>>>     A: 0x1000->0x2000
>>>     C: 0x1000->0x3000
>>>
>>> Will it confuse them to see the same virtual address range twice?  Or is
>>> there something preventing that happening that I'm missing?
>>>
>>
>> You are right. Nothing can prevent it.
>>
>> However, it is not easy to handle the case that the new VMA overlays
>> with the old VMA
>> already got by userspace. I think we have some choices:
>> 1: One way is completely skipping the new VMA region as current kernel
>> code does but i
>>    do not think this is good as the later VMAs will be dropped.
>>
>> 2: show the un-overlayed portion of new VMA. In your case, we just show
>> the region
>>    (0x2000 -> 0x3000), however, it can not work well if the VMA is a new
>> created
>>    region with different attributions.
>>
>> 3: completely show the new VMA as this patch does.
>>
>> Which one do you prefer?
>
> I'd be willing to bet that #3 will break *somebody's* tooling.
> Addresses going backwards is certainly screwy.  Imagine somebody using
> smaps to search for address holes and doing hole_size=0x1000-0x2000.
>
> #1 can lies about there being no mapping in place where there there may
> have _always_ been a mapping and is very similar to the bug you were
> originally fixing.  I think that throws it out.
>
> #2 is our best bet, I think.  It's unfortunately also the most code.
> It's also a bit of a fib because it'll show a mapping that never
> actually existed, but I think this is OK.  I'm not sure what the
> downside is that you're referring to, though.  Can you explain?

Yes. I was talking the case as follows:
    1: read() #1: prints vma-A(0x1000 -> 0x2000)
    2: unmap vma-A(0x1000 -> 0x2000)
    3: create vma-B(0x80 -> 0x3000) on other file with different permission
       (w, r, x)
    4: read #2: prints vma-B(0x2000 -> 0x3000)

Then userspace will get just a portion of vma-B. well, maybe it is not too bad. :)

How about this changes:

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 187d84e..10ca648 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -147,7 +147,7 @@ m_next_vma(struct proc_maps_private *priv, struct vm_area_struct *vma)
  static void m_cache_vma(struct seq_file *m, struct vm_area_struct *vma)
  {
         if (m->count < m->size) /* vma is copied successfully */
-               m->version = m_next_vma(m->private, vma) ? vma->vm_start : -1UL;
+               m->version = m_next_vma(m->private, vma) ? vma->vm_end : -1UL;
  }

  static void *m_start(struct seq_file *m, loff_t *ppos)
@@ -176,14 +176,14 @@ static void *m_start(struct seq_file *m, loff_t *ppos)

         if (last_addr) {
                 vma = find_vma(mm, last_addr);
-               if (vma && (vma = m_next_vma(priv, vma)))
+               if (vma)
                         return vma;
         }

         m->version = 0;
         if (pos < mm->map_count) {
                 for (vma = mm->mmap; pos; pos--) {
-                       m->version = vma->vm_start;
+                       m->version = vma->vm_end;
                         vma = vma->vm_next;
                 }
                 return vma;
@@ -293,7 +293,7 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
         vm_flags_t flags = vma->vm_flags;
         unsigned long ino = 0;
         unsigned long long pgoff = 0;
-       unsigned long start, end;
+       unsigned long end, start = m->version;
         dev_t dev = 0;
         const char *name = NULL;

@@ -304,8 +304,13 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
                 pgoff = ((loff_t)vma->vm_pgoff) << PAGE_SHIFT;
         }

+       /*
+        * the region [0, m->version) has already been handled, do not
+        * handle it doubly.
+        */
+       start = max(vma->vm_start, start);
+
         /* We don't show the stack guard page in /proc/maps */
-       start = vma->vm_start;
         if (stack_guard_page_start(vma, start))
                 start += PAGE_SIZE;
         end = vma->vm_end;

WARNING: multiple messages have this Message-ID (diff)
From: Xiao Guangrong <guangrong.xiao@linux.intel.com>
To: Dave Hansen <dave.hansen@intel.com>,
	pbonzini@redhat.com, akpm@linux-foundation.org, mhocko@suse.com,
	dan.j.williams@intel.com
Cc: gleb@kernel.org, mtosatti@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, stefanha@redhat.com,
	yuhuang@redhat.com, linux-mm@kvack.org,
	ross.zwisler@linux.intel.com
Subject: Re: [PATCH] Fix region lost in /proc/self/smaps
Date: Fri, 9 Sep 2016 16:19:15 +0800	[thread overview]
Message-ID: <01bcbbe2-5560-ea42-4d75-6ab50c3060d4@linux.intel.com> (raw)
In-Reply-To: <57D1703E.4070504@intel.com>



On 09/08/2016 10:05 PM, Dave Hansen wrote:
> On 09/07/2016 08:36 PM, Xiao Guangrong wrote:>> The user will see two
> VMAs in their output:
>>>
>>>     A: 0x1000->0x2000
>>>     C: 0x1000->0x3000
>>>
>>> Will it confuse them to see the same virtual address range twice?  Or is
>>> there something preventing that happening that I'm missing?
>>>
>>
>> You are right. Nothing can prevent it.
>>
>> However, it is not easy to handle the case that the new VMA overlays
>> with the old VMA
>> already got by userspace. I think we have some choices:
>> 1: One way is completely skipping the new VMA region as current kernel
>> code does but i
>>    do not think this is good as the later VMAs will be dropped.
>>
>> 2: show the un-overlayed portion of new VMA. In your case, we just show
>> the region
>>    (0x2000 -> 0x3000), however, it can not work well if the VMA is a new
>> created
>>    region with different attributions.
>>
>> 3: completely show the new VMA as this patch does.
>>
>> Which one do you prefer?
>
> I'd be willing to bet that #3 will break *somebody's* tooling.
> Addresses going backwards is certainly screwy.  Imagine somebody using
> smaps to search for address holes and doing hole_size=0x1000-0x2000.
>
> #1 can lies about there being no mapping in place where there there may
> have _always_ been a mapping and is very similar to the bug you were
> originally fixing.  I think that throws it out.
>
> #2 is our best bet, I think.  It's unfortunately also the most code.
> It's also a bit of a fib because it'll show a mapping that never
> actually existed, but I think this is OK.  I'm not sure what the
> downside is that you're referring to, though.  Can you explain?

Yes. I was talking the case as follows:
    1: read() #1: prints vma-A(0x1000 -> 0x2000)
    2: unmap vma-A(0x1000 -> 0x2000)
    3: create vma-B(0x80 -> 0x3000) on other file with different permission
       (w, r, x)
    4: read #2: prints vma-B(0x2000 -> 0x3000)

Then userspace will get just a portion of vma-B. well, maybe it is not too bad. :)

How about this changes:

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 187d84e..10ca648 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -147,7 +147,7 @@ m_next_vma(struct proc_maps_private *priv, struct vm_area_struct *vma)
  static void m_cache_vma(struct seq_file *m, struct vm_area_struct *vma)
  {
         if (m->count < m->size) /* vma is copied successfully */
-               m->version = m_next_vma(m->private, vma) ? vma->vm_start : -1UL;
+               m->version = m_next_vma(m->private, vma) ? vma->vm_end : -1UL;
  }

  static void *m_start(struct seq_file *m, loff_t *ppos)
@@ -176,14 +176,14 @@ static void *m_start(struct seq_file *m, loff_t *ppos)

         if (last_addr) {
                 vma = find_vma(mm, last_addr);
-               if (vma && (vma = m_next_vma(priv, vma)))
+               if (vma)
                         return vma;
         }

         m->version = 0;
         if (pos < mm->map_count) {
                 for (vma = mm->mmap; pos; pos--) {
-                       m->version = vma->vm_start;
+                       m->version = vma->vm_end;
                         vma = vma->vm_next;
                 }
                 return vma;
@@ -293,7 +293,7 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
         vm_flags_t flags = vma->vm_flags;
         unsigned long ino = 0;
         unsigned long long pgoff = 0;
-       unsigned long start, end;
+       unsigned long end, start = m->version;
         dev_t dev = 0;
         const char *name = NULL;

@@ -304,8 +304,13 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
                 pgoff = ((loff_t)vma->vm_pgoff) << PAGE_SHIFT;
         }

+       /*
+        * the region [0, m->version) has already been handled, do not
+        * handle it doubly.
+        */
+       start = max(vma->vm_start, start);
+
         /* We don't show the stack guard page in /proc/maps */
-       start = vma->vm_start;
         if (stack_guard_page_start(vma, start))
                 start += PAGE_SIZE;
         end = vma->vm_end;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-09-09  8:25 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-07  6:51 [PATCH] Fix region lost in /proc/self/smaps Xiao Guangrong
2016-09-07  6:51 ` Xiao Guangrong
2016-09-07  7:05 ` Xiao Guangrong
2016-09-07  7:05   ` Xiao Guangrong
2016-09-07 16:34 ` Dave Hansen
2016-09-07 16:34   ` Dave Hansen
2016-09-08  3:36   ` Xiao Guangrong
2016-09-08  3:36     ` Xiao Guangrong
2016-09-08 14:05     ` Dave Hansen
2016-09-08 14:05       ` Dave Hansen
2016-09-09  8:19       ` Xiao Guangrong [this message]
2016-09-09  8:19         ` Xiao Guangrong
2016-09-09 16:47         ` Dave Hansen
2016-09-09 16:47           ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01bcbbe2-5560-ea42-4d75-6ab50c3060d4@linux.intel.com \
    --to=guangrong.xiao@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=gleb@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=ross.zwisler@linux.intel.com \
    --cc=stefanha@redhat.com \
    --cc=yuhuang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.