All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: akpm@linux-foundation.org, andrea@kernel.org,
	kirill@shutemov.name, oleg@redhat.com,
	wenwei.tww@alibaba-inc.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: Re: [PATCH 2/2] mm, oom: fix potential data corruption when oom_reaper races with writer
Date: Tue, 15 Aug 2017 08:55:44 +0200	[thread overview]
Message-ID: <20170815065544.GA29067@dhcp22.suse.cz> (raw)
In-Reply-To: <201708142251.v7EMp3j9081456@www262.sakura.ne.jp>

On Tue 15-08-17 07:51:02, Tetsuo Handa wrote:
> Michal Hocko wrote:
[...]
> > Were you able to reproduce with other filesystems?
> 
> Yes, I can reproduce this problem using both xfs and ext4 on 4.11.11-200.fc25.x86_64
> on Oracle VM VirtualBox on Windows.
> 
> I believe that this is not old data from disk, for I can reproduce this problem
> using newly attached /dev/sdb which has never written any data (other than data
> written by mkfs.xfs and mkfs.ext4).
> 
>   /dev/sdb /tmp ext4 rw,seclabel,relatime,data=ordered 0 0
>   
> The garbage pattern (the last 4096 bytes) is identical for both xfs and ext4.

Thanks a lot for retesting. It is now obvious that FS doesn't have
anything to do with this issue which is in line with my investigation
from yesterday and Friday. I simply cannot see any way the file position
would be updated with a zero length write. So this must be something
else. I have double checked the MM side of the page fault I couldn't
find anything there either so this smells like a stray pte while the
underlying page got reused or something TLB related.
 
> >                                                    I wonder what is
> > different in my testing because I cannot reproduce this at all. Well, I
> > had to reduce the number of competing writer threads to 128 because I
> > quickly hit the trashing behavior with more of them (and 4 CPUs). I will
> > try on a larger machine.
> 
> I don't think a larger machine is necessary.
> I can reproduce this problem with 8 competing writer threads on 4 CPUs.

OK, I will try with fewer writers which should make it easier to have it
run for long time without any trashing.
 
> I don't have native Linux environment. Maybe that is the difference.
> Can you try VMware Workstation Player or Oracle VM VirtualBox environment?

Hmm, I do not have any of those handy for use, unfortunately. I will
keep focusing on the native HW and KVM for today.

Thanks!
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: akpm@linux-foundation.org, andrea@kernel.org,
	kirill@shutemov.name, oleg@redhat.com,
	wenwei.tww@alibaba-inc.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: Re: [PATCH 2/2] mm, oom: fix potential data corruption when oom_reaper races with writer
Date: Tue, 15 Aug 2017 08:55:44 +0200	[thread overview]
Message-ID: <20170815065544.GA29067@dhcp22.suse.cz> (raw)
In-Reply-To: <201708142251.v7EMp3j9081456@www262.sakura.ne.jp>

On Tue 15-08-17 07:51:02, Tetsuo Handa wrote:
> Michal Hocko wrote:
[...]
> > Were you able to reproduce with other filesystems?
> 
> Yes, I can reproduce this problem using both xfs and ext4 on 4.11.11-200.fc25.x86_64
> on Oracle VM VirtualBox on Windows.
> 
> I believe that this is not old data from disk, for I can reproduce this problem
> using newly attached /dev/sdb which has never written any data (other than data
> written by mkfs.xfs and mkfs.ext4).
> 
>   /dev/sdb /tmp ext4 rw,seclabel,relatime,data=ordered 0 0
>   
> The garbage pattern (the last 4096 bytes) is identical for both xfs and ext4.

Thanks a lot for retesting. It is now obvious that FS doesn't have
anything to do with this issue which is in line with my investigation
from yesterday and Friday. I simply cannot see any way the file position
would be updated with a zero length write. So this must be something
else. I have double checked the MM side of the page fault I couldn't
find anything there either so this smells like a stray pte while the
underlying page got reused or something TLB related.
 
> >                                                    I wonder what is
> > different in my testing because I cannot reproduce this at all. Well, I
> > had to reduce the number of competing writer threads to 128 because I
> > quickly hit the trashing behavior with more of them (and 4 CPUs). I will
> > try on a larger machine.
> 
> I don't think a larger machine is necessary.
> I can reproduce this problem with 8 competing writer threads on 4 CPUs.

OK, I will try with fewer writers which should make it easier to have it
run for long time without any trashing.
 
> I don't have native Linux environment. Maybe that is the difference.
> Can you try VMware Workstation Player or Oracle VM VirtualBox environment?

Hmm, I do not have any of those handy for use, unfortunately. I will
keep focusing on the native HW and KVM for today.

Thanks!
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-08-15  6:55 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-07 11:38 [PATCH 0/2] mm, oom: fix oom_reaper fallouts Michal Hocko
2017-08-07 11:38 ` Michal Hocko
2017-08-07 11:38 ` [PATCH 1/2] mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS Michal Hocko
2017-08-07 11:38   ` Michal Hocko
2017-08-15  0:49   ` David Rientjes
2017-08-15  0:49     ` David Rientjes
2017-08-07 11:38 ` [PATCH 2/2] mm, oom: fix potential data corruption when oom_reaper races with writer Michal Hocko
2017-08-07 11:38   ` Michal Hocko
2017-08-08 17:48   ` Andrea Arcangeli
2017-08-08 17:48     ` Andrea Arcangeli
2017-08-08 23:35     ` Tetsuo Handa
2017-08-08 23:35       ` Tetsuo Handa
2017-08-09 18:36       ` Andrea Arcangeli
2017-08-09 18:36         ` Andrea Arcangeli
2017-08-10  8:21     ` Michal Hocko
2017-08-10  8:21       ` Michal Hocko
2017-08-10 13:33       ` Michal Hocko
2017-08-10 13:33         ` Michal Hocko
2017-08-11  2:28   ` Tetsuo Handa
2017-08-11  2:28     ` Tetsuo Handa
2017-08-11  7:09     ` Michal Hocko
2017-08-11  7:09       ` Michal Hocko
2017-08-11  7:54       ` Tetsuo Handa
2017-08-11  7:54         ` Tetsuo Handa
2017-08-11 10:22         ` Andrea Arcangeli
2017-08-11 10:22           ` Andrea Arcangeli
2017-08-11 10:42           ` Andrea Arcangeli
2017-08-11 10:42             ` Andrea Arcangeli
2017-08-11 11:53             ` Tetsuo Handa
2017-08-11 11:53               ` Tetsuo Handa
2017-08-11 12:08         ` Michal Hocko
2017-08-11 12:08           ` Michal Hocko
2017-08-11 15:46           ` Tetsuo Handa
2017-08-11 15:46             ` Tetsuo Handa
2017-08-14 13:59             ` Michal Hocko
2017-08-14 13:59               ` Michal Hocko
2017-08-14 22:51               ` Tetsuo Handa
2017-08-15  6:55                 ` Michal Hocko [this message]
2017-08-15  6:55                   ` Michal Hocko
2017-08-15  8:41                 ` Michal Hocko
2017-08-15  8:41                   ` Michal Hocko
2017-08-15 10:06                   ` Tetsuo Handa
2017-08-15 12:26                     ` Michal Hocko
2017-08-15 12:26                       ` Michal Hocko
2017-08-15 12:58                       ` Tetsuo Handa
2017-08-17 13:58                         ` Michal Hocko
2017-08-17 13:58                           ` Michal Hocko
2017-08-15  5:30               ` Tetsuo Handa
2017-08-07 13:28 ` [PATCH 0/2] mm, oom: fix oom_reaper fallouts Tetsuo Handa
2017-08-07 13:28   ` Tetsuo Handa
2017-08-07 14:04   ` Michal Hocko
2017-08-07 14:04     ` Michal Hocko
2017-08-07 15:23     ` Tetsuo Handa
2017-08-07 15:23       ` Tetsuo Handa
2017-08-15 12:29 ` Michal Hocko
2017-08-15 12:29   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170815065544.GA29067@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@kernel.org \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=wenwei.tww@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.