All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Eric Biggers <ebiggers@kernel.org>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH V2 2/2] mm/highmem: Lift memcpy_[to|from]_page to core
Date: Wed, 9 Dec 2020 20:14:15 +0000	[thread overview]
Message-ID: <20201209201415.GT7338@casper.infradead.org> (raw)
In-Reply-To: <CAPcyv4iD0eprWC_kMOdYdX-GvT-72OjZB-CKA9b5qV8BwNQ+6A@mail.gmail.com>

On Wed, Dec 09, 2020 at 11:47:56AM -0800, Dan Williams wrote:
> On Tue, Dec 8, 2020 at 8:03 PM Matthew Wilcox <willy@infradead.org> wrote:
> > On Tue, Dec 08, 2020 at 06:22:50PM -0800, Ira Weiny wrote:
> > > Therefore, I tend to agree with Dan that if anything is to be done it should be
> > > a WARN_ON() which is only going to throw an error that something has probably
> > > been wrong all along and should be fixed but continue running as before.
> >
> > Silent data corruption is for ever.  Are you absolutely sure nobody has
> > done:
> >
> >         page = alloc_pages(GFP_HIGHUSER_MOVABLE, 3);
> >         memcpy_to_page(page, PAGE_SIZE * 2, p, PAGE_SIZE * 2);
> >
> > because that will work fine if the pages come from ZONE_NORMAL and fail
> > miserably if they came from ZONE_HIGHMEM.
> 
> ...and violently regress with the BUG_ON.

... which is what we want, no?

> The question to me is: which is more likely that any bad usages have
> been covered up by being limited to ZONE_NORMAL / 64-bit only, or that
> silent data corruption has been occurring with no ill effects?

I wouldn't be at all surprised to learn that there is silent data
corruption on 32-bit systems with HIGHMEM.  Would you?  How much testing
do you do on 32-bit HIGHMEM systems?

Actually, I wouldn't be at all surprised if we can hit this problem today.
Look at this:

size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
{
        char *to = addr;
        if (unlikely(iov_iter_is_pipe(i))) {
                WARN_ON(1);
                return 0;
        }
        if (iter_is_iovec(i))
                might_fault();
        iterate_and_advance(i, bytes, v,
                copyin((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len),
                memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
                                 v.bv_offset, v.bv_len),
                memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
        )

        return bytes;
}
EXPORT_SYMBOL(_copy_from_iter);

There's a lot of macrology in there, so for those following along who
aren't familiar with the iov_iter code, if the iter is operating on a
bvec, then iterate_and_advance() will call memcpy_from_page(), passing
it the bv_page, bv_offset and bv_len stored in the bvec.  Since 2019,
Linux has supported multipage bvecs (commit 07173c3ec276).  So bv_len
absolutely *can* be > PAGE_SIZE.

Does this ever happen in practice?  I have no idea; I don't know whether
any multipage BIOs are currently handed to copy_from_iter().  But I
have no confidence in your audit if you didn't catch this one.

> > > FWIW I think this is a 'bad BUG_ON' use because we are "checking something that
> > > we know we might be getting wrong".[1]  And because, "BUG() is only good for
> > > something that never happens and that we really have no other option for".[2]
> >
> > BUG() is our only option here.  Both limiting how much we copy or
> > copying the requested amount result in data corruption or leaking
> > information to a process that isn't supposed to see it.
> 
> At a minimum I think this should be debated in a follow on patch to
> add assertion checking where there was none before. There is no
> evidence of a page being overrun in the audit Ira performed.

If we put in into a separate patch, someone will suggest backing out the
patch which tells us that there's a problem.  You know, like this guy ...
https://lore.kernel.org/linux-mm/CAPcyv4jNVroYmirzKw_=CsEixOEACdL3M1Wc4xjd_TFv3h+o8Q@mail.gmail.com/

  reply	other threads:[~2020-12-09 20:15 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-07 22:57 [PATCH V2 0/2] Lift memcpy_[to|from]_page to core ira.weiny
2020-12-07 22:57 ` [PATCH V2 1/2] mm/highmem: Remove deprecated kmap_atomic ira.weiny
2020-12-08  0:22   ` kernel test robot
2020-12-08  0:22     ` kernel test robot
2020-12-07 22:57 ` [PATCH V2 2/2] mm/highmem: Lift memcpy_[to|from]_page to core ira.weiny
2020-12-07 23:26   ` Matthew Wilcox
2020-12-07 23:34     ` Dan Williams
2020-12-07 23:40       ` Matthew Wilcox
2020-12-07 23:49         ` Dan Williams
2020-12-08 21:32           ` Ira Weiny
2020-12-08 21:50             ` Matthew Wilcox
2020-12-08 22:23               ` Dan Williams
2020-12-08 22:32                 ` Matthew Wilcox
2020-12-08 22:45                   ` Darrick J. Wong
2020-12-08 22:54                     ` Matthew Wilcox
2020-12-08 23:40                     ` Dan Williams
2020-12-09  2:22                       ` Ira Weiny
2020-12-09  4:03                         ` Matthew Wilcox
2020-12-09 19:47                           ` Dan Williams
2020-12-09 20:14                             ` Matthew Wilcox [this message]
2020-12-09 20:19                               ` Dan Williams
2020-12-10  5:35                               ` Ira Weiny
2020-12-08 22:21             ` Dan Williams
2020-12-08  0:40   ` kernel test robot
2020-12-08  0:40     ` kernel test robot
2020-12-08  1:09   ` kernel test robot
2020-12-08  1:09     ` kernel test robot
2020-12-08 12:23   ` Matthew Wilcox
2020-12-08 16:38     ` Ira Weiny
2020-12-08 16:40       ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201209201415.GT7338@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=dave.hansen@intel.com \
    --cc=ebiggers@kernel.org \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.