linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>
Subject: Re: [RFC] iov_iter_get_pages() semantics
Date: Wed, 1 Apr 2015 20:50:22 +0100	[thread overview]
Message-ID: <20150401195022.GC889@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CA+55aFyOZ5LD-mAOb_OafJe-VMy0hxrgOCB5hJ=GXJAFP7e8dQ@mail.gmail.com>

On Wed, Apr 01, 2015 at 11:26:38AM -0700, Linus Torvalds wrote:
> On Wed, Apr 1, 2015 at 11:08 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > IOW, do you have a problem with obtaining a pointer to kernel page and
> > immediately shoving it into scatterlist?
> 
> And just to clarify, yes I do. Why the f*ck wasn't it a struct page to
> begin with? And why do you think that a scatter-list is somehow "safe"
> and guarantees people won't be playing (invalid and completely broken)
> games with page counters etc that you cannot play for those things?

Point taken.  What do you think about sg_set_buf() and sg_init_one()?

> If this is just about finit_module(), then dammit, why the f*ck does
> it even try to do zero-copy in the first place?

Mostly because there's no way to tell the filesystem that we don't want
zero-copy deep in the bowels of underlying driver...

> But if that's the only
> use, maybe we can improve on kernel_read() to do some aio-read on the
> raw pages instead. And change the "info->hdr" thing to not just do a
> blind vmalloc, but actually do the page allocations and then do
> vmap_page_range() to map in the end result after IO etc.

Can do, but that would depend on 9p getting converted to read_iter/write_iter
in a sane way ;-) (and that's worth doing for a lot of other reasons, which
is what had brought me to net/9p in the first place).

That might actually be a good idea - for ITER_BVEC we know that page is
a normal one (not many originators of such - __swap_writepage() and
pipe_buffer ones), so making iov_iter_get_pages() work for those wouldn't
be a problem...

kernel_read() is a wrong helper, though - it should just use vfs_read_iter().
We are -><- that close to making it work on all "normal files" - the only
exceptions right now are ncpfs (fixed in local tree), coda (ditto) and 9p.

> IOW, it's fine to do IO on 'struct page', but it should be
> *controlled* and you damn well need to _own_ that struct page and its
> lifetime, no just "look up random struct page from some kernel
> address".

I certainly agree that throwing pointers to weird pages around is generally
a bad idea, but lifetime is not an issue, AFAICS - if somebody manages to do
vfree() the destination of your read right under you, you are already very
deep in trouble.

Speaking of weird pages, some of vmalloc_to_page() users look very strange -
netlink_mmap(), in particular...

  parent reply	other threads:[~2015-04-01 19:50 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-04 20:20 [RFC][PATCHES] iov_iter.c rewrite Al Viro
2014-12-04 20:23 ` [RFC][PATCH 01/13] iov_iter.c: macros for iterating over iov_iter Al Viro
2014-12-04 20:23 ` [RFC][PATCH 02/13] iov_iter.c: iterate_and_advance Al Viro
2014-12-04 20:23 ` [RFC][PATCH 03/13] iov_iter.c: convert iov_iter_npages() to iterate_all_kinds Al Viro
2014-12-04 20:23 ` [RFC][PATCH 04/13] iov_iter.c: convert iov_iter_get_pages() " Al Viro
2014-12-04 20:23 ` [RFC][PATCH 05/13] iov_iter.c: convert iov_iter_get_pages_alloc() " Al Viro
2014-12-04 20:23 ` [RFC][PATCH 06/13] iov_iter.c: convert iov_iter_zero() to iterate_and_advance Al Viro
2014-12-04 20:23 ` [RFC][PATCH 07/13] iov_iter.c: get rid of bvec_copy_page_{to,from}_iter() Al Viro
2014-12-05 12:28   ` Sergei Shtylyov
2014-12-04 20:23 ` [RFC][PATCH 08/13] iov_iter.c: convert copy_from_iter() to iterate_and_advance Al Viro
2014-12-04 20:23 ` [RFC][PATCH 09/13] iov_iter.c: convert copy_to_iter() " Al Viro
2014-12-04 20:23 ` [RFC][PATCH 10/13] iov_iter.c: handle ITER_KVEC directly Al Viro
2014-12-04 20:23 ` [RFC][PATCH 11/13] csum_and_copy_..._iter() Al Viro
2014-12-04 20:23 ` [RFC][PATCH 12/13] new helper: iov_iter_kvec() Al Viro
2014-12-04 20:23 ` [RFC][PATCH 13/13] copy_from_iter_nocache() Al Viro
2014-12-08 16:46 ` [RFC][PATCHES] iov_iter.c rewrite Kirill A. Shutemov
2014-12-08 17:58   ` Al Viro
2014-12-08 18:08     ` Al Viro
2014-12-08 18:14       ` Linus Torvalds
2014-12-08 18:20         ` Al Viro
2014-12-08 18:37           ` Linus Torvalds
2014-12-08 18:46             ` Al Viro
2014-12-08 18:57               ` Linus Torvalds
2014-12-08 19:28                 ` Al Viro
2014-12-08 19:48                   ` Linus Torvalds
2014-12-09  1:56                   ` Al Viro
2014-12-09  2:21                     ` Kirill A. Shutemov
2015-04-01  2:33                 ` [RFC] iov_iter_get_pages() semantics Al Viro
2015-04-01 16:45                   ` Linus Torvalds
2015-04-01 18:08                     ` Al Viro
2015-04-01 18:15                       ` Linus Torvalds
2015-04-01 19:23                         ` Al Viro
2015-04-01 18:26                       ` Linus Torvalds
2015-04-01 18:34                         ` Linus Torvalds
2015-04-01 20:15                           ` Al Viro
2015-04-01 21:57                             ` Linus Torvalds
2015-04-01 19:50                         ` Al Viro [this message]
2014-12-08 18:56     ` [RFC][PATCHES] iov_iter.c rewrite Kirill A. Shutemov
2014-12-08 19:01       ` Linus Torvalds
2014-12-08 19:15         ` Dave Jones
2014-12-08 19:23         ` Kirill A. Shutemov
2014-12-08 22:14           ` Theodore Ts'o
2014-12-08 22:23             ` Linus Torvalds
2014-12-08 22:31               ` Dave Jones
2014-12-08 18:07   ` Linus Torvalds
2014-12-08 18:14     ` Al Viro
2014-12-08 18:23       ` Linus Torvalds
2014-12-08 18:35         ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150401195022.GC889@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=kirill@shutemov.name \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).