linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] mm: direct I/O (using GUP) can write to COW anonymous pages
@ 2018-09-17 16:12 Jann Horn
  2018-09-18  0:05 ` Hugh Dickins
  0 siblings, 1 reply; 7+ messages in thread
From: Jann Horn @ 2018-09-17 16:12 UTC (permalink / raw)
  To: Linux-MM, Dan Williams, Andrew Morton, Michal Hocko,
	Hugh Dickins, Rik van Riel
  Cc: kernel list

[I'm not sure who the best people to ask about this are, I hope the
recipient list resembles something reasonable...]

I have noticed that the dup_mmap() logic on fork() doesn't handle
pages with active direct I/O properly: dup_mmap() seems to assume that
making the PTE referencing a page readonly will always prevent future
writes to the page, but if the kernel has acquired a direct reference
to the page before (e.g. via get_user_pages_fast()), writes can still
happen that way.

The worst-case effect of this - as far as I can tell - is that when a
multithreaded process forks while one thread is in the middle of
sys_read() on a file that uses direct I/O with get_user_pages_fast(),
the read data can become visible in the child while the parent's
buffer stays uninitialized if the parent writes to a relevant page
post-fork before either the I/O completes or the child writes to it.

Reproducer code:

====== START hello.c ======
#define FUSE_USE_VERSION 26

#include <fuse.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
#include <err.h>
#include <sys/uio.h>

static const char *hello_path = "/hello";

static int hello_getattr(const char *path, struct stat *stbuf)
{
        int res = 0;
        memset(stbuf, 0, sizeof(struct stat));
        if (strcmp(path, "/") == 0) {
                stbuf->st_mode = S_IFDIR | 0755;
                stbuf->st_nlink = 2;
        } else if (strcmp(path, hello_path) == 0) {
                stbuf->st_mode = S_IFREG | 0666;
                stbuf->st_nlink = 1;
                stbuf->st_size = 0x1000;
                stbuf->st_blocks = 0;
        } else
                res = -ENOENT;
        return res;
}

static int hello_readdir(const char *path, void *buf, fuse_fill_dir_t
filler, off_t offset, struct fuse_file_info *fi) {
        filler(buf, ".", NULL, 0);
        filler(buf, "..", NULL, 0);
        filler(buf, hello_path + 1, NULL, 0);
        return 0;
}

static int hello_open(const char *path, struct fuse_file_info *fi) {
        return 0;
}

static int hello_read(const char *path, char *buf, size_t size, off_t
offset, struct fuse_file_info *fi) {
        sleep(3);
        size_t len = 0x1000;
        if (offset < len) {
                if (offset + size > len)
                        size = len - offset;
                memset(buf, 0, size);
        } else
                size = 0;
        return size;
}

static int hello_write(const char *path, const char *buf, size_t size,
off_t offset, struct fuse_file_info *fi) {
        while(1) pause();
}

static struct fuse_operations hello_oper = {
        .getattr        = hello_getattr,
        .readdir        = hello_readdir,
        .open           = hello_open,
        .read           = hello_read,
        .write          = hello_write,
};

int main(int argc, char *argv[]) {
        return fuse_main(argc, argv, &hello_oper, NULL);
}
====== END hello.c ======

====== START simple_mmap.c ======
#define _GNU_SOURCE
#include <pthread.h>
#include <sys/mman.h>
#include <err.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <signal.h>
#include <sys/prctl.h>
#include <sys/wait.h>

__attribute__((aligned(0x1000))) char data_buffer_[0x10000];
#define data_buffer (data_buffer_ + 0x8000)

void *fuse_thread(void *dummy) {
        /* step 2: start direct I/O on data_buffer */
        int fuse_fd = open("mount/hello", O_RDWR);
        if (fuse_fd == -1)
                err(1, "unable to open FUSE fd");
        printf("char in parent (before): %hhd\n", data_buffer[0]);
        int res = read(fuse_fd, data_buffer, 0x1000);
        /* step 6: read completes, show post-read state */
        printf("fuse read result: %d\n", res);
        printf("char in parent (after): %hhd\n", data_buffer[0]);
}

int main(void) {
        /* step 1: make data_buffer dirty */
        data_buffer[0] = 1;

        pthread_t thread;
        if (pthread_create(&thread, NULL, fuse_thread, NULL))
                errx(1, "pthread_create");

        sleep(1);
        /* step 3: fork a child */
        pid_t child = fork();
        if (child == -1)
                err(1, "fork");
        if (child == 0) {
                prctl(PR_SET_PDEATHSIG, SIGKILL);
                sleep(1);

                /* step 5: show pre-read state in the child */
                printf("char in child (before): %hhd\n", data_buffer[0]);
                sleep(3);
                /* step 7: read is complete, show post-read state in child */
                printf("char in child (after): %hhd\n", data_buffer[0]);
                return 0;
        }

        /* step 4: de-CoW data_buffer in the parent */
        data_buffer[0x800] = 2;

        int status;
        if (wait(&status) != child)
                err(1, "wait");
}
====== END simple_mmap.c ======

Repro steps:

In one terminal:
$ mkdir mount
$ gcc -o hello hello.c -Wall -std=gnu99 `pkg-config fuse --cflags --libs`
hello.c: In function ‘hello_write’:
hello.c:67:1: warning: no return statement in function returning
non-void [-Wreturn-type]
 }
 ^
$ ./hello -d -o direct_io mount
FUSE library version: 2.9.7
[...]

In a second terminal:
$ gcc -pthread -o simple_mmap simple_mmap.c
$ ./simple_mmap
char in parent (before): 1
char in child (before): 1
fuse read result: 4096
char in parent (after): 1
char in child (after): 0

I have tested that this still works on 4.19.0-rc3+.

As far as I can tell, the fix would be to immediately copy pages with
`refcount - mapcount > N` in dup_mmap(), or something like that?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] mm: direct I/O (using GUP) can write to COW anonymous pages
  2018-09-17 16:12 [BUG] mm: direct I/O (using GUP) can write to COW anonymous pages Jann Horn
@ 2018-09-18  0:05 ` Hugh Dickins
  2018-09-18  0:19   ` Salman Qazi
  2018-09-18  0:35   ` Jann Horn
  0 siblings, 2 replies; 7+ messages in thread
From: Hugh Dickins @ 2018-09-18  0:05 UTC (permalink / raw)
  To: Jann Horn
  Cc: Dan Williams, Andrew Morton, Michal Hocko, Hugh Dickins,
	Rik van Riel, Andrea Arcangeli, Konstantin Khlebnikov,
	Salman Qazi, Michael S. Tsirkin, Jan Kara, linux-kernel,
	linux-mm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6917 bytes --]

Hi Jann,

On Mon, 17 Sep 2018, Jann Horn wrote:

> [I'm not sure who the best people to ask about this are, I hope the
> recipient list resembles something reasonable...]
> 
> I have noticed that the dup_mmap() logic on fork() doesn't handle
> pages with active direct I/O properly: dup_mmap() seems to assume that
> making the PTE referencing a page readonly will always prevent future
> writes to the page, but if the kernel has acquired a direct reference
> to the page before (e.g. via get_user_pages_fast()), writes can still
> happen that way.
> 
> The worst-case effect of this - as far as I can tell - is that when a
> multithreaded process forks while one thread is in the middle of
> sys_read() on a file that uses direct I/O with get_user_pages_fast(),
> the read data can become visible in the child while the parent's
> buffer stays uninitialized if the parent writes to a relevant page
> post-fork before either the I/O completes or the child writes to it.

Yes: you're understandably more worried by the one seeing the other's
data; we've tended in the past to be more worried about the one getting
corruption, and the other not seeing the data it asked for (and usually
in the context of RDMA, rather than filesystem direct I/O).

I've added some Cc's: I might be misremembering, but I think both
Andrea and Konstantin have offered approaches to this in the past,
and I believe Salman is taking a look at it currently.

But my own interest ended when Michael added MADV_DONTFORK: beyond
that, we've rated it a "Patient: It hurts when I do this. Doctor:
Don't do that then" - more complexity and overhead to solve, than
we have had appetite to get into.  But not a shiningly satisfactory
situation, I'll agree.

Hugh

> 
> Reproducer code:
> 
> ====== START hello.c ======
> #define FUSE_USE_VERSION 26
> 
> #include <fuse.h>
> #include <stdio.h>
> #include <string.h>
> #include <errno.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <err.h>
> #include <sys/uio.h>
> 
> static const char *hello_path = "/hello";
> 
> static int hello_getattr(const char *path, struct stat *stbuf)
> {
>         int res = 0;
>         memset(stbuf, 0, sizeof(struct stat));
>         if (strcmp(path, "/") == 0) {
>                 stbuf->st_mode = S_IFDIR | 0755;
>                 stbuf->st_nlink = 2;
>         } else if (strcmp(path, hello_path) == 0) {
>                 stbuf->st_mode = S_IFREG | 0666;
>                 stbuf->st_nlink = 1;
>                 stbuf->st_size = 0x1000;
>                 stbuf->st_blocks = 0;
>         } else
>                 res = -ENOENT;
>         return res;
> }
> 
> static int hello_readdir(const char *path, void *buf, fuse_fill_dir_t
> filler, off_t offset, struct fuse_file_info *fi) {
>         filler(buf, ".", NULL, 0);
>         filler(buf, "..", NULL, 0);
>         filler(buf, hello_path + 1, NULL, 0);
>         return 0;
> }
> 
> static int hello_open(const char *path, struct fuse_file_info *fi) {
>         return 0;
> }
> 
> static int hello_read(const char *path, char *buf, size_t size, off_t
> offset, struct fuse_file_info *fi) {
>         sleep(3);
>         size_t len = 0x1000;
>         if (offset < len) {
>                 if (offset + size > len)
>                         size = len - offset;
>                 memset(buf, 0, size);
>         } else
>                 size = 0;
>         return size;
> }
> 
> static int hello_write(const char *path, const char *buf, size_t size,
> off_t offset, struct fuse_file_info *fi) {
>         while(1) pause();
> }
> 
> static struct fuse_operations hello_oper = {
>         .getattr        = hello_getattr,
>         .readdir        = hello_readdir,
>         .open           = hello_open,
>         .read           = hello_read,
>         .write          = hello_write,
> };
> 
> int main(int argc, char *argv[]) {
>         return fuse_main(argc, argv, &hello_oper, NULL);
> }
> ====== END hello.c ======
> 
> ====== START simple_mmap.c ======
> #define _GNU_SOURCE
> #include <pthread.h>
> #include <sys/mman.h>
> #include <err.h>
> #include <unistd.h>
> #include <fcntl.h>
> #include <stdio.h>
> #include <signal.h>
> #include <sys/prctl.h>
> #include <sys/wait.h>
> 
> __attribute__((aligned(0x1000))) char data_buffer_[0x10000];
> #define data_buffer (data_buffer_ + 0x8000)
> 
> void *fuse_thread(void *dummy) {
>         /* step 2: start direct I/O on data_buffer */
>         int fuse_fd = open("mount/hello", O_RDWR);
>         if (fuse_fd == -1)
>                 err(1, "unable to open FUSE fd");
>         printf("char in parent (before): %hhd\n", data_buffer[0]);
>         int res = read(fuse_fd, data_buffer, 0x1000);
>         /* step 6: read completes, show post-read state */
>         printf("fuse read result: %d\n", res);
>         printf("char in parent (after): %hhd\n", data_buffer[0]);
> }
> 
> int main(void) {
>         /* step 1: make data_buffer dirty */
>         data_buffer[0] = 1;
> 
>         pthread_t thread;
>         if (pthread_create(&thread, NULL, fuse_thread, NULL))
>                 errx(1, "pthread_create");
> 
>         sleep(1);
>         /* step 3: fork a child */
>         pid_t child = fork();
>         if (child == -1)
>                 err(1, "fork");
>         if (child == 0) {
>                 prctl(PR_SET_PDEATHSIG, SIGKILL);
>                 sleep(1);
> 
>                 /* step 5: show pre-read state in the child */
>                 printf("char in child (before): %hhd\n", data_buffer[0]);
>                 sleep(3);
>                 /* step 7: read is complete, show post-read state in child */
>                 printf("char in child (after): %hhd\n", data_buffer[0]);
>                 return 0;
>         }
> 
>         /* step 4: de-CoW data_buffer in the parent */
>         data_buffer[0x800] = 2;
> 
>         int status;
>         if (wait(&status) != child)
>                 err(1, "wait");
> }
> ====== END simple_mmap.c ======
> 
> Repro steps:
> 
> In one terminal:
> $ mkdir mount
> $ gcc -o hello hello.c -Wall -std=gnu99 `pkg-config fuse --cflags --libs`
> hello.c: In function ‘hello_write’:
> hello.c:67:1: warning: no return statement in function returning
> non-void [-Wreturn-type]
>  }
>  ^
> $ ./hello -d -o direct_io mount
> FUSE library version: 2.9.7
> [...]
> 
> In a second terminal:
> $ gcc -pthread -o simple_mmap simple_mmap.c
> $ ./simple_mmap
> char in parent (before): 1
> char in child (before): 1
> fuse read result: 4096
> char in parent (after): 1
> char in child (after): 0
> 
> I have tested that this still works on 4.19.0-rc3+.
> 
> As far as I can tell, the fix would be to immediately copy pages with
> `refcount - mapcount > N` in dup_mmap(), or something like that?
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] mm: direct I/O (using GUP) can write to COW anonymous pages
  2018-09-18  0:05 ` Hugh Dickins
@ 2018-09-18  0:19   ` Salman Qazi
  2018-09-18  0:35   ` Jann Horn
  1 sibling, 0 replies; 7+ messages in thread
From: Salman Qazi @ 2018-09-18  0:19 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: jannh, Dan Williams, Andrew Morton, mhocko, riel, aarcange,
	khlebnikov, mst, jack, Linux Kernel Mailing List, linux-mm

On Mon, Sep 17, 2018 at 5:05 PM Hugh Dickins <hughd@google.com> wrote:
>
> Hi Jann,
>
> On Mon, 17 Sep 2018, Jann Horn wrote:
>
> > [I'm not sure who the best people to ask about this are, I hope the
> > recipient list resembles something reasonable...]
> >
> > I have noticed that the dup_mmap() logic on fork() doesn't handle
> > pages with active direct I/O properly: dup_mmap() seems to assume that
> > making the PTE referencing a page readonly will always prevent future
> > writes to the page, but if the kernel has acquired a direct reference
> > to the page before (e.g. via get_user_pages_fast()), writes can still
> > happen that way.
> >
> > The worst-case effect of this - as far as I can tell - is that when a
> > multithreaded process forks while one thread is in the middle of
> > sys_read() on a file that uses direct I/O with get_user_pages_fast(),
> > the read data can become visible in the child while the parent's
> > buffer stays uninitialized if the parent writes to a relevant page
> > post-fork before either the I/O completes or the child writes to it.
>
> Yes: you're understandably more worried by the one seeing the other's
> data; we've tended in the past to be more worried about the one getting
> corruption, and the other not seeing the data it asked for (and usually
> in the context of RDMA, rather than filesystem direct I/O).
>
> I've added some Cc's: I might be misremembering, but I think both
> Andrea and Konstantin have offered approaches to this in the past,
> and I believe Salman is taking a look at it currently.
>
> But my own interest ended when Michael added MADV_DONTFORK: beyond
> that, we've rated it a "Patient: It hurts when I do this. Doctor:
> Don't do that then" - more complexity and overhead to solve, than
> we have had appetite to get into.  But not a shiningly satisfactory
> situation, I'll agree.

The approach that I am currently investigating (at least to use
internally at Google) is to be able to detect (via a heuristic with
some error rate) when the patient might be doing it, and tell them not
to.  Of course, we would have
to tell them nicely rather than killing their process, especially if
we expect substantial number of false positives.
I am motivated by the belief that a transparent fix is not likely to
be feasible.  Others, who have taken that route earlier, can correct
me, if I am mistaken.

>
> Hugh
>
> >
> > Reproducer code:
> >
> > ====== START hello.c ======
> > #define FUSE_USE_VERSION 26
> >
> > #include <fuse.h>
> > #include <stdio.h>
> > #include <string.h>
> > #include <errno.h>
> > #include <fcntl.h>
> > #include <unistd.h>
> > #include <err.h>
> > #include <sys/uio.h>
> >
> > static const char *hello_path = "/hello";
> >
> > static int hello_getattr(const char *path, struct stat *stbuf)
> > {
> >         int res = 0;
> >         memset(stbuf, 0, sizeof(struct stat));
> >         if (strcmp(path, "/") == 0) {
> >                 stbuf->st_mode = S_IFDIR | 0755;
> >                 stbuf->st_nlink = 2;
> >         } else if (strcmp(path, hello_path) == 0) {
> >                 stbuf->st_mode = S_IFREG | 0666;
> >                 stbuf->st_nlink = 1;
> >                 stbuf->st_size = 0x1000;
> >                 stbuf->st_blocks = 0;
> >         } else
> >                 res = -ENOENT;
> >         return res;
> > }
> >
> > static int hello_readdir(const char *path, void *buf, fuse_fill_dir_t
> > filler, off_t offset, struct fuse_file_info *fi) {
> >         filler(buf, ".", NULL, 0);
> >         filler(buf, "..", NULL, 0);
> >         filler(buf, hello_path + 1, NULL, 0);
> >         return 0;
> > }
> >
> > static int hello_open(const char *path, struct fuse_file_info *fi) {
> >         return 0;
> > }
> >
> > static int hello_read(const char *path, char *buf, size_t size, off_t
> > offset, struct fuse_file_info *fi) {
> >         sleep(3);
> >         size_t len = 0x1000;
> >         if (offset < len) {
> >                 if (offset + size > len)
> >                         size = len - offset;
> >                 memset(buf, 0, size);
> >         } else
> >                 size = 0;
> >         return size;
> > }
> >
> > static int hello_write(const char *path, const char *buf, size_t size,
> > off_t offset, struct fuse_file_info *fi) {
> >         while(1) pause();
> > }
> >
> > static struct fuse_operations hello_oper = {
> >         .getattr        = hello_getattr,
> >         .readdir        = hello_readdir,
> >         .open           = hello_open,
> >         .read           = hello_read,
> >         .write          = hello_write,
> > };
> >
> > int main(int argc, char *argv[]) {
> >         return fuse_main(argc, argv, &hello_oper, NULL);
> > }
> > ====== END hello.c ======
> >
> > ====== START simple_mmap.c ======
> > #define _GNU_SOURCE
> > #include <pthread.h>
> > #include <sys/mman.h>
> > #include <err.h>
> > #include <unistd.h>
> > #include <fcntl.h>
> > #include <stdio.h>
> > #include <signal.h>
> > #include <sys/prctl.h>
> > #include <sys/wait.h>
> >
> > __attribute__((aligned(0x1000))) char data_buffer_[0x10000];
> > #define data_buffer (data_buffer_ + 0x8000)
> >
> > void *fuse_thread(void *dummy) {
> >         /* step 2: start direct I/O on data_buffer */
> >         int fuse_fd = open("mount/hello", O_RDWR);
> >         if (fuse_fd == -1)
> >                 err(1, "unable to open FUSE fd");
> >         printf("char in parent (before): %hhd\n", data_buffer[0]);
> >         int res = read(fuse_fd, data_buffer, 0x1000);
> >         /* step 6: read completes, show post-read state */
> >         printf("fuse read result: %d\n", res);
> >         printf("char in parent (after): %hhd\n", data_buffer[0]);
> > }
> >
> > int main(void) {
> >         /* step 1: make data_buffer dirty */
> >         data_buffer[0] = 1;
> >
> >         pthread_t thread;
> >         if (pthread_create(&thread, NULL, fuse_thread, NULL))
> >                 errx(1, "pthread_create");
> >
> >         sleep(1);
> >         /* step 3: fork a child */
> >         pid_t child = fork();
> >         if (child == -1)
> >                 err(1, "fork");
> >         if (child == 0) {
> >                 prctl(PR_SET_PDEATHSIG, SIGKILL);
> >                 sleep(1);
> >
> >                 /* step 5: show pre-read state in the child */
> >                 printf("char in child (before): %hhd\n", data_buffer[0]);
> >                 sleep(3);
> >                 /* step 7: read is complete, show post-read state in child */
> >                 printf("char in child (after): %hhd\n", data_buffer[0]);
> >                 return 0;
> >         }
> >
> >         /* step 4: de-CoW data_buffer in the parent */
> >         data_buffer[0x800] = 2;
> >
> >         int status;
> >         if (wait(&status) != child)
> >                 err(1, "wait");
> > }
> > ====== END simple_mmap.c ======
> >
> > Repro steps:
> >
> > In one terminal:
> > $ mkdir mount
> > $ gcc -o hello hello.c -Wall -std=gnu99 `pkg-config fuse --cflags --libs`
> > hello.c: In function ‘hello_write’:
> > hello.c:67:1: warning: no return statement in function returning
> > non-void [-Wreturn-type]
> >  }
> >  ^
> > $ ./hello -d -o direct_io mount
> > FUSE library version: 2.9.7
> > [...]
> >
> > In a second terminal:
> > $ gcc -pthread -o simple_mmap simple_mmap.c
> > $ ./simple_mmap
> > char in parent (before): 1
> > char in child (before): 1
> > fuse read result: 4096
> > char in parent (after): 1
> > char in child (after): 0
> >
> > I have tested that this still works on 4.19.0-rc3+.
> >
> > As far as I can tell, the fix would be to immediately copy pages with
> > `refcount - mapcount > N` in dup_mmap(), or something like that?
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] mm: direct I/O (using GUP) can write to COW anonymous pages
  2018-09-18  0:05 ` Hugh Dickins
  2018-09-18  0:19   ` Salman Qazi
@ 2018-09-18  0:35   ` Jann Horn
  2018-09-18  9:13     ` Konstantin Khlebnikov
  2018-09-18  9:58     ` Jan Kara
  1 sibling, 2 replies; 7+ messages in thread
From: Jann Horn @ 2018-09-18  0:35 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Dan Williams, Andrew Morton, Michal Hocko, Rik van Riel,
	Andrea Arcangeli, Konstantin Khlebnikov, sqazi,
	Michael S. Tsirkin, jack, kernel list, Linux-MM

On Tue, Sep 18, 2018 at 2:05 AM Hugh Dickins <hughd@google.com> wrote:
>
> Hi Jann,
>
> On Mon, 17 Sep 2018, Jann Horn wrote:
>
> > [I'm not sure who the best people to ask about this are, I hope the
> > recipient list resembles something reasonable...]
> >
> > I have noticed that the dup_mmap() logic on fork() doesn't handle
> > pages with active direct I/O properly: dup_mmap() seems to assume that
> > making the PTE referencing a page readonly will always prevent future
> > writes to the page, but if the kernel has acquired a direct reference
> > to the page before (e.g. via get_user_pages_fast()), writes can still
> > happen that way.
> >
> > The worst-case effect of this - as far as I can tell - is that when a
> > multithreaded process forks while one thread is in the middle of
> > sys_read() on a file that uses direct I/O with get_user_pages_fast(),
> > the read data can become visible in the child while the parent's
> > buffer stays uninitialized if the parent writes to a relevant page
> > post-fork before either the I/O completes or the child writes to it.
>
> Yes: you're understandably more worried by the one seeing the other's
> data;

Actually, I was mostly just trying to find a scenario in which the
parent doesn't get the data it's asking for, and this is the simplest
I could come up with. :)

I was also vaguely worried about whether some other part of the mm
subsystem might assume that COW pages are immutable, but I haven't
found anything like that so far, so that might've been unwarranted
paranoia.

> we've tended in the past to be more worried about the one getting
> corruption, and the other not seeing the data it asked for (and usually
> in the context of RDMA, rather than filesystem direct I/O).
>
> I've added some Cc's: I might be misremembering, but I think both
> Andrea and Konstantin have offered approaches to this in the past,
> and I believe Salman is taking a look at it currently.
>
> But my own interest ended when Michael added MADV_DONTFORK: beyond
> that, we've rated it a "Patient: It hurts when I do this. Doctor:
> Don't do that then" - more complexity and overhead to solve, than
> we have had appetite to get into.

Makes sense, I guess.

I wonder whether there's a concise way to express this in the fork.2
manpage, or something like that. Maybe I'll take a stab at writing
something. The biggest issue I see with documenting this edgecase is
that, as an application developer, if you don't know whether some file
might be coming from a FUSE filesystem that has opted out of using the
disk cache, the "don't do that" essentially becomes "don't read() into
heap buffers while fork()ing in another thread", since with FUSE,
direct I/O can happen even if you don't open files as O_DIRECT as long
as the filesystem requests direct I/O, and get_user_pages_fast() will
AFAIU be used for non-page-aligned buffers, meaning that an adjacent
heap memory access could trigger CoW page duplication. But then, FUSE
filesystems that opt out of the disk cache are probably so rare that
it's not a concern in practice...

> But not a shiningly satisfactory
> situation, I'll agree.
>
> Hugh
>
> >
> > Reproducer code:
> >
> > ====== START hello.c ======
> > #define FUSE_USE_VERSION 26
> >
> > #include <fuse.h>
> > #include <stdio.h>
> > #include <string.h>
> > #include <errno.h>
> > #include <fcntl.h>
> > #include <unistd.h>
> > #include <err.h>
> > #include <sys/uio.h>
> >
> > static const char *hello_path = "/hello";
> >
> > static int hello_getattr(const char *path, struct stat *stbuf)
> > {
> >         int res = 0;
> >         memset(stbuf, 0, sizeof(struct stat));
> >         if (strcmp(path, "/") == 0) {
> >                 stbuf->st_mode = S_IFDIR | 0755;
> >                 stbuf->st_nlink = 2;
> >         } else if (strcmp(path, hello_path) == 0) {
> >                 stbuf->st_mode = S_IFREG | 0666;
> >                 stbuf->st_nlink = 1;
> >                 stbuf->st_size = 0x1000;
> >                 stbuf->st_blocks = 0;
> >         } else
> >                 res = -ENOENT;
> >         return res;
> > }
> >
> > static int hello_readdir(const char *path, void *buf, fuse_fill_dir_t
> > filler, off_t offset, struct fuse_file_info *fi) {
> >         filler(buf, ".", NULL, 0);
> >         filler(buf, "..", NULL, 0);
> >         filler(buf, hello_path + 1, NULL, 0);
> >         return 0;
> > }
> >
> > static int hello_open(const char *path, struct fuse_file_info *fi) {
> >         return 0;
> > }
> >
> > static int hello_read(const char *path, char *buf, size_t size, off_t
> > offset, struct fuse_file_info *fi) {
> >         sleep(3);
> >         size_t len = 0x1000;
> >         if (offset < len) {
> >                 if (offset + size > len)
> >                         size = len - offset;
> >                 memset(buf, 0, size);
> >         } else
> >                 size = 0;
> >         return size;
> > }
> >
> > static int hello_write(const char *path, const char *buf, size_t size,
> > off_t offset, struct fuse_file_info *fi) {
> >         while(1) pause();
> > }
> >
> > static struct fuse_operations hello_oper = {
> >         .getattr        = hello_getattr,
> >         .readdir        = hello_readdir,
> >         .open           = hello_open,
> >         .read           = hello_read,
> >         .write          = hello_write,
> > };
> >
> > int main(int argc, char *argv[]) {
> >         return fuse_main(argc, argv, &hello_oper, NULL);
> > }
> > ====== END hello.c ======
> >
> > ====== START simple_mmap.c ======
> > #define _GNU_SOURCE
> > #include <pthread.h>
> > #include <sys/mman.h>
> > #include <err.h>
> > #include <unistd.h>
> > #include <fcntl.h>
> > #include <stdio.h>
> > #include <signal.h>
> > #include <sys/prctl.h>
> > #include <sys/wait.h>
> >
> > __attribute__((aligned(0x1000))) char data_buffer_[0x10000];
> > #define data_buffer (data_buffer_ + 0x8000)
> >
> > void *fuse_thread(void *dummy) {
> >         /* step 2: start direct I/O on data_buffer */
> >         int fuse_fd = open("mount/hello", O_RDWR);
> >         if (fuse_fd == -1)
> >                 err(1, "unable to open FUSE fd");
> >         printf("char in parent (before): %hhd\n", data_buffer[0]);
> >         int res = read(fuse_fd, data_buffer, 0x1000);
> >         /* step 6: read completes, show post-read state */
> >         printf("fuse read result: %d\n", res);
> >         printf("char in parent (after): %hhd\n", data_buffer[0]);
> > }
> >
> > int main(void) {
> >         /* step 1: make data_buffer dirty */
> >         data_buffer[0] = 1;
> >
> >         pthread_t thread;
> >         if (pthread_create(&thread, NULL, fuse_thread, NULL))
> >                 errx(1, "pthread_create");
> >
> >         sleep(1);
> >         /* step 3: fork a child */
> >         pid_t child = fork();
> >         if (child == -1)
> >                 err(1, "fork");
> >         if (child == 0) {
> >                 prctl(PR_SET_PDEATHSIG, SIGKILL);
> >                 sleep(1);
> >
> >                 /* step 5: show pre-read state in the child */
> >                 printf("char in child (before): %hhd\n", data_buffer[0]);
> >                 sleep(3);
> >                 /* step 7: read is complete, show post-read state in child */
> >                 printf("char in child (after): %hhd\n", data_buffer[0]);
> >                 return 0;
> >         }
> >
> >         /* step 4: de-CoW data_buffer in the parent */
> >         data_buffer[0x800] = 2;
> >
> >         int status;
> >         if (wait(&status) != child)
> >                 err(1, "wait");
> > }
> > ====== END simple_mmap.c ======
> >
> > Repro steps:
> >
> > In one terminal:
> > $ mkdir mount
> > $ gcc -o hello hello.c -Wall -std=gnu99 `pkg-config fuse --cflags --libs`
> > hello.c: In function ‘hello_write’:
> > hello.c:67:1: warning: no return statement in function returning
> > non-void [-Wreturn-type]
> >  }
> >  ^
> > $ ./hello -d -o direct_io mount
> > FUSE library version: 2.9.7
> > [...]
> >
> > In a second terminal:
> > $ gcc -pthread -o simple_mmap simple_mmap.c
> > $ ./simple_mmap
> > char in parent (before): 1
> > char in child (before): 1
> > fuse read result: 4096
> > char in parent (after): 1
> > char in child (after): 0
> >
> > I have tested that this still works on 4.19.0-rc3+.
> >
> > As far as I can tell, the fix would be to immediately copy pages with
> > `refcount - mapcount > N` in dup_mmap(), or something like that?
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] mm: direct I/O (using GUP) can write to COW anonymous pages
  2018-09-18  0:35   ` Jann Horn
@ 2018-09-18  9:13     ` Konstantin Khlebnikov
  2018-09-18  9:58     ` Jan Kara
  1 sibling, 0 replies; 7+ messages in thread
From: Konstantin Khlebnikov @ 2018-09-18  9:13 UTC (permalink / raw)
  To: Jann Horn, Hugh Dickins
  Cc: Dan Williams, Andrew Morton, Michal Hocko, Rik van Riel,
	Andrea Arcangeli, sqazi, Michael S. Tsirkin, jack, kernel list,
	Linux-MM

On 18.09.2018 03:35, Jann Horn wrote:
> On Tue, Sep 18, 2018 at 2:05 AM Hugh Dickins <hughd@google.com> wrote:
>>
>> Hi Jann,
>>
>> On Mon, 17 Sep 2018, Jann Horn wrote:
>>
>>> [I'm not sure who the best people to ask about this are, I hope the
>>> recipient list resembles something reasonable...]
>>>
>>> I have noticed that the dup_mmap() logic on fork() doesn't handle
>>> pages with active direct I/O properly: dup_mmap() seems to assume that
>>> making the PTE referencing a page readonly will always prevent future
>>> writes to the page, but if the kernel has acquired a direct reference
>>> to the page before (e.g. via get_user_pages_fast()), writes can still
>>> happen that way.
>>>
>>> The worst-case effect of this - as far as I can tell - is that when a
>>> multithreaded process forks while one thread is in the middle of
>>> sys_read() on a file that uses direct I/O with get_user_pages_fast(),
>>> the read data can become visible in the child while the parent's
>>> buffer stays uninitialized if the parent writes to a relevant page
>>> post-fork before either the I/O completes or the child writes to it.
>>
>> Yes: you're understandably more worried by the one seeing the other's
>> data;
> 
> Actually, I was mostly just trying to find a scenario in which the
> parent doesn't get the data it's asking for, and this is the simplest
> I could come up with. :)

This might happens even in single-threaded process:

io_submit()
fork()
<CoW same page>
io_getevents()
<no new data in page>

For example if buffer is only 512 bytes and share page with something else.

THP opens wider race window.

> 
> I was also vaguely worried about whether some other part of the mm
> subsystem might assume that COW pages are immutable, but I haven't
> found anything like that so far, so that might've been unwarranted
> paranoia.
> 
>> we've tended in the past to be more worried about the one getting
>> corruption, and the other not seeing the data it asked for (and usually
>> in the context of RDMA, rather than filesystem direct I/O).
>>
>> I've added some Cc's: I might be misremembering, but I think both
>> Andrea and Konstantin have offered approaches to this in the past,
>> and I believe Salman is taking a look at it currently.
>>
>> But my own interest ended when Michael added MADV_DONTFORK: beyond
>> that, we've rated it a "Patient: It hurts when I do this. Doctor:
>> Don't do that then" - more complexity and overhead to solve, than
>> we have had appetite to get into.
> 
> Makes sense, I guess.
> 
> I wonder whether there's a concise way to express this in the fork.2
> manpage, or something like that. Maybe I'll take a stab at writing
> something. The biggest issue I see with documenting this edgecase is
> that, as an application developer, if you don't know whether some file
> might be coming from a FUSE filesystem that has opted out of using the
> disk cache, the "don't do that" essentially becomes "don't read() into
> heap buffers while fork()ing in another thread", since with FUSE,
> direct I/O can happen even if you don't open files as O_DIRECT as long
> as the filesystem requests direct I/O, and get_user_pages_fast() will
> AFAIU be used for non-page-aligned buffers, meaning that an adjacent
> heap memory access could trigger CoW page duplication. But then, FUSE
> filesystems that opt out of the disk cache are probably so rare that
> it's not a concern in practice...

fork() from multithreaded process is almost always broken
because child gets locks held by other threads.
For example glibc localtime_r() is unsafe after such fork().

Unless child calls execve() right away it cannot do much.
But in this case vfork() works way much better and faster.

> 
>> But not a shiningly satisfactory
>> situation, I'll agree.
>>
>> Hugh
>>
>>>
>>> Reproducer code:
>>>
>>> ====== START hello.c ======
>>> #define FUSE_USE_VERSION 26
>>>
>>> #include <fuse.h>
>>> #include <stdio.h>
>>> #include <string.h>
>>> #include <errno.h>
>>> #include <fcntl.h>
>>> #include <unistd.h>
>>> #include <err.h>
>>> #include <sys/uio.h>
>>>
>>> static const char *hello_path = "/hello";
>>>
>>> static int hello_getattr(const char *path, struct stat *stbuf)
>>> {
>>>          int res = 0;
>>>          memset(stbuf, 0, sizeof(struct stat));
>>>          if (strcmp(path, "/") == 0) {
>>>                  stbuf->st_mode = S_IFDIR | 0755;
>>>                  stbuf->st_nlink = 2;
>>>          } else if (strcmp(path, hello_path) == 0) {
>>>                  stbuf->st_mode = S_IFREG | 0666;
>>>                  stbuf->st_nlink = 1;
>>>                  stbuf->st_size = 0x1000;
>>>                  stbuf->st_blocks = 0;
>>>          } else
>>>                  res = -ENOENT;
>>>          return res;
>>> }
>>>
>>> static int hello_readdir(const char *path, void *buf, fuse_fill_dir_t
>>> filler, off_t offset, struct fuse_file_info *fi) {
>>>          filler(buf, ".", NULL, 0);
>>>          filler(buf, "..", NULL, 0);
>>>          filler(buf, hello_path + 1, NULL, 0);
>>>          return 0;
>>> }
>>>
>>> static int hello_open(const char *path, struct fuse_file_info *fi) {
>>>          return 0;
>>> }
>>>
>>> static int hello_read(const char *path, char *buf, size_t size, off_t
>>> offset, struct fuse_file_info *fi) {
>>>          sleep(3);
>>>          size_t len = 0x1000;
>>>          if (offset < len) {
>>>                  if (offset + size > len)
>>>                          size = len - offset;
>>>                  memset(buf, 0, size);
>>>          } else
>>>                  size = 0;
>>>          return size;
>>> }
>>>
>>> static int hello_write(const char *path, const char *buf, size_t size,
>>> off_t offset, struct fuse_file_info *fi) {
>>>          while(1) pause();
>>> }
>>>
>>> static struct fuse_operations hello_oper = {
>>>          .getattr        = hello_getattr,
>>>          .readdir        = hello_readdir,
>>>          .open           = hello_open,
>>>          .read           = hello_read,
>>>          .write          = hello_write,
>>> };
>>>
>>> int main(int argc, char *argv[]) {
>>>          return fuse_main(argc, argv, &hello_oper, NULL);
>>> }
>>> ====== END hello.c ======
>>>
>>> ====== START simple_mmap.c ======
>>> #define _GNU_SOURCE
>>> #include <pthread.h>
>>> #include <sys/mman.h>
>>> #include <err.h>
>>> #include <unistd.h>
>>> #include <fcntl.h>
>>> #include <stdio.h>
>>> #include <signal.h>
>>> #include <sys/prctl.h>
>>> #include <sys/wait.h>
>>>
>>> __attribute__((aligned(0x1000))) char data_buffer_[0x10000];
>>> #define data_buffer (data_buffer_ + 0x8000)
>>>
>>> void *fuse_thread(void *dummy) {
>>>          /* step 2: start direct I/O on data_buffer */
>>>          int fuse_fd = open("mount/hello", O_RDWR);
>>>          if (fuse_fd == -1)
>>>                  err(1, "unable to open FUSE fd");
>>>          printf("char in parent (before): %hhd\n", data_buffer[0]);
>>>          int res = read(fuse_fd, data_buffer, 0x1000);
>>>          /* step 6: read completes, show post-read state */
>>>          printf("fuse read result: %d\n", res);
>>>          printf("char in parent (after): %hhd\n", data_buffer[0]);
>>> }
>>>
>>> int main(void) {
>>>          /* step 1: make data_buffer dirty */
>>>          data_buffer[0] = 1;
>>>
>>>          pthread_t thread;
>>>          if (pthread_create(&thread, NULL, fuse_thread, NULL))
>>>                  errx(1, "pthread_create");
>>>
>>>          sleep(1);
>>>          /* step 3: fork a child */
>>>          pid_t child = fork();
>>>          if (child == -1)
>>>                  err(1, "fork");
>>>          if (child == 0) {
>>>                  prctl(PR_SET_PDEATHSIG, SIGKILL);
>>>                  sleep(1);
>>>
>>>                  /* step 5: show pre-read state in the child */
>>>                  printf("char in child (before): %hhd\n", data_buffer[0]);
>>>                  sleep(3);
>>>                  /* step 7: read is complete, show post-read state in child */
>>>                  printf("char in child (after): %hhd\n", data_buffer[0]);
>>>                  return 0;
>>>          }
>>>
>>>          /* step 4: de-CoW data_buffer in the parent */
>>>          data_buffer[0x800] = 2;
>>>
>>>          int status;
>>>          if (wait(&status) != child)
>>>                  err(1, "wait");
>>> }
>>> ====== END simple_mmap.c ======
>>>
>>> Repro steps:
>>>
>>> In one terminal:
>>> $ mkdir mount
>>> $ gcc -o hello hello.c -Wall -std=gnu99 `pkg-config fuse --cflags --libs`
>>> hello.c: In function ‘hello_write’:
>>> hello.c:67:1: warning: no return statement in function returning
>>> non-void [-Wreturn-type]
>>>   }
>>>   ^
>>> $ ./hello -d -o direct_io mount
>>> FUSE library version: 2.9.7
>>> [...]
>>>
>>> In a second terminal:
>>> $ gcc -pthread -o simple_mmap simple_mmap.c
>>> $ ./simple_mmap
>>> char in parent (before): 1
>>> char in child (before): 1
>>> fuse read result: 4096
>>> char in parent (after): 1
>>> char in child (after): 0
>>>
>>> I have tested that this still works on 4.19.0-rc3+.
>>>
>>> As far as I can tell, the fix would be to immediately copy pages with
>>> `refcount - mapcount > N` in dup_mmap(), or something like that?
>>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] mm: direct I/O (using GUP) can write to COW anonymous pages
  2018-09-18  0:35   ` Jann Horn
  2018-09-18  9:13     ` Konstantin Khlebnikov
@ 2018-09-18  9:58     ` Jan Kara
  2018-09-26  5:00       ` John Hubbard
  1 sibling, 1 reply; 7+ messages in thread
From: Jan Kara @ 2018-09-18  9:58 UTC (permalink / raw)
  To: Jann Horn
  Cc: Hugh Dickins, Dan Williams, Andrew Morton, Michal Hocko,
	Rik van Riel, Andrea Arcangeli, Konstantin Khlebnikov, sqazi,
	Michael S. Tsirkin, jack, kernel list, Linux-MM, Miklos Szeredi,
	john.hubbard

On Tue 18-09-18 02:35:43, Jann Horn wrote:
> On Tue, Sep 18, 2018 at 2:05 AM Hugh Dickins <hughd@google.com> wrote:

Thanks for CC Hugh.

> > On Mon, 17 Sep 2018, Jann Horn wrote:
> >
> > > [I'm not sure who the best people to ask about this are, I hope the
> > > recipient list resembles something reasonable...]
> > >
> > > I have noticed that the dup_mmap() logic on fork() doesn't handle
> > > pages with active direct I/O properly: dup_mmap() seems to assume that
> > > making the PTE referencing a page readonly will always prevent future
> > > writes to the page, but if the kernel has acquired a direct reference
> > > to the page before (e.g. via get_user_pages_fast()), writes can still
> > > happen that way.
> > >
> > > The worst-case effect of this - as far as I can tell - is that when a
> > > multithreaded process forks while one thread is in the middle of
> > > sys_read() on a file that uses direct I/O with get_user_pages_fast(),
> > > the read data can become visible in the child while the parent's
> > > buffer stays uninitialized if the parent writes to a relevant page
> > > post-fork before either the I/O completes or the child writes to it.
> >
> > Yes: you're understandably more worried by the one seeing the other's
> > data;
> 
> Actually, I was mostly just trying to find a scenario in which the
> parent doesn't get the data it's asking for, and this is the simplest
> I could come up with. :)
> 
> I was also vaguely worried about whether some other part of the mm
> subsystem might assume that COW pages are immutable, but I haven't
> found anything like that so far, so that might've been unwarranted
> paranoia.

It's actually warranted paranoia. There are situations where filesystems
don't expect *shared file* page to be written when all pages tables are
write-protected - you can have a look at https://lwn.net/Articles/753027/
for a discussion from LSF/MM on this.  And as I've learned from Nick Piggin
people were aware of this problem over 10 years ago -
https://lkml.org/lkml/2018/7/9/217. Just nobody put enough effort into
fixing this.

> > we've tended in the past to be more worried about the one getting
> > corruption, and the other not seeing the data it asked for (and usually
> > in the context of RDMA, rather than filesystem direct I/O).
> >
> > I've added some Cc's: I might be misremembering, but I think both
> > Andrea and Konstantin have offered approaches to this in the past,
> > and I believe Salman is taking a look at it currently.
> >
> > But my own interest ended when Michael added MADV_DONTFORK: beyond
> > that, we've rated it a "Patient: It hurts when I do this. Doctor:
> > Don't do that then" - more complexity and overhead to solve, than
> > we have had appetite to get into.
> 
> Makes sense, I guess.
> 
> I wonder whether there's a concise way to express this in the fork.2
> manpage, or something like that. Maybe I'll take a stab at writing
> something. The biggest issue I see with documenting this edgecase is
> that, as an application developer, if you don't know whether some file
> might be coming from a FUSE filesystem that has opted out of using the
> disk cache, the "don't do that" essentially becomes "don't read() into
> heap buffers while fork()ing in another thread", since with FUSE,
> direct I/O can happen even if you don't open files as O_DIRECT as long
> as the filesystem requests direct I/O, and get_user_pages_fast() will
> AFAIU be used for non-page-aligned buffers, meaning that an adjacent
> heap memory access could trigger CoW page duplication. But then, FUSE
> filesystems that opt out of the disk cache are probably so rare that
> it's not a concern in practice...

So at least for shared file mappings we do need to fix this issue as it's
currently userspace triggerable Oops if you try hard enough. And with RDMA
you don't even have to try that hard. Properly dealing with private
mappings should not be that hard once the infrastructure is there I hope
but I didn't seriously look into that. I've added Miklos and John to CC as
they are interested as well. John was working on fixing this problem -
https://lkml.org/lkml/2018/7/9/158 - but I didn't hear from him for quite a
while so I'm not sure whether it died off or what's the current situation.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] mm: direct I/O (using GUP) can write to COW anonymous pages
  2018-09-18  9:58     ` Jan Kara
@ 2018-09-26  5:00       ` John Hubbard
  0 siblings, 0 replies; 7+ messages in thread
From: John Hubbard @ 2018-09-26  5:00 UTC (permalink / raw)
  To: Jan Kara, Jann Horn
  Cc: Hugh Dickins, Dan Williams, Andrew Morton, Michal Hocko,
	Rik van Riel, Andrea Arcangeli, Konstantin Khlebnikov, sqazi,
	Michael S. Tsirkin, kernel list, Linux-MM, Miklos Szeredi,
	john.hubbard

On 9/18/18 2:58 AM, Jan Kara wrote:
> On Tue 18-09-18 02:35:43, Jann Horn wrote:
>> On Tue, Sep 18, 2018 at 2:05 AM Hugh Dickins <hughd@google.com> wrote:
> 
> Thanks for CC Hugh.
> 
>>> On Mon, 17 Sep 2018, Jann Horn wrote:
>>>
>>
>> Makes sense, I guess.
>>
>> I wonder whether there's a concise way to express this in the fork.2
>> manpage, or something like that. Maybe I'll take a stab at writing
>> something. The biggest issue I see with documenting this edgecase is
>> that, as an application developer, if you don't know whether some file
>> might be coming from a FUSE filesystem that has opted out of using the
>> disk cache, the "don't do that" essentially becomes "don't read() into
>> heap buffers while fork()ing in another thread", since with FUSE,
>> direct I/O can happen even if you don't open files as O_DIRECT as long
>> as the filesystem requests direct I/O, and get_user_pages_fast() will
>> AFAIU be used for non-page-aligned buffers, meaning that an adjacent
>> heap memory access could trigger CoW page duplication. But then, FUSE
>> filesystems that opt out of the disk cache are probably so rare that
>> it's not a concern in practice...
> 
> So at least for shared file mappings we do need to fix this issue as it's
> currently userspace triggerable Oops if you try hard enough. And with RDMA
> you don't even have to try that hard. Properly dealing with private
> mappings should not be that hard once the infrastructure is there I hope
> but I didn't seriously look into that. I've added Miklos and John to CC as
> they are interested as well. John was working on fixing this problem -
> https://lkml.org/lkml/2018/7/9/158 - but I didn't hear from him for quite a
> while so I'm not sure whether it died off or what's the current situation.
> 

Hi,

Sorry for missing this even though I was CC'd, I only just now noticed it, while
trying to get caught up again.

Anyway, I've been sidetracked for a...while (since July!), but am jumping back 
in and working on this now. And I've got time allocated for it. So here goes.

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-09-26  5:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-17 16:12 [BUG] mm: direct I/O (using GUP) can write to COW anonymous pages Jann Horn
2018-09-18  0:05 ` Hugh Dickins
2018-09-18  0:19   ` Salman Qazi
2018-09-18  0:35   ` Jann Horn
2018-09-18  9:13     ` Konstantin Khlebnikov
2018-09-18  9:58     ` Jan Kara
2018-09-26  5:00       ` John Hubbard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).