All of lore.kernel.org
 help / color / mirror / Atom feed
* splice() on two pipes
@ 2009-04-29 10:33 Max Kellermann
  2009-04-29 15:23 ` Andi Kleen
  2009-04-30  6:21 ` Jens Axboe
  0 siblings, 2 replies; 7+ messages in thread
From: Max Kellermann @ 2009-04-29 10:33 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 667 bytes --]

Hi,

when I read about the splice() system call, I thought it was obvious
that it could copy data between two pipes.  I was surprised that this
assumption is wrong, it's not possible on 2.6.29, I get EINVAL.  Can
anybody please explain this limitation?

Background: I want to forward data between two subprocesses, which are
connected to me with a pipe().

I have attached a small test program which prints a table of supported
splice operations.  Here's the output on 2.6.29.1:

 in\out  pipe    sock    reg     chr
 pipe    no      yes     yes     yes
 sock    no      no      no      no
 reg     yes     no      no      no
 chr     no      no      no      no

Max

[-- Attachment #2: test_splice.c --]
[-- Type: text/x-csrc, Size: 2971 bytes --]

/*
 * Copyright (C) 2009 Max Kellermann <max@duempel.org>
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 * - Redistributions of source code must retain the above copyright
 * notice, this list of conditions and the following disclaimer.
 *
 * - Redistributions in binary form must reproduce the above copyright
 * notice, this list of conditions and the following disclaimer in the
 * documentation and/or other materials provided with the
 * distribution.
 */

/*
 * This tiny program prints a matrix: which file descriptor
 * combinations are supported by splice()?
 */

#define _GNU_SOURCE
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>

static struct {
    const char *const name;
    int in, out;
} fds[] = {
    { .name = "pipe", },
    { .name = "sock", },
    { .name = "reg", },
    { .name = "chr", },
};

enum {
    NUM_FDS = sizeof(fds) / sizeof(fds[0]),
};

int main(int argc, char **argv)
{
    int f[2], ret;
    unsigned x, y;
    char template1[] = "/tmp/test_splice.XXXXXX";
    char template2[] = "/tmp/test_splice.XXXXXX";

    (void)argc;
    (void)argv;

    /* open two file descriptors of each kind */

    fds[0].in = pipe(f) >= 0 ? f[0] : -1;
    fds[0].out = pipe(f) >= 0 ? f[1] : -1;
    fds[1].in = socketpair(AF_UNIX, SOCK_STREAM, 0, f) >= 0 ? f[0] : -1;
    fds[1].out = socketpair(AF_UNIX, SOCK_STREAM, 0, f) >= 0 ? f[0] : -1;
    fds[2].in = mkstemp(template1);
    fds[2].out = mkstemp(template2);
    fds[3].in = open("/dev/zero", O_RDONLY);
    fds[3].out = open("/dev/null", O_WRONLY);

    /* print table header */

    printf("in\\out");
    for (x = 0; x < NUM_FDS; ++x)
        printf("\t%s", fds[x].name);
    putchar('\n');

    for (y = 0; y < NUM_FDS; ++y) {
        fputs(fds[y].name, stdout);

        for (x = 0; x < NUM_FDS; ++x) {
            putchar('\t');

            if (fds[x].out < 0 || fds[y].in < 0) {
                fputs("n/a", stdout);
                continue;
            }

            ret = splice(fds[y].in, NULL, fds[x].out, NULL, 1,
                         SPLICE_F_NONBLOCK);
            if (ret >= 0 || errno == EAGAIN || errno == EWOULDBLOCK)
                /* EAGAIN or EWOULDBLOCK means that the kernel has
                   accepted this combination, but can't move pages
                   right now */
                fputs("yes", stdout);
            else if (errno == EINVAL)
                /* the kernel doesn't support this combination */
                fputs("no", stdout);
            else if (errno == ENOSYS)
                /* splice() isn't supported at all */
                fputs("ENOSYS", stdout);
            else
                /* an unexpected error code */
                fputs("err", stdout);
        }

        putchar('\n');
    }

    unlink(template1);
    unlink(template2);
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: splice() on two pipes
  2009-04-29 10:33 splice() on two pipes Max Kellermann
@ 2009-04-29 15:23 ` Andi Kleen
  2009-04-29 19:42   ` Max Kellermann
  2009-04-30  6:21 ` Jens Axboe
  1 sibling, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2009-04-29 15:23 UTC (permalink / raw)
  To: linux-kernel

Max Kellermann <max@duempel.org> writes:

I don't think splice is about handling all possible cases,
but just cases where the kernel can do better than user space.
I don't think that's the case here.

> when I read about the splice() system call, I thought it was obvious
> that it could copy data between two pipes. 

It would be more efficient if you used fd passing to pass the fd
around to the other process and let it read directly.

-Andi


-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: splice() on two pipes
  2009-04-29 15:23 ` Andi Kleen
@ 2009-04-29 19:42   ` Max Kellermann
  2009-04-30  4:56     ` Willy Tarreau
  0 siblings, 1 reply; 7+ messages in thread
From: Max Kellermann @ 2009-04-29 19:42 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel

On 2009/04/29 17:23, Andi Kleen <andi@firstfloor.org> wrote:
> I don't think splice is about handling all possible cases,
> but just cases where the kernel can do better than user space.
> I don't think that's the case here.

If splice() is about passing pointers of a pipe buffer, what's more
trivial (and natural) than passing that pointer between two pipes?

> > when I read about the splice() system call, I thought it was obvious
> > that it could copy data between two pipes. 
> 
> It would be more efficient if you used fd passing to pass the fd
> around to the other process and let it read directly.

That's not so easy in my case.  The header output of the one process
has to be parsed before the rest of it (or part of the rest) is going
to be forwarded to the second one.  My master process would lose
control over the transfer.  splice() looks like the perfect solution.

Max

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: splice() on two pipes
  2009-04-29 19:42   ` Max Kellermann
@ 2009-04-30  4:56     ` Willy Tarreau
  2009-04-30 14:26       ` Mark Hills
  0 siblings, 1 reply; 7+ messages in thread
From: Willy Tarreau @ 2009-04-30  4:56 UTC (permalink / raw)
  To: Andi Kleen, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1487 bytes --]

On Wed, Apr 29, 2009 at 09:42:55PM +0200, Max Kellermann wrote:
> On 2009/04/29 17:23, Andi Kleen <andi@firstfloor.org> wrote:
> > I don't think splice is about handling all possible cases,
> > but just cases where the kernel can do better than user space.
> > I don't think that's the case here.
> 
> If splice() is about passing pointers of a pipe buffer, what's more
> trivial (and natural) than passing that pointer between two pipes?
> 
> > > when I read about the splice() system call, I thought it was obvious
> > > that it could copy data between two pipes. 
> > 
> > It would be more efficient if you used fd passing to pass the fd
> > around to the other process and let it read directly.
> 
> That's not so easy in my case.  The header output of the one process
> has to be parsed before the rest of it (or part of the rest) is going
> to be forwarded to the second one.  My master process would lose
> control over the transfer.  splice() looks like the perfect solution.

indeed, that could make sense. From what I have seen in the splicing
code, I think tht implementing pipe to pipe should not be *that* hard,
starting from existing code (eg: net to pipe). Maybe you could try to
implement it since you have the code which makes use of it ? I think
it is the kind of feature which can only improve step by step based
on application needs.

BTW, I like your test program. Simple and easy. I have completed it to
test tcp and udp, you can find it attached.

Regards,
Willy


[-- Attachment #2: test_splice.c --]
[-- Type: text/plain, Size: 3240 bytes --]

/*
 * Copyright (C) 2009 Max Kellermann <max@duempel.org>
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 * - Redistributions of source code must retain the above copyright
 * notice, this list of conditions and the following disclaimer.
 *
 * - Redistributions in binary form must reproduce the above copyright
 * notice, this list of conditions and the following disclaimer in the
 * documentation and/or other materials provided with the
 * distribution.
 */

/*
 * This tiny program prints a matrix: which file descriptor
 * combinations are supported by splice()?
 */

#define _GNU_SOURCE
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>

static struct {
    const char *const name;
    int in, out;
} fds[] = {
    { .name = "pipe", },
    { .name = "reg", },
    { .name = "chr", },
    { .name = "unix", },
    { .name = "tcp", },
    { .name = "udp", },
};

enum {
    NUM_FDS = sizeof(fds) / sizeof(fds[0]),
};

int main(int argc, char **argv)
{
    int f[2], ret;
    unsigned x, y;
    char template1[] = "/tmp/test_splice.XXXXXX";
    char template2[] = "/tmp/test_splice.XXXXXX";

    (void)argc;
    (void)argv;

    /* open two file descriptors of each kind */

    fds[0].in = pipe(f) >= 0 ? f[0] : -1;
    fds[0].out = pipe(f) >= 0 ? f[1] : -1;
    fds[1].in = mkstemp(template1);
    fds[1].out = mkstemp(template2);
    fds[2].in = open("/dev/zero", O_RDONLY);
    fds[2].out = open("/dev/null", O_WRONLY);
    fds[3].in = socketpair(AF_UNIX, SOCK_STREAM, 0, f) >= 0 ? f[0] : -1;
    fds[3].out = socketpair(AF_UNIX, SOCK_STREAM, 0, f) >= 0 ? f[0] : -1;
    fds[4].in  = socket(AF_INET, SOCK_STREAM, 0);
    fds[4].out = socket(AF_INET, SOCK_STREAM, 0);
    fds[5].in  = socket(AF_INET, SOCK_DGRAM, 0);
    fds[5].out = socket(AF_INET, SOCK_DGRAM, 0);

    /* print table header */

    printf("in\\out");
    for (x = 0; x < NUM_FDS; ++x)
        printf("\t%s", fds[x].name);
    putchar('\n');

    for (y = 0; y < NUM_FDS; ++y) {
        fputs(fds[y].name, stdout);

        for (x = 0; x < NUM_FDS; ++x) {
            putchar('\t');

            if (fds[x].out < 0 || fds[y].in < 0) {
                fputs("n/a", stdout);
                continue;
            }

            ret = splice(fds[y].in, NULL, fds[x].out, NULL, 1,
                         SPLICE_F_NONBLOCK);
            if (ret >= 0 || errno == EAGAIN || errno == EWOULDBLOCK
		|| errno == ENOTCONN)
                /* EAGAIN or EWOULDBLOCK means that the kernel has
                   accepted this combination, but can't move pages
                   right now */
                fputs("yes", stdout);
            else if (errno == EINVAL)
                /* the kernel doesn't support this combination */
                fputs("no", stdout);
            else if (errno == ENOSYS)
                /* splice() isn't supported at all */
                fputs("ENOSYS", stdout);
            else
                /* an unexpected error code */
                fputs("err", stdout);
        }

        putchar('\n');
    }

    unlink(template1);
    unlink(template2);
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: splice() on two pipes
  2009-04-29 10:33 splice() on two pipes Max Kellermann
  2009-04-29 15:23 ` Andi Kleen
@ 2009-04-30  6:21 ` Jens Axboe
  2009-04-30  6:42   ` Max Kellermann
  1 sibling, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2009-04-30  6:21 UTC (permalink / raw)
  To: Max Kellermann; +Cc: linux-kernel

On Wed, Apr 29 2009, Max Kellermann wrote:
> Hi,
> 
> when I read about the splice() system call, I thought it was obvious
> that it could copy data between two pipes.  I was surprised that this
> assumption is wrong, it's not possible on 2.6.29, I get EINVAL.  Can
> anybody please explain this limitation?
> 
> Background: I want to forward data between two subprocesses, which are
> connected to me with a pipe().
> 
> I have attached a small test program which prints a table of supported
> splice operations.  Here's the output on 2.6.29.1:
> 
>  in\out  pipe    sock    reg     chr
>  pipe    no      yes     yes     yes
>  sock    no      no      no      no
>  reg     yes     no      no      no
>  chr     no      no      no      no

See sys_tee().

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: splice() on two pipes
  2009-04-30  6:21 ` Jens Axboe
@ 2009-04-30  6:42   ` Max Kellermann
  0 siblings, 0 replies; 7+ messages in thread
From: Max Kellermann @ 2009-04-30  6:42 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

On 2009/04/30 08:21, Jens Axboe <jens.axboe@oracle.com> wrote:
> See sys_tee().

Hm, so I could tee() from pipe1 to pipe2, then delete data from pipe1
by splicing it to /dev/null...

Max

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: splice() on two pipes
  2009-04-30  4:56     ` Willy Tarreau
@ 2009-04-30 14:26       ` Mark Hills
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Hills @ 2009-04-30 14:26 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Andi Kleen, linux-kernel

On Thu, 30 Apr 2009, Willy Tarreau wrote:

> BTW, I like your test program. Simple and easy. I have completed it to 
> test tcp and udp, you can find it attached.

I think the shown ability to splice to a unix socket may be misleading. 
Can someone offer some insight to my previous post (22nd April, "splice() 
on unix sockets")? Does splice to a unix socket actually result in zero 
copy?

   $ ./test_splice
   in\out  pipe    reg     chr     unix    tcp     udp
   pipe    no      yes     yes     yes     yes     yes
   reg     yes     no      no      no      no      no
   chr     no      no      no      no      no      no
   unix    no      no      no      no      no      no
   tcp     yes     no      no      no      no      no
   udp     no      no      no      no      no      no

Thanks

-- 
Mark

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-04-30 14:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-29 10:33 splice() on two pipes Max Kellermann
2009-04-29 15:23 ` Andi Kleen
2009-04-29 19:42   ` Max Kellermann
2009-04-30  4:56     ` Willy Tarreau
2009-04-30 14:26       ` Mark Hills
2009-04-30  6:21 ` Jens Axboe
2009-04-30  6:42   ` Max Kellermann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.