io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value
@ 2021-09-29 10:16 Ammar Faizi
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 1/6] src/syscall: " Ammar Faizi
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Ammar Faizi @ 2021-09-29 10:16 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov
  Cc: io-uring Mailing List, Louvian Lyndal, Ammar Faizi

Hi Jens,
Hi Pavel,

This is the v1 of RFC to implement the kernel style return value.

Motivation:
Currently liburing depends on libc. We want to make liburing can be
built without libc.

This idea firstly posted as an issue on the liburing GitHub
repository here: https://github.com/axboe/liburing/issues/443

The subject of the issue is: "An option to use liburing without libc?".

On Mon, Sep 27, 2021 at 4:18 PM Mahdi Rakhshandehroo <notifications@github.com> wrote:
> There are a couple of issues with liburing's libc dependency:
> 
>  1) libc implementations of errno, malloc, pthread etc. tend to
>     pollute the binary with unwanted global/thread-local state.
>     This makes reentrancy impossible and relocations expensive.
>  2) libc doesn't play nice with non-POSIX threading models, like
>     green threads with small stack sizes, or direct use of the
>     clone() system call. This makes interop with other
>     languages/runtimes difficult.
> 
> One could use the raw syscall interface to io_uring to address these
> concerns, but that would be somewhat painful, so it would be nice
> for liburing to support this use case out of the box. Perhaps
> something like a NOLIBC macro could be added which, if defined,
> would patch out libc constructs and replace them with non-libc
> wrappers where applicable. A few API changes might be necessary for
> the non-libc case (e.g. io_uring_get_probe/io_uring_free_probe), but
> it shouldn't break existing applications as long as it's opt-in.

----------------------------------------------------------------

### 1) Introduction

We want to make the changes incrementally, start from making it
possible to remove the `errno` variable dependency.

So this RFC aims to make it possible to remove `errno` variable
depedency from the liburing sources by implementing the kernel style
return value.

What we mean by "kernel style return value" is that, we wrap the
syscall API to make it return negative error code when error happens,
like we usually do in the kernel space code. So the caller doesn't
have to check the `errno` variable.

If we can land this "kernel style return value" on liburing, we will
start working on series to support build with no libc. These changes
will not break user land and no functional changes will be visible to
user (only affect liburing internal sources).


### 2) How to deal with __sys_io_uring_{register,setup,enter2,enter}

Currently we expose these functions (**AAA**) to userland:
**AAA**:
  1) `__sys_io_uring_register`
  2) `__sys_io_uring_setup`
  3) `__sys_io_uring_enter2`
  4) `__sys_io_uring_enter`

These functions are used by several tests. As the userland needs to
check the `errno` value to use them properly, this means those
functions always depend on libc. So we cannot change their behavior.

As such, only for the **no libc** environment case, we remove those
functions (**AAA**).

Then we introduce new functions (**BBB**) with the same name (with
extra underscore as prefix, 4 underscores). These functions do not
use `errno` variable on the caller (they use the kernel style return
value) and always exist regardless the libc existence.

**BBB**:
  1) `____sys_io_uring_register`
  2) `____sys_io_uring_setup`
  3) `____sys_io_uring_enter2`
  4) `____sys_io_uring_enter`
    
Summary
  1) **AAA** will only exist for the libc environment.

  2) **BBB** always exists.

  3) Do not use **AAA** for the liburing internal (it's just for the
     userland backward compatibility).

  4) For the libc environment, **BBB** may use `syscall(2)` and
     `errno` variable, only to emulate the kernel style return value.

  5) For the no libc environment, **BBB** will use Assembly interface
     to perform the syscall (arch dependent).

  6) Tests should not be affected, this is because of (1) and (4),
     which keep the compatibility.


### 3) How to deal syscalls

We have 3 patches in this series to wrap the syscalls, they are:
  - Add `liburing_mmap()` and `liburing_munmap()`
  - Add `liburing_madvise()`
  - Add `liburing_getrlimit()` and `liburing_setrlimit()`

For `liburing_{munmap,madvise,getrlimit,setrlimit}`, they will return
negative value of error code if error. They basically just return
an int, so nothing to worry about.

Special case is for pointer return value like `liburing_mmap()`. In
this case we take the `include/linux/err.h` file from the Linux kernel
source tree and use `IS_ERR()`, `PTR_ERR()`, `ERR_PTR()` to deal with
it.

It is implemented in patch:
  - Add kernel error header `src/kernel_err.h`


### 4) How can this help to support no libc environment?

When this kernel style return value gets adapted on liburing, we will
start working on raw syscall directly written in Assembly (arch
dependent).

Me (Ammar Faizi) will start kicking the tires from x86-64 arch.
Hopefully we will get support for other architectures as well.

The example of liburing syscall wrapper may look like this:

```c
void *liburing_mmap(void *addr, size_t length, int prot, int flags,
		    int fd, off_t offset)
{	
#ifdef LIBURING_NOLIBC
	/*
	 * This is when we build without libc.
	 *
	 * Assume __raw_mmap is the syscall written in ASM.
	 *
	 * The return value is directly taken from the syscall
	 * return value.
	 */
	return __raw_mmap(addr, length, prot, flags, fd, offset);
#else
	/*
	 * This is when libc exists.
	 */
	void *ret;

	ret = mmap(addr, length, prot, flags, fd, offset);
	if (ret == MAP_FAILED)
		ret = ERR_PTR(-errno);

	return ret;
#endif
}
```

----------------------------------------------------------------
The following changes since commit ce10538688b93dafd257ebfed7faf18844e0052d:

  test: Fix endianess issue on `bind()` and `connect()` (2021-09-27 07:45:03 -0600)

based on:

  git://git.kernel.dk/liburing.git master

are available as 6 patches in this series, all will be posted as a
response to this one.

If you want to take git tag, it is available in the Git repository at:

  git://github.com/ammarfaizi2/liburing.git tags/nolibc-support-rfc-v1

Please review!

----------------------------------------------------------------
Ammar Faizi (6):
      src/syscall: Implement the kernel style return value
      Add kernel error header `src/kernel_err.h`
      Add `liburing_mmap()` and `liburing_munmap()`
      Add `liburing_madvise()`
      Add `liburing_getrlimit()` and `liburing_setrlimit()`
      src/{queue,register,setup}: Remove `#include <errno.h>`

 src/kernel_err.h |  75 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 src/queue.c      |  28 +++++++++----------------
 src/register.c   | 189 +++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------------------------------------------------------------------------------------
 src/setup.c      |  60 ++++++++++++++++++++++++++++-------------------------
 src/syscall.c    |  92 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 src/syscall.h    |  18 ++++++++++++++++
 6 files changed, 284 insertions(+), 178 deletions(-)
 create mode 100644 src/kernel_err.h

--
Ammar Faizi



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCHSET v1 RFC liburing 1/6] src/syscall: Implement the kernel style return value
  2021-09-29 10:16 [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
@ 2021-09-29 10:16 ` Ammar Faizi
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 2/6] Add kernel error header `src/kernel_err.h` Ammar Faizi
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Ammar Faizi @ 2021-09-29 10:16 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov
  Cc: io-uring Mailing List, Louvian Lyndal, Ammar Faizi, Ammar Faizi

It is the starting point to make liburing can work without the libc.
Make it possible to remove the dependency of `errno` variable (which
comes from libc). Do it incrementally, start from `__sys_io_uring_*`
functions.

@@ Notes:
  1) We do not plan to remove the libc support.
  2) The return value for each syscall API to indicate error is a
     negative value within range [-4095, -1].
  3) In the liburing sources, the `errno` variable is only allowed to
     be used in file `src/syscall.c` (just to emulate the kernel
     style return value).

@@ Extra notes for backward compatibility:
Currently, we expose these functions (**AAA**) to userland:
**AAA**:
  1) `__sys_io_uring_register`
  2) `__sys_io_uring_setup`
  3) `__sys_io_uring_enter2`
  4) `__sys_io_uring_enter`

As the userland needs to check the `errno` value to use them properly,
this means those functions always depend on libc. So we cannot change
their behavior.

As such, only for the **no libc** environment case, we remove those
functions (**AAA**).

Then we introduce new functions (**BBB**) with the same name (with
extra underscore as prefix, 4 underscores). These functions do not
use `errno` variable on the caller (they use the kernel style return
value) and always exist regardless the libc existence.

**BBB**:
  1) `____sys_io_uring_register`
  2) `____sys_io_uring_setup`
  3) `____sys_io_uring_enter2`
  4) `____sys_io_uring_enter`

@@ Summary
 1) **AAA** will only exist for the libc environment.
 2) **BBB** always exists.
 3) Do not use **AAA** for the liburing internal (it's just for the
    userland backward compatibility).
 4) For the libc environment, **BBB** may use `syscall(2)` and
    `errno` variable, only to emulate the kernel style return value.
 5) For the no libc environment, **BBB** will use Assembly interface
    to perform the syscall (arch dependent).
 6) Tests should not be affected, this is because of (1) and (4),
    which keep the compatibility.

Link: https://github.com/axboe/liburing/issues/443
Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
---
 src/queue.c    |  27 +++-----
 src/register.c | 184 ++++++++++++++++---------------------------------
 src/setup.c    |   4 +-
 src/syscall.c  |  45 +++++++++++-
 src/syscall.h  |   8 +++
 5 files changed, 120 insertions(+), 148 deletions(-)

diff --git a/src/queue.c b/src/queue.c
index 10ef31c..e85ea1d 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -117,11 +117,11 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
 		if (!need_enter)
 			break;
 
-		ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
-				data->wait_nr, flags, data->arg,
-				data->sz);
+		ret = ____sys_io_uring_enter2(ring->ring_fd, data->submit,
+					      data->wait_nr, flags, data->arg,
+					      data->sz);
 		if (ret < 0) {
-			err = -errno;
+			err = ret;
 			break;
 		}
 
@@ -178,8 +178,8 @@ again:
 		goto done;
 
 	if (cq_ring_needs_flush(ring)) {
-		__sys_io_uring_enter(ring->ring_fd, 0, 0,
-				     IORING_ENTER_GETEVENTS, NULL);
+		____sys_io_uring_enter(ring->ring_fd, 0, 0,
+				       IORING_ENTER_GETEVENTS, NULL);
 		overflow_checked = true;
 		goto again;
 	}
@@ -333,10 +333,8 @@ static int __io_uring_submit(struct io_uring *ring, unsigned submitted,
 		if (wait_nr || (ring->flags & IORING_SETUP_IOPOLL))
 			flags |= IORING_ENTER_GETEVENTS;
 
-		ret = __sys_io_uring_enter(ring->ring_fd, submitted, wait_nr,
-						flags, NULL);
-		if (ret < 0)
-			return -errno;
+		ret = ____sys_io_uring_enter(ring->ring_fd, submitted, wait_nr,
+					     flags, NULL);
 	} else
 		ret = submitted;
 
@@ -391,11 +389,6 @@ struct io_uring_sqe *io_uring_get_sqe(struct io_uring *ring)
 
 int __io_uring_sqring_wait(struct io_uring *ring)
 {
-	int ret;
-
-	ret = __sys_io_uring_enter(ring->ring_fd, 0, 0, IORING_ENTER_SQ_WAIT,
-					NULL);
-	if (ret < 0)
-		ret = -errno;
-	return ret;
+	return ____sys_io_uring_enter(ring->ring_fd, 0, 0, IORING_ENTER_SQ_WAIT,
+				      NULL);
 }
diff --git a/src/register.c b/src/register.c
index 5ea4331..944852e 100644
--- a/src/register.c
+++ b/src/register.c
@@ -26,12 +26,10 @@ int io_uring_register_buffers_update_tag(struct io_uring *ring, unsigned off,
 		.tags = (unsigned long)tags,
 		.nr = nr,
 	};
-	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd,
-				      IORING_REGISTER_BUFFERS_UPDATE,
-				      &up, sizeof(up));
-	return ret < 0 ? -errno : ret;
+	return ____sys_io_uring_register(ring->ring_fd,
+					 IORING_REGISTER_BUFFERS_UPDATE, &up,
+					 sizeof(up));
 }
 
 int io_uring_register_buffers_tags(struct io_uring *ring,
@@ -44,11 +42,10 @@ int io_uring_register_buffers_tags(struct io_uring *ring,
 		.data = (unsigned long)iovecs,
 		.tags = (unsigned long)tags,
 	};
-	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_BUFFERS2,
-				      &reg, sizeof(reg));
-	return ret < 0 ? -errno : ret;
+	return ____sys_io_uring_register(ring->ring_fd,
+					 IORING_REGISTER_BUFFERS2, &reg,
+					 sizeof(reg));
 }
 
 int io_uring_register_buffers(struct io_uring *ring, const struct iovec *iovecs,
@@ -56,24 +53,18 @@ int io_uring_register_buffers(struct io_uring *ring, const struct iovec *iovecs,
 {
 	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_BUFFERS,
+	ret = ____sys_io_uring_register(ring->ring_fd, IORING_REGISTER_BUFFERS,
 					iovecs, nr_iovecs);
-	if (ret < 0)
-		return -errno;
-
-	return 0;
+	return (ret < 0) ? ret : 0;
 }
 
 int io_uring_unregister_buffers(struct io_uring *ring)
 {
 	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_BUFFERS,
+	ret = ____sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_BUFFERS,
 					NULL, 0);
-	if (ret < 0)
-		return -errno;
-
-	return 0;
+	return (ret < 0) ? ret : 0;
 }
 
 int io_uring_register_files_update_tag(struct io_uring *ring, unsigned off,
@@ -86,12 +77,10 @@ int io_uring_register_files_update_tag(struct io_uring *ring, unsigned off,
 		.tags = (unsigned long)tags,
 		.nr = nr_files,
 	};
-	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd,
-					IORING_REGISTER_FILES_UPDATE2,
-					&up, sizeof(up));
-	return ret < 0 ? -errno : ret;
+	return ____sys_io_uring_register(ring->ring_fd,
+					 IORING_REGISTER_FILES_UPDATE2, &up,
+					 sizeof(up));
 }
 
 /*
@@ -108,15 +97,10 @@ int io_uring_register_files_update(struct io_uring *ring, unsigned off,
 		.offset	= off,
 		.fds	= (unsigned long) files,
 	};
-	int ret;
-
-	ret = __sys_io_uring_register(ring->ring_fd,
-					IORING_REGISTER_FILES_UPDATE, &up,
-					nr_files);
-	if (ret < 0)
-		return -errno;
 
-	return ret;
+	return ____sys_io_uring_register(ring->ring_fd,
+					 IORING_REGISTER_FILES_UPDATE, &up,
+					 nr_files);
 }
 
 static int increase_rlimit_nofile(unsigned nr)
@@ -145,12 +129,12 @@ int io_uring_register_files_tags(struct io_uring *ring,
 	int ret, did_increase = 0;
 
 	do {
-		ret = __sys_io_uring_register(ring->ring_fd,
-					      IORING_REGISTER_FILES2, &reg,
-					      sizeof(reg));
+		ret = ____sys_io_uring_register(ring->ring_fd,
+						IORING_REGISTER_FILES2, &reg,
+						sizeof(reg));
 		if (ret >= 0)
 			break;
-		if (errno == EMFILE && !did_increase) {
+		if (ret == -EMFILE && !did_increase) {
 			did_increase = 1;
 			increase_rlimit_nofile(nr);
 			continue;
@@ -158,7 +142,7 @@ int io_uring_register_files_tags(struct io_uring *ring,
 		break;
 	} while (1);
 
-	return ret < 0 ? -errno : ret;
+	return (ret < 0) ? ret : 0;
 }
 
 int io_uring_register_files(struct io_uring *ring, const int *files,
@@ -167,12 +151,12 @@ int io_uring_register_files(struct io_uring *ring, const int *files,
 	int ret, did_increase = 0;
 
 	do {
-		ret = __sys_io_uring_register(ring->ring_fd,
-					      IORING_REGISTER_FILES, files,
-					      nr_files);
+		ret = ____sys_io_uring_register(ring->ring_fd,
+						IORING_REGISTER_FILES, files,
+						nr_files);
 		if (ret >= 0)
 			break;
-		if (errno == EMFILE && !did_increase) {
+		if (ret == -EMFILE && !did_increase) {
 			did_increase = 1;
 			increase_rlimit_nofile(nr_files);
 			continue;
@@ -180,55 +164,44 @@ int io_uring_register_files(struct io_uring *ring, const int *files,
 		break;
 	} while (1);
 
-	return ret < 0 ? -errno : ret;
+	return (ret < 0) ? ret : 0;
 }
 
 int io_uring_unregister_files(struct io_uring *ring)
 {
 	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_FILES,
+	ret = ____sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_FILES,
 					NULL, 0);
-	if (ret < 0)
-		return -errno;
-
-	return 0;
+	return (ret < 0) ? ret : 0;
 }
 
 int io_uring_register_eventfd(struct io_uring *ring, int event_fd)
 {
 	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_EVENTFD,
+	ret = ____sys_io_uring_register(ring->ring_fd, IORING_REGISTER_EVENTFD,
 					&event_fd, 1);
-	if (ret < 0)
-		return -errno;
-
-	return 0;
+	return (ret < 0) ? ret : 0;
 }
 
 int io_uring_unregister_eventfd(struct io_uring *ring)
 {
 	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_EVENTFD,
-					NULL, 0);
-	if (ret < 0)
-		return -errno;
-
-	return 0;
+	ret = ____sys_io_uring_register(ring->ring_fd,
+					IORING_UNREGISTER_EVENTFD, NULL, 0);
+	return (ret < 0) ? ret : 0;
 }
 
 int io_uring_register_eventfd_async(struct io_uring *ring, int event_fd)
 {
 	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_EVENTFD_ASYNC,
-			&event_fd, 1);
-	if (ret < 0)
-		return -errno;
-
-	return 0;
+	ret = ____sys_io_uring_register(ring->ring_fd,
+					IORING_REGISTER_EVENTFD_ASYNC,
+					&event_fd, 1);
+	return (ret < 0) ? ret : 0;
 }
 
 int io_uring_register_probe(struct io_uring *ring, struct io_uring_probe *p,
@@ -236,36 +209,22 @@ int io_uring_register_probe(struct io_uring *ring, struct io_uring_probe *p,
 {
 	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_PROBE,
-					p, nr_ops);
-	if (ret < 0)
-		return -errno;
-
-	return 0;
+	ret = ____sys_io_uring_register(ring->ring_fd, IORING_REGISTER_PROBE, p,
+					nr_ops);
+	return (ret < 0) ? ret : 0;
 }
 
 int io_uring_register_personality(struct io_uring *ring)
 {
-	int ret;
-
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_PERSONALITY,
-					NULL, 0);
-	if (ret < 0)
-		return -errno;
-
-	return ret;
+	return ____sys_io_uring_register(ring->ring_fd,
+					 IORING_REGISTER_PERSONALITY, NULL, 0);
 }
 
 int io_uring_unregister_personality(struct io_uring *ring, int id)
 {
-	int ret;
-
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_UNREGISTER_PERSONALITY,
-					NULL, id);
-	if (ret < 0)
-		return -errno;
-
-	return ret;
+	return ____sys_io_uring_register(ring->ring_fd,
+					 IORING_UNREGISTER_PERSONALITY, NULL,
+					 id);
 }
 
 int io_uring_register_restrictions(struct io_uring *ring,
@@ -274,61 +233,34 @@ int io_uring_register_restrictions(struct io_uring *ring,
 {
 	int ret;
 
-	ret = __sys_io_uring_register(ring->ring_fd, IORING_REGISTER_RESTRICTIONS,
-				      res, nr_res);
-	if (ret < 0)
-		return -errno;
-
-	return 0;
+	ret = ____sys_io_uring_register(ring->ring_fd,
+					IORING_REGISTER_RESTRICTIONS, res,
+					nr_res);
+	return (ret < 0) ? ret : 0;
 }
 
 int io_uring_enable_rings(struct io_uring *ring)
 {
-	int ret;
-
-	ret = __sys_io_uring_register(ring->ring_fd,
-				      IORING_REGISTER_ENABLE_RINGS, NULL, 0);
-	if (ret < 0)
-		return -errno;
-
-	return ret;
+	return ____sys_io_uring_register(ring->ring_fd,
+					 IORING_REGISTER_ENABLE_RINGS, NULL, 0);
 }
 
 int io_uring_register_iowq_aff(struct io_uring *ring, size_t cpusz,
 			       const cpu_set_t *mask)
 {
-	int ret;
-
-	ret = __sys_io_uring_register(ring->ring_fd,
-					IORING_REGISTER_IOWQ_AFF, mask, cpusz);
-	if (ret < 0)
-		return -errno;
-
-	return ret;
+	return ____sys_io_uring_register(ring->ring_fd,
+					 IORING_REGISTER_IOWQ_AFF, mask, cpusz);
 }
 
 int io_uring_unregister_iowq_aff(struct io_uring *ring)
 {
-	int ret;
-
-	ret = __sys_io_uring_register(ring->ring_fd,
-					IORING_REGISTER_IOWQ_AFF, NULL, 0);
-	if (ret < 0)
-		return -errno;
-
-	return ret;
+	return ____sys_io_uring_register(ring->ring_fd,
+					 IORING_REGISTER_IOWQ_AFF, NULL, 0);
 }
 
 int io_uring_register_iowq_max_workers(struct io_uring *ring, unsigned int *val)
 {
-	int ret;
-
-	ret = __sys_io_uring_register(ring->ring_fd,
-					IORING_REGISTER_IOWQ_MAX_WORKERS,
-					val, 2);
-	if (ret < 0)
-		return -errno;
-
-	return ret;
-
+	return ____sys_io_uring_register(ring->ring_fd,
+					 IORING_REGISTER_IOWQ_MAX_WORKERS, val,
+					 2);
 }
diff --git a/src/setup.c b/src/setup.c
index 54225e8..edfe94e 100644
--- a/src/setup.c
+++ b/src/setup.c
@@ -140,9 +140,9 @@ int io_uring_queue_init_params(unsigned entries, struct io_uring *ring,
 {
 	int fd, ret;
 
-	fd = __sys_io_uring_setup(entries, p);
+	fd = ____sys_io_uring_setup(entries, p);
 	if (fd < 0)
-		return -errno;
+		return fd;
 
 	ret = io_uring_queue_mmap(fd, p, ring);
 	if (ret) {
diff --git a/src/syscall.c b/src/syscall.c
index 69027e5..0ecc17b 100644
--- a/src/syscall.c
+++ b/src/syscall.c
@@ -4,6 +4,7 @@
 /*
  * Will go away once libc support is there
  */
+#include <errno.h>
 #include <unistd.h>
 #include <sys/syscall.h>
 #include <sys/uio.h>
@@ -59,15 +60,53 @@ int __sys_io_uring_setup(unsigned entries, struct io_uring_params *p)
 }
 
 int __sys_io_uring_enter2(int fd, unsigned to_submit, unsigned min_complete,
-			 unsigned flags, sigset_t *sig, int sz)
+			  unsigned flags, sigset_t *sig, int sz)
 {
 	return syscall(__NR_io_uring_enter, fd, to_submit, min_complete,
-			flags, sig, sz);
+		       flags, sig, sz);
 }
 
 int __sys_io_uring_enter(int fd, unsigned to_submit, unsigned min_complete,
 			 unsigned flags, sigset_t *sig)
 {
 	return __sys_io_uring_enter2(fd, to_submit, min_complete, flags, sig,
-					_NSIG / 8);
+				     _NSIG / 8);
+}
+
+
+/*
+ * Syscall with kernel style return value.
+ */
+int ____sys_io_uring_register(int fd, unsigned opcode, const void *arg,
+			      unsigned nr_args)
+{
+	int ret;
+
+	ret = syscall(__NR_io_uring_register, fd, opcode, arg, nr_args);
+	return (ret < 0) ? -errno : ret;
+}
+
+int ____sys_io_uring_setup(unsigned entries, struct io_uring_params *p)
+{
+	int ret;
+
+	ret = syscall(__NR_io_uring_setup, entries, p);
+	return (ret < 0) ? -errno : ret;
+}
+
+int ____sys_io_uring_enter2(int fd, unsigned to_submit, unsigned min_complete,
+			    unsigned flags, sigset_t *sig, int sz)
+{
+	int ret;
+
+	ret = syscall(__NR_io_uring_enter, fd, to_submit, min_complete,
+		      flags, sig, sz);
+	return (ret < 0) ? -errno : ret;
+}
+
+int ____sys_io_uring_enter(int fd, unsigned to_submit, unsigned min_complete,
+			   unsigned flags, sigset_t *sig)
+{
+	return ____sys_io_uring_enter2(fd, to_submit, min_complete, flags, sig,
+				       _NSIG / 8);
 }
diff --git a/src/syscall.h b/src/syscall.h
index 2368f83..8cd2d4c 100644
--- a/src/syscall.h
+++ b/src/syscall.h
@@ -17,4 +17,12 @@ int __sys_io_uring_enter2(int fd, unsigned to_submit, unsigned min_complete,
 int __sys_io_uring_register(int fd, unsigned int opcode, const void *arg,
 			    unsigned int nr_args);
 
+int ____sys_io_uring_setup(unsigned entries, struct io_uring_params *p);
+int ____sys_io_uring_enter(int fd, unsigned to_submit, unsigned min_complete,
+			   unsigned flags, sigset_t *sig);
+int ____sys_io_uring_enter2(int fd, unsigned to_submit, unsigned min_complete,
+			    unsigned flags, sigset_t *sig, int sz);
+int ____sys_io_uring_register(int fd, unsigned int opcode, const void *arg,
+			      unsigned int nr_args);
+
 #endif
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHSET v1 RFC liburing 2/6] Add kernel error header `src/kernel_err.h`
  2021-09-29 10:16 [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 1/6] src/syscall: " Ammar Faizi
@ 2021-09-29 10:16 ` Ammar Faizi
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 3/6] Add `liburing_mmap()` and `liburing_munmap()` Ammar Faizi
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Ammar Faizi @ 2021-09-29 10:16 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov
  Cc: io-uring Mailing List, Louvian Lyndal, Ammar Faizi, Ammar Faizi

We take the `include/linux/err.h` file from Linux kernel with a bit
modification.

The purpose of this file is to use `PTR_ERR()`, `ERR_PTR()`, and
similar stuff to implement the kernel style return value which is
discussed at [1].

The small modification summary:
  1) Add `__must_check` attribute macro.
  2) `#include <liburing.h>` to take the `uring_likely` and
     `uring_unlikely` macros.

This file is licensed under the GPL-2.0.

Link: https://github.com/axboe/liburing/issues/443 [1]
Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
---
 src/kernel_err.h | 75 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)
 create mode 100644 src/kernel_err.h

diff --git a/src/kernel_err.h b/src/kernel_err.h
new file mode 100644
index 0000000..b9ea5fe
--- /dev/null
+++ b/src/kernel_err.h
@@ -0,0 +1,75 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_ERR_H
+#define _LINUX_ERR_H
+
+#include <linux/types.h>
+
+#include <asm/errno.h>
+
+#include <stdbool.h>
+#include <liburing.h>
+
+/*
+ * Kernel pointers have redundant information, so we can use a
+ * scheme where we can return either an error code or a normal
+ * pointer with the same return value.
+ *
+ * This should be a per-architecture thing, to allow different
+ * error and pointer decisions.
+ */
+#define MAX_ERRNO	4095
+
+#ifndef __ASSEMBLY__
+
+#define IS_ERR_VALUE(x) uring_unlikely((unsigned long)(void *)(x) >= (unsigned long)-MAX_ERRNO)
+
+/*
+ *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-warn_005funused_005fresult-function-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#nodiscard-warn-unused-result
+ */
+#define __must_check __attribute__((__warn_unused_result__))
+
+static inline void * __must_check ERR_PTR(long error)
+{
+	return (void *) error;
+}
+
+static inline long __must_check PTR_ERR(const void *ptr)
+{
+	return (long) ptr;
+}
+
+static inline bool __must_check IS_ERR(const void *ptr)
+{
+	return IS_ERR_VALUE((unsigned long)ptr);
+}
+
+static inline bool __must_check IS_ERR_OR_NULL(const void *ptr)
+{
+	return uring_unlikely(!ptr) || IS_ERR_VALUE((unsigned long)ptr);
+}
+
+/**
+ * ERR_CAST - Explicitly cast an error-valued pointer to another pointer type
+ * @ptr: The pointer to cast.
+ *
+ * Explicitly cast an error-valued pointer to another pointer type in such a
+ * way as to make it clear that's what's going on.
+ */
+static inline void * __must_check ERR_CAST(const void *ptr)
+{
+	/* cast away the const */
+	return (void *) ptr;
+}
+
+static inline int __must_check PTR_ERR_OR_ZERO(const void *ptr)
+{
+	if (IS_ERR(ptr))
+		return PTR_ERR(ptr);
+	else
+		return 0;
+}
+
+#endif /* #ifndef __ASSEMBLY__ */
+
+#endif /* #ifndef _LINUX_ERR_H */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHSET v1 RFC liburing 3/6] Add `liburing_mmap()` and `liburing_munmap()`
  2021-09-29 10:16 [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 1/6] src/syscall: " Ammar Faizi
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 2/6] Add kernel error header `src/kernel_err.h` Ammar Faizi
@ 2021-09-29 10:16 ` Ammar Faizi
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 4/6] Add `liburing_madvise()` Ammar Faizi
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Ammar Faizi @ 2021-09-29 10:16 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov
  Cc: io-uring Mailing List, Louvian Lyndal, Ammar Faizi, Ammar Faizi

Do not use `mmap()` and `mumap()` directly from the libc in the
liburing internal sources. Wrap them in `src/syscall.c`. This is the
part of implementing the kernel style return value (which later is
supposed to support no libc environment).

`liburing_mmap()` and `liburing_munmap()` do the same thing with
`mmap()` and `munmap()` from the libc. The only different is when
error happens, the return value is of `liburing_{mmap,munmap}()` will
be a negative error code.

Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
---
 src/setup.c   | 37 +++++++++++++++++++++----------------
 src/syscall.c | 23 +++++++++++++++++++++++
 src/syscall.h |  5 +++++
 3 files changed, 49 insertions(+), 16 deletions(-)

diff --git a/src/setup.c b/src/setup.c
index edfe94e..01cb151 100644
--- a/src/setup.c
+++ b/src/setup.c
@@ -15,12 +15,13 @@
 #include "liburing.h"
 
 #include "syscall.h"
+#include "kernel_err.h"
 
 static void io_uring_unmap_rings(struct io_uring_sq *sq, struct io_uring_cq *cq)
 {
-	munmap(sq->ring_ptr, sq->ring_sz);
+	liburing_munmap(sq->ring_ptr, sq->ring_sz);
 	if (cq->ring_ptr && cq->ring_ptr != sq->ring_ptr)
-		munmap(cq->ring_ptr, cq->ring_sz);
+		liburing_munmap(cq->ring_ptr, cq->ring_sz);
 }
 
 static int io_uring_mmap(int fd, struct io_uring_params *p,
@@ -37,19 +38,22 @@ static int io_uring_mmap(int fd, struct io_uring_params *p,
 			sq->ring_sz = cq->ring_sz;
 		cq->ring_sz = sq->ring_sz;
 	}
-	sq->ring_ptr = mmap(0, sq->ring_sz, PROT_READ | PROT_WRITE,
-			MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_SQ_RING);
-	if (sq->ring_ptr == MAP_FAILED)
-		return -errno;
+	sq->ring_ptr = liburing_mmap(0, sq->ring_sz, PROT_READ | PROT_WRITE,
+				     MAP_SHARED | MAP_POPULATE, fd,
+				     IORING_OFF_SQ_RING);
+	if (IS_ERR(sq->ring_ptr))
+		return PTR_ERR(sq->ring_ptr);
 
 	if (p->features & IORING_FEAT_SINGLE_MMAP) {
 		cq->ring_ptr = sq->ring_ptr;
 	} else {
-		cq->ring_ptr = mmap(0, cq->ring_sz, PROT_READ | PROT_WRITE,
-				MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_CQ_RING);
-		if (cq->ring_ptr == MAP_FAILED) {
+		cq->ring_ptr = liburing_mmap(0, cq->ring_sz,
+					     PROT_READ | PROT_WRITE,
+					     MAP_SHARED | MAP_POPULATE, fd,
+					     IORING_OFF_CQ_RING);
+		if (IS_ERR(cq->ring_ptr)) {
+			ret = PTR_ERR(cq->ring_ptr);
 			cq->ring_ptr = NULL;
-			ret = -errno;
 			goto err;
 		}
 	}
@@ -63,11 +67,11 @@ static int io_uring_mmap(int fd, struct io_uring_params *p,
 	sq->array = sq->ring_ptr + p->sq_off.array;
 
 	size = p->sq_entries * sizeof(struct io_uring_sqe);
-	sq->sqes = mmap(0, size, PROT_READ | PROT_WRITE,
-				MAP_SHARED | MAP_POPULATE, fd,
-				IORING_OFF_SQES);
-	if (sq->sqes == MAP_FAILED) {
-		ret = -errno;
+	sq->sqes = liburing_mmap(0, size, PROT_READ | PROT_WRITE,
+				 MAP_SHARED | MAP_POPULATE, fd,
+				 IORING_OFF_SQES);
+	if (IS_ERR(sq->sqes)) {
+		ret = PTR_ERR(sq->sqes);
 err:
 		io_uring_unmap_rings(sq, cq);
 		return ret;
@@ -173,7 +177,8 @@ void io_uring_queue_exit(struct io_uring *ring)
 	struct io_uring_sq *sq = &ring->sq;
 	struct io_uring_cq *cq = &ring->cq;
 
-	munmap(sq->sqes, *sq->kring_entries * sizeof(struct io_uring_sqe));
+	liburing_munmap(sq->sqes,
+			*sq->kring_entries * sizeof(struct io_uring_sqe));
 	io_uring_unmap_rings(sq, cq);
 	close(ring->ring_fd);
 }
diff --git a/src/syscall.c b/src/syscall.c
index 0ecc17b..cb48a94 100644
--- a/src/syscall.c
+++ b/src/syscall.c
@@ -8,9 +8,12 @@
 #include <unistd.h>
 #include <sys/syscall.h>
 #include <sys/uio.h>
+#include <sys/mman.h>
+
 #include "liburing/compat.h"
 #include "liburing/io_uring.h"
 #include "syscall.h"
+#include "kernel_err.h"
 
 #ifdef __alpha__
 /*
@@ -110,3 +113,23 @@ int ____sys_io_uring_enter(int fd, unsigned to_submit, unsigned min_complete,
 	return ____sys_io_uring_enter2(fd, to_submit, min_complete, flags, sig,
 				       _NSIG / 8);
 }
+
+void *liburing_mmap(void *addr, size_t length, int prot, int flags, int fd,
+		    off_t offset)
+{
+	void *ret;
+
+	ret = mmap(addr, length, prot, flags, fd, offset);
+	if (ret == MAP_FAILED)
+		ret = ERR_PTR(-errno);
+
+	return ret;
+}
+
+int liburing_munmap(void *addr, size_t length)
+{
+	int ret;
+
+	ret = munmap(addr, length);
+	return (ret < 0) ? -errno : ret;
+}
diff --git a/src/syscall.h b/src/syscall.h
index 8cd2d4c..feccf67 100644
--- a/src/syscall.h
+++ b/src/syscall.h
@@ -3,6 +3,7 @@
 #define LIBURING_SYSCALL_H
 
 #include <signal.h>
+#include "kernel_err.h"
 
 struct io_uring_params;
 
@@ -25,4 +26,8 @@ int ____sys_io_uring_enter2(int fd, unsigned to_submit, unsigned min_complete,
 int ____sys_io_uring_register(int fd, unsigned int opcode, const void *arg,
 			      unsigned int nr_args);
 
+void *liburing_mmap(void *addr, size_t length, int prot, int flags, int fd,
+		    off_t offset);
+int liburing_munmap(void *addr, size_t length);
+
 #endif
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHSET v1 RFC liburing 4/6] Add `liburing_madvise()`
  2021-09-29 10:16 [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
                   ` (2 preceding siblings ...)
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 3/6] Add `liburing_mmap()` and `liburing_munmap()` Ammar Faizi
@ 2021-09-29 10:16 ` Ammar Faizi
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 5/6] Add `liburing_getrlimit()` and `liburing_setrlimit()` Ammar Faizi
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Ammar Faizi @ 2021-09-29 10:16 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov
  Cc: io-uring Mailing List, Louvian Lyndal, Ammar Faizi, Ammar Faizi

Do not use `madvise()` directly from the libc in the liburing internal
sources. Wrap them in `src/syscall.c`. This is the part of implementing
the kernel style return value (which later is supposed to support no
libc environment).

`liburing_madvise()` does the same thing with `madvise()` from the libc.
The only different is when error happens, the return value is of
`liburing_madvise()` will be a negative error code.

Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
---
 src/setup.c   | 18 +++++++++---------
 src/syscall.c |  8 ++++++++
 src/syscall.h |  1 +
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/src/setup.c b/src/setup.c
index 01cb151..52f3557 100644
--- a/src/setup.c
+++ b/src/setup.c
@@ -120,20 +120,20 @@ int io_uring_ring_dontfork(struct io_uring *ring)
 		return -EINVAL;
 
 	len = *ring->sq.kring_entries * sizeof(struct io_uring_sqe);
-	ret = madvise(ring->sq.sqes, len, MADV_DONTFORK);
-	if (ret == -1)
-		return -errno;
+	ret = liburing_madvise(ring->sq.sqes, len, MADV_DONTFORK);
+	if (uring_unlikely(ret))
+		return ret;
 
 	len = ring->sq.ring_sz;
-	ret = madvise(ring->sq.ring_ptr, len, MADV_DONTFORK);
-	if (ret == -1)
-		return -errno;
+	ret = liburing_madvise(ring->sq.ring_ptr, len, MADV_DONTFORK);
+	if (uring_unlikely(ret))
+		return ret;
 
 	if (ring->cq.ring_ptr != ring->sq.ring_ptr) {
 		len = ring->cq.ring_sz;
-		ret = madvise(ring->cq.ring_ptr, len, MADV_DONTFORK);
-		if (ret == -1)
-			return -errno;
+		ret = liburing_madvise(ring->cq.ring_ptr, len, MADV_DONTFORK);
+		if (uring_unlikely(ret))
+			return ret;
 	}
 
 	return 0;
diff --git a/src/syscall.c b/src/syscall.c
index cb48a94..44861f6 100644
--- a/src/syscall.c
+++ b/src/syscall.c
@@ -133,3 +133,11 @@ int liburing_munmap(void *addr, size_t length)
 	ret = munmap(addr, length);
 	return (ret < 0) ? -errno : ret;
 }
+
+int liburing_madvise(void *addr, size_t length, int advice)
+{
+	int ret;
+
+	ret = madvise(addr, length, advice);
+	return (ret < 0) ? -errno : ret;
+}
diff --git a/src/syscall.h b/src/syscall.h
index feccf67..32381ce 100644
--- a/src/syscall.h
+++ b/src/syscall.h
@@ -29,5 +29,6 @@ int ____sys_io_uring_register(int fd, unsigned int opcode, const void *arg,
 void *liburing_mmap(void *addr, size_t length, int prot, int flags, int fd,
 		    off_t offset);
 int liburing_munmap(void *addr, size_t length);
+int liburing_madvise(void *addr, size_t length, int advice);
 
 #endif
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHSET v1 RFC liburing 5/6] Add `liburing_getrlimit()` and `liburing_setrlimit()`
  2021-09-29 10:16 [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
                   ` (3 preceding siblings ...)
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 4/6] Add `liburing_madvise()` Ammar Faizi
@ 2021-09-29 10:16 ` Ammar Faizi
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 6/6] src/{queue,register,setup}: Remove `#include <errno.h>` Ammar Faizi
  2021-09-29 10:21 ` [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
  6 siblings, 0 replies; 10+ messages in thread
From: Ammar Faizi @ 2021-09-29 10:16 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov
  Cc: io-uring Mailing List, Louvian Lyndal, Ammar Faizi, Ammar Faizi

Do not use `getrlimit()` and `mumap()` directly from the libc in the
liburing internal sources. Wrap them in `src/syscall.c`. This is the
part of implementing the kernel style return value (which later is
supposed to support no libc environment).

`liburing_getrlimit()` and `liburing_setrlimit()` do the same thing
with `getrlimit()` and `setrlimit()` from the libc. The only different
is when error happens, the return value is of `liburing_{get,set}rlimit()`
will be a negative error code.

Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
---
 src/register.c |  4 ++--
 src/syscall.c  | 16 ++++++++++++++++
 src/syscall.h  |  4 ++++
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/register.c b/src/register.c
index 944852e..0908e3e 100644
--- a/src/register.c
+++ b/src/register.c
@@ -107,11 +107,11 @@ static int increase_rlimit_nofile(unsigned nr)
 {
 	struct rlimit rlim;
 
-	if (getrlimit(RLIMIT_NOFILE, &rlim) < 0)
+	if (liburing_getrlimit(RLIMIT_NOFILE, &rlim) < 0)
 		return -errno;
 	if (rlim.rlim_cur < nr) {
 		rlim.rlim_cur += nr;
-		setrlimit(RLIMIT_NOFILE, &rlim);
+		liburing_setrlimit(RLIMIT_NOFILE, &rlim);
 	}
 
 	return 0;
diff --git a/src/syscall.c b/src/syscall.c
index 44861f6..b8e7cb3 100644
--- a/src/syscall.c
+++ b/src/syscall.c
@@ -141,3 +141,19 @@ int liburing_madvise(void *addr, size_t length, int advice)
 	ret = madvise(addr, length, advice);
 	return (ret < 0) ? -errno : ret;
 }
+
+int liburing_getrlimit(int resource, struct rlimit *rlim)
+{
+	int ret;
+
+	ret = getrlimit(resource, rlim);
+	return (ret < 0) ? -errno : ret;
+}
+
+int liburing_setrlimit(int resource, const struct rlimit *rlim)
+{
+	int ret;
+
+	ret = setrlimit(resource, rlim);
+	return (ret < 0) ? -errno : ret;
+}
diff --git a/src/syscall.h b/src/syscall.h
index 32381ce..1ac56f9 100644
--- a/src/syscall.h
+++ b/src/syscall.h
@@ -3,6 +3,8 @@
 #define LIBURING_SYSCALL_H
 
 #include <signal.h>
+#include <sys/time.h>
+#include <sys/resource.h>
 #include "kernel_err.h"
 
 struct io_uring_params;
@@ -30,5 +32,7 @@ void *liburing_mmap(void *addr, size_t length, int prot, int flags, int fd,
 		    off_t offset);
 int liburing_munmap(void *addr, size_t length);
 int liburing_madvise(void *addr, size_t length, int advice);
+int liburing_getrlimit(int resource, struct rlimit *rlim);
+int liburing_setrlimit(int resource, const struct rlimit *rlim);
 
 #endif
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCHSET v1 RFC liburing 6/6] src/{queue,register,setup}: Remove `#include <errno.h>`
  2021-09-29 10:16 [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
                   ` (4 preceding siblings ...)
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 5/6] Add `liburing_getrlimit()` and `liburing_setrlimit()` Ammar Faizi
@ 2021-09-29 10:16 ` Ammar Faizi
  2021-09-29 10:21 ` [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
  6 siblings, 0 replies; 10+ messages in thread
From: Ammar Faizi @ 2021-09-29 10:16 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov
  Cc: io-uring Mailing List, Louvian Lyndal, Ammar Faizi, Ammar Faizi

We don't need `#include <errno.h>` in these files anymore. For now,
`errno` variable is only allowed to be used in `src/syscall.c` to
separate the dependency from other liburing sources.

Signed-off-by: Ammar Faizi <ammar.faizi@students.amikom.ac.id>
---
 src/queue.c    | 1 -
 src/register.c | 1 -
 src/setup.c    | 1 -
 3 files changed, 3 deletions(-)

diff --git a/src/queue.c b/src/queue.c
index e85ea1d..24ff8bc 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -5,7 +5,6 @@
 #include <sys/stat.h>
 #include <sys/mman.h>
 #include <unistd.h>
-#include <errno.h>
 #include <string.h>
 #include <stdbool.h>
 
diff --git a/src/register.c b/src/register.c
index 0908e3e..2b8cbac 100644
--- a/src/register.c
+++ b/src/register.c
@@ -6,7 +6,6 @@
 #include <sys/mman.h>
 #include <sys/resource.h>
 #include <unistd.h>
-#include <errno.h>
 #include <string.h>
 
 #include "liburing/compat.h"
diff --git a/src/setup.c b/src/setup.c
index 52f3557..486a3a1 100644
--- a/src/setup.c
+++ b/src/setup.c
@@ -5,7 +5,6 @@
 #include <sys/stat.h>
 #include <sys/mman.h>
 #include <unistd.h>
-#include <errno.h>
 #include <string.h>
 #include <stdlib.h>
 #include <signal.h>
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value
  2021-09-29 10:16 [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
                   ` (5 preceding siblings ...)
  2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 6/6] src/{queue,register,setup}: Remove `#include <errno.h>` Ammar Faizi
@ 2021-09-29 10:21 ` Ammar Faizi
  2021-10-01  6:44   ` Louvian Lyndal
  6 siblings, 1 reply; 10+ messages in thread
From: Ammar Faizi @ 2021-09-29 10:21 UTC (permalink / raw)
  To: Jens Axboe, Pavel Begunkov
  Cc: io-uring Mailing List, Louvian Lyndal, Ammar Faizi

Sorry for the title, the patches should have subject
[PATCH v1 RFC liburing]. I missed that. Will make sure
to pay attention on it for v2 and later.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value
  2021-09-29 10:21 ` [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
@ 2021-10-01  6:44   ` Louvian Lyndal
  2021-10-01  7:36     ` Ammar Faizi
  0 siblings, 1 reply; 10+ messages in thread
From: Louvian Lyndal @ 2021-10-01  6:44 UTC (permalink / raw)
  To: Ammar Faizi, Jens Axboe, Pavel Begunkov
  Cc: io-uring Mailing List, Ammar Faizi

On 9/29/21 5:16 PM, Ammar Faizi wrote:
> ### 3) How to deal syscalls
>
> We have 3 patches in this series to wrap the syscalls, they are:
>   - Add `liburing_mmap()` and `liburing_munmap()`
>   - Add `liburing_madvise()`
>   - Add `liburing_getrlimit()` and `liburing_setrlimit()`
>
> For `liburing_{munmap,madvise,getrlimit,setrlimit}`, they will return
> negative value of error code if error. They basically just return
> an int, so nothing to worry about.
>
> Special case is for pointer return value like `liburing_mmap()`. In
> this case we take the `include/linux/err.h` file from the Linux kernel
> source tree and use `IS_ERR()`, `PTR_ERR()`, `ERR_PTR()` to deal with
> it.
>
>
> ### 4) How can this help to support no libc environment?
>
> When this kernel style return value gets adapted on liburing, we will
> start working on raw syscall directly written in Assembly (arch
> dependent).
>
> Me (Ammar Faizi) will start kicking the tires from x86-64 arch.
> Hopefully we will get support for other architectures as well.
>
> The example of liburing syscall wrapper may look like this:
>
> ```c
> void *liburing_mmap(void *addr, size_t length, int prot, int flags,
>                     int fd, off_t offset)
> {      
> #ifdef LIBURING_NOLIBC
>         /*
>          * This is when we build without libc.
>          *
>          * Assume __raw_mmap is the syscall written in ASM.
>          *
>          * The return value is directly taken from the syscall
>          * return value.
>          */
>         return __raw_mmap(addr, length, prot, flags, fd, offset);
> #else
>         /*
>          * This is when libc exists.
>          */
>         void *ret;
>
>         ret = mmap(addr, length, prot, flags, fd, offset);
>         if (ret == MAP_FAILED)
>                 ret = ERR_PTR(-errno);
>
>         return ret;
> #endif
> }
> ```

This will add extra call just to wrap the libc. Consider to static
inline them?

For libc they just check the retval, if it's -1 then return -errno. If
they are inlined, they are ideally identical with the previous version.

Besides they are all internal functions. I don't see why should we
pollute global scope with extra wrappers.

Regards,

--
Louvian Lyndal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value
  2021-10-01  6:44   ` Louvian Lyndal
@ 2021-10-01  7:36     ` Ammar Faizi
  0 siblings, 0 replies; 10+ messages in thread
From: Ammar Faizi @ 2021-10-01  7:36 UTC (permalink / raw)
  To: Louvian Lyndal
  Cc: Jens Axboe, Pavel Begunkov, io-uring Mailing List, Ammar Faizi

On Fri, Oct 1, 2021 at 1:44 PM Louvian Lyndal <louvianlyndal@gmail.com> wrote:
> This will add extra call just to wrap the libc. Consider to static
> inline them?
>
> For libc they just check the retval, if it's -1 then return -errno. If
> they are inlined, they are ideally identical with the previous version.
>
> Besides they are all internal functions. I don't see why should we
> pollute global scope with extra wrappers.
>

Yeah makes sense, I will address this for v2. We can have them as
static inline functions in `src/syscall.h`.

> Regards,
>
> --
> Louvian Lyndal

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-10-01  7:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-29 10:16 [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 1/6] src/syscall: " Ammar Faizi
2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 2/6] Add kernel error header `src/kernel_err.h` Ammar Faizi
2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 3/6] Add `liburing_mmap()` and `liburing_munmap()` Ammar Faizi
2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 4/6] Add `liburing_madvise()` Ammar Faizi
2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 5/6] Add `liburing_getrlimit()` and `liburing_setrlimit()` Ammar Faizi
2021-09-29 10:16 ` [PATCHSET v1 RFC liburing 6/6] src/{queue,register,setup}: Remove `#include <errno.h>` Ammar Faizi
2021-09-29 10:21 ` [PATCHSET v1 RFC liburing 0/6] Implement the kernel style return value Ammar Faizi
2021-10-01  6:44   ` Louvian Lyndal
2021-10-01  7:36     ` Ammar Faizi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).