linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ali Raza <aliraza@bu.edu>
To: linux-kernel@vger.kernel.org
Cc: corbet@lwn.net, masahiroy@kernel.org, michal.lkml@markovi.net,
	ndesaulniers@google.com, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com,
	luto@kernel.org, ebiederm@xmission.com, keescook@chromium.org,
	peterz@infradead.org, viro@zeniv.linux.org.uk, arnd@arndb.de,
	juri.lelli@redhat.com, vincent.guittot@linaro.org,
	dietmar.eggemann@arm.com, rostedt@goodmis.org,
	bsegall@google.com, mgorman@suse.de, bristot@redhat.com,
	vschneid@redhat.com, pbonzini@redhat.com, jpoimboe@kernel.org,
	linux-doc@vger.kernel.org, linux-kbuild@vger.kernel.org,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-arch@vger.kernel.org, x86@kernel.org, rjones@redhat.com,
	munsoner@bu.edu, tommyu@bu.edu, drepper@redhat.com,
	lwoodman@redhat.com, mboydmcse@gmail.com, okrieg@bu.edu,
	rmancuso@bu.edu, Ali Raza <aliraza@bu.edu>
Subject: [RFC UKL 10/10] Kconfig: Add config option for enabling and sample for testing UKL
Date: Mon,  3 Oct 2022 18:21:33 -0400	[thread overview]
Message-ID: <20221003222133.20948-11-aliraza@bu.edu> (raw)
In-Reply-To: <20221003222133.20948-1-aliraza@bu.edu>

Add the KConfig file that will enable building UKL. Documentation
introduces the technical details for how UKL works and the motivations
behind why it is useful. Sample provides a simple program that still uses
the standard system call interface, but does not require a modified C
library.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>

Co-developed-by: Eric B Munson <munsoner@bu.edu>
Signed-off-by: Eric B Munson <munsoner@bu.edu>
Co-developed-by: Ali Raza <aliraza@bu.edu>
Signed-off-by: Ali Raza <aliraza@bu.edu>
---
 Documentation/index.rst   |   1 +
 Documentation/ukl/ukl.rst | 104 ++++++++++++++++++++++++++++++++++++++
 Kconfig                   |   2 +
 kernel/Kconfig.ukl        |  41 +++++++++++++++
 samples/ukl/Makefile      |  16 ++++++
 samples/ukl/README        |  17 +++++++
 samples/ukl/syscall.S     |  28 ++++++++++
 samples/ukl/tcp_server.c  |  99 ++++++++++++++++++++++++++++++++++++
 8 files changed, 308 insertions(+)
 create mode 100644 Documentation/ukl/ukl.rst
 create mode 100644 kernel/Kconfig.ukl
 create mode 100644 samples/ukl/Makefile
 create mode 100644 samples/ukl/README
 create mode 100644 samples/ukl/syscall.S
 create mode 100644 samples/ukl/tcp_server.c

diff --git a/Documentation/index.rst b/Documentation/index.rst
index 4737c18c97ff..42f8cb7d4cae 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -167,6 +167,7 @@ to ReStructured Text format, or are simply too old.
 
    tools/index
    staging/index
+   ukl/ukl.rst
 
 
 Translations
diff --git a/Documentation/ukl/ukl.rst b/Documentation/ukl/ukl.rst
new file mode 100644
index 000000000000..a07ebb51169e
--- /dev/null
+++ b/Documentation/ukl/ukl.rst
@@ -0,0 +1,104 @@
+SPDX-License-Identifier: GPL-2.0
+
+Unikernel Linux (UKL)
+=====================
+
+Unikernel Linux (UKL) is a research project aimed at integrating
+application specific optimizations to the Linux kernel. This RFC aims to
+introduce this research to the community. Any feedback regarding the idea,
+goals, implementation and research is highly appreciated.
+
+Unikernels are specialized operating systems where an application is linked
+directly with the kernel and runs in supervisor mode. This allows the
+developers to implement application specific optimizations to the kernel,
+which can be directly invoked by the application (without going through the
+syscall path). An application can control scheduling and resource
+management and directly access the hardware. Application and the kernel can
+be co-optimized, e.g., through LTO, PGO, etc. All of these optimizations,
+and others, provide applications with huge performance benefits over
+general purpose operating systems.
+
+Linux is the de-facto operating system of today. Applications depend on its
+battle tested code base, large developer community, support for legacy
+code, a huge ecosystem of tools and utilities, and a wide range of
+compatible hardware and device drivers. Linux also allows some degree of
+application specific optimizations through build time config options,
+runtime configuration, and recently through eBPF. But still, there is a
+need for even more fine-grained application specific optimizations, and
+some developers resort to kernel bypass techniques.
+
+Unikernel Linux (UKL) aims to get the best of both worlds by bringing
+application specific optimizations to the Linux ecosystem. This way,
+unmodified applications can keep getting the benefits of Linux while taking
+advantage of the unikernel-style optimizations. Optionally, applications
+can be modified to invoke deeper optimizations.
+
+There are two steps to unikernel-izing Linux, i.e., first, equip Linux with
+a unikernel model, and second, actually use that model to implement
+application specific optimizations. This patch focuses on the first part.
+Through this patch, unmodified applications can be built as Linux
+unikernels, albeit with only modest performance advantages. Like
+unikernels, UKL would allow an application to be statically linked into the
+kernel and executed in supervisor mode. However, UKL preserves most of the
+invariants and design of Linux, including a separate page-able application
+portion of the address space and a pinned kernel portion, the ability to
+run multiple processes, and distinct execution modes for application and
+kernel code. Kernel execution mode and application execution mode are
+different, e.g., the application execution mode allows application threads
+to be scheduled, handle signals, etc., which do not apply to kernel
+threads. Application built as a Linux unikernel will have its text and data
+loaded with the kernel at boot time, while the rest of the address space
+would remain unchanged. These applications invoke the system call
+functionality through a function call into the kernel system call entry
+point instead of through the syscall assembly instruction. UKL would
+support a normal userspace so the UKL application can be started, managed,
+profiled, etc., using normal command line utilities.
+
+Once Linux has a unikernel model, different application specific
+optimizations are possible. We have tried a few, e.g., fast system call
+transitions, shared stacks to allow LTO, invoking kernel functions
+directly, etc. We have seen huge performance benefits, details of which are
+not relevant to this patch and can be found in our paper.
+(https://arxiv.org/pdf/2206.00789.pdf)
+
+UKL differs significantly from previous projects, e.g., UML, KML and LKL.
+User Mode Linux (UML) is a virtual machine monitor implemented on syscall
+interface, a very different goal from UKL. Kernel Mode Linux (KML) allows
+applications to run in kernel mode and replaces syscalls with function
+calls. While KML stops there, UKL goes further. UKL links applications and
+kernel together which allows further optimizations e.g., fast system call
+transitions, shared stacks to allow LTO, invoking kernel functions directly
+etc. Details can be found in the paper linked above. Linux Kernel Library
+(LKL) harvests arch independent code from Linux, takes it to userspace as a
+library to be linked with applications. A host needs to provide arch
+dependent functionality. This model is very different from UKL. A detailed
+discussion of related work is present in the paper linked above.
+
+See samples/ukl for a simple TCP echo server example which can be built as
+a normal user space application and also as a UKL application. In the Linux
+config options, a path to the compiled and partially linked application
+binary can be specified. Kernel built with UKL enabled will search this
+location for the binary and link with the kernel. Applications and required
+libraries need to be compiled with -mno-red-zone -mcmodel=kernel flags
+because kernel mode execution can trample on application red zones and in
+order to link with the kernel and be loaded in the high end of the address
+space, application should have the correct memory model. Examples of other
+applications like Redis, Memcached etc along with glibc and libgcc etc.,
+can be found at https://github.com/unikernelLinux/ukl
+
+List of authors and contributors:
+=================================
+
+Ali Raza - aliraza@bu.edu
+Thomas Unger - tommyu@bu.edu
+Matthew Boyd - mboydmcse@gmail.com
+Eric Munson - munsoner@bu.edu
+Parul Sohal - psohal@bu.edu
+Ulrich Drepper - drepper@redhat.com
+Richard Jones - rjones@redhat.com
+Daniel Bristot de Oliveira - bristot@kernel.org
+Larry Woodman - lwoodman@redhat.com
+Renato Mancuso - rmancuso@bu.edu
+Jonathan Appavoo - jappavoo@bu.edu
+Orran Krieger - okrieg@bu.edu
+
diff --git a/Kconfig b/Kconfig
index 745bc773f567..2a4594ae472c 100644
--- a/Kconfig
+++ b/Kconfig
@@ -29,4 +29,6 @@ source "lib/Kconfig"
 
 source "lib/Kconfig.debug"
 
+source "kernel/Kconfig.ukl"
+
 source "Documentation/Kconfig"
diff --git a/kernel/Kconfig.ukl b/kernel/Kconfig.ukl
new file mode 100644
index 000000000000..c2c5e1003605
--- /dev/null
+++ b/kernel/Kconfig.ukl
@@ -0,0 +1,41 @@
+menuconfig UNIKERNEL_LINUX
+	bool "Unikernel Linux"
+	depends on X86_64 && !RANDOMIZE_BASE && !PAGE_TABLE_ISOLATION
+	help
+	    Unikernel Linux allows for a single, privileged process to be
+	    linked with the kernel binary and be executed inplace of or
+	    along side a more traditional user space.
+
+	    If you don't know what this is, say N.
+
+config UKL_TLS
+	bool "Enable TLS for UKL application"
+	depends on UNIKERNEL_LINUX
+	default Y
+	help
+	    Not all applications will make use of thread local storage,
+	    but we need to account for it in the linker script if used.
+	    For the application in samples/ this should be disabled, but
+	    if you are working with glibc this should be 'Y'.
+
+	    If unsure say 'Y' here
+
+config UKL_NAME
+	string "UKL Exec target"
+	depends on UNIKERNEL_LINUX
+	default "/UKL"
+	help
+	    We need a way to trigger the start of the UKL application,
+	    either by the kernel inplace of init or userspace when setup
+	    is finished. The value given here is compared against the
+	    filename passed to exec and if they match UKL is started.
+	    For a more 'traditional' unikernel model, the value set here
+	    should be given to the init= boot parameter.
+
+config UKL_ARCHIVE_PATH
+	string "Path static application archive"
+	depends on UNIKERNEL_LINUX
+	default "../UKL.a"
+	help
+	    Where the linker should look for the statically linked application
+	    and dependency archive.
diff --git a/samples/ukl/Makefile b/samples/ukl/Makefile
new file mode 100644
index 000000000000..93beb7750d4b
--- /dev/null
+++ b/samples/ukl/Makefile
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: GPL-2.0
+
+CFLAGS += -I usr/include -fno-PIC -mno-red-zone -mcmodel=kernel
+
+UKL.a: tcp_server.o syscall.o userspace
+	$(AR) cr UKL.a tcp_server.o syscall.o
+	objcopy --prefix-symbols=ukl_ UKL.a
+
+tcp_server.o: tcp_server.c
+syscall.o: syscall.S
+
+userspace:
+	gcc -o tcp_server tcp_server.c
+
+clean:
+	rm -f UKL.a tcp_server.o syscall.o tcp_server
diff --git a/samples/ukl/README b/samples/ukl/README
new file mode 100644
index 000000000000..fbb771da033a
--- /dev/null
+++ b/samples/ukl/README
@@ -0,0 +1,17 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+UKL test program
+================
+
+tcp_server.c is a epoll based TCP echo server written in C which uses port
+no. 5555 by default. syscall.S translates syscall() function to a call
+instruction in assembly. Normally, C libraries provide syscall() function
+that translate into syscall assembly instruction. Run `make` and it will
+create a UKL.a and a tcp_server. UKL.a can then be copied to where UKL
+Linux build expects it to be present. This can be changed through the Linux
+config options (by running `make menuconfig` etc.) The resulting Linux
+kernel can be run, and once the userspace comes up, the echo server can be
+started by running the UKL exec command, again chosen through the Linux
+config options. tcp_server is a userspace binary of the same echo server
+which can be run normally. This is meant to show that UKL can run code
+which can also be run as a userspace binary without modification.
diff --git a/samples/ukl/syscall.S b/samples/ukl/syscall.S
new file mode 100644
index 000000000000..95d1c177fb05
--- /dev/null
+++ b/samples/ukl/syscall.S
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+	.global _start
+_start:
+	jmp main
+
+	.global syscall
+
+/* Usage: long syscall (syscall_number, arg1, arg2, arg3, arg4, arg5, arg6)
+   We need to do some arg shifting, the syscall_number will be in
+   rax.  */
+
+	.text
+syscall:
+	movq %rdi, %rax		/* Syscall number -> rax.  */
+	movq %rsi, %rdi		/* shift arg1 - arg5.  */
+	movq %rdx, %rsi
+	movq %rcx, %rdx
+	movq %r8, %r10
+	movq %r9, %r8
+	movq 8(%rsp),%r9	/* arg6 is on the stack.  */
+	call entry_SYSCALL_64	/* Do the system call.  */
+	cmpq $-4095, %rax	/* Check %rax for error.  */
+	jae loop	/* Jump to error handler if error.  */
+	ret			/* Return to caller.  */
+
+loop:
+	jmp loop
diff --git a/samples/ukl/tcp_server.c b/samples/ukl/tcp_server.c
new file mode 100644
index 000000000000..abf1a0e2bb79
--- /dev/null
+++ b/samples/ukl/tcp_server.c
@@ -0,0 +1,99 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <sys/epoll.h>
+#include <arpa/inet.h>
+#include <netinet/tcp.h>
+
+#define BACKLOG 512
+#define MAX_EVENTS 128
+#define MAX_MESSAGE_LEN 2048
+
+void error(char *msg);
+extern long syscall(long number, ...);
+
+int main(void)
+{
+	// some variables we need
+	struct sockaddr_in server_addr, client_addr;
+	socklen_t client_len = sizeof(client_addr);
+	int bytes_received;
+	char buffer[MAX_MESSAGE_LEN];
+	int on;
+	int result;
+	int sock_listen_fd, newsockfd;
+
+	// setup socket
+	sock_listen_fd = syscall(41, AF_INET, SOCK_STREAM, 0);
+	if (sock_listen_fd < 0)
+		error("Error creating socket..\n");
+
+	server_addr.sin_family = AF_INET;
+	server_addr.sin_port = 45845; //htons(portno);
+	server_addr.sin_addr.s_addr = INADDR_ANY;
+
+	// set TCP NODELAY
+	on = 1;
+	result = syscall(54, sock_listen_fd, IPPROTO_TCP, TCP_NODELAY, &on, sizeof(on));
+	if (result < 0)
+		error("Can't set TCP_NODELAY to on");
+
+	// bind socket and listen for connections
+	if (syscall(49, sock_listen_fd, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0)
+		error("Error binding socket..\n");
+
+	if (syscall(50, sock_listen_fd, BACKLOG) < 0)
+		error("Error listening..\n");
+
+	struct epoll_event ev, events[MAX_EVENTS];
+	int new_events, sock_conn_fd, epollfd;
+
+	epollfd = syscall(213, MAX_EVENTS);
+	if (epollfd < 0)
+		error("Error creating epoll..\n");
+
+	ev.events = EPOLLIN;
+	ev.data.fd = sock_listen_fd;
+
+	if (syscall(233, epollfd, EPOLL_CTL_ADD, sock_listen_fd, &ev) == -1)
+		error("Error adding new listeding socket to epoll..\n");
+
+	while (1) {
+		new_events = syscall(232, epollfd, events, MAX_EVENTS, -1);
+
+		if (new_events == -1)
+			error("Error in epoll_wait..\n");
+
+		for (int i = 0; i < new_events; ++i) {
+			if (events[i].data.fd == sock_listen_fd) {
+				sock_conn_fd = syscall(288, sock_listen_fd,
+						(struct sockaddr *)&client_addr,
+						&client_len, SOCK_NONBLOCK);
+				if (sock_conn_fd == -1)
+					error("Error accepting new connection..\n");
+
+				ev.events = EPOLLIN | EPOLLET;
+				ev.data.fd = sock_conn_fd;
+				if (syscall(233, epollfd, EPOLL_CTL_ADD, sock_conn_fd, &ev) == -1)
+					error("Error adding new event to epoll..\n");
+			} else {
+				newsockfd = events[i].data.fd;
+				bytes_received = syscall(45, newsockfd, buffer, MAX_MESSAGE_LEN,
+						0, NULL, NULL);
+				if (bytes_received <= 0) {
+					syscall(233, epollfd, EPOLL_CTL_DEL, newsockfd, NULL);
+					syscall(48, newsockfd, SHUT_RDWR);
+				} else {
+					syscall(44, newsockfd, buffer, bytes_received, 0, NULL, 0);
+				}
+			}
+		}
+	}
+}
+
+void error(char *msg)
+{
+	syscall(1, 1, msg, 15);
+	syscall(60, 1);
+}
-- 
2.21.3


  parent reply	other threads:[~2022-10-03 22:24 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-03 22:21 [RFC UKL 00/10] Unikernel Linux (UKL) Ali Raza
2022-10-03 22:21 ` [RFC UKL 01/10] kbuild: Add sections and symbols to linker script for UKL support Ali Raza
2022-10-03 22:21 ` [RFC UKL 02/10] x86/boot: Load the PT_TLS segment for Unikernel configs Ali Raza
2022-10-04 17:30   ` Andy Lutomirski
2022-10-06 21:00     ` Ali Raza
2022-10-03 22:21 ` [RFC UKL 03/10] sched: Add task_struct tracking of kernel or application execution Ali Raza
2022-10-03 22:21 ` [RFC UKL 04/10] x86/entry: Create alternate entry path for system calls Ali Raza
2022-10-04 17:43   ` Andy Lutomirski
2022-10-06 21:12     ` Ali Raza
2022-10-03 22:21 ` [RFC UKL 05/10] x86/uaccess: Make access_ok UKL aware Ali Raza
2022-10-04 17:36   ` Andy Lutomirski
2022-10-06 21:16     ` Ali Raza
2022-10-03 22:21 ` [RFC UKL 06/10] x86/fault: Skip checking kernel mode access to user address space for UKL Ali Raza
2022-10-03 22:21 ` [RFC UKL 07/10] x86/signal: Adjust signal handler register values and return frame Ali Raza
2022-10-04 17:34   ` Andy Lutomirski
2022-10-06 21:20     ` Ali Raza
2022-10-03 22:21 ` [RFC UKL 08/10] exec: Make exec path for starting UKL application Ali Raza
2022-10-03 22:21 ` [RFC UKL 09/10] exec: Give userspace a method for starting UKL process Ali Raza
2022-10-04 17:35   ` Andy Lutomirski
2022-10-06 21:25     ` Ali Raza
2022-10-03 22:21 ` Ali Raza [this message]
2022-10-04  2:11   ` [RFC UKL 10/10] Kconfig: Add config option for enabling and sample for testing UKL Bagas Sanjaya
2022-10-06 21:28     ` Ali Raza
2022-10-07 10:21       ` Masahiro Yamada
2022-10-13 17:08         ` Ali Raza
2022-10-06 21:27 ` [RFC UKL 00/10] Unikernel Linux (UKL) H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221003222133.20948-11-aliraza@bu.edu \
    --to=aliraza@bu.edu \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=drepper@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=lwoodman@redhat.com \
    --cc=masahiroy@kernel.org \
    --cc=mboydmcse@gmail.com \
    --cc=mgorman@suse.de \
    --cc=michal.lkml@markovi.net \
    --cc=mingo@redhat.com \
    --cc=munsoner@bu.edu \
    --cc=ndesaulniers@google.com \
    --cc=okrieg@bu.edu \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjones@redhat.com \
    --cc=rmancuso@bu.edu \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tommyu@bu.edu \
    --cc=vincent.guittot@linaro.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vschneid@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).