All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/47] Unifying LKL into UML
@ 2019-10-23  4:37 Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 01/47] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms Hajime Tazaki
                   ` (49 more replies)
  0 siblings, 50 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Hajime Tazaki, Akira Moroo

This RFC patchset is to ask opinions from UML people, whether LKL codes is
good to integrate into UML code base.  We wish to have any kind of feedback
from your kind reviews.  There are numbers of commits which should be asked
for reviews to other mailing lists; we will do it later once we got
discussed in this mailing list.

# sorry for the long list of patches: we can make it smaller by only
  including basic set of LKL (e.g., removing foreign OS support, etc) if
  you wish.



LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
as extensively as possible with minimal effort and reduced maintenance
overhead.

Examples of how LKL can be used are: creating userspace applications
(running on Linux and other operating systems) that can read or write Linux
filesystems or can use the Linux networking stack, creating kernel drivers
for other operating systems that can read Linux filesystems, bootloaders
support for reading/writing Linux filesystems, etc.

With LKL, the kernel code is compiled into an object file that can be
directly linked by applications. The API offered by LKL is based on the
Linux system call interface.

LKL is originally implemented as an architecture port in arch/lkl, but this
series of commits tries to integrate this into arch/um as one of the mode
of UML.  This was discussed during RFC email of LKL (*1).

The latest LKL version can be found at https://github.com/lkl/linux

Milestone
=========
This patches is just a first step toward upstreaming *library mode* of
Linux kernel, but we think we need to have several steps toward our goal,
describing in the below.

1. Put LKL code under arch/um (arch/um/lkl), and build it in a
separate way from UML.
2. Share common parts of implementation between UML and LKL.
3. Reimplement UML features with LKL API (if we wish)

For the step 1, we put LKL as one of SUBARCH in order to make less effort
to integrate (make ARCH=um SUBARCH=lkl).  The modification to existing UML
code is trying to be minimized.

The RFC patches also includes and a bit of step 2 as a proof of possibility
to share the code.  For this, we used the virtio device code of LKL and use
it from UML by enabling virtio-mmio driver with UML code.



Building LKL the host library and LKL applications
==================================================

% cd tools/lkl
% make

will build LKL as a object file, it will install it in tools/lkl/lib together
with the headers files in tools/lkl/include then will build the host library,
tests and a few of application examples:

* tests/boot - a simple applications that uses LKL and exercises the basic
LKL APIs

* tests/net-test - a simple applications that uses network feature of
LKL and exercises the basic network-related APIs

* fs2tar - a tool that converts a filesystem image to a tar archive

* cptofs/cpfromfs - a tool that copies files to/from a filesystem image

% make run-tests

should run the above `tests/boot` and `tests/net-test` and report errors if
there are any.

Supported hosts
===============

Currently LKL supports POSIX and Windows userspace applications. New hosts
can be added relatively easy if the host supports gcc and GNU ld. Previous
versions of LKL supported Windows kernel and Haiku kernel hosts, and we
also have WIP patches (not included in this RFC) with rump-hypercall
interface, used in UEFI, as well as macOS userspace (part of POSIX?).

There is also musl-libc port for LKL, which might be interested in for some
folks.


Further readings about LKL
=========================

- Discussion in github LKL issue
https://github.com/lkl/linux/issues/304

- LKL (an article)
https://www.researchgate.net/profile/Nicolae_Tapus2/publication/224164682_LKL_The_Linux_kernel_library/links/02bfe50fd921ab4f7c000000.pdf

*1 RFC email to LKML (back in 2015)
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1012277.html



Please review the following changes for suitability for inclusion. If you have
any objections or suggestions for improvement, please respond to the patches. If
you agree with the changes, please provide your Acked-by.

The following changes since commit 73625ed66389d4c620520058d828f43a93ab4d0c:

  um: irq: Fix LAST_IRQ usage in init_IRQ() (2019-09-16 08:38:58 +0200)

are available in the Git repository at:

  git://github.com/thehajime/linux d380ec02dd0cd97afe08706093b59329e6b09fe2
  https://github.com/thehajime/linux/tree/upstream-to-uml-5.5-rc1

Akira Moroo (2):
  Revert "vmlinux.lds.h: remove stale <linux/export.h> include"
  um lkl: use ARCH=um SUBARCH=lkl for tools/lkl

Andreas Abel (1):
  kallsyms: Add a config option to select section for kallsyms

Hajime Tazaki (9):
  lkl: Android ARM (arm/arm64) support
  Revert "export.h: remove code for prefixing symbols with underscore"
  Revert "linux/linkage.h: replace VMLINUX_SYMBOL_STR() with
    __stringify()"
  Revert "vmlinux.lds.h: remove no-op macro VMLINUX_SYMBOL()"
  Revert "kbuild: remove CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX"
  Revert "kallsyms: remove symbol prefix support"
  um lkl: add CI tests
  um: use lkl virtio_net_tap device as UML device
  um: add lkl virtio-blk device

Octavian Purdila (34):
  asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
  kbuild: allow architectures to automatically define kconfig symbols
  lkl: architecture skeleton for Linux kernel library
  lkl: host interface
  lkl: memory handling
  lkl: kernel threads support
  lkl: interrupt support
  lkl: system call interface and application API
  lkl: timers, time and delay support
  lkl: memory mapped I/O support
  lkl: basic kernel console support
  lkl: initialization and cleanup
  lkl: plug in the build system
  lkl tools: skeleton for host side library, tests and tools
  lkl tools: host lib: add utilities functions
  lkl tools: host lib: memory mapped I/O helpers
  lkl tools: host lib: virtio devices
  lkl tools: host lib: virtio block device
  lkl tools: host lib: filesystem helpers
  lkl tools: host lib: posix host operations
  lkl tools: "boot" test
  lkl tools: tool that converts a filesystem image to tar
  lkl tools: tool that reads/writes to/from a filesystem image
  lkl tools: virtio: add network device support
  lkl: add support for Windows hosts
  lkl tools: add support for Windows host
  lkl tools: add lklfuse
  lkl: add initial system call hijack support (a.k.a. NUSE of libos)
  lkl: add documentation
  cpu: add cpu_yield_to_irqs
  signal: use CONFIG_X86_32 instead of __i386__
  arch: add __SYSCALL_DEFINE_ARCH
  xfs: support for non-mmu architectures
  checkpatch: avoid showing BIT_ULL warnings for tools/ files

Thomas Liebetraut (1):
  tools: Add the lkl host library to the common tools Makefile

 .circleci/config.yml                          | 248 +++++
 Documentation/lkl.txt                         | 470 +++++++++
 MAINTAINERS                                   |   8 +
 Makefile                                      |   3 +
 README.md                                     |   1 +
 arch/Kconfig                                  |   6 +
 arch/um/Kconfig                               |  56 +-
 arch/um/Makefile                              | 115 +--
 arch/um/Makefile.um                           | 121 +++
 arch/um/auto.conf                             |   0
 arch/um/configs/x86_64_defconfig              |   6 +
 arch/um/include/asm/Kbuild                    |   6 +
 arch/um/include/asm/io.h                      |   4 +
 arch/um/lkl/.gitignore                        |   2 +
 arch/um/lkl/Kconfig                           |  96 ++
 arch/um/lkl/Kconfig.debug                     |   0
 arch/um/lkl/Makefile                          |   0
 arch/um/lkl/Makefile.um                       |  70 ++
 arch/um/lkl/auto.conf                         |   1 +
 arch/um/lkl/configs/lkl_defconfig             |  95 ++
 arch/um/lkl/include/asm/Kbuild                |  80 ++
 arch/um/lkl/include/asm/bitsperlong.h         |  11 +
 arch/um/lkl/include/asm/byteorder.h           |   7 +
 arch/um/lkl/include/asm/cpu.h                 |  14 +
 arch/um/lkl/include/asm/elf.h                 |  15 +
 arch/um/lkl/include/asm/host_ops.h            |  12 +
 arch/um/lkl/include/asm/io.h                  | 104 ++
 arch/um/lkl/include/asm/irq.h                 |  15 +
 arch/um/lkl/include/asm/mutex.h               |   7 +
 arch/um/lkl/include/asm/page.h                |  14 +
 arch/um/lkl/include/asm/pgtable.h             |  62 ++
 arch/um/lkl/include/asm/processor.h           |  60 ++
 arch/um/lkl/include/asm/ptrace.h              |  25 +
 arch/um/lkl/include/asm/sched.h               |  23 +
 arch/um/lkl/include/asm/setup.h               |   7 +
 arch/um/lkl/include/asm/syscalls.h            |  18 +
 arch/um/lkl/include/asm/syscalls_32.h         |  43 +
 arch/um/lkl/include/asm/thread_info.h         |  70 ++
 arch/um/lkl/include/asm/tlb.h                 |  12 +
 arch/um/lkl/include/asm/uaccess.h             |  64 ++
 arch/um/lkl/include/asm/unistd.h              |  29 +
 arch/um/lkl/include/asm/unistd_32.h           |  31 +
 arch/um/lkl/include/asm/vmlinux.lds.h         |  14 +
 arch/um/lkl/include/asm/xor.h                 |   9 +
 arch/um/lkl/include/system/stdarg.h           |   2 +
 arch/um/lkl/include/uapi/asm/Kbuild           |   9 +
 arch/um/lkl/include/uapi/asm/bitsperlong.h    |  13 +
 arch/um/lkl/include/uapi/asm/byteorder.h      |  11 +
 arch/um/lkl/include/uapi/asm/host_ops.h       | 153 +++
 arch/um/lkl/include/uapi/asm/irq.h            |  36 +
 arch/um/lkl/include/uapi/asm/sigcontext.h     |  16 +
 arch/um/lkl/include/uapi/asm/siginfo.h        |  11 +
 arch/um/lkl/include/uapi/asm/swab.h           |  11 +
 arch/um/lkl/include/uapi/asm/syscalls.h       | 348 +++++++
 arch/um/lkl/include/uapi/asm/unistd.h         |  18 +
 arch/um/lkl/kernel/Makefile                   |   4 +
 arch/um/lkl/kernel/asm-offsets.c              |   2 +
 arch/um/lkl/kernel/console.c                  |  42 +
 arch/um/lkl/kernel/cpu.c                      | 223 +++++
 arch/um/lkl/kernel/irq.c                      | 193 ++++
 arch/um/lkl/kernel/misc.c                     |  60 ++
 arch/um/lkl/kernel/setup.c                    | 193 ++++
 arch/um/lkl/kernel/syscalls.c                 | 246 +++++
 arch/um/lkl/kernel/syscalls_32.c              | 159 +++
 arch/um/lkl/kernel/threads.c                  | 227 +++++
 arch/um/lkl/kernel/time.c                     | 145 +++
 arch/um/lkl/kernel/vmlinux.lds.S              |  51 +
 arch/um/lkl/mm/Makefile                       |   1 +
 arch/um/lkl/mm/bootmem.c                      |  66 ++
 arch/um/lkl/scripts/headers_install.py        | 195 ++++
 arch/um/lkl/um/Makefile                       |   1 +
 .../um/lkl/um/include/sysdep/kernel-offsets.h |   4 +
 arch/um/os-Linux/Makefile                     |   5 +
 arch/um/os-Linux/lkl_dev.c                    | 188 ++++
 arch/x86/um/syscalls_64.c                     |  53 +
 crypto/xor.c                                  |   2 +
 fs/xfs/xfs_buf.c                              |  26 +
 include/asm-generic/atomic64.h                |   2 +
 include/asm-generic/export.h                  |  34 +-
 include/asm-generic/vmlinux.lds.h             | 293 +++---
 include/linux/compiler_attributes.h           |   4 +
 include/linux/cpu.h                           |   1 +
 include/linux/export.h                        |  23 +-
 include/linux/linkage.h                       |  12 +-
 include/linux/syscalls.h                      |   6 +
 init/Kconfig                                  |  12 +
 kernel/cpu.c                                  |   5 +
 kernel/signal.c                               |   2 +-
 lib/.gitignore                                |   2 +
 lib/raid6/.gitignore                          |   1 +
 lib/raid6/algos.c                             |   9 +-
 scripts/.gitignore                            |   2 +
 scripts/Makefile.build                        |   7 +-
 scripts/adjust_autoksyms.sh                   |   7 +-
 scripts/basic/.gitignore                      |   1 +
 scripts/checkpatch.pl                         |   3 +-
 scripts/kallsyms.c                            |  58 +-
 scripts/kconfig/.gitignore                    |   1 +
 scripts/link-vmlinux.sh                       |  10 +
 scripts/mod/.gitignore                        |   1 +
 tools/Makefile                                |  11 +-
 tools/lkl/.gitignore                          |  14 +
 tools/lkl/Build                               |   6 +
 tools/lkl/Makefile                            | 130 +++
 tools/lkl/Makefile.autoconf                   | 114 +++
 tools/lkl/Targets                             |  27 +
 tools/lkl/bin/arm-linux-androideabi-ld        |   1 +
 tools/lkl/bin/lkl-hijack.sh                   |  23 +
 tools/lkl/cptofs.c                            | 635 ++++++++++++
 tools/lkl/fs2tar.c                            | 410 ++++++++
 tools/lkl/include/.gitignore                  |   1 +
 tools/lkl/include/lkl.h                       | 928 ++++++++++++++++++
 tools/lkl/include/lkl_config.h                |  61 ++
 tools/lkl/include/lkl_host.h                  | 160 +++
 tools/lkl/include/mingw32/sys/socket.h        |   4 +
 tools/lkl/lib/.gitignore                      |   3 +
 tools/lkl/lib/Build                           |  25 +
 tools/lkl/lib/Makefile                        |  32 +
 tools/lkl/lib/config.c                        | 793 +++++++++++++++
 tools/lkl/lib/dbg.c                           | 300 ++++++
 tools/lkl/lib/dbg_handler.c                   |  44 +
 tools/lkl/lib/endian.h                        |  31 +
 tools/lkl/lib/fs.c                            | 433 ++++++++
 tools/lkl/lib/hijack/Build                    |   4 +
 tools/lkl/lib/hijack/hijack.c                 | 618 ++++++++++++
 tools/lkl/lib/hijack/init.c                   | 252 +++++
 tools/lkl/lib/hijack/init.h                   |   8 +
 tools/lkl/lib/hijack/xlate.c                  | 613 ++++++++++++
 tools/lkl/lib/hijack/xlate.h                  |  13 +
 tools/lkl/lib/iomem.c                         |  88 ++
 tools/lkl/lib/iomem.h                         |  15 +
 tools/lkl/lib/jmp_buf.c                       |  14 +
 tools/lkl/lib/jmp_buf.h                       |   8 +
 tools/lkl/lib/net.c                           | 818 +++++++++++++++
 tools/lkl/lib/nt-host.c                       | 375 +++++++
 tools/lkl/lib/posix-host.c                    | 439 +++++++++
 tools/lkl/lib/utils.c                         | 266 +++++
 tools/lkl/lib/virtio.c                        | 644 ++++++++++++
 tools/lkl/lib/virtio.h                        | 115 +++
 tools/lkl/lib/virtio_blk.c                    | 132 +++
 tools/lkl/lib/virtio_net.c                    | 342 +++++++
 tools/lkl/lib/virtio_net_dpdk.c               | 480 +++++++++
 tools/lkl/lib/virtio_net_fd.c                 | 195 ++++
 tools/lkl/lib/virtio_net_fd.h                 |  50 +
 tools/lkl/lib/virtio_net_macvtap.c            |  32 +
 tools/lkl/lib/virtio_net_pipe.c               |  76 ++
 tools/lkl/lib/virtio_net_raw.c                |  94 ++
 tools/lkl/lib/virtio_net_tap.c                | 111 +++
 tools/lkl/lib/virtio_net_vde.c                | 168 ++++
 tools/lkl/lklfuse.c                           | 658 +++++++++++++
 tools/lkl/scripts/checkpatch.sh               |  60 ++
 tools/lkl/scripts/dpdk-sdk-build.sh           |  18 +
 tools/lkl/scripts/lkl-jenkins.sh              |  21 +
 tools/lkl/tests/Build                         |   3 +
 tools/lkl/tests/boot.c                        | 562 +++++++++++
 tools/lkl/tests/boot.sh                       |   9 +
 tools/lkl/tests/cla.c                         | 159 +++
 tools/lkl/tests/cla.h                         |  33 +
 tools/lkl/tests/disk.c                        | 189 ++++
 tools/lkl/tests/disk.sh                       |  70 ++
 tools/lkl/tests/hijack-test.sh                | 760 ++++++++++++++
 tools/lkl/tests/lklfuse.sh                    | 110 +++
 tools/lkl/tests/net-setup.sh                  | 134 +++
 tools/lkl/tests/net-test.c                    | 317 ++++++
 tools/lkl/tests/net.sh                        | 186 ++++
 tools/lkl/tests/run.py                        | 186 ++++
 tools/lkl/tests/run_netperf.sh                |  98 ++
 tools/lkl/tests/tap13.py                      | 209 ++++
 tools/lkl/tests/test.c                        | 126 +++
 tools/lkl/tests/test.h                        |  72 ++
 tools/lkl/tests/test.sh                       | 240 +++++
 tools/lkl/tests/valgrind.supp                 |  85 ++
 tools/lkl/tests/valgrind2xunit.py             |  69 ++
 173 files changed, 19464 insertions(+), 330 deletions(-)
 create mode 100644 .circleci/config.yml
 create mode 100644 Documentation/lkl.txt
 create mode 120000 README.md
 create mode 100644 arch/um/Makefile.um
 create mode 100644 arch/um/auto.conf
 create mode 100644 arch/um/lkl/.gitignore
 create mode 100644 arch/um/lkl/Kconfig
 create mode 100644 arch/um/lkl/Kconfig.debug
 create mode 100644 arch/um/lkl/Makefile
 create mode 100644 arch/um/lkl/Makefile.um
 create mode 100644 arch/um/lkl/auto.conf
 create mode 100644 arch/um/lkl/configs/lkl_defconfig
 create mode 100644 arch/um/lkl/include/asm/Kbuild
 create mode 100644 arch/um/lkl/include/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/asm/cpu.h
 create mode 100644 arch/um/lkl/include/asm/elf.h
 create mode 100644 arch/um/lkl/include/asm/host_ops.h
 create mode 100644 arch/um/lkl/include/asm/io.h
 create mode 100644 arch/um/lkl/include/asm/irq.h
 create mode 100644 arch/um/lkl/include/asm/mutex.h
 create mode 100644 arch/um/lkl/include/asm/page.h
 create mode 100644 arch/um/lkl/include/asm/pgtable.h
 create mode 100644 arch/um/lkl/include/asm/processor.h
 create mode 100644 arch/um/lkl/include/asm/ptrace.h
 create mode 100644 arch/um/lkl/include/asm/sched.h
 create mode 100644 arch/um/lkl/include/asm/setup.h
 create mode 100644 arch/um/lkl/include/asm/syscalls.h
 create mode 100644 arch/um/lkl/include/asm/syscalls_32.h
 create mode 100644 arch/um/lkl/include/asm/thread_info.h
 create mode 100644 arch/um/lkl/include/asm/tlb.h
 create mode 100644 arch/um/lkl/include/asm/uaccess.h
 create mode 100644 arch/um/lkl/include/asm/unistd.h
 create mode 100644 arch/um/lkl/include/asm/unistd_32.h
 create mode 100644 arch/um/lkl/include/asm/vmlinux.lds.h
 create mode 100644 arch/um/lkl/include/asm/xor.h
 create mode 100644 arch/um/lkl/include/system/stdarg.h
 create mode 100644 arch/um/lkl/include/uapi/asm/Kbuild
 create mode 100644 arch/um/lkl/include/uapi/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/uapi/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/uapi/asm/host_ops.h
 create mode 100644 arch/um/lkl/include/uapi/asm/irq.h
 create mode 100644 arch/um/lkl/include/uapi/asm/sigcontext.h
 create mode 100644 arch/um/lkl/include/uapi/asm/siginfo.h
 create mode 100644 arch/um/lkl/include/uapi/asm/swab.h
 create mode 100644 arch/um/lkl/include/uapi/asm/syscalls.h
 create mode 100644 arch/um/lkl/include/uapi/asm/unistd.h
 create mode 100644 arch/um/lkl/kernel/Makefile
 create mode 100644 arch/um/lkl/kernel/asm-offsets.c
 create mode 100644 arch/um/lkl/kernel/console.c
 create mode 100644 arch/um/lkl/kernel/cpu.c
 create mode 100644 arch/um/lkl/kernel/irq.c
 create mode 100644 arch/um/lkl/kernel/misc.c
 create mode 100644 arch/um/lkl/kernel/setup.c
 create mode 100644 arch/um/lkl/kernel/syscalls.c
 create mode 100644 arch/um/lkl/kernel/syscalls_32.c
 create mode 100644 arch/um/lkl/kernel/threads.c
 create mode 100644 arch/um/lkl/kernel/time.c
 create mode 100644 arch/um/lkl/kernel/vmlinux.lds.S
 create mode 100644 arch/um/lkl/mm/Makefile
 create mode 100644 arch/um/lkl/mm/bootmem.c
 create mode 100755 arch/um/lkl/scripts/headers_install.py
 create mode 100644 arch/um/lkl/um/Makefile
 create mode 100644 arch/um/lkl/um/include/sysdep/kernel-offsets.h
 create mode 100644 arch/um/os-Linux/lkl_dev.c
 create mode 100644 tools/lkl/.gitignore
 create mode 100644 tools/lkl/Build
 create mode 100644 tools/lkl/Makefile
 create mode 100644 tools/lkl/Makefile.autoconf
 create mode 100644 tools/lkl/Targets
 create mode 120000 tools/lkl/bin/arm-linux-androideabi-ld
 create mode 100755 tools/lkl/bin/lkl-hijack.sh
 create mode 100644 tools/lkl/cptofs.c
 create mode 100644 tools/lkl/fs2tar.c
 create mode 100644 tools/lkl/include/.gitignore
 create mode 100644 tools/lkl/include/lkl.h
 create mode 100644 tools/lkl/include/lkl_config.h
 create mode 100644 tools/lkl/include/lkl_host.h
 create mode 100644 tools/lkl/include/mingw32/sys/socket.h
 create mode 100644 tools/lkl/lib/.gitignore
 create mode 100644 tools/lkl/lib/Build
 create mode 100644 tools/lkl/lib/Makefile
 create mode 100644 tools/lkl/lib/config.c
 create mode 100644 tools/lkl/lib/dbg.c
 create mode 100644 tools/lkl/lib/dbg_handler.c
 create mode 100644 tools/lkl/lib/endian.h
 create mode 100644 tools/lkl/lib/fs.c
 create mode 100644 tools/lkl/lib/hijack/Build
 create mode 100644 tools/lkl/lib/hijack/hijack.c
 create mode 100644 tools/lkl/lib/hijack/init.c
 create mode 100644 tools/lkl/lib/hijack/init.h
 create mode 100644 tools/lkl/lib/hijack/xlate.c
 create mode 100644 tools/lkl/lib/hijack/xlate.h
 create mode 100644 tools/lkl/lib/iomem.c
 create mode 100644 tools/lkl/lib/iomem.h
 create mode 100644 tools/lkl/lib/jmp_buf.c
 create mode 100644 tools/lkl/lib/jmp_buf.h
 create mode 100644 tools/lkl/lib/net.c
 create mode 100644 tools/lkl/lib/nt-host.c
 create mode 100644 tools/lkl/lib/posix-host.c
 create mode 100644 tools/lkl/lib/utils.c
 create mode 100644 tools/lkl/lib/virtio.c
 create mode 100644 tools/lkl/lib/virtio.h
 create mode 100644 tools/lkl/lib/virtio_blk.c
 create mode 100644 tools/lkl/lib/virtio_net.c
 create mode 100644 tools/lkl/lib/virtio_net_dpdk.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.h
 create mode 100644 tools/lkl/lib/virtio_net_macvtap.c
 create mode 100644 tools/lkl/lib/virtio_net_pipe.c
 create mode 100644 tools/lkl/lib/virtio_net_raw.c
 create mode 100644 tools/lkl/lib/virtio_net_tap.c
 create mode 100644 tools/lkl/lib/virtio_net_vde.c
 create mode 100644 tools/lkl/lklfuse.c
 create mode 100755 tools/lkl/scripts/checkpatch.sh
 create mode 100755 tools/lkl/scripts/dpdk-sdk-build.sh
 create mode 100755 tools/lkl/scripts/lkl-jenkins.sh
 create mode 100644 tools/lkl/tests/Build
 create mode 100644 tools/lkl/tests/boot.c
 create mode 100755 tools/lkl/tests/boot.sh
 create mode 100644 tools/lkl/tests/cla.c
 create mode 100644 tools/lkl/tests/cla.h
 create mode 100644 tools/lkl/tests/disk.c
 create mode 100755 tools/lkl/tests/disk.sh
 create mode 100755 tools/lkl/tests/hijack-test.sh
 create mode 100755 tools/lkl/tests/lklfuse.sh
 create mode 100644 tools/lkl/tests/net-setup.sh
 create mode 100644 tools/lkl/tests/net-test.c
 create mode 100755 tools/lkl/tests/net.sh
 create mode 100755 tools/lkl/tests/run.py
 create mode 100755 tools/lkl/tests/run_netperf.sh
 create mode 100644 tools/lkl/tests/tap13.py
 create mode 100644 tools/lkl/tests/test.c
 create mode 100644 tools/lkl/tests/test.h
 create mode 100644 tools/lkl/tests/test.sh
 create mode 100644 tools/lkl/tests/valgrind.supp
 create mode 100755 tools/lkl/tests/valgrind2xunit.py

-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* [RFC PATCH 01/47] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 02/47] kbuild: allow architectures to automatically define kconfig symbols Hajime Tazaki
                   ` (48 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 include/asm-generic/atomic64.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/asm-generic/atomic64.h b/include/asm-generic/atomic64.h
index 370f01d4450f..9b15847baae5 100644
--- a/include/asm-generic/atomic64.h
+++ b/include/asm-generic/atomic64.h
@@ -9,9 +9,11 @@
 #define _ASM_GENERIC_ATOMIC64_H
 #include <linux/types.h>
 
+#ifndef CONFIG_64BIT
 typedef struct {
 	s64 counter;
 } atomic64_t;
+#endif
 
 #define ATOMIC64_INIT(i)	{ (i) }
 
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 02/47] kbuild: allow architectures to automatically define kconfig symbols
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 01/47] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library Hajime Tazaki
                   ` (47 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch calls an architecture hook during the kernel config process
that allows the architecture to automatically define kconfig symbols.
This can be done by exporting environment variables from the
new architecture hook.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Makefile b/Makefile
index 0cbe8717bdb3..8c1f7422d0bc 100644
--- a/Makefile
+++ b/Makefile
@@ -605,6 +605,7 @@ endif
 export KBUILD_MODULES KBUILD_BUILTIN
 
 ifeq ($(dot-config),1)
+include arch/$(SRCARCH)/auto.conf
 include include/config/auto.conf
 endif
 
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 01/47] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 02/47] kbuild: allow architectures to automatically define kconfig symbols Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-25 21:40   ` Richard Weinberger
  2019-10-23  4:37 ` [RFC PATCH 04/47] lkl: host interface Hajime Tazaki
                   ` (46 subsequent siblings)
  49 siblings, 1 reply; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: H . K . Jerry Chu, Levente Kurusa, Matthieu Coudron,
	Conrad Meyer, Octavian Purdila, Yuan Liu, Jens Staal,
	Motomu Utsumi, Lai Jiangshan, Akira Moroo, Petros Angelatos,
	Andreas Abel, Xiao Jia, Mark Stillwell, Hajime Tazaki,
	Patrick Collins, Pierre-Hugues Husson, Michael Zimmermann,
	Luca Dariz, Edison M . Castro

From: Octavian Purdila <tavi.purdila@gmail.com>

Adds the LKL Kconfig, vmlinux linker script, basic architecture
headers and miscellaneous basic functions or stubs such as
dump_stack(), show_regs() and cpuinfo proc ops.

The headers we introduce in this patch are simple wrappers to the
asm-generic headers or stubs for things we don't support, such as
ptrace, DMA, signals, ELF handling and low level processor operations.

The kernel configuration is automatically updated to reflect the
endianness of the host, 64bit support or the output format for
vmlinux's linker script. We do this by looking at the ld's default
output format.

Signed-off-by: Andreas Abel <aabel@google.com>
Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Jens Staal <staal1978@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Levente Kurusa <levex@linux.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Matthieu Coudron <mattator@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 MAINTAINERS                                |   8 +
 arch/um/lkl/.gitignore                     |   2 +
 arch/um/lkl/Kconfig                        |  96 ++++++
 arch/um/lkl/Kconfig.debug                  |   0
 arch/um/lkl/Makefile                       |   0
 arch/um/lkl/Makefile.um                    |  70 +++++
 arch/um/lkl/configs/lkl_defconfig          |  95 ++++++
 arch/um/lkl/include/asm/Kbuild             |  80 +++++
 arch/um/lkl/include/asm/bitsperlong.h      |  11 +
 arch/um/lkl/include/asm/byteorder.h        |   7 +
 arch/um/lkl/include/asm/cpu.h              |  14 +
 arch/um/lkl/include/asm/elf.h              |  15 +
 arch/um/lkl/include/asm/mutex.h            |   7 +
 arch/um/lkl/include/asm/processor.h        |  60 ++++
 arch/um/lkl/include/asm/ptrace.h           |  25 ++
 arch/um/lkl/include/asm/sched.h            |  23 ++
 arch/um/lkl/include/asm/syscalls.h         |  18 ++
 arch/um/lkl/include/asm/syscalls_32.h      |  43 +++
 arch/um/lkl/include/asm/tlb.h              |  12 +
 arch/um/lkl/include/asm/uaccess.h          |  64 ++++
 arch/um/lkl/include/asm/unistd_32.h        |  31 ++
 arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
 arch/um/lkl/include/asm/xor.h              |   9 +
 arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
 arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
 arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
 arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
 arch/um/lkl/include/uapi/asm/swab.h        |  11 +
 arch/um/lkl/include/uapi/asm/syscalls.h    | 348 +++++++++++++++++++++
 arch/um/lkl/kernel/asm-offsets.c           |   2 +
 arch/um/lkl/kernel/misc.c                  |  60 ++++
 arch/um/lkl/kernel/vmlinux.lds.S           |  51 +++
 32 files changed, 1220 insertions(+)
 create mode 100644 arch/um/lkl/.gitignore
 create mode 100644 arch/um/lkl/Kconfig
 create mode 100644 arch/um/lkl/Kconfig.debug
 create mode 100644 arch/um/lkl/Makefile
 create mode 100644 arch/um/lkl/Makefile.um
 create mode 100644 arch/um/lkl/configs/lkl_defconfig
 create mode 100644 arch/um/lkl/include/asm/Kbuild
 create mode 100644 arch/um/lkl/include/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/asm/cpu.h
 create mode 100644 arch/um/lkl/include/asm/elf.h
 create mode 100644 arch/um/lkl/include/asm/mutex.h
 create mode 100644 arch/um/lkl/include/asm/processor.h
 create mode 100644 arch/um/lkl/include/asm/ptrace.h
 create mode 100644 arch/um/lkl/include/asm/sched.h
 create mode 100644 arch/um/lkl/include/asm/syscalls.h
 create mode 100644 arch/um/lkl/include/asm/syscalls_32.h
 create mode 100644 arch/um/lkl/include/asm/tlb.h
 create mode 100644 arch/um/lkl/include/asm/uaccess.h
 create mode 100644 arch/um/lkl/include/asm/unistd_32.h
 create mode 100644 arch/um/lkl/include/asm/vmlinux.lds.h
 create mode 100644 arch/um/lkl/include/asm/xor.h
 create mode 100644 arch/um/lkl/include/uapi/asm/Kbuild
 create mode 100644 arch/um/lkl/include/uapi/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/uapi/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/uapi/asm/siginfo.h
 create mode 100644 arch/um/lkl/include/uapi/asm/swab.h
 create mode 100644 arch/um/lkl/include/uapi/asm/syscalls.h
 create mode 100644 arch/um/lkl/kernel/asm-offsets.c
 create mode 100644 arch/um/lkl/kernel/misc.c
 create mode 100644 arch/um/lkl/kernel/vmlinux.lds.S

diff --git a/MAINTAINERS b/MAINTAINERS
index e7a47b5210fd..6832972ad54b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9369,6 +9369,14 @@ F:	Documentation/core-api/atomic_ops.rst
 F:	Documentation/core-api/refcount-vs-atomic.rst
 F:	Documentation/memory-barriers.txt
 
+LINUX KERNEL LIBRARY
+M:	Octavian Purdila <octavian.purdila@intel.com>
+M:	Hajime Tazaki <thehajime@gmail.com>
+L:	linux-kernel-library@freelists.org
+S:	Maintained
+F:	arch/lkl/
+F:	tools/lkl/
+
 LIS3LV02D ACCELEROMETER DRIVER
 M:	Eric Piel <eric.piel@tremplin-utc.net>
 S:	Maintained
diff --git a/arch/um/lkl/.gitignore b/arch/um/lkl/.gitignore
new file mode 100644
index 000000000000..ced1c60d8235
--- /dev/null
+++ b/arch/um/lkl/.gitignore
@@ -0,0 +1,2 @@
+kernel/vmlinux.lds
+include/generated
diff --git a/arch/um/lkl/Kconfig b/arch/um/lkl/Kconfig
new file mode 100644
index 000000000000..1e68e474a21b
--- /dev/null
+++ b/arch/um/lkl/Kconfig
@@ -0,0 +1,96 @@
+# SPDX-License-Identifier: GPL-2.0
+
+config UML_LKL
+       def_bool y
+       depends on !SMP && !MMU && !COREDUMP && !SECCOMP && !UPROBES && !COMPAT && !USER_RETURN_NOTIFIER
+       select ARCH_THREAD_STACK_ALLOCATOR
+       select RWSEM_GENERIC_SPINLOCK
+       select GENERIC_ATOMIC64
+       select GENERIC_HWEIGHT
+       select FLATMEM
+       select FLAT_NODE_MEM_MAP
+       select GENERIC_CLOCKEVENTS
+       select GENERIC_CPU_DEVICES
+       select NO_HZ_IDLE
+       select NO_PREEMPT
+       select ARCH_WANT_FRAME_POINTERS
+       select HAS_DMA
+       select DMA_DIRECT_OPS
+       select PHYS_ADDR_T_64BIT if 64BIT
+       select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64"
+       select 64BIT if "$(OUTPUT_FORMAT)" = "pe-x86-64"
+       select HAVE_UNDERSCORE_SYMBOL_PREFIX if "$(OUTPUT_FORMAT)" = "pe-i386"
+       select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64-freebsd"
+       select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-littleaarch64"
+       select NET
+       select MULTIUSER
+       select INET
+       select IPV6
+       select IP_PNP
+       select IP_PNP_DHCP
+       select TCP_CONG_ADVANCED
+       select TCP_CONG_BBR
+       select HIGH_RES_TIMERS
+       select NET_SCHED
+       select NET_SCH_FQ
+       select IP_MULTICAST
+       select IPV6_MULTICAST
+       select IP_MULTIPLE_TABLES
+       select IPV6_MULTIPLE_TABLES
+       select IP_ROUTE_MULTIPATH
+       select IPV6_ROUTE_MULTIPATH
+       select IP_ADVANCED_ROUTER
+       select IPV6_ADVANCED_ROUTER
+       select ARCH_NO_COHERENT_DMA_MMAP
+       select HAVE_MEMBLOCK
+       select NO_BOOTMEM
+
+config OUTPUT_FORMAT
+       string "Output format"
+       default "$(OUTPUT_FORMAT)"
+
+config ARCH_DMA_ADDR_T_64BIT
+       def_bool 64BIT
+
+config 64BIT
+       def_bool n
+
+config COREDUMP
+       def_bool n
+
+config BIG_ENDIAN
+       def_bool n
+
+config GENERIC_CSUM
+       def_bool y
+
+config GENERIC_HWEIGHT
+       def_bool y
+
+config NO_IOPORT_MAP
+       def_bool y
+
+config RWSEM_GENERIC_SPINLOCK
+	bool
+	default y
+
+config HAVE_UNDERSCORE_SYMBOL_PREFIX
+       bool
+       help
+         Some architectures generate an _ in front of C symbols; things like
+         module loading and assembly files need to know about this.
+
+config HZ
+        int
+        default 100
+
+config CONSOLE_LOGLEVEL_QUIET
+	int "quiet console loglevel (1-15)"
+	range 1 15
+	default "4"
+	help
+	  loglevel to use when "quiet" is passed on the kernel commandline.
+
+	  When "quiet" is passed on the kernel commandline this loglevel
+	  will be used as the loglevel. IOW passing "quiet" will be the
+	  equivalent of passing "loglevel=<CONSOLE_LOGLEVEL_QUIET>"
diff --git a/arch/um/lkl/Kconfig.debug b/arch/um/lkl/Kconfig.debug
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/arch/um/lkl/Makefile b/arch/um/lkl/Makefile
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/arch/um/lkl/Makefile.um b/arch/um/lkl/Makefile.um
new file mode 100644
index 000000000000..612705870e82
--- /dev/null
+++ b/arch/um/lkl/Makefile.um
@@ -0,0 +1,70 @@
+# SPDX-License-Identifier: GPL-2.0
+
+include $(HOST_DIR)/auto.conf
+
+SRCARCH := um/$(SUBARCH)
+ARCH_INCLUDE += -I$(srctree)/$(HOST_DIR)/um/include
+LINUXINCLUDE := $(subst $(ARCH_DIR),$(HOST_DIR),$(LINUXINCLUDE)) $(ARCH_INCLUDE)
+KBUILD_CFLAGS += -fno-builtin
+
+ifneq (,$(filter $(OUTPUT_FORMAT),elf64-x86-64 elf32-i386 elf64-x86-64-freebsd elf32-littlearm elf64-littleaarch64))
+KBUILD_CFLAGS += -fPIC
+else ifneq (,$(filter $(OUTPUT_FORMAT),pe-i386 pe-x86-64 ))
+ifneq ($(OUTPUT_FORMAT),pe-x86-64)
+prefix=_
+endif
+# workaround for #include_next<stdarg.h> errors
+LINUXINCLUDE := -isystem $(HOST_DIR)/include/system $(LINUXINCLUDE)
+# workaround for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52991
+KBUILD_CFLAGS += -mno-ms-bitfields
+else
+$(error Unrecognized platform: $(OUTPUT_FORMAT))
+endif
+
+ifeq ($(shell uname -s), Linux)
+NPROC=$(shell nproc)
+else # e.g., FreeBSD
+NPROC=$(shell sysctl -n hw.ncpu)
+endif
+
+LDFLAGS_vmlinux += -r
+LKL_ENTRY_POINTS := lkl_start_kernel lkl_sys_halt lkl_syscall lkl_trigger_irq \
+	lkl_get_free_irq lkl_put_irq lkl_is_running lkl_bug lkl_printf
+
+ifeq ($(OUTPUT_FORMAT),elf32-i386)
+LKL_ENTRY_POINTS += \
+	__x86.get_pc_thunk.bx __x86.get_pc_thunk.dx __x86.get_pc_thunk.ax \
+	__x86.get_pc_thunk.cx __x86.get_pc_thunk.si __x86.get_pc_thunk.di
+endif
+
+core-y += $(HOST_DIR)/kernel/
+core-y += $(HOST_DIR)/mm/
+
+all: lkl.o
+
+lkl.o: vmlinux
+	$(OBJCOPY) -R .eh_frame -R .syscall_defs $(foreach sym,$(LKL_ENTRY_POINTS),-G$(prefix)$(sym)) vmlinux lkl.o
+
+$(HOST_DIR)/include/generated/uapi/asm/syscall_defs.h: vmlinux
+	$(OBJCOPY) -j .syscall_defs -O binary --set-section-flags .syscall_defs=alloc $< $@
+	$(Q) export tmpfile=$(shell mktemp); \
+	sed 's/\x0//g' $@ > $$tmpfile; mv $$tmpfile $@ ; rm -f $$tmpfile
+
+install: lkl.o headers $(HOST_DIR)/include/generated/uapi/asm/syscall_defs.h
+	@echo "  INSTALL	$(INSTALL_PATH)/lib/lkl.o"
+	@mkdir -p $(INSTALL_PATH)/lib/
+	@cp lkl.o $(INSTALL_PATH)/lib/
+	@$(srctree)/$(HOST_DIR)/scripts/headers_install.py \
+		$(subst -j,-j$(NPROC),$(findstring -j,$(MAKEFLAGS))) \
+		$(INSTALL_PATH)/include
+
+archheaders:
+	$(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(SRCARCH) asm-generic archheaders
+
+archclean:
+	$(Q)rm -rf $(srctree)/$(HOST_DIR)/include/generated
+	$(Q)$(MAKE) $(clean)=$(boot)
+
+define archhelp
+  echo '  install	- Install library and headers to INSTALL_PATH/{lib,include}'
+endef
diff --git a/arch/um/lkl/configs/lkl_defconfig b/arch/um/lkl/configs/lkl_defconfig
new file mode 100644
index 000000000000..f91380beee7c
--- /dev/null
+++ b/arch/um/lkl/configs/lkl_defconfig
@@ -0,0 +1,95 @@
+# CONFIG_LOCALVERSION_AUTO is not set
+CONFIG_NO_HZ_IDLE=y
+# CONFIG_SYSFS_SYSCALL is not set
+CONFIG_KALLSYMS_USE_DATA_SECTION=y
+CONFIG_KALLSYMS_ALL=y
+# CONFIG_BASE_FULL is not set
+# CONFIG_FUTEX is not set
+# CONFIG_SIGNALFD is not set
+# CONFIG_TIMERFD is not set
+# CONFIG_AIO is not set
+# CONFIG_ADVISE_SYSCALLS is not set
+CONFIG_EMBEDDED=y
+# CONFIG_VM_EVENT_COUNTERS is not set
+# CONFIG_COMPAT_BRK is not set
+# CONFIG_BLK_DEV_BSG is not set
+CONFIG_NET=y
+CONFIG_INET=y
+# CONFIG_WIRELESS is not set
+# CONFIG_UEVENT_HELPER is not set
+# CONFIG_FW_LOADER is not set
+CONFIG_VIRTIO_BLK=y
+CONFIG_NETDEVICES=y
+CONFIG_VIRTIO_NET=y
+# CONFIG_ETHERNET is not set
+# CONFIG_WLAN is not set
+# CONFIG_VT is not set
+CONFIG_VIRTIO_MMIO=y
+CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_FS_POSIX_ACL=y
+CONFIG_EXT4_FS_SECURITY=y
+CONFIG_XFS_FS=y
+CONFIG_XFS_POSIX_ACL=y
+CONFIG_BTRFS_FS=y
+CONFIG_BTRFS_FS_POSIX_ACL=y
+# CONFIG_FILE_LOCKING is not set
+# CONFIG_DNOTIFY is not set
+# CONFIG_INOTIFY_USER is not set
+CONFIG_VFAT_FS=y
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_CODEPAGE_737=y
+CONFIG_NLS_CODEPAGE_775=y
+CONFIG_NLS_CODEPAGE_850=y
+CONFIG_NLS_CODEPAGE_852=y
+CONFIG_NLS_CODEPAGE_855=y
+CONFIG_NLS_CODEPAGE_857=y
+CONFIG_NLS_CODEPAGE_860=y
+CONFIG_NLS_CODEPAGE_861=y
+CONFIG_NLS_CODEPAGE_862=y
+CONFIG_NLS_CODEPAGE_863=y
+CONFIG_NLS_CODEPAGE_864=y
+CONFIG_NLS_CODEPAGE_865=y
+CONFIG_NLS_CODEPAGE_866=y
+CONFIG_NLS_CODEPAGE_869=y
+CONFIG_NLS_CODEPAGE_936=y
+CONFIG_NLS_CODEPAGE_950=y
+CONFIG_NLS_CODEPAGE_932=y
+CONFIG_NLS_CODEPAGE_949=y
+CONFIG_NLS_CODEPAGE_874=y
+CONFIG_NLS_ISO8859_8=y
+CONFIG_NLS_CODEPAGE_1250=y
+CONFIG_NLS_CODEPAGE_1251=y
+CONFIG_NLS_ASCII=y
+CONFIG_NLS_ISO8859_1=y
+CONFIG_NLS_ISO8859_2=y
+CONFIG_NLS_ISO8859_3=y
+CONFIG_NLS_ISO8859_4=y
+CONFIG_NLS_ISO8859_5=y
+CONFIG_NLS_ISO8859_6=y
+CONFIG_NLS_ISO8859_7=y
+CONFIG_NLS_ISO8859_9=y
+CONFIG_NLS_ISO8859_13=y
+CONFIG_NLS_ISO8859_14=y
+CONFIG_NLS_ISO8859_15=y
+CONFIG_NLS_KOI8_R=y
+CONFIG_NLS_KOI8_U=y
+CONFIG_NLS_MAC_ROMAN=y
+CONFIG_NLS_MAC_CELTIC=y
+CONFIG_NLS_MAC_CENTEURO=y
+CONFIG_NLS_MAC_CROATIAN=y
+CONFIG_NLS_MAC_CYRILLIC=y
+CONFIG_NLS_MAC_GAELIC=y
+CONFIG_NLS_MAC_GREEK=y
+CONFIG_NLS_MAC_ICELAND=y
+CONFIG_NLS_MAC_INUIT=y
+CONFIG_NLS_MAC_ROMANIAN=y
+CONFIG_NLS_MAC_TURKISH=y
+CONFIG_NLS_UTF8=y
+CONFIG_HZ_100=y
+CONFIG_CRYPTO_ANSI_CPRNG=y
+CONFIG_PRINTK_TIME=y
+CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_REDUCED=y
+# CONFIG_ENABLE_WARN_DEPRECATED is not set
+# CONFIG_ENABLE_MUST_CHECK is not set
diff --git a/arch/um/lkl/include/asm/Kbuild b/arch/um/lkl/include/asm/Kbuild
new file mode 100644
index 000000000000..f6308985c61c
--- /dev/null
+++ b/arch/um/lkl/include/asm/Kbuild
@@ -0,0 +1,80 @@
+generic-y += atomic.h
+generic-y += barrier.h
+generic-y += bitops.h
+generic-y += bug.h
+generic-y += bugs.h
+generic-y += cache.h
+generic-y += cacheflush.h
+generic-y += checksum.h
+generic-y += cmpxchg-local.h
+generic-y += cmpxchg.h
+generic-y += compat.h
+generic-y += cputime.h
+generic-y += current.h
+generic-y += delay.h
+generic-y += device.h
+generic-y += div64.h
+generic-y += dma.h
+generic-y += dma-mapping.h
+generic-y += emergency-restart.h
+generic-y += errno.h
+generic-y += extable.h
+generic-y += exec.h
+generic-y += ftrace.h
+generic-y += futex.h
+generic-y += hardirq.h
+generic-y += hw_irq.h
+generic-y += ioctl.h
+generic-y += ipcbuf.h
+generic-y += irq_regs.h
+generic-y += irqflags.h
+generic-y += irq_work.h
+generic-y += kdebug.h
+generic-y += kmap_types.h
+generic-y += linkage.h
+generic-y += local.h
+generic-y += local64.h
+generic-y += mcs_spinlock.h
+generic-y += mmiowb.h
+generic-y += mmu.h
+generic-y += mmu_context.h
+generic-y += module.h
+generic-y += msgbuf.h
+generic-y += param.h
+generic-y += parport.h
+generic-y += pci.h
+generic-y += percpu.h
+generic-y += pgalloc.h
+generic-y += poll.h
+generic-y += preempt.h
+generic-y += resource.h
+generic-y += rwsem.h
+generic-y += scatterlist.h
+generic-y += seccomp.h
+generic-y += sections.h
+generic-y += segment.h
+generic-y += sembuf.h
+generic-y += serial.h
+generic-y += shmbuf.h
+generic-y += signal.h
+generic-y += simd.h
+generic-y += sizes.h
+generic-y += socket.h
+generic-y += sockios.h
+generic-y += stat.h
+generic-y += statfs.h
+generic-y += string.h
+generic-y += swab.h
+generic-y += switch_to.h
+generic-y += syscall.h
+generic-y += termbits.h
+generic-y += termios.h
+generic-y += time.h
+generic-y += timex.h
+generic-y += tlbflush.h
+generic-y += topology.h
+generic-y += trace_clock.h
+generic-y += unaligned.h
+generic-y += user.h
+generic-y += word-at-a-time.h
+generic-y += kprobes.h
diff --git a/arch/um/lkl/include/asm/bitsperlong.h b/arch/um/lkl/include/asm/bitsperlong.h
new file mode 100644
index 000000000000..5745d5e51274
--- /dev/null
+++ b/arch/um/lkl/include/asm/bitsperlong.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LKL_BITSPERLONG_H
+#define __LKL_BITSPERLONG_H
+
+#include <uapi/asm/bitsperlong.h>
+
+#define BITS_PER_LONG __BITS_PER_LONG
+
+#define BITS_PER_LONG_LONG 64
+
+#endif
diff --git a/arch/um/lkl/include/asm/byteorder.h b/arch/um/lkl/include/asm/byteorder.h
new file mode 100644
index 000000000000..5d0c4efaa44b
--- /dev/null
+++ b/arch/um/lkl/include/asm/byteorder.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_BYTEORDER_H
+#define _ASM_LKL_BYTEORDER_H
+
+#include <uapi/asm/byteorder.h>
+
+#endif /* _ASM_LKL_BYTEORDER_H */
diff --git a/arch/um/lkl/include/asm/cpu.h b/arch/um/lkl/include/asm/cpu.h
new file mode 100644
index 000000000000..d2b8c501c7b1
--- /dev/null
+++ b/arch/um/lkl/include/asm/cpu.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_CPU_H
+#define _ASM_LKL_CPU_H
+
+int lkl_cpu_get(void);
+void lkl_cpu_put(void);
+int lkl_cpu_try_run_irq(int irq);
+int lkl_cpu_init(void);
+void lkl_cpu_shutdown(void);
+void lkl_cpu_wait_shutdown(void);
+void lkl_cpu_change_owner(lkl_thread_t owner);
+void lkl_cpu_set_irqs_pending(void);
+
+#endif /* _ASM_LKL_CPU_H */
diff --git a/arch/um/lkl/include/asm/elf.h b/arch/um/lkl/include/asm/elf.h
new file mode 100644
index 000000000000..bb2456d638f4
--- /dev/null
+++ b/arch/um/lkl/include/asm/elf.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_ELF_H
+#define _ASM_LKL_ELF_H
+
+#define elf_check_arch(x) 0
+
+#ifdef CONFIG_64BIT
+#define ELF_CLASS ELFCLASS64
+#else
+#define ELF_CLASS ELFCLASS32
+#endif
+
+#define elf_gregset_t long
+#define elf_fpregset_t double
+#endif
diff --git a/arch/um/lkl/include/asm/mutex.h b/arch/um/lkl/include/asm/mutex.h
new file mode 100644
index 000000000000..492d04183f9c
--- /dev/null
+++ b/arch/um/lkl/include/asm/mutex.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_MUTEX_H
+#define _ASM_LKL_MUTEX_H
+
+#include <asm-generic/mutex-dec.h>
+
+#endif
diff --git a/arch/um/lkl/include/asm/processor.h b/arch/um/lkl/include/asm/processor.h
new file mode 100644
index 000000000000..c1aa8b3a266e
--- /dev/null
+++ b/arch/um/lkl/include/asm/processor.h
@@ -0,0 +1,60 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_PROCESSOR_H
+#define _ASM_LKL_PROCESSOR_H
+
+struct task_struct;
+
+static inline void cpu_relax(void)
+{
+	unsigned long flags;
+
+	/* since this is usually called in a tight loop waiting for some
+	 * external condition (e.g. jiffies) lets run interrupts now to allow
+	 * the external condition to propagate
+	 */
+	local_irq_save(flags);
+	local_irq_restore(flags);
+}
+
+#define current_text_addr() ({ __label__ _l; _l: &&_l; })
+
+static inline unsigned long thread_saved_pc(struct task_struct *tsk)
+{
+	return 0;
+}
+
+static inline void release_thread(struct task_struct *dead_task)
+{
+}
+
+static inline void prepare_to_copy(struct task_struct *tsk)
+{
+}
+
+static inline unsigned long get_wchan(struct task_struct *p)
+{
+	return 0;
+}
+
+static inline void flush_thread(void)
+{
+}
+
+static inline void trap_init(void)
+{
+}
+
+struct thread_struct { };
+
+#define INIT_THREAD { }
+
+#define task_pt_regs(tsk) (struct pt_regs *)(NULL)
+
+/* We don't have strict user/kernel spaces */
+#define TASK_SIZE ((unsigned long)-1)
+#define TASK_UNMAPPED_BASE 0
+
+#define KSTK_EIP(tsk) (0)
+#define KSTK_ESP(tsk) (0)
+
+#endif
diff --git a/arch/um/lkl/include/asm/ptrace.h b/arch/um/lkl/include/asm/ptrace.h
new file mode 100644
index 000000000000..28199be26dc0
--- /dev/null
+++ b/arch/um/lkl/include/asm/ptrace.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_PTRACE_H
+#define _ASM_LKL_PTRACE_H
+
+#include <linux/errno.h>
+
+struct task_struct;
+
+#define user_mode(regs) 0
+#define kernel_mode(regs) 1
+#define profile_pc(regs) 0
+#define instruction_pointer(regs) 0
+#define user_stack_pointer(regs) 0
+
+static inline long arch_ptrace(struct task_struct *child, long request,
+			       unsigned long addr, unsigned long data)
+{
+	return -EINVAL;
+}
+
+static inline void ptrace_disable(struct task_struct *child)
+{
+}
+
+#endif
diff --git a/arch/um/lkl/include/asm/sched.h b/arch/um/lkl/include/asm/sched.h
new file mode 100644
index 000000000000..4c2635921ec8
--- /dev/null
+++ b/arch/um/lkl/include/asm/sched.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_SCHED_H
+#define _ASM_LKL_SCHED_H
+
+#include <linux/sched.h>
+#include <uapi/asm/host_ops.h>
+
+static inline void thread_sched_jb(void)
+{
+	if (test_ti_thread_flag(current_thread_info(), TIF_HOST_THREAD)) {
+		set_ti_thread_flag(current_thread_info(), TIF_SCHED_JB);
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		lkl_ops->jmp_buf_set(&current_thread_info()->sched_jb,
+				     schedule);
+	} else {
+		lkl_bug("%s() can be used only for host task\n", __func__);
+	}
+}
+
+void switch_to_host_task(struct task_struct *);
+int host_task_stub(void *unused);
+
+#endif /*  _ASM_LKL_SCHED_H */
diff --git a/arch/um/lkl/include/asm/syscalls.h b/arch/um/lkl/include/asm/syscalls.h
new file mode 100644
index 000000000000..2eaa870a9020
--- /dev/null
+++ b/arch/um/lkl/include/asm/syscalls.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_SYSCALLS_H
+#define _ASM_LKL_SYSCALLS_H
+
+int syscalls_init(void);
+void syscalls_cleanup(void);
+long lkl_syscall(long no, long *params);
+void wakeup_idle_host_task(void);
+
+#define sys_mmap sys_mmap_pgoff
+#define sys_mmap2 sys_mmap_pgoff
+#define sys_clone sys_ni_syscall
+#define sys_vfork sys_ni_syscall
+#define sys_rt_sigreturn sys_ni_syscall
+
+#include <asm-generic/syscalls.h>
+
+#endif /* _ASM_LKL_SYSCALLS_H */
diff --git a/arch/um/lkl/include/asm/syscalls_32.h b/arch/um/lkl/include/asm/syscalls_32.h
new file mode 100644
index 000000000000..0e1a7649c81b
--- /dev/null
+++ b/arch/um/lkl/include/asm/syscalls_32.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_SYSCALLS_32_H
+#define _ASM_SYSCALLS_32_H
+
+#include <linux/compiler.h>
+#include <linux/linkage.h>
+#include <linux/types.h>
+#include <linux/signal.h>
+
+#if __BITS_PER_LONG == 32
+
+/* kernel/syscalls_32.c */
+asmlinkage long sys32_truncate64(const char __user *, unsigned long,
+				 unsigned long);
+asmlinkage long sys32_ftruncate64(unsigned int, unsigned long, unsigned long);
+
+#ifdef CONFIG_MMU
+struct mmap_arg_struct32;
+asmlinkage long sys32_mmap(struct mmap_arg_struct32 __user *);
+#endif
+
+asmlinkage long sys32_wait4(pid_t, unsigned int __user *, int,
+			    struct rusage __user *);
+
+asmlinkage long sys32_pread64(unsigned int, char __user *, u32, u32, u32);
+asmlinkage long sys32_pwrite64(unsigned int, const char __user *, u32, u32,
+			       u32);
+
+long sys32_fadvise64_64(int a, __u32 b, __u32 c, __u32 d, __u32 e, int f);
+
+asmlinkage ssize_t sys32_readahead(int, unsigned int, unsigned int, size_t);
+asmlinkage long sys32_sync_file_range(int, unsigned int, unsigned int,
+				      unsigned int, unsigned int, unsigned int);
+asmlinkage long sys32_sync_file_range2(int, unsigned int, unsigned int,
+				       unsigned int, unsigned int,
+				       unsigned int);
+asmlinkage long sys32_fadvise64(int, unsigned int, unsigned int, size_t, int);
+asmlinkage long sys32_fallocate(int, int, unsigned int, unsigned int,
+				unsigned int, unsigned int);
+
+#endif /* __BITS_PER_LONG */
+
+#endif /* _ASM_SYSCALLS_32_H */
diff --git a/arch/um/lkl/include/asm/tlb.h b/arch/um/lkl/include/asm/tlb.h
new file mode 100644
index 000000000000..d474890d317d
--- /dev/null
+++ b/arch/um/lkl/include/asm/tlb.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_TLB_H
+#define _ASM_LKL_TLB_H
+
+#define tlb_start_vma(tlb, vma)				do { } while (0)
+#define tlb_end_vma(tlb, vma)				do { } while (0)
+#define __tlb_remove_tlb_entry(tlb, pte, address)	do { } while (0)
+#define tlb_flush(tlb)					do { } while (0)
+
+#include <asm-generic/tlb.h>
+
+#endif /* _ASM_LKL_TLB_H */
diff --git a/arch/um/lkl/include/asm/uaccess.h b/arch/um/lkl/include/asm/uaccess.h
new file mode 100644
index 000000000000..f267ac3be8b3
--- /dev/null
+++ b/arch/um/lkl/include/asm/uaccess.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_UACCESS_H
+#define _ASM_LKL_UACCESS_H
+
+/* copied from old include/asm-generic/uaccess.h */
+static inline __must_check long
+raw_copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	if (__builtin_constant_p(n)) {
+		switch (n) {
+		case 1:
+			*(u8 *)to = *(u8 __force *)from;
+			return 0;
+		case 2:
+			*(u16 *)to = *(u16 __force *)from;
+			return 0;
+		case 4:
+			*(u32 *)to = *(u32 __force *)from;
+			return 0;
+#ifdef CONFIG_64BIT
+		case 8:
+			*(u64 *)to = *(u64 __force *)from;
+			return 0;
+#endif
+		default:
+			break;
+		}
+	}
+
+	memcpy(to, (const void __force *)from, n);
+	return 0;
+}
+
+static inline __must_check long
+raw_copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+	if (__builtin_constant_p(n)) {
+		switch (n) {
+		case 1:
+			*(u8 __force *)to = *(u8 *)from;
+			return 0;
+		case 2:
+			*(u16 __force *)to = *(u16 *)from;
+			return 0;
+		case 4:
+			*(u32 __force *)to = *(u32 *)from;
+			return 0;
+#ifdef CONFIG_64BIT
+		case 8:
+			*(u64 __force *)to = *(u64 *)from;
+			return 0;
+#endif
+		default:
+			break;
+		}
+	}
+
+	memcpy((void __force *)to, from, n);
+	return 0;
+}
+
+#include <asm-generic/uaccess.h>
+
+#endif
diff --git a/arch/um/lkl/include/asm/unistd_32.h b/arch/um/lkl/include/asm/unistd_32.h
new file mode 100644
index 000000000000..8582a55e61e2
--- /dev/null
+++ b/arch/um/lkl/include/asm/unistd_32.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <asm/bitsperlong.h>
+
+#ifndef __SYSCALL
+#define __SYSCALL(x, y)
+#endif
+
+#if __BITS_PER_LONG == 32
+__SYSCALL(__NR3264_truncate, sys32_truncate64)
+__SYSCALL(__NR3264_ftruncate, sys32_ftruncate64)
+
+#ifdef CONFIG_MMU
+__SYSCALL(__NR3264_mmap, sys32_mmap)
+#endif
+
+__SYSCALL(__NR_wait4, sys32_wait4)
+
+__SYSCALL(__NR_pread64, sys32_pread64)
+__SYSCALL(__NR_pwrite64, sys32_pwrite64)
+
+__SYSCALL(__NR_readahead, sys32_readahead)
+#ifdef __ARCH_WANT_SYNC_FILE_RANGE2
+__SYSCALL(__NR_sync_file_range2, sys32_sync_file_range2)
+#else
+__SYSCALL(__NR_sync_file_range, sys32_sync_file_range)
+#endif
+/* mm/fadvise.c */
+__SYSCALL(__NR3264_fadvise64, sys32_fadvise64_64)
+__SYSCALL(__NR_fallocate, sys32_fallocate)
+
+#endif
diff --git a/arch/um/lkl/include/asm/vmlinux.lds.h b/arch/um/lkl/include/asm/vmlinux.lds.h
new file mode 100644
index 000000000000..a3c285882dc4
--- /dev/null
+++ b/arch/um/lkl/include/asm/vmlinux.lds.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_VMLINUX_LDS_H
+#define _LKL_VMLINUX_LDS_H
+
+/* we encode our own __ro_after_init section */
+#define RO_AFTER_INIT_DATA
+
+#ifdef __MINGW32__
+#define RODATA_SECTION .rdata
+#endif
+
+#include <asm-generic/vmlinux.lds.h>
+
+#endif
diff --git a/arch/um/lkl/include/asm/xor.h b/arch/um/lkl/include/asm/xor.h
new file mode 100644
index 000000000000..286ce75b5d9d
--- /dev/null
+++ b/arch/um/lkl/include/asm/xor.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_XOR_H
+#define _ASM_LKL_XOR_H
+
+#include <asm-generic/xor.h>
+
+#define XOR_SELECT_TEMPLATE(x) (&xor_block_8regs)
+
+#endif /* _ASM_LKL_XOR_H */
diff --git a/arch/um/lkl/include/uapi/asm/Kbuild b/arch/um/lkl/include/uapi/asm/Kbuild
new file mode 100644
index 000000000000..39d9a1f2e8f5
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/Kbuild
@@ -0,0 +1,9 @@
+# UAPI Header export list
+
+generic-y += elf.h
+generic-y += kvm_para.h
+generic-y += shmparam.h
+generic-y += timex.h
+
+# no header-y since we need special user headers handling
+# see arch/lkl/script/headers.py
diff --git a/arch/um/lkl/include/uapi/asm/bitsperlong.h b/arch/um/lkl/include/uapi/asm/bitsperlong.h
new file mode 100644
index 000000000000..8b4ebf2b0264
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/bitsperlong.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_BITSPERLONG_H
+#define _ASM_UAPI_LKL_BITSPERLONG_H
+
+#ifdef CONFIG_64BIT
+#define __BITS_PER_LONG 64
+#else
+#define __BITS_PER_LONG 32
+#endif
+
+#define __ARCH_WANT_STAT64
+
+#endif /* _ASM_UAPI_LKL_BITSPERLONG_H */
diff --git a/arch/um/lkl/include/uapi/asm/byteorder.h b/arch/um/lkl/include/uapi/asm/byteorder.h
new file mode 100644
index 000000000000..3c4a58d2062f
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/byteorder.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_BYTEORDER_H
+#define _ASM_UAPI_LKL_BYTEORDER_H
+
+#if defined(CONFIG_BIG_ENDIAN)
+#include <linux/byteorder/big_endian.h>
+#else
+#include <linux/byteorder/little_endian.h>
+#endif
+
+#endif /* _ASM_UAPI_LKL_BYTEORDER_H */
diff --git a/arch/um/lkl/include/uapi/asm/siginfo.h b/arch/um/lkl/include/uapi/asm/siginfo.h
new file mode 100644
index 000000000000..811916cf42c8
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/siginfo.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_LKL_SIGINFO_H
+#define _ASM_LKL_SIGINFO_H
+
+#ifdef CONFIG_64BIT
+#define __ARCH_SI_PREAMBLE_SIZE	(4 * sizeof(int))
+#endif
+
+#include <asm-generic/siginfo.h>
+
+#endif /* _ASM_LKL_SIGINFO_H */
diff --git a/arch/um/lkl/include/uapi/asm/swab.h b/arch/um/lkl/include/uapi/asm/swab.h
new file mode 100644
index 000000000000..1a1773e1bd35
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/swab.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_LKL_SWAB_H
+#define _ASM_LKL_SWAB_H
+
+#ifndef __arch_swab32
+#define __arch_swab32(x) ___constant_swab32(x)
+#endif
+
+#include <asm-generic/swab.h>
+
+#endif /* _ASM_LKL_SWAB_H */
diff --git a/arch/um/lkl/include/uapi/asm/syscalls.h b/arch/um/lkl/include/uapi/asm/syscalls.h
new file mode 100644
index 000000000000..a81534ffccb7
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/syscalls.h
@@ -0,0 +1,348 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_SYSCALLS_H
+#define _ASM_UAPI_LKL_SYSCALLS_H
+
+#include <autoconf.h>
+#include <linux/types.h>
+
+typedef __kernel_uid32_t	qid_t;
+typedef __kernel_fd_set		fd_set;
+typedef __kernel_mode_t		mode_t;
+typedef unsigned short		umode_t;
+typedef __u32			nlink_t;
+typedef __kernel_off_t		off_t;
+typedef __kernel_pid_t		pid_t;
+typedef __kernel_key_t		key_t;
+typedef __kernel_suseconds_t	suseconds_t;
+typedef __kernel_timer_t	timer_t;
+typedef __kernel_clockid_t	clockid_t;
+typedef __kernel_mqd_t		mqd_t;
+typedef __kernel_uid32_t	uid_t;
+typedef __kernel_gid32_t	gid_t;
+typedef __kernel_uid16_t        uid16_t;
+typedef __kernel_gid16_t        gid16_t;
+typedef unsigned long		uintptr_t;
+#ifdef CONFIG_UID16
+typedef __kernel_old_uid_t	old_uid_t;
+typedef __kernel_old_gid_t	old_gid_t;
+#endif
+typedef __kernel_loff_t		loff_t;
+typedef __kernel_size_t		size_t;
+typedef __kernel_ssize_t	ssize_t;
+typedef __kernel_time_t		time_t;
+typedef __kernel_clock_t	clock_t;
+typedef __u32			u32;
+typedef __s32			s32;
+typedef __u64			u64;
+typedef __s64			s64;
+
+#define __user
+
+#include <asm/unistd.h>
+/* Temporary undefine system calls that don't have data types defined in UAPI
+ * headers
+ */
+#undef __NR_kexec_load
+#undef __NR_getcpu
+#undef __NR_sched_getattr
+#undef __NR_sched_setattr
+#undef __NR_sched_setparam
+#undef __NR_sched_getparam
+#undef __NR_sched_setscheduler
+#undef __NR_name_to_handle_at
+#undef __NR_open_by_handle_at
+
+/* deprecated system calls */
+#undef __NR_epoll_create
+#undef __NR_epoll_wait
+#undef __NR_access
+#undef __NR_chmod
+#undef __NR_chown
+#undef __NR_lchown
+#undef __NR_open
+#undef __NR_creat
+#undef __NR_readlink
+#undef __NR_pipe
+#undef __NR_mknod
+#undef __NR_mkdir
+#undef __NR_rmdir
+#undef __NR_unlink
+#undef __NR_symlink
+#undef __NR_link
+#undef __NR_rename
+#undef __NR_getdents
+#undef __NR_select
+#undef __NR_poll
+#undef __NR_dup2
+#undef __NR_futimesat
+#undef __NR_utimes
+#undef __NR_ustat
+#undef __NR_eventfd
+#undef __NR_bdflush
+#undef __NR_send
+#undef __NR_recv
+
+#undef __NR_umount
+#define __NR_umount __NR_umount2
+
+#ifdef CONFIG_64BIT
+#define __NR_newfstat __NR3264_fstat
+#define __NR_newfstatat __NR3264_fstatat
+#endif
+
+#define __NR_mmap_pgoff __NR3264_mmap
+
+#include <linux/time.h>
+#include <linux/times.h>
+#include <linux/timex.h>
+#include <linux/capability.h>
+#define __KERNEL__ /* to pull in S_ definitions */
+#include <linux/stat.h>
+#undef __KERNEL__
+#include <linux/errno.h>
+#include <linux/fcntl.h>
+#include <linux/fs.h>
+#include <asm/statfs.h>
+#include <asm/stat.h>
+#include <linux/bpf.h>
+#include <linux/msg.h>
+#include <linux/resource.h>
+#include <linux/sysinfo.h>
+#include <linux/shm.h>
+#include <linux/aio_abi.h>
+#include <linux/socket.h>
+#include <linux/perf_event.h>
+#include <linux/sem.h>
+#include <linux/futex.h>
+#include <linux/poll.h>
+#include <linux/mqueue.h>
+#include <linux/eventpoll.h>
+#include <linux/uio.h>
+#include <asm/signal.h>
+#include <asm/siginfo.h>
+#include <linux/utime.h>
+#include <asm/socket.h>
+#include <linux/icmp.h>
+#include <linux/ip.h>
+
+/* Define data structures used in system calls that are not defined in UAPI
+ * headers
+ */
+struct sockaddr {
+	unsigned short int sa_family;
+	char sa_data[14];
+};
+
+#define __UAPI_DEF_IF_NET_DEVICE_FLAGS_LOWER_UP_DORMANT_ECHO 1
+#define __UAPI_DEF_IF_IFNAMSIZ	1
+#define __UAPI_DEF_IF_NET_DEVICE_FLAGS 1
+#define __UAPI_DEF_IF_IFREQ	1
+#define __UAPI_DEF_IF_IFMAP	1
+#include <linux/if.h>
+#define __UAPI_DEF_IN_IPPROTO	1
+#define __UAPI_DEF_IN_ADDR	1
+#define __UAPI_DEF_IN6_ADDR	1
+#define __UAPI_DEF_IP_MREQ	1
+#define __UAPI_DEF_IN_PKTINFO	1
+#define __UAPI_DEF_SOCKADDR_IN	1
+#define __UAPI_DEF_IN_CLASS	1
+#include <linux/in.h>
+#include <linux/in6.h>
+#include <linux/sockios.h>
+#include <linux/route.h>
+#include <linux/ipv6_route.h>
+#include <linux/ipv6.h>
+#include <linux/netlink.h>
+#include <linux/neighbour.h>
+#include <linux/rtnetlink.h>
+#include <linux/fib_rules.h>
+
+#include <linux/kdev_t.h>
+#include <asm/irq.h>
+#include <linux/virtio_blk.h>
+#include <linux/virtio_net.h>
+#include <linux/virtio_ring.h>
+#include <linux/pkt_sched.h>
+#include <linux/io_uring.h>
+
+struct user_msghdr {
+	void		__user *msg_name;
+	int		msg_namelen;
+	struct iovec	__user *msg_iov;
+	__kernel_size_t	msg_iovlen;
+	void		__user *msg_control;
+	__kernel_size_t	msg_controllen;
+	unsigned int	msg_flags;
+};
+
+typedef __u32 key_serial_t;
+
+struct mmsghdr {
+	struct user_msghdr  msg_hdr;
+	unsigned int        msg_len;
+};
+
+struct linux_dirent64 {
+	u64		d_ino;
+	s64		d_off;
+	unsigned short	d_reclen;
+	unsigned char	d_type;
+	char		d_name[0];
+};
+
+struct linux_dirent {
+	unsigned long	d_ino;
+	unsigned long	d_off;
+	unsigned short	d_reclen;
+	char		d_name[1];
+};
+
+struct ustat {
+	__kernel_daddr_t	f_tfree;
+	__kernel_ino_t		f_tinode;
+	char			f_fname[6];
+	char			f_fpack[6];
+};
+
+typedef __kernel_rwf_t		rwf_t;
+
+#define AF_UNSPEC       0
+#define AF_UNIX         1
+#define AF_LOCAL        1
+#define AF_INET         2
+#define AF_AX25         3
+#define AF_IPX          4
+#define AF_APPLETALK    5
+#define AF_NETROM       6
+#define AF_BRIDGE       7
+#define AF_ATMPVC       8
+#define AF_X25          9
+#define AF_INET6        10
+#define AF_ROSE         11
+#define AF_DECnet       12
+#define AF_NETBEUI      13
+#define AF_SECURITY     14
+#define AF_KEY          15
+#define AF_NETLINK      16
+#define AF_ROUTE        AF_NETLINK
+#define AF_PACKET       17
+#define AF_ASH          18
+#define AF_ECONET       19
+#define AF_ATMSVC       20
+#define AF_RDS          21
+#define AF_SNA          22
+#define AF_IRDA         23
+#define AF_PPPOX        24
+#define AF_WANPIPE      25
+#define AF_LLC          26
+#define AF_IB           27
+#define AF_MPLS         28
+#define AF_CAN          29
+#define AF_TIPC         30
+#define AF_BLUETOOTH    31
+#define AF_IUCV         32
+#define AF_RXRPC        33
+#define AF_ISDN         34
+#define AF_PHONET       35
+#define AF_IEEE802154   36
+#define AF_CAIF         37
+#define AF_ALG          38
+#define AF_NFC          39
+#define AF_VSOCK        40
+
+#define SOCK_STREAM		1
+#define SOCK_DGRAM		2
+#define SOCK_RAW		3
+#define SOCK_RDM		4
+#define SOCK_SEQPACKET		5
+#define SOCK_DCCP		6
+#define SOCK_PACKET		10
+
+#define MSG_TRUNC 0x20
+#define MSG_DONTWAIT 0x40
+
+/* avoid colision with system headers defines */
+#define sa_handler sa_handler
+#define st_atime st_atime
+#define st_mtime st_mtime
+#define st_ctime st_ctime
+#define s_addr s_addr
+
+long lkl_syscall(long no, long *params);
+long lkl_sys_halt(void);
+
+#define __MAP0(m, ...)
+#define __MAP1(m, t, a) m(t, a)
+#define __MAP2(m, t, a, ...) m(t, a), __MAP1(m, __VA_ARGS__)
+#define __MAP3(m, t, a, ...) m(t, a), __MAP2(m, __VA_ARGS__)
+#define __MAP4(m, t, a, ...) m(t, a), __MAP3(m, __VA_ARGS__)
+#define __MAP5(m, t, a, ...) m(t, a), __MAP4(m, __VA_ARGS__)
+#define __MAP6(m, t, a, ...) m(t, a), __MAP5(m, __VA_ARGS__)
+#define __MAP(n, ...) __MAP##n(__VA_ARGS__)
+
+#define __SC_LONG(t, a) (long)a
+#define __SC_TABLE(t, a) {sizeof(t), (long long)(a)}
+#define __SC_DECL(t, a) t a
+
+#define LKL_SYSCALL0(name)					       \
+	static inline long lkl_sys##name(void)			       \
+	{							       \
+		long params[6];					       \
+		return lkl_syscall(__lkl__NR##name, params);	       \
+	}
+
+#if __BITS_PER_LONG == 32
+#define LKL_SYSCALLx(x, name, ...)					\
+	static inline							\
+	long lkl_sys##name(__MAP(x, __SC_DECL, __VA_ARGS__))		\
+	{								\
+		struct {						\
+			unsigned int size;				\
+			long long value;				\
+		} lkl_params[x] = { __MAP(x, __SC_TABLE, __VA_ARGS__) }; \
+		long sys_params[6], i, k;				\
+		for (i = k = 0; i < x && k < 6; i++, k++) {		\
+			if (lkl_params[i].size > sizeof(long) &&	\
+			    k + 1 < 6) {				\
+				sys_params[k] =				\
+					(long)(lkl_params[i].value & (-1UL)); \
+				k++;					\
+				sys_params[k] =				\
+					(long)(lkl_params[i].value >>	\
+					       __BITS_PER_LONG);	\
+			} else {					\
+				sys_params[k] = (long)(lkl_params[i].value); \
+			}						\
+		}							\
+		return lkl_syscall(__lkl__NR##name, sys_params);	\
+	}
+#else
+#define LKL_SYSCALLx(x, name, ...)					\
+	static inline							\
+	long lkl_sys##name(__MAP(x, __SC_DECL, __VA_ARGS__))		\
+	{								\
+		long lkl_params[6] = { __MAP(x, __SC_LONG, __VA_ARGS__) }; \
+		return lkl_syscall(__lkl__NR##name, lkl_params);	\
+	}
+#endif
+
+#define SYSCALL_DEFINE0(name, ...) LKL_SYSCALL0(name)
+#define SYSCALL_DEFINE1(name, ...) LKL_SYSCALLx(1, name, __VA_ARGS__)
+#define SYSCALL_DEFINE2(name, ...) LKL_SYSCALLx(2, name, __VA_ARGS__)
+#define SYSCALL_DEFINE3(name, ...) LKL_SYSCALLx(3, name, __VA_ARGS__)
+#define SYSCALL_DEFINE4(name, ...) LKL_SYSCALLx(4, name, __VA_ARGS__)
+#define SYSCALL_DEFINE5(name, ...) LKL_SYSCALLx(5, name, __VA_ARGS__)
+#define SYSCALL_DEFINE6(name, ...) LKL_SYSCALLx(6, name, __VA_ARGS__)
+
+#if __BITS_PER_LONG == 32
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wpointer-to-int-cast"
+#endif
+
+#include <asm/syscall_defs.h>
+
+#if __BITS_PER_LONG == 32
+#pragma GCC diagnostic pop
+#endif
+
+#endif
diff --git a/arch/um/lkl/kernel/asm-offsets.c b/arch/um/lkl/kernel/asm-offsets.c
new file mode 100644
index 000000000000..6be0763698dc
--- /dev/null
+++ b/arch/um/lkl/kernel/asm-offsets.c
@@ -0,0 +1,2 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Dummy asm-offsets.c file. Required by kbuild and ready to be used - hint! */
diff --git a/arch/um/lkl/kernel/misc.c b/arch/um/lkl/kernel/misc.c
new file mode 100644
index 000000000000..60f048f02ae6
--- /dev/null
+++ b/arch/um/lkl/kernel/misc.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kallsyms.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <asm/ptrace.h>
+#include <asm/host_ops.h>
+
+#ifdef CONFIG_PRINTK
+void dump_stack(void)
+{
+	unsigned long dummy;
+	unsigned long *stack = &dummy;
+	unsigned long addr;
+
+	pr_info("Call Trace:\n");
+	while (((long)stack & (THREAD_SIZE - 1)) != 0) {
+		addr = *stack;
+		if (__kernel_text_address(addr)) {
+			pr_info("%p:  [<%08lx>] %pS", stack, addr,
+				(void *)addr);
+			pr_cont("\n");
+		}
+		stack++;
+	}
+	pr_info("\n");
+}
+#endif
+
+void show_regs(struct pt_regs *regs)
+{
+}
+
+#ifdef CONFIG_PROC_FS
+static void *cpuinfo_start(struct seq_file *m, loff_t *pos)
+{
+	return NULL;
+}
+
+static void *cpuinfo_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	return NULL;
+}
+
+static void cpuinfo_stop(struct seq_file *m, void *v)
+{
+}
+
+static int show_cpuinfo(struct seq_file *m, void *v)
+{
+	return 0;
+}
+
+const struct seq_operations cpuinfo_op = {
+	.start	= cpuinfo_start,
+	.next	= cpuinfo_next,
+	.stop	= cpuinfo_stop,
+	.show	= show_cpuinfo,
+};
+#endif
diff --git a/arch/um/lkl/kernel/vmlinux.lds.S b/arch/um/lkl/kernel/vmlinux.lds.S
new file mode 100644
index 000000000000..efe420f38110
--- /dev/null
+++ b/arch/um/lkl/kernel/vmlinux.lds.S
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <asm/vmlinux.lds.h>
+#include <asm/thread_info.h>
+#include <asm/page.h>
+#include <asm/cache.h>
+#include <linux/export.h>
+
+OUTPUT_FORMAT(CONFIG_OUTPUT_FORMAT)
+
+VMLINUX_SYMBOL(jiffies) = VMLINUX_SYMBOL(jiffies_64);
+
+SECTIONS
+{
+	VMLINUX_SYMBOL(__init_begin) = .;
+	HEAD_TEXT_SECTION
+	INIT_TEXT_SECTION(PAGE_SIZE)
+	INIT_DATA_SECTION(16)
+	PERCPU_SECTION(L1_CACHE_BYTES)
+	VMLINUX_SYMBOL(__init_end) = .;
+
+	VMLINUX_SYMBOL(_stext) = .;
+	VMLINUX_SYMBOL(_text) = . ;
+	VMLINUX_SYMBOL(text) = . ;
+	.text      :
+	{
+		TEXT_TEXT
+		SCHED_TEXT
+		LOCK_TEXT
+		CPUIDLE_TEXT
+	}
+	VMLINUX_SYMBOL(_etext) = .;
+
+	VMLINUX_SYMBOL(_sdata) = .;
+	RO_DATA_SECTION(PAGE_SIZE)
+	RW_DATA_SECTION(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
+	VMLINUX_SYMBOL(_edata) = .;
+
+	VMLINUX_SYMBOL(__start_ro_after_init) = .;
+	.data..ro_after_init : { *(.data..ro_after_init)}
+	EXCEPTION_TABLE(16)
+	VMLINUX_SYMBOL(__end_ro_after_init) = .;
+	NOTES
+
+	BSS_SECTION(0, 0, 0)
+	VMLINUX_SYMBOL(_end) = .;
+
+	STABS_DEBUG
+	DWARF_DEBUG
+
+	DISCARDS
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 04/47] lkl: host interface
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (2 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 05/47] lkl: memory handling Hajime Tazaki
                   ` (45 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, Yuan Liu, Patrick Collins,
	Pierre-Hugues Husson, Michael Zimmermann, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch introduces the host operations that define the interface
between the LKL and the host. These operations must be provided either
by a host library or by the application itself.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/host_ops.h      |  12 ++
 arch/um/lkl/include/uapi/asm/host_ops.h | 153 ++++++++++++++++++++++++
 2 files changed, 165 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/host_ops.h
 create mode 100644 arch/um/lkl/include/uapi/asm/host_ops.h

diff --git a/arch/um/lkl/include/asm/host_ops.h b/arch/um/lkl/include/asm/host_ops.h
new file mode 100644
index 000000000000..a31b10c33a5b
--- /dev/null
+++ b/arch/um/lkl/include/asm/host_ops.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_HOST_OPS_H
+#define _ASM_LKL_HOST_OPS_H
+
+#include "irq.h"
+#include <uapi/asm/host_ops.h>
+
+extern struct lkl_host_operations *lkl_ops;
+
+#define lkl_puts(text) lkl_ops->print(text, strlen(text))
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
new file mode 100644
index 000000000000..5f26e61f4b18
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -0,0 +1,153 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_HOST_OPS_H
+#define _ASM_UAPI_LKL_HOST_OPS_H
+
+/* Defined in {posix,nt}-host.c */
+struct lkl_mutex;
+struct lkl_sem;
+struct lkl_tls_key;
+typedef unsigned long lkl_thread_t;
+struct lkl_jmp_buf {
+	unsigned long buf[32];
+};
+
+/**
+ * lkl_host_operations - host operations used by the Linux kernel
+ *
+ * These operations must be provided by a host library or by the application
+ * itself.
+ *
+ * @virtio_devices - string containg the list of virtio devices in virtio mmio
+ * command line format. This string is appended to the kernel command line and
+ * is provided here for convenience to be implemented by the host library.
+ *
+ * @print - optional operation that receives console messages
+ *
+ * @panic - called during a kernel panic
+ *
+ * @sem_alloc - allocate a host semaphore an initialize it to count
+ * @sem_free - free a host semaphore
+ * @sem_up - perform an up operation on the semaphore
+ * @sem_down - perform a down operation on the semaphore
+ *
+ * @mutex_alloc - allocate and initialize a host mutex; the recursive parameter
+ * determines if the mutex is recursive or not
+ * @mutex_free - free a host mutex
+ * @mutex_lock - acquire the mutex
+ * @mutex_unlock - release the mutex
+ *
+ * @thread_create - create a new thread and run f(arg) in its context; returns a
+ * thread handle or 0 if the thread could not be created
+ * @thread_detach - on POSIX systems, free up resources held by
+ * pthreads. Noop on Win32.
+ * @thread_exit - terminates the current thread
+ * @thread_join - wait for the given thread to terminate. Returns 0
+ * for success, -1 otherwise
+ *
+ * @tls_alloc - allocate a thread local storage key; returns 0 if successful; if
+ * destructor is not NULL it will be called when a thread terminates with its
+ * argument set to the current thread local storage value
+ * @tls_free - frees a thread local storage key; returns 0 if successful
+ * @tls_set - associate data to the thread local storage key; returns 0 if
+ * successful
+ * @tls_get - return data associated with the thread local storage key or NULL
+ * on error
+ *
+ * @mem_alloc - allocate memory
+ * @mem_free - free memory
+ *
+ * @timer_create - allocate a host timer that runs fn(arg) when the timer
+ * fires.
+ * @timer_free - disarms and free the timer
+ * @timer_set_oneshot - arm the timer to fire once, after delta ns.
+ * @timer_set_periodic - arm the timer to fire periodically, with a period of
+ * delta ns.
+ *
+ * @ioremap - searches for an I/O memory region identified by addr and size and
+ * returns a pointer to the start of the address range that can be used by
+ * iomem_access
+ * @iomem_access - reads or writes to and I/O memory region; addr must be in the
+ * range returned by ioremap
+ *
+ * @gettid - returns the host thread id of the caller, which need not
+ * be the same as the handle returned by thread_create
+ *
+ * @jmp_buf_set - runs the give function and setups a jump back point by saving
+ * the context in the jump buffer; jmp_buf_longjmp can be called from the give
+ * function or any callee in that function to return back to the jump back
+ * point
+ *
+ * NOTE: we can't return from jmp_buf_set before calling jmp_buf_longjmp or
+ * otherwise the saved context (stack) is not going to be valid, so we must pass
+ * the function that will eventually call longjmp here
+ *
+ * @jmp_buf_longjmp - perform a jump back to the saved jump buffer
+ */
+struct lkl_host_operations {
+	const char *virtio_devices;
+
+	void (*print)(const char *str, int len);
+	void (*panic)(void);
+
+	struct lkl_sem *(*sem_alloc)(int count);
+	void (*sem_free)(struct lkl_sem *sem);
+	void (*sem_up)(struct lkl_sem *sem);
+	void (*sem_down)(struct lkl_sem *sem);
+
+	struct lkl_mutex *(*mutex_alloc)(int recursive);
+	void (*mutex_free)(struct lkl_mutex *mutex);
+	void (*mutex_lock)(struct lkl_mutex *mutex);
+	void (*mutex_unlock)(struct lkl_mutex *mutex);
+
+	lkl_thread_t (*thread_create)(void (*f)(void *), void *arg);
+	void (*thread_detach)(void);
+	void (*thread_exit)(void);
+	int (*thread_join)(lkl_thread_t tid);
+	lkl_thread_t (*thread_self)(void);
+	int (*thread_equal)(lkl_thread_t a, lkl_thread_t b);
+
+	struct lkl_tls_key *(*tls_alloc)(void (*destructor)(void *));
+	void (*tls_free)(struct lkl_tls_key *key);
+	int (*tls_set)(struct lkl_tls_key *key, void *data);
+	void *(*tls_get)(struct lkl_tls_key *key);
+
+	void *(*mem_alloc)(unsigned long mem);
+	void (*mem_free)(void *mem);
+
+	unsigned long long (*time)(void);
+
+	void *(*timer_alloc)(void (*fn)(void *), void *arg);
+	int (*timer_set_oneshot)(void *timer, unsigned long delta);
+	void (*timer_free)(void *timer);
+
+	void *(*ioremap)(long addr, int size);
+	int (*iomem_access)(const volatile void *addr, void *val, int size,
+			    int write);
+
+	long (*gettid)(void);
+
+	void (*jmp_buf_set)(struct lkl_jmp_buf *jmpb, void (*f)(void));
+	void (*jmp_buf_longjmp)(struct lkl_jmp_buf *jmpb, int val);
+};
+
+/**
+ * lkl_start_kernel - registers the host operations and starts the kernel
+ *
+ * The function returns only after the kernel is shutdown with lkl_sys_halt.
+ *
+ * @lkl_ops - pointer to host operations
+ * @cmd_line - format for command line string that is going to be used to
+ * generate the Linux kernel command line
+ */
+int lkl_start_kernel(struct lkl_host_operations *lkl_ops, const char *cmd_line,
+		     ...);
+
+/**
+ * lkl_is_running - returns 1 if the kernel is currently running
+ */
+int lkl_is_running(void);
+
+int lkl_printf(const char *fmt, ...);
+void lkl_bug(const char *fmt, ...);
+
+#endif
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 05/47] lkl: memory handling
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (3 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 04/47] lkl: host interface Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 06/47] lkl: kernel threads support Hajime Tazaki
                   ` (44 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: H . K . Jerry Chu, Levente Kurusa, Octavian Purdila,
	Hajime Tazaki, Akira Moroo, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

LKL is a non MMU architecture and hence there is not much work left to
do other than initializing the boot allocator and providing the page
and page table definitions.

The backstore memory is allocated via a host operation and the memory
size to be used is specified when the kernel is started, in the
lkl_start_kernel call.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Levente Kurusa <levex@linux.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/page.h    | 14 +++++++
 arch/um/lkl/include/asm/pgtable.h | 62 +++++++++++++++++++++++++++++
 arch/um/lkl/mm/bootmem.c          | 66 +++++++++++++++++++++++++++++++
 3 files changed, 142 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/page.h
 create mode 100644 arch/um/lkl/include/asm/pgtable.h
 create mode 100644 arch/um/lkl/mm/bootmem.c

diff --git a/arch/um/lkl/include/asm/page.h b/arch/um/lkl/include/asm/page.h
new file mode 100644
index 000000000000..e77f3da22031
--- /dev/null
+++ b/arch/um/lkl/include/asm/page.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_PAGE_H
+#define _ASM_LKL_PAGE_H
+
+#define CONFIG_KERNEL_RAM_BASE_ADDRESS memory_start
+
+#ifndef __ASSEMBLY__
+void free_mem(void);
+void bootmem_init(unsigned long mem_size);
+#endif
+
+#include <asm-generic/page.h>
+
+#endif /* _ASM_LKL_PAGE_H */
diff --git a/arch/um/lkl/include/asm/pgtable.h b/arch/um/lkl/include/asm/pgtable.h
new file mode 100644
index 000000000000..b790296abfac
--- /dev/null
+++ b/arch/um/lkl/include/asm/pgtable.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_PGTABLE_H
+#define _LKL_PGTABLE_H
+
+#include <asm-generic/4level-fixup.h>
+
+/*
+ * (C) Copyright 2000-2002, Greg Ungerer <gerg@snapgear.com>
+ */
+
+#include <linux/slab.h>
+#include <asm/processor.h>
+#include <asm/io.h>
+
+#define pgd_present(pgd)	(1)
+#define pgd_none(pgd)		(0)
+#define pgd_bad(pgd)		(0)
+#define pgd_clear(pgdp)
+#define kern_addr_valid(addr)	(1)
+#define	pmd_offset(a, b)	((void *)0)
+
+#define PAGE_NONE		__pgprot(0)
+#define PAGE_SHARED		__pgprot(0)
+#define PAGE_COPY		__pgprot(0)
+#define PAGE_READONLY		__pgprot(0)
+#define PAGE_KERNEL		__pgprot(0)
+
+void paging_init(void);
+#define swapper_pg_dir		((pgd_t *)0)
+
+#define __swp_type(x)		(0)
+#define __swp_offset(x)		(0)
+#define __swp_entry(typ, off)	((swp_entry_t) { ((typ) | ((off) << 7)) })
+#define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) })
+#define __swp_entry_to_pte(x)	((pte_t) { (x).val })
+
+/*
+ * ZERO_PAGE is a global shared page that is always zero: used
+ * for zero-mapped memory areas etc..
+ */
+extern void *empty_zero_page;
+#define ZERO_PAGE(vaddr)	(virt_to_page(empty_zero_page))
+
+/*
+ * No page table caches to initialise.
+ */
+#define pgtable_cache_init()	do { } while (0)
+
+/*
+ * All 32bit addresses are effectively valid for vmalloc...
+ * Sort of meaningless for non-VM targets.
+ */
+#define	VMALLOC_START		0
+#define	VMALLOC_END		0xffffffff
+#define	KMAP_START		0
+#define	KMAP_END		0xffffffff
+
+#include <asm-generic/pgtable.h>
+
+#define check_pgt_cache()	do { } while (0)
+
+#endif
diff --git a/arch/um/lkl/mm/bootmem.c b/arch/um/lkl/mm/bootmem.c
new file mode 100644
index 000000000000..39dd0d22b44e
--- /dev/null
+++ b/arch/um/lkl/mm/bootmem.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/memblock.h>
+#include <linux/mm.h>
+#include <linux/swap.h>
+
+unsigned long memory_start, memory_end;
+static unsigned long _memory_start, mem_size;
+
+void *empty_zero_page;
+
+void __init bootmem_init(unsigned long mem_sz)
+{
+	mem_size = mem_sz;
+
+	_memory_start = (unsigned long)lkl_ops->mem_alloc(mem_size);
+	memory_start = _memory_start;
+	WARN_ON(!memory_start);
+	memory_end = memory_start + mem_size;
+
+	if (PAGE_ALIGN(memory_start) != memory_start) {
+		mem_size -= PAGE_ALIGN(memory_start) - memory_start;
+		memory_start = PAGE_ALIGN(memory_start);
+		mem_size = (mem_size / PAGE_SIZE) * PAGE_SIZE;
+	}
+	pr_info("memblock address range: 0x%lx - 0x%lx\n", memory_start,
+		memory_start + mem_size);
+	/*
+	 * Give all the memory to the bootmap allocator, tell it to put the
+	 * boot mem_map at the start of memory.
+	 */
+	max_low_pfn = virt_to_pfn(memory_end);
+	min_low_pfn = virt_to_pfn(memory_start);
+	memblock_add(memory_start, mem_size);
+
+	empty_zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+	memset((void *)empty_zero_page, 0, PAGE_SIZE);
+
+	{
+		unsigned long zones_size[MAX_NR_ZONES] = {0, };
+
+		zones_size[ZONE_NORMAL] = (mem_size) >> PAGE_SHIFT;
+		free_area_init(zones_size);
+	}
+}
+
+void __init mem_init(void)
+{
+	max_mapnr = (((unsigned long)high_memory) - PAGE_OFFSET) >> PAGE_SHIFT;
+	/* this will put all memory onto the freelists */
+	totalram_pages_add(memblock_free_all());
+	pr_info("Memory available: %luk/%luk RAM\n",
+		(nr_free_pages() << PAGE_SHIFT) >> 10, mem_size >> 10);
+}
+
+/*
+ * In our case __init memory is not part of the page allocator so there is
+ * nothing to free.
+ */
+void free_initmem(void)
+{
+}
+
+void free_mem(void)
+{
+	lkl_ops->mem_free((void *)_memory_start);
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 06/47] lkl: kernel threads support
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (4 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 05/47] lkl: memory handling Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 07/47] lkl: interrupt support Hajime Tazaki
                   ` (43 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Lai Jiangshan, Hajime Tazaki, Patrick Collins,
	Akira Moroo, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

LKL does not support user processes but it must support kernel threads
as part as the normal kernel work-flow. It uses host operations to
create and terminate host threads that are going to run the kernel
threads. It also uses semaphores to synchronize those threads and to
allow the Linux kernel scheduler to control how the kernel threads
run.

Each kernel thread runs in a host threads and has a host semaphore
associated with it - the thread's scheduling semaphore. The semaphore
counter is initialized to 0. The first thing a kernel thread does
after getting spawned, before running any kernel code, is to perform a
down operation to block the thread.

The kernel controls host threads scheduling by performing up and down
operations on the scheduling semaphore. In __switch_context an up
operation on the next thread is performed to wake up a blocked thread,
and a down operation is performed on the prev thread to block it.

A thread is terminated by marking it in free_thread_info and
performing an up operation on the scheduling semaphore at which point
the marked thread will terminate itself.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/thread_info.h |  70 ++++++++
 arch/um/lkl/kernel/cpu.c              | 223 +++++++++++++++++++++++++
 arch/um/lkl/kernel/threads.c          | 227 ++++++++++++++++++++++++++
 3 files changed, 520 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/thread_info.h
 create mode 100644 arch/um/lkl/kernel/cpu.c
 create mode 100644 arch/um/lkl/kernel/threads.c

diff --git a/arch/um/lkl/include/asm/thread_info.h b/arch/um/lkl/include/asm/thread_info.h
new file mode 100644
index 000000000000..da4e75fc7b10
--- /dev/null
+++ b/arch/um/lkl/include/asm/thread_info.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_THREAD_INFO_H
+#define _ASM_LKL_THREAD_INFO_H
+
+#define THREAD_SIZE	       (4096)
+
+#ifndef __ASSEMBLY__
+#include <asm/types.h>
+#include <asm/processor.h>
+#include <asm/host_ops.h>
+
+typedef struct {
+	unsigned long seg;
+} mm_segment_t;
+
+struct thread_info {
+	struct task_struct *task;
+	unsigned long flags;
+	int preempt_count;
+	mm_segment_t addr_limit;
+	struct lkl_sem *sched_sem;
+	struct lkl_jmp_buf sched_jb;
+	bool dead;
+	lkl_thread_t tid;
+	struct task_struct *prev_sched;
+	unsigned long stackend;
+};
+
+#define INIT_THREAD_INFO(tsk)				\
+{							\
+	.task		= &tsk,				\
+	.preempt_count	= INIT_PREEMPT_COUNT,		\
+	.flags		= 0,				\
+	.addr_limit	= KERNEL_DS,			\
+}
+
+/* how to get the thread information struct from C */
+extern struct thread_info *_current_thread_info;
+static inline struct thread_info *current_thread_info(void)
+{
+	return _current_thread_info;
+}
+
+/* thread information allocation */
+unsigned long *alloc_thread_stack_node(struct task_struct *, int node);
+void free_thread_stack(struct task_struct *tsk);
+
+void threads_init(void);
+void threads_cleanup(void);
+
+#define TIF_SYSCALL_TRACE		0
+#define TIF_NOTIFY_RESUME		1
+#define TIF_SIGPENDING			2
+#define TIF_NEED_RESCHED		3
+#define TIF_RESTORE_SIGMASK		4
+#define TIF_MEMDIE			5
+#define TIF_NOHZ			6
+#define TIF_SCHED_JB			7
+#define TIF_HOST_THREAD			8
+
+#define __HAVE_THREAD_FUNCTIONS
+
+#define task_thread_info(task)	((struct thread_info *)(task)->stack)
+#define task_stack_page(task)	((task)->stack)
+void setup_thread_stack(struct task_struct *p, struct task_struct *org);
+#define end_of_stack(p) (&task_thread_info(p)->stackend)
+
+#endif /* __ASSEMBLY__ */
+
+#endif
diff --git a/arch/um/lkl/kernel/cpu.c b/arch/um/lkl/kernel/cpu.c
new file mode 100644
index 000000000000..125af3b2d5dd
--- /dev/null
+++ b/arch/um/lkl/kernel/cpu.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kernel.h>
+#include <linux/sched/stat.h>
+#include <asm/host_ops.h>
+#include <asm/cpu.h>
+#include <asm/thread_info.h>
+#include <asm/unistd.h>
+#include <asm/sched.h>
+#include <asm/syscalls.h>
+
+/*
+ * This structure is used to get access to the "LKL CPU" that allows us to run
+ * Linux code. Because we have to deal with various synchronization requirements
+ * between idle thread, system calls, interrupts, "reentrancy", CPU shutdown,
+ * imbalance wake up (i.e. acquire the CPU from one thread and release it from
+ * another), we can't use a simple synchronization mechanism such as (recursive)
+ * mutex or semaphore. Instead, we use a mutex and a bunch of status data plus a
+ * semaphore.
+ */
+static struct lkl_cpu {
+	/* lock that protects the CPU status data */
+	struct lkl_mutex *lock;
+	/*
+	 * Since we must free the cpu lock during shutdown we need a
+	 * synchronization algorithm between lkl_cpu_shutdown() and the CPU
+	 * access functions since lkl_cpu_get() gets called from thread
+	 * destructor callback functions which may be scheduled after
+	 * lkl_cpu_shutdown() has freed the cpu lock.
+	 *
+	 * An atomic counter is used to keep track of the number of running
+	 * CPU access functions and allow the shutdown function to wait for
+	 * them.
+	 *
+	 * The shutdown functions adds MAX_THREADS to this counter which allows
+	 * the CPU access functions to check if the shutdown process has
+	 * started.
+	 *
+	 * This algorithm assumes that we never have more the MAX_THREADS
+	 * requesting CPU access.
+	 */
+	#define MAX_THREADS 1000000
+	unsigned int shutdown_gate;
+	bool irqs_pending;
+	/* no of threads waiting the CPU */
+	unsigned int sleepers;
+	/* no of times the current thread got the CPU */
+	unsigned int count;
+	/* current thread that owns the CPU */
+	lkl_thread_t owner;
+	/* semaphore for threads waiting the CPU */
+	struct lkl_sem *sem;
+	/* semaphore used for shutdown */
+	struct lkl_sem *shutdown_sem;
+} cpu;
+
+static int __cpu_try_get_lock(int n)
+{
+	lkl_thread_t self;
+
+	if (__sync_fetch_and_add(&cpu.shutdown_gate, n) >= MAX_THREADS)
+		return -2;
+
+	lkl_ops->mutex_lock(cpu.lock);
+
+	if (cpu.shutdown_gate >= MAX_THREADS)
+		return -1;
+
+	self = lkl_ops->thread_self();
+
+	if (cpu.owner && !lkl_ops->thread_equal(cpu.owner, self))
+		return 0;
+
+	cpu.owner = self;
+	cpu.count++;
+
+	return 1;
+}
+
+static void __cpu_try_get_unlock(int lock_ret, int n)
+{
+	if (lock_ret >= -1)
+		lkl_ops->mutex_unlock(cpu.lock);
+	__sync_fetch_and_sub(&cpu.shutdown_gate, n);
+}
+
+void lkl_cpu_change_owner(lkl_thread_t owner)
+{
+	lkl_ops->mutex_lock(cpu.lock);
+	if (cpu.count > 1)
+		lkl_bug("bad count while changing owner\n");
+	cpu.owner = owner;
+	lkl_ops->mutex_unlock(cpu.lock);
+}
+
+int lkl_cpu_get(void)
+{
+	int ret;
+
+	ret = __cpu_try_get_lock(1);
+
+	while (ret == 0) {
+		cpu.sleepers++;
+		__cpu_try_get_unlock(ret, 0);
+		lkl_ops->sem_down(cpu.sem);
+		ret = __cpu_try_get_lock(0);
+	}
+
+	__cpu_try_get_unlock(ret, 1);
+
+	return ret;
+}
+
+void lkl_cpu_put(void)
+{
+	lkl_ops->mutex_lock(cpu.lock);
+
+	if (!cpu.count || !cpu.owner ||
+	    !lkl_ops->thread_equal(cpu.owner, lkl_ops->thread_self()))
+		lkl_bug("%s: unbalanced put\n", __func__);
+
+	while (cpu.irqs_pending && !irqs_disabled()) {
+		cpu.irqs_pending = false;
+		lkl_ops->mutex_unlock(cpu.lock);
+		run_irqs();
+		lkl_ops->mutex_lock(cpu.lock);
+	}
+
+	if (test_ti_thread_flag(current_thread_info(), TIF_HOST_THREAD) &&
+	    !single_task_running() && cpu.count == 1) {
+		if (in_interrupt())
+			lkl_bug("%s: in interrupt\n", __func__);
+		lkl_ops->mutex_unlock(cpu.lock);
+		thread_sched_jb();
+		return;
+	}
+
+	if (--cpu.count > 0) {
+		lkl_ops->mutex_unlock(cpu.lock);
+		return;
+	}
+
+	if (cpu.sleepers) {
+		cpu.sleepers--;
+		lkl_ops->sem_up(cpu.sem);
+	}
+
+	cpu.owner = 0;
+
+	lkl_ops->mutex_unlock(cpu.lock);
+}
+
+int lkl_cpu_try_run_irq(int irq)
+{
+	int ret;
+
+	ret = __cpu_try_get_lock(1);
+	if (!ret) {
+		set_irq_pending(irq);
+		cpu.irqs_pending = true;
+	}
+	__cpu_try_get_unlock(ret, 1);
+
+	return ret;
+}
+
+void lkl_cpu_shutdown(void)
+{
+	__sync_fetch_and_add(&cpu.shutdown_gate, MAX_THREADS);
+}
+
+void lkl_cpu_wait_shutdown(void)
+{
+	lkl_ops->sem_down(cpu.shutdown_sem);
+	lkl_ops->sem_free(cpu.shutdown_sem);
+}
+
+static void lkl_cpu_cleanup(bool shutdown)
+{
+	while (__sync_fetch_and_add(&cpu.shutdown_gate, 0) > MAX_THREADS)
+		;
+
+	if (shutdown)
+		lkl_ops->sem_up(cpu.shutdown_sem);
+	else if (cpu.shutdown_sem)
+		lkl_ops->sem_free(cpu.shutdown_sem);
+	if (cpu.sem)
+		lkl_ops->sem_free(cpu.sem);
+	if (cpu.lock)
+		lkl_ops->mutex_free(cpu.lock);
+}
+
+void arch_cpu_idle(void)
+{
+	if (cpu.shutdown_gate >= MAX_THREADS) {
+		lkl_ops->mutex_lock(cpu.lock);
+		while (cpu.sleepers--)
+			lkl_ops->sem_up(cpu.sem);
+		lkl_ops->mutex_unlock(cpu.lock);
+
+		lkl_cpu_cleanup(true);
+
+		lkl_ops->thread_exit();
+	}
+	/* enable irqs now to allow direct irqs to run */
+	local_irq_enable();
+
+	/* switch to idle_host_task */
+	wakeup_idle_host_task();
+}
+
+int lkl_cpu_init(void)
+{
+	cpu.lock = lkl_ops->mutex_alloc(0);
+	cpu.sem = lkl_ops->sem_alloc(0);
+	cpu.shutdown_sem = lkl_ops->sem_alloc(0);
+
+	if (!cpu.lock || !cpu.sem || !cpu.shutdown_sem) {
+		lkl_cpu_cleanup(false);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
diff --git a/arch/um/lkl/kernel/threads.c b/arch/um/lkl/kernel/threads.c
new file mode 100644
index 000000000000..4fe8c56ae5e0
--- /dev/null
+++ b/arch/um/lkl/kernel/threads.c
@@ -0,0 +1,227 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/sched/task.h>
+#include <linux/sched/signal.h>
+#include <asm/host_ops.h>
+#include <asm/cpu.h>
+#include <asm/sched.h>
+
+static int init_ti(struct thread_info *ti)
+{
+	ti->sched_sem = lkl_ops->sem_alloc(0);
+	if (!ti->sched_sem)
+		return -ENOMEM;
+
+	ti->dead = false;
+	ti->prev_sched = NULL;
+	ti->tid = 0;
+
+	return 0;
+}
+
+unsigned long *alloc_thread_stack_node(struct task_struct *task, int node)
+{
+	struct thread_info *ti;
+
+	ti = kmalloc(sizeof(*ti), GFP_KERNEL);
+	if (!ti)
+		return NULL;
+
+	if (init_ti(ti)) {
+		kfree(ti);
+		return NULL;
+	}
+	ti->task = task;
+
+	return (unsigned long *)ti;
+}
+
+/*
+ * The only new tasks created are kernel threads that have a predefined starting
+ * point thus no stack copy is required.
+ */
+void setup_thread_stack(struct task_struct *p, struct task_struct *org)
+{
+	struct thread_info *ti = task_thread_info(p);
+	struct thread_info *org_ti = task_thread_info(org);
+
+	ti->flags = org_ti->flags;
+	ti->preempt_count = org_ti->preempt_count;
+	ti->addr_limit = org_ti->addr_limit;
+}
+
+static void kill_thread(struct thread_info *ti)
+{
+	if (!test_ti_thread_flag(ti, TIF_HOST_THREAD)) {
+		ti->dead = true;
+		lkl_ops->sem_up(ti->sched_sem);
+		lkl_ops->thread_join(ti->tid);
+	}
+	lkl_ops->sem_free(ti->sched_sem);
+}
+
+void free_thread_stack(struct task_struct *tsk)
+{
+	struct thread_info *ti = task_thread_info(tsk);
+
+	kill_thread(ti);
+	kfree(ti);
+}
+
+struct thread_info *_current_thread_info = &init_thread_union.thread_info;
+
+/*
+ * schedule() expects the return of this function to be the task that we
+ * switched away from. Returning prev is not going to work because we are
+ * actually going to return the previous taks that was scheduled before the
+ * task we are going to wake up, and not the current task, e.g.:
+ *
+ * swapper -> init: saved prev on swapper stack is swapper
+ * init -> ksoftirqd0: saved prev on init stack is init
+ * ksoftirqd0 -> swapper: returned prev is swapper
+ */
+static struct task_struct *abs_prev = &init_task;
+
+struct task_struct *__switch_to(struct task_struct *prev,
+				struct task_struct *next)
+{
+	struct thread_info *_prev = task_thread_info(prev);
+	struct thread_info *_next = task_thread_info(next);
+	unsigned long _prev_flags = _prev->flags;
+	struct lkl_jmp_buf _prev_jb;
+
+	_current_thread_info = task_thread_info(next);
+	_next->prev_sched = prev;
+	abs_prev = prev;
+
+	BUG_ON(!_next->tid);
+	lkl_cpu_change_owner(_next->tid);
+
+	if (test_bit(TIF_SCHED_JB, &_prev_flags)) {
+		/* Atomic. Must be done before wakeup next */
+		clear_ti_thread_flag(_prev, TIF_SCHED_JB);
+		_prev_jb = _prev->sched_jb;
+	}
+
+	lkl_ops->sem_up(_next->sched_sem);
+	if (test_bit(TIF_SCHED_JB, &_prev_flags))
+		lkl_ops->jmp_buf_longjmp(&_prev_jb, 1);
+	else
+		lkl_ops->sem_down(_prev->sched_sem);
+
+	if (_prev->dead)
+		lkl_ops->thread_exit();
+
+	return abs_prev;
+}
+
+int host_task_stub(void *unused)
+{
+	return 0;
+}
+
+void switch_to_host_task(struct task_struct *task)
+{
+	if (WARN_ON(!test_tsk_thread_flag(task, TIF_HOST_THREAD)))
+		return;
+
+	task_thread_info(task)->tid = lkl_ops->thread_self();
+
+	if (current == task)
+		return;
+
+	wake_up_process(task);
+	thread_sched_jb();
+	lkl_ops->sem_down(task_thread_info(task)->sched_sem);
+	schedule_tail(abs_prev);
+}
+
+struct thread_bootstrap_arg {
+	struct thread_info *ti;
+	int (*f)(void *arg);
+	void *arg;
+};
+
+static void thread_bootstrap(void *_tba)
+{
+	struct thread_bootstrap_arg *tba = (struct thread_bootstrap_arg *)_tba;
+	struct thread_info *ti = tba->ti;
+	int (*f)(void *) = tba->f;
+	void *arg = tba->arg;
+
+	lkl_ops->sem_down(ti->sched_sem);
+	kfree(tba);
+	if (ti->prev_sched)
+		schedule_tail(ti->prev_sched);
+
+	f(arg);
+	do_exit(0);
+}
+
+int copy_thread(unsigned long clone_flags, unsigned long esp,
+		unsigned long unused, struct task_struct *p)
+{
+	struct thread_info *ti = task_thread_info(p);
+	struct thread_bootstrap_arg *tba;
+
+	if ((int (*)(void *))esp == host_task_stub) {
+		set_ti_thread_flag(ti, TIF_HOST_THREAD);
+		return 0;
+	}
+
+	tba = kmalloc(sizeof(*tba), GFP_KERNEL);
+	if (!tba)
+		return -ENOMEM;
+
+	tba->f = (int (*)(void *))esp;
+	tba->arg = (void *)unused;
+	tba->ti = ti;
+
+	ti->tid = lkl_ops->thread_create(thread_bootstrap, tba);
+	if (!ti->tid) {
+		kfree(tba);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void show_stack(struct task_struct *task, unsigned long *esp)
+{
+}
+
+/**
+ * This is called before the kernel initializes, so no kernel calls (including
+ * printk) can't be made yet.
+ */
+void threads_init(void)
+{
+	int ret;
+	struct thread_info *ti = &init_thread_union.thread_info;
+
+	ret = init_ti(ti);
+	if (ret < 0)
+		lkl_printf("lkl: failed to allocate init schedule semaphore\n");
+
+	ti->tid = lkl_ops->thread_self();
+}
+
+void threads_cleanup(void)
+{
+	struct task_struct *p, *t;
+
+	for_each_process_thread(p, t) {
+		struct thread_info *ti = task_thread_info(t);
+
+		if (t->pid != 1 && !test_ti_thread_flag(ti, TIF_HOST_THREAD))
+			WARN(!(t->flags & PF_KTHREAD),
+			     "non kernel thread task %s\n", t->comm);
+		WARN(t->state == TASK_RUNNING,
+		     "thread %s still running while halting\n", t->comm);
+
+		kill_thread(ti);
+	}
+
+	lkl_ops->sem_free(init_thread_union.thread_info.sched_sem);
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 07/47] lkl: interrupt support
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (5 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 06/47] lkl: kernel threads support Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 08/47] lkl: system call interface and application API Hajime Tazaki
                   ` (42 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Michael Zimmermann, Hajime Tazaki, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Add APIs that allows the host to reserve and free and interrupt number
and also to trigger an interrupt.

The trigger operation will simply store the interrupt data in
queue. The interrupt handler is run later, at the first opportunity it
has to avoid races with any kernel threads.

Currently, interrupts are run on the first interrupt enable operation
if interrupts are disabled and if we are not already in interrupt
context.

When triggering an interrupt the host can also send a void pointer
that is going to be available to the handler routine via
get_irq_regs()->irq_data. This allows to easly create host <-> kernel
synchronous communication channels and is currently used by the system
call interface.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/irq.h             |  15 ++
 arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
 arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
 arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++
 4 files changed, 260 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/irq.h
 create mode 100644 arch/um/lkl/include/uapi/asm/irq.h
 create mode 100644 arch/um/lkl/include/uapi/asm/sigcontext.h
 create mode 100644 arch/um/lkl/kernel/irq.c

diff --git a/arch/um/lkl/include/asm/irq.h b/arch/um/lkl/include/asm/irq.h
new file mode 100644
index 000000000000..36af9e36be1c
--- /dev/null
+++ b/arch/um/lkl/include/asm/irq.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_IRQ_H
+#define _ASM_LKL_IRQ_H
+
+#ifndef __arch_um__
+#define IRQ_STATUS_BITS		(sizeof(long) * 8)
+#define NR_IRQS			((int)(IRQ_STATUS_BITS * IRQ_STATUS_BITS))
+#endif	/* __arch_um__ */
+
+void run_irqs(void);
+void set_irq_pending(int irq);
+
+#include <uapi/asm/irq.h>
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/irq.h b/arch/um/lkl/include/uapi/asm/irq.h
new file mode 100644
index 000000000000..666628b233eb
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/irq.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_IRQ_H
+#define _ASM_UAPI_LKL_IRQ_H
+
+/**
+ * lkl_trigger_irq - generate an interrupt
+ *
+ * This function is used by the device host side to signal its Linux counterpart
+ * that some event happened.
+ *
+ * @irq - the irq number to signal
+ */
+int lkl_trigger_irq(int irq);
+
+/**
+ * lkl_get_free_irq - find and reserve a free IRQ number
+ *
+ * This function is called by the host device code to find an unused IRQ number
+ * and reserved it for its own use.
+ *
+ * @user - a string to identify the user
+ * @returns - and irq number that can be used by request_irq or an negative
+ * value in case of an error
+ */
+int lkl_get_free_irq(const char *user);
+
+/**
+ * lkl_put_irq - release an IRQ number previously obtained with lkl_get_free_irq
+ *
+ * @irq - irq number to release
+ * @user - string identifying the user; should be the same as the one passed to
+ * lkl_get_free_irq when the irq number was obtained
+ */
+void lkl_put_irq(int irq, const char *name);
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/sigcontext.h b/arch/um/lkl/include/uapi/asm/sigcontext.h
new file mode 100644
index 000000000000..2f4848843d1d
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/sigcontext.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_SIGCONTEXT_H
+#define _ASM_UAPI_LKL_SIGCONTEXT_H
+
+#include <asm/ptrace.h>
+
+struct pt_regs {
+	void *irq_data;
+};
+
+struct sigcontext {
+	struct pt_regs regs;
+	unsigned long oldmask;
+};
+
+#endif
diff --git a/arch/um/lkl/kernel/irq.c b/arch/um/lkl/kernel/irq.c
new file mode 100644
index 000000000000..e3b59e46ca50
--- /dev/null
+++ b/arch/um/lkl/kernel/irq.c
@@ -0,0 +1,193 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/irq.h>
+#include <linux/hardirq.h>
+#include <asm/irq_regs.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <linux/tick.h>
+#include <asm/irqflags.h>
+#include <asm/host_ops.h>
+#include <asm/cpu.h>
+
+/*
+ * To avoid much overhead we use an indirect approach: the irqs are marked using
+ * a bitmap (array of longs) and a summary of the modified bits is kept in a
+ * separate "index" long - one bit for each sizeof(long). Thus we can support
+ * 4096 irqs on 64bit platforms and 1024 irqs on 32bit platforms.
+ *
+ * Whenever an irq is trigger both the array and the index is updated. To find
+ * which irqs were triggered we first search the index and then the
+ * corresponding part of the arrary.
+ */
+static unsigned long irq_status[NR_IRQS / IRQ_STATUS_BITS];
+static unsigned long irq_index_status;
+
+static inline unsigned long test_and_clear_irq_index_status(void)
+{
+	if (!irq_index_status)
+		return 0;
+	return __sync_fetch_and_and(&irq_index_status, 0);
+}
+
+static inline unsigned long test_and_clear_irq_status(int index)
+{
+	if (!&irq_status[index])
+		return 0;
+	return __sync_fetch_and_and(&irq_status[index], 0);
+}
+
+void set_irq_pending(int irq)
+{
+	int index = irq / IRQ_STATUS_BITS;
+	int bit = irq % IRQ_STATUS_BITS;
+
+	__sync_fetch_and_or(&irq_status[index], BIT(bit));
+	__sync_fetch_and_or(&irq_index_status, BIT(index));
+}
+
+static struct irq_info {
+	const char *user;
+} irqs[NR_IRQS];
+
+static bool irqs_enabled;
+
+static struct pt_regs dummy;
+
+static void run_irq(int irq)
+{
+	unsigned long flags;
+	struct pt_regs *old_regs = set_irq_regs((struct pt_regs *)&dummy);
+
+	/* interrupt handlers need to run with interrupts disabled */
+	local_irq_save(flags);
+	irq_enter();
+	generic_handle_irq(irq);
+	irq_exit();
+	set_irq_regs(old_regs);
+	local_irq_restore(flags);
+}
+
+/**
+ * This function can be called from arbitrary host threads, so do not
+ * issue any Linux calls (e.g. prink) if lkl_cpu_get() was not issued
+ * before.
+ */
+int lkl_trigger_irq(int irq)
+{
+	int ret;
+
+	if (!irq || irq > NR_IRQS)
+		return -EINVAL;
+
+	ret = lkl_cpu_try_run_irq(irq);
+	if (ret <= 0)
+		return ret;
+
+	/*
+	 * Since this can be called from Linux context (e.g. lkl_trigger_irq ->
+	 * IRQ -> softirq -> lkl_trigger_irq) make sure we are actually allowed
+	 * to run irqs at this point
+	 */
+	if (!irqs_enabled) {
+		set_irq_pending(irq);
+		lkl_cpu_put();
+		return 0;
+	}
+
+	run_irq(irq);
+
+	lkl_cpu_put();
+
+	return 0;
+}
+
+static inline void for_each_bit(unsigned long word, void (*f)(int, int), int j)
+{
+	int i = 0;
+
+	while (word) {
+		if (word & 1)
+			f(i, j);
+		word >>= 1;
+		i++;
+	}
+}
+
+static inline void deliver_irq(int bit, int index)
+{
+	run_irq(index * IRQ_STATUS_BITS + bit);
+}
+
+static inline void check_irq_status(int i, int unused)
+{
+	for_each_bit(test_and_clear_irq_status(i), deliver_irq, i);
+}
+
+void run_irqs(void)
+{
+	for_each_bit(test_and_clear_irq_index_status(), check_irq_status, 0);
+}
+
+int show_interrupts(struct seq_file *p, void *v)
+{
+	return 0;
+}
+
+int lkl_get_free_irq(const char *user)
+{
+	int i;
+	int ret = -EBUSY;
+
+	/* 0 is not a valid IRQ */
+	for (i = 1; i < NR_IRQS; i++) {
+		if (!irqs[i].user) {
+			irqs[i].user = user;
+			irq_set_chip_and_handler(i, &dummy_irq_chip,
+						 handle_simple_irq);
+			ret = i;
+			break;
+		}
+	}
+
+	return ret;
+}
+
+void lkl_put_irq(int i, const char *user)
+{
+	if (!irqs[i].user || strcmp(irqs[i].user, user) != 0) {
+		WARN("%s tried to release %s's irq %d", user, irqs[i].user, i);
+		return;
+	}
+
+	irqs[i].user = NULL;
+}
+
+unsigned long arch_local_save_flags(void)
+{
+	return irqs_enabled;
+}
+
+void arch_local_irq_restore(unsigned long flags)
+{
+	if (flags == ARCH_IRQ_ENABLED && irqs_enabled == ARCH_IRQ_DISABLED &&
+	    !in_interrupt())
+		run_irqs();
+	irqs_enabled = flags;
+}
+
+void init_IRQ(void)
+{
+	int i;
+
+	for (i = 0; i < NR_IRQS; i++)
+		irq_set_chip_and_handler(i, &dummy_irq_chip, handle_simple_irq);
+
+	pr_info("lkl: irqs initialized\n");
+}
+
+void cpu_yield_to_irqs(void)
+{
+	cpu_relax();
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 08/47] lkl: system call interface and application API
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (6 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 07/47] lkl: interrupt support Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 09/47] lkl: timers, time and delay support Hajime Tazaki
                   ` (41 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: Conrad Meyer, Octavian Purdila, Jens Staal, Lai Jiangshan,
	Akira Moroo, Yuan Liu, Patrick Collins, Pierre-Hugues Husson,
	Michael Zimmermann, Luca Dariz, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

The LKL application API is based on the kernel system call interface
in order to offer a stable API to applications. Note that we can't
offer the full Linux system call interface due to LKL limitations such
as lack of virtual memory, signal, user processes, etc.

The host is using the LKL interrupt mechanism (lkl_trigger_irq) to
initiate a system call. The system call is executed in the context of
the init process.

To avoid collisions between the Linux API and the LKL API (e.g.  struct
stat, MKNOD, etc.) we use a python script to modify the user headers
and to prefix all of the global symbols (structures, typedefs,
defines) with LKL, lkl, _LKL, _lkl, __LKL or __lkl.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Jens Staal <staal1978@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/unistd.h       |  29 +++
 arch/um/lkl/include/uapi/asm/unistd.h  |  18 ++
 arch/um/lkl/kernel/syscalls.c          | 246 +++++++++++++++++++++++++
 arch/um/lkl/kernel/syscalls_32.c       | 159 ++++++++++++++++
 arch/um/lkl/scripts/headers_install.py | 195 ++++++++++++++++++++
 5 files changed, 647 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/unistd.h
 create mode 100644 arch/um/lkl/include/uapi/asm/unistd.h
 create mode 100644 arch/um/lkl/kernel/syscalls.c
 create mode 100644 arch/um/lkl/kernel/syscalls_32.c
 create mode 100755 arch/um/lkl/scripts/headers_install.py

diff --git a/arch/um/lkl/include/asm/unistd.h b/arch/um/lkl/include/asm/unistd.h
new file mode 100644
index 000000000000..c0efc68bf41f
--- /dev/null
+++ b/arch/um/lkl/include/asm/unistd.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <uapi/asm/unistd.h>
+
+__SYSCALL(__NR_virtio_mmio_device_add, sys_virtio_mmio_device_add)
+
+#define __SC_ASCII(t, a) #t "," #a
+
+#define __ASCII_MAP0(m, ...)
+#define __ASCII_MAP1(m, t, a) m(t, a)
+#define __ASCII_MAP2(m, t, a, ...) m(t, a) "," __ASCII_MAP1(m, __VA_ARGS__)
+#define __ASCII_MAP3(m, t, a, ...) m(t, a) "," __ASCII_MAP2(m, __VA_ARGS__)
+#define __ASCII_MAP4(m, t, a, ...) m(t, a) "," __ASCII_MAP3(m, __VA_ARGS__)
+#define __ASCII_MAP5(m, t, a, ...) m(t, a) "," __ASCII_MAP4(m, __VA_ARGS__)
+#define __ASCII_MAP6(m, t, a, ...) m(t, a) "," __ASCII_MAP5(m, __VA_ARGS__)
+#define __ASCII_MAP(n, ...) __ASCII_MAP##n(__VA_ARGS__)
+
+#ifdef __MINGW32__
+#define SECTION_ATTRS "n0"
+#else
+#define SECTION_ATTRS "a"
+#endif
+
+#define __SYSCALL_DEFINE_ARCH(x, name, ...)				\
+	asm(".section .syscall_defs,\"" SECTION_ATTRS "\"\n"		\
+	    ".ascii \"#ifdef __NR" #name "\\n\"\n"			\
+	    ".ascii \"SYSCALL_DEFINE" #x "(" #name ","			\
+	    __ASCII_MAP(x, __SC_ASCII, __VA_ARGS__) ")\\n\"\n"		\
+	    ".ascii \"#endif\\n\"\n"					\
+	    ".section .text\n");
diff --git a/arch/um/lkl/include/uapi/asm/unistd.h b/arch/um/lkl/include/uapi/asm/unistd.h
new file mode 100644
index 000000000000..561a7036821e
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/unistd.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#define __ARCH_WANT_SYSCALL_NO_AT
+#define __ARCH_WANT_SYSCALL_DEPRECATED
+#define __ARCH_WANT_SYSCALL_NO_FLAGS
+#define __ARCH_WANT_RENAMEAT
+#define __ARCH_WANT_NEW_STAT
+#define __ARCH_WANT_SET_GET_RLIMIT
+#define __ARCH_WANT_TIME32_SYSCALLS
+
+#include <asm/bitsperlong.h>
+
+#if __BITS_PER_LONG == 64
+#define __ARCH_WANT_SYS_NEWFSTATAT
+#endif
+
+#include <asm-generic/unistd.h>
+
+#define __NR_virtio_mmio_device_add		(__NR_arch_specific_syscall + 0)
diff --git a/arch/um/lkl/kernel/syscalls.c b/arch/um/lkl/kernel/syscalls.c
new file mode 100644
index 000000000000..ce3923baa655
--- /dev/null
+++ b/arch/um/lkl/kernel/syscalls.c
@@ -0,0 +1,246 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/stat.h>
+#include <linux/irq.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/jhash.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/net.h>
+#include <linux/task_work.h>
+#include <linux/syscalls.h>
+#include <linux/kthread.h>
+#include <linux/platform_device.h>
+#include <asm/host_ops.h>
+#include <asm/syscalls.h>
+#include <asm/syscalls_32.h>
+#include <asm/cpu.h>
+#include <asm/sched.h>
+
+static asmlinkage long sys_virtio_mmio_device_add(long base, long size,
+						  unsigned int irq);
+
+typedef long (*syscall_handler_t)(long arg1, ...);
+
+#undef __SYSCALL
+#define __SYSCALL(nr, sym)[nr] = (syscall_handler_t)sym,
+
+static syscall_handler_t syscall_table[__NR_syscalls] = {
+	[0 ... __NR_syscalls - 1] = (syscall_handler_t)sys_ni_syscall,
+#include <asm/unistd.h>
+
+#if __BITS_PER_LONG == 32
+#include <asm/unistd_32.h>
+#endif
+};
+
+static long run_syscall(long no, long *params)
+{
+	long ret;
+
+	if (no < 0 || no >= __NR_syscalls)
+		return -ENOSYS;
+
+	ret = syscall_table[no](params[0], params[1], params[2], params[3],
+				params[4], params[5]);
+
+	task_work_run();
+
+	return ret;
+}
+
+
+#define CLONE_FLAGS (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_THREAD |	\
+		     CLONE_SIGHAND | SIGCHLD)
+
+static int host_task_id;
+static struct task_struct *host0;
+
+static int new_host_task(struct task_struct **task)
+{
+	pid_t pid;
+
+	switch_to_host_task(host0);
+
+	pid = kernel_thread(host_task_stub, NULL, CLONE_FLAGS);
+	if (pid < 0)
+		return pid;
+
+	rcu_read_lock();
+	*task = find_task_by_pid_ns(pid, &init_pid_ns);
+	rcu_read_unlock();
+
+	host_task_id++;
+
+	snprintf((*task)->comm, sizeof((*task)->comm), "host%d", host_task_id);
+
+	return 0;
+}
+static void exit_task(void)
+{
+	do_exit(0);
+}
+
+static void del_host_task(void *arg)
+{
+	struct task_struct *task = (struct task_struct *)arg;
+	struct thread_info *ti = task_thread_info(task);
+
+	if (lkl_cpu_get() < 0)
+		return;
+
+	switch_to_host_task(task);
+	host_task_id--;
+	set_ti_thread_flag(ti, TIF_SCHED_JB);
+	lkl_ops->jmp_buf_set(&ti->sched_jb, exit_task);
+}
+
+static struct lkl_tls_key *task_key;
+
+long lkl_syscall(long no, long *params)
+{
+	struct task_struct *task = host0;
+	long ret;
+
+	ret = lkl_cpu_get();
+	if (ret < 0)
+		return ret;
+
+	if (lkl_ops->tls_get) {
+		task = lkl_ops->tls_get(task_key);
+		if (!task) {
+			ret = new_host_task(&task);
+			if (ret)
+				goto out;
+			lkl_ops->tls_set(task_key, task);
+		}
+	}
+
+	switch_to_host_task(task);
+
+	ret = run_syscall(no, params);
+
+	if (no == __NR_reboot) {
+		thread_sched_jb();
+		return ret;
+	}
+
+out:
+	lkl_cpu_put();
+
+	return ret;
+}
+
+static struct task_struct *idle_host_task;
+
+/* called from idle, don't failed, don't block */
+void wakeup_idle_host_task(void)
+{
+	if (!need_resched() && idle_host_task)
+		wake_up_process(idle_host_task);
+}
+
+static int idle_host_task_loop(void *unused)
+{
+	struct thread_info *ti = task_thread_info(current);
+
+	snprintf(current->comm, sizeof(current->comm), "idle_host_task");
+	set_thread_flag(TIF_HOST_THREAD);
+	idle_host_task = current;
+
+	for (;;) {
+		lkl_cpu_put();
+		lkl_ops->sem_down(ti->sched_sem);
+		if (idle_host_task == NULL) {
+			lkl_ops->thread_exit();
+			return 0;
+		}
+		schedule_tail(ti->prev_sched);
+	}
+}
+
+int syscalls_init(void)
+{
+	snprintf(current->comm, sizeof(current->comm), "host0");
+	set_thread_flag(TIF_HOST_THREAD);
+	host0 = current;
+
+	if (lkl_ops->tls_alloc) {
+		task_key = lkl_ops->tls_alloc(del_host_task);
+		if (!task_key)
+			return -1;
+	}
+
+	if (kernel_thread(idle_host_task_loop, NULL, CLONE_FLAGS) < 0) {
+		if (lkl_ops->tls_free)
+			lkl_ops->tls_free(task_key);
+		return -1;
+	}
+
+	return 0;
+}
+
+void syscalls_cleanup(void)
+{
+	if (idle_host_task) {
+		struct thread_info *ti = task_thread_info(idle_host_task);
+
+		idle_host_task = NULL;
+		lkl_ops->sem_up(ti->sched_sem);
+		lkl_ops->thread_join(ti->tid);
+	}
+
+	if (lkl_ops->tls_free)
+		lkl_ops->tls_free(task_key);
+}
+
+SYSCALL_DEFINE3(virtio_mmio_device_add, long, base, long, size, unsigned int,
+		irq)
+{
+	struct platform_device *pdev;
+	int ret;
+
+	struct resource res[] = {
+		[0] = {
+				.start = base,
+				.end = base + size - 1,
+				.flags = IORESOURCE_MEM,
+			},
+		[1] = {
+				.start = irq,
+				.end = irq,
+				.flags = IORESOURCE_IRQ,
+			},
+	};
+
+	pdev = platform_device_alloc("virtio-mmio", PLATFORM_DEVID_AUTO);
+	if (!pdev) {
+		dev_err(&pdev->dev,
+			"%s: Unable to device alloc for virtio-mmio\n",
+			__func__);
+		return -ENOMEM;
+	}
+
+	ret = platform_device_add_resources(pdev, res, ARRAY_SIZE(res));
+	if (ret) {
+		dev_err(&pdev->dev, "%s: Unable to add resources for %s%d\n",
+			__func__, pdev->name, pdev->id);
+		goto exit_device_put;
+	}
+
+	ret = platform_device_add(pdev);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "%s: Unable to add %s%d\n", __func__,
+			pdev->name, pdev->id);
+		goto exit_release_pdev;
+	}
+
+	return pdev->id;
+
+exit_release_pdev:
+	platform_device_del(pdev);
+exit_device_put:
+	platform_device_put(pdev);
+
+	return ret;
+}
diff --git a/arch/um/lkl/kernel/syscalls_32.c b/arch/um/lkl/kernel/syscalls_32.c
new file mode 100644
index 000000000000..a4271593c338
--- /dev/null
+++ b/arch/um/lkl/kernel/syscalls_32.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * sys_ia32.c: Conversion between 32bit and 64bit native syscalls. Based on
+ *             sys_sparc32
+ *
+ * Copyright (C) 2000		VA Linux Co
+ * Copyright (C) 2000		Don Dugger <n0ano@valinux.com>
+ * Copyright (C) 1999		Arun Sharma <arun.sharma@intel.com>
+ * Copyright (C) 1997,1998	Jakub Jelinek (jj@sunsite.mff.cuni.cz)
+ * Copyright (C) 1997		David S. Miller (davem@caip.rutgers.edu)
+ * Copyright (C) 2000		Hewlett-Packard Co.
+ * Copyright (C) 2000		David Mosberger-Tang <davidm@hpl.hp.com>
+ * Copyright (C) 2000,2001,2002	Andi Kleen, SuSE Labs (x86-64 port)
+ *
+ * These routines maintain argument size conversion between 32bit and 64bit
+ * environment. In 2.5 most of this should be moved to a generic directory.
+ *
+ * This file assumes that there is a hole at the end of user address space.
+ *
+ * Some of the functions are LE specific currently. These are
+ * hopefully all marked.  This should be fixed.
+ */
+
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/fs.h>
+#include <linux/file.h>
+#include <linux/signal.h>
+#include <linux/syscalls.h>
+#include <linux/times.h>
+#include <linux/utsname.h>
+#include <linux/mm.h>
+#include <linux/uio.h>
+#include <linux/poll.h>
+#include <linux/personality.h>
+#include <linux/stat.h>
+#include <linux/rwsem.h>
+#include <linux/compat.h>
+#include <linux/vfs.h>
+#include <linux/ptrace.h>
+#include <linux/highuid.h>
+#include <linux/sysctl.h>
+#include <linux/slab.h>
+#include <asm/types.h>
+#include <linux/atomic.h>
+#include <asm/syscalls_32.h>
+
+#define AA(__x)		((unsigned long)(__x))
+
+#if __BITS_PER_LONG == 32
+
+asmlinkage long sys32_truncate64(const char __user *filename,
+				 unsigned long offset_low,
+				 unsigned long offset_high)
+{
+	return sys_truncate64(filename,
+			      ((loff_t)offset_high << 32) | offset_low);
+}
+
+asmlinkage long sys32_ftruncate64(unsigned int fd, unsigned long offset_low,
+				  unsigned long offset_high)
+{
+	return sys_ftruncate64(fd, ((loff_t)offset_high << 32) | offset_low);
+}
+
+#ifdef CONFIG_MMU
+/*
+ * Linux/i386 didn't use to be able to handle more than
+ * 4 system call parameters, so these system calls used a memory
+ * block for parameter passing..
+ */
+
+struct mmap_arg_struct32 {
+	unsigned int addr;
+	unsigned int len;
+	unsigned int prot;
+	unsigned int flags;
+	unsigned int fd;
+	unsigned int offset;
+};
+
+asmlinkage long sys32_mmap(struct mmap_arg_struct32 __user *arg)
+{
+	struct mmap_arg_struct32 a;
+
+	if (copy_from_user(&a, arg, sizeof(a)))
+		return -EFAULT;
+
+	if (a.offset & ~PAGE_MASK)
+		return -EINVAL;
+
+	return sys_mmap_pgoff(a.addr, a.len, a.prot, a.flags, a.fd,
+			      a.offset >> PAGE_SHIFT);
+}
+#endif
+
+asmlinkage long sys32_wait4(pid_t pid, unsigned int __user *stat_addr,
+			    int options, struct rusage __user *ru)
+{
+	return sys_wait4(pid, stat_addr, options, ru);
+}
+
+asmlinkage long sys32_pread64(unsigned int fd, char __user *ubuf, u32 count,
+			      u32 poslo, u32 poshi)
+{
+	return sys_pread64(fd, ubuf, count,
+			   ((loff_t)AA(poshi) << 32) | AA(poslo));
+}
+
+asmlinkage long sys32_pwrite64(unsigned int fd, const char __user *ubuf,
+			       u32 count, u32 poslo, u32 poshi)
+{
+	return sys_pwrite64(fd, ubuf, count,
+			    ((loff_t)AA(poshi) << 32) | AA(poslo));
+}
+
+/*
+ * Some system calls that need sign extended arguments. This could be
+ * done by a generic wrapper.
+ */
+long sys32_fadvise64_64(int fd, __u32 offset_low, __u32 offset_high,
+			__u32 len_low, __u32 len_high, int advice)
+{
+	return sys_fadvise64_64(fd, (((u64)offset_high) << 32) | offset_low,
+				(((u64)len_high) << 32) | len_low, advice);
+}
+
+asmlinkage ssize_t sys32_readahead(int fd, unsigned int off_lo,
+				   unsigned int off_hi, size_t count)
+{
+	return sys_readahead(fd, ((u64)off_hi << 32) | off_lo, count);
+}
+
+asmlinkage long sys32_sync_file_range(int fd, unsigned int off_low,
+				      unsigned int off_hi, unsigned int n_low,
+				      unsigned int n_hi, unsigned int flags)
+{
+	return sys_sync_file_range(fd, ((u64)off_hi << 32) | off_low,
+				   ((u64)n_hi << 32) | n_low, flags);
+}
+
+asmlinkage long sys32_sync_file_range2(int fd, unsigned int flags,
+				       unsigned int off_low,
+				       unsigned int off_hi, unsigned int n_low,
+				       unsigned int n_hi)
+{
+	return sys_sync_file_range(fd, ((u64)off_hi << 32) | off_low,
+				   ((u64)n_hi << 32) | n_low, flags);
+}
+
+asmlinkage long sys32_fallocate(int fd, int mode, unsigned int offset_lo,
+				unsigned int offset_hi, unsigned int len_lo,
+				unsigned int len_hi)
+{
+	return sys_fallocate(fd, mode, ((u64)offset_hi << 32) | offset_lo,
+			     ((u64)len_hi << 32) | len_lo);
+}
+
+#endif
diff --git a/arch/um/lkl/scripts/headers_install.py b/arch/um/lkl/scripts/headers_install.py
new file mode 100755
index 000000000000..17a4d2b00681
--- /dev/null
+++ b/arch/um/lkl/scripts/headers_install.py
@@ -0,0 +1,195 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+import re, os, sys, argparse, multiprocessing, fnmatch
+
+srctree = os.environ["srctree"]
+objtree = os.environ["objtree"]
+header_paths = [ "include/uapi/", "arch/um/lkl/include/uapi/",
+                 "arch/um/lkl/include/generated/uapi/", "include/generated/" ]
+
+headers = set()
+includes = set()
+
+def relpath2abspath(relpath):
+    if "generated" in relpath:
+        return objtree + "/" + relpath
+    else:
+        return srctree + "/" + relpath
+
+def find_headers(path):
+    headers.add(path)
+    f = open(relpath2abspath(path))
+    for l in f.readlines():
+        m = re.search("#include <(.*)>", l)
+        try:
+            i = m.group(1)
+            for p in header_paths:
+                if os.access(relpath2abspath(p + i), os.R_OK):
+                    if p + i not in headers:
+                        includes.add(i)
+                        headers.add(p + i)
+                        find_headers(p + i)
+        except:
+            pass
+    f.close()
+
+def has_lkl_prefix(w):
+  return w.startswith("lkl") or w.startswith("_lkl") or w.startswith("__lkl") \
+         or w.startswith("LKL") or w.startswith("_LKL") or w.startswith("__LKL")
+
+def find_symbols(regexp, store):
+    for h in headers:
+        f = open(h)
+        for l in f.readlines():
+            m = regexp.search(l)
+            if not m:
+                continue
+            for e in reversed(m.groups()):
+                if e:
+                    if not has_lkl_prefix(e):
+                        store.add(e)
+                    break
+        f.close()
+
+def find_ml_symbols(regexp, store):
+    for h in headers:
+        for i in regexp.finditer(open(h).read()):
+            for j in reversed(i.groups()):
+                if j:
+                    if not has_lkl_prefix(j):
+                        store.add(j)
+                    break
+
+def find_enums(block_regexp, symbol_regexp, store):
+    for h in headers:
+        # remove comments
+        content = re.sub(re.compile("(\/\*(\*(?!\/)|[^*])*\*\/)", re.S|re.M), " ", open(h).read())
+        # remove preprocesor lines
+        clean_content = ""
+        for l in content.split("\n"):
+            if re.match("\s*#", l):
+                continue
+            clean_content += l + "\n"
+        for i in block_regexp.finditer(clean_content):
+            for j in reversed(i.groups()):
+                if j:
+                    for k in symbol_regexp.finditer(j):
+                        for l in k.groups():
+                            if l:
+                                if not has_lkl_prefix(l):
+                                    store.add(l)
+                                break
+
+def lkl_prefix(w):
+    r = ""
+
+    if w.startswith("__"):
+        r = "__"
+    elif w.startswith("_"):
+        r = "_"
+
+    if w.isupper():
+        r += "LKL"
+    else:
+        r += "lkl"
+
+    if not w.startswith("_"):
+        r += "_"
+
+    r += w
+
+    return r
+
+def replace(h):
+    content = open(h).read()
+    for i in includes:
+        search_str = "(#[ \t]*include[ \t]*[<\"][ \t]*)" + i + "([ \t]*[>\"])"
+        replace_str = "\\1" + "lkl/" + i + "\\2"
+        content = re.sub(search_str, replace_str, content)
+    tmp = ""
+    for w in re.split("(\W+)", content):
+        if w in defines:
+            w = lkl_prefix(w)
+        tmp += w
+    content = tmp
+    for s in structs:
+        search_str = "(\W?struct\s+)" + s + "(\W)"
+        replace_str = "\\1" + lkl_prefix(s) + "\\2"
+        content = re.sub(search_str, replace_str, content, flags = re.MULTILINE)
+    for s in unions:
+        search_str = "(\W?union\s+)" + s + "(\W)"
+        replace_str = "\\1" + lkl_prefix(s) + "\\2"
+        content = re.sub(search_str, replace_str, content, flags = re.MULTILINE)
+    open(h, 'w').write(content)
+
+parser = argparse.ArgumentParser(description='install lkl headers')
+parser.add_argument('path', help='path to install to', )
+parser.add_argument('-j', '--jobs', help='number of parallel jobs', default=1, type=int)
+args = parser.parse_args()
+
+find_headers("arch/um/lkl/include/uapi/asm/syscalls.h")
+headers.add("arch/um/lkl/include/uapi/asm/host_ops.h")
+
+if 'LKL_INSTALL_ADDITIONAL_HEADERS' in os.environ:
+    with open(os.environ['LKL_INSTALL_ADDITIONAL_HEADERS'], 'rU') as f:
+        for line in f.readlines():
+            line = line.split('#', 1)[0].strip()
+            if line != '':
+                headers.add(line)
+
+new_headers = set()
+
+for h in headers:
+    dir = os.path.dirname(h)
+    out_dir = args.path + "/" + re.sub("(arch/um/lkl/include/uapi/|arch/um/lkl/include/generated/uapi/|include/uapi/|include/generated/uapi/|include/generated)(.*)", "lkl/\\2", dir)
+    try:
+        os.makedirs(out_dir)
+    except:
+        pass
+    print("  INSTALL\t%s" % (out_dir + "/" + os.path.basename(h)))
+    os.system(srctree+"/scripts/headers_install.sh %s %s" % (os.path.abspath(h),
+                                                       out_dir + "/" + os.path.basename(h)))
+    new_headers.add(out_dir + "/" + os.path.basename(h))
+
+headers = new_headers
+
+defines = set()
+structs = set()
+unions = set()
+
+p = re.compile("#[ \t]*define[ \t]*(\w+)")
+find_symbols(p, defines)
+p = re.compile("typedef.*(\(\*(\w+)\)\(.*\)\s*|\W+(\w+)\s*|\s+(\w+)\(.*\)\s*);")
+find_symbols(p, defines)
+p = re.compile("typedef\s+(struct|union)\s+\w*\s*{[^\\{\}]*}\W*(\w+)\s*;", re.M|re.S)
+find_ml_symbols(p, defines)
+defines.add("siginfo_t")
+defines.add("sigevent_t")
+p = re.compile("struct\s+(\w+)\s*\{")
+find_symbols(p, structs)
+structs.add("iovec")
+p = re.compile("union\s+(\w+)\s*\{")
+find_symbols(p, unions)
+p = re.compile("static\s+__inline__(\s+\w+)+\s+(\w+)\([^)]*\)\s")
+find_symbols(p, defines)
+p = re.compile("static\s+__always_inline(\s+\w+)+\s+(\w+)\([^)]*\)\s")
+find_symbols(p, defines)
+p = re.compile("enum\s+(\w*)\s*{([^}]*)}", re.M|re.S)
+q = re.compile("(\w+)\s*(,|=[^,]*|$)", re.M|re.S)
+find_enums(p, q, defines)
+
+# needed for i386
+defines.add("__NR_stime")
+
+def process_header(h):
+    print("  REPLACE\t%s" % (out_dir + "/" + os.path.basename(h)))
+    replace(h)
+
+p = multiprocessing.Pool(args.jobs)
+try:
+    p.map_async(process_header, headers).wait(999999)
+    p.close()
+except:
+    p.terminate()
+finally:
+    p.join()
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 09/47] lkl: timers, time and delay support
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (7 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 08/47] lkl: system call interface and application API Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 10/47] lkl: memory mapped I/O support Hajime Tazaki
                   ` (40 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Michael Zimmermann, Hajime Tazaki, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Clockevent driver based on host timer operations and clocksource
driver and udelay support based on host time operations.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/kernel/time.c | 145 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 145 insertions(+)
 create mode 100644 arch/um/lkl/kernel/time.c

diff --git a/arch/um/lkl/kernel/time.c b/arch/um/lkl/kernel/time.c
new file mode 100644
index 000000000000..b8320e1bfa53
--- /dev/null
+++ b/arch/um/lkl/kernel/time.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/clocksource.h>
+#include <linux/clockchips.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <asm/host_ops.h>
+
+static unsigned long long boot_time;
+
+void __ndelay(unsigned long nsecs)
+{
+	unsigned long long start = lkl_ops->time();
+
+	while (lkl_ops->time() < start + nsecs)
+		;
+}
+
+void __udelay(unsigned long usecs)
+{
+	__ndelay(usecs * NSEC_PER_USEC);
+}
+
+void __const_udelay(unsigned long xloops)
+{
+	__udelay(xloops / 0x10c7ul);
+}
+
+void calibrate_delay(void)
+{
+}
+
+void read_persistent_clock(struct timespec *ts)
+{
+	*ts = ns_to_timespec(lkl_ops->time());
+}
+
+/*
+ * Scheduler clock - returns current time in nanosec units.
+ *
+ */
+unsigned long long sched_clock(void)
+{
+	if (!boot_time)
+		return 0;
+
+	return lkl_ops->time() - boot_time;
+}
+
+static u64 clock_read(struct clocksource *cs)
+{
+	return lkl_ops->time();
+}
+
+static struct clocksource clocksource = {
+	.name	= "lkl",
+	.rating = 499,
+	.read	= clock_read,
+	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
+	.mask	= CLOCKSOURCE_MASK(64),
+};
+
+static void *timer;
+
+static int timer_irq;
+
+static void timer_fn(void *arg)
+{
+	lkl_trigger_irq(timer_irq);
+}
+
+static int clockevent_set_state_shutdown(struct clock_event_device *evt)
+{
+	if (timer) {
+		lkl_ops->timer_free(timer);
+		timer = NULL;
+	}
+
+	return 0;
+}
+
+static int clockevent_set_state_oneshot(struct clock_event_device *evt)
+{
+	timer = lkl_ops->timer_alloc(timer_fn, NULL);
+	if (!timer)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static irqreturn_t timer_irq_handler(int irq, void *dev_id)
+{
+	struct clock_event_device *dev = (struct clock_event_device *)dev_id;
+
+	dev->event_handler(dev);
+
+	return IRQ_HANDLED;
+}
+
+static int clockevent_next_event(unsigned long ns,
+				 struct clock_event_device *evt)
+{
+	return lkl_ops->timer_set_oneshot(timer, ns);
+}
+
+static struct clock_event_device clockevent = {
+	.name			= "lkl",
+	.features		= CLOCK_EVT_FEAT_ONESHOT,
+	.set_state_oneshot	= clockevent_set_state_oneshot,
+	.set_next_event		= clockevent_next_event,
+	.set_state_shutdown	= clockevent_set_state_shutdown,
+};
+
+static struct irqaction irq0  = {
+	.handler	= timer_irq_handler,
+	.flags		= IRQF_NOBALANCING | IRQF_TIMER,
+	.dev_id		= &clockevent,
+	.name		= "timer"
+};
+
+void __init time_init(void)
+{
+	int ret;
+
+	if (!lkl_ops->timer_alloc || !lkl_ops->timer_free ||
+	    !lkl_ops->timer_set_oneshot || !lkl_ops->time) {
+		pr_err("lkl: no time or timer support provided by host\n");
+		return;
+	}
+
+	timer_irq = lkl_get_free_irq("timer");
+	setup_irq(timer_irq, &irq0);
+
+	ret = clocksource_register_khz(&clocksource, 1000000);
+	if (ret)
+		pr_err("lkl: unable to register clocksource\n");
+
+	clockevents_config_and_register(&clockevent, NSEC_PER_SEC, 1,
+					ULONG_MAX);
+
+	boot_time = lkl_ops->time();
+	pr_info("lkl: time and timers initialized (irq%d)\n", timer_irq);
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 10/47] lkl: memory mapped I/O support
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (8 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 09/47] lkl: timers, time and delay support Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 11/47] lkl: basic kernel console support Hajime Tazaki
                   ` (39 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

All memory mapped I/O access is redirected to the host via the
iomem_access host operation. The host can setup the memory mapped I/O
region via the ioremap operation.

This allows the host to implement support for various devices, such as
block or network devices.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/io.h | 104 +++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/io.h

diff --git a/arch/um/lkl/include/asm/io.h b/arch/um/lkl/include/asm/io.h
new file mode 100644
index 000000000000..33d4e1a7feb2
--- /dev/null
+++ b/arch/um/lkl/include/asm/io.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_IO_H
+#define _ASM_LKL_IO_H
+
+#include <asm/bug.h>
+#include <asm/host_ops.h>
+
+#define __raw_readb __raw_readb
+static inline u8 __raw_readb(const volatile void __iomem *addr)
+{
+	int ret;
+	u8 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+
+#define __raw_readw __raw_readw
+static inline u16 __raw_readw(const volatile void __iomem *addr)
+{
+	int ret;
+	u16 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+
+#define __raw_readl __raw_readl
+static inline u32 __raw_readl(const volatile void __iomem *addr)
+{
+	int ret;
+	u32 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+
+#ifdef CONFIG_64BIT
+#define __raw_readq __raw_readq
+static inline u64 __raw_readq(const volatile void __iomem *addr)
+{
+	int ret;
+	u64 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+#endif /* CONFIG_64BIT */
+
+#define __raw_writeb __raw_writeb
+static inline void __raw_writeb(u8 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+
+#define __raw_writew __raw_writew
+static inline void __raw_writew(u16 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+
+#define __raw_writel __raw_writel
+static inline void __raw_writel(u32 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+
+#ifdef CONFIG_64BIT
+#define __raw_writeq __raw_writeq
+static inline void __raw_writeq(u64 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+#endif /* CONFIG_64BIT */
+
+#define ioremap ioremap
+static inline void __iomem *ioremap(phys_addr_t offset, size_t size)
+{
+	return (void __iomem *)lkl_ops->ioremap(offset, size);
+}
+
+#include <asm-generic/io.h>
+
+#endif /* _ASM_LKL_IO_H */
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 11/47] lkl: basic kernel console support
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (9 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 10/47] lkl: memory mapped I/O support Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 12/47] lkl: initialization and cleanup Hajime Tazaki
                   ` (38 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Write operations are deferred to the host print operation.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/kernel/console.c | 42 ++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)
 create mode 100644 arch/um/lkl/kernel/console.c

diff --git a/arch/um/lkl/kernel/console.c b/arch/um/lkl/kernel/console.c
new file mode 100644
index 000000000000..54d7f756c6da
--- /dev/null
+++ b/arch/um/lkl/kernel/console.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/console.h>
+#include <asm/host_ops.h>
+
+static void console_write(struct console *con, const char *str,
+			  unsigned int len)
+{
+	if (lkl_ops->print)
+		lkl_ops->print(str, len);
+}
+
+#ifdef CONFIG_LKL_EARLY_CONSOLE
+static struct console lkl_boot_console = {
+	.name	= "lkl_boot_console",
+	.write	= console_write,
+	.flags	= CON_PRINTBUFFER | CON_BOOT,
+	.index	= -1,
+};
+
+int __init lkl_boot_console_init(void)
+{
+	register_console(&lkl_boot_console);
+	return 0;
+}
+early_initcall(lkl_boot_console_init);
+#endif
+
+static struct console lkl_console = {
+	.name	= "lkl_console",
+	.write	= console_write,
+	.flags	= CON_PRINTBUFFER,
+	.index	= -1,
+};
+
+static int __init lkl_console_init(void)
+{
+	register_console(&lkl_console);
+	return 0;
+}
+core_initcall(lkl_console_init);
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 12/47] lkl: initialization and cleanup
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (10 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 11/47] lkl: basic kernel console support Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 13/47] lkl: plug in the build system Hajime Tazaki
                   ` (37 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Hajime Tazaki, Patrick Collins,
	Michael Zimmermann, Akira Moroo, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Add the lkl_start_kernel and lkl_sys_halt APIs that start and
respectively stops the Linux kernel.

lkl_start_kernel creates a separate threads that will run the initial
and idle kernel thread. It waits for the kernel to complete
initialization before returning, to avoid races with system calls
issues by the host application.

During the setup phase, we create "/init" in initial ramfs root
filesystem to avoid mounting the "real" rootfs since ramfs is good
enough for now.

lkl_stop_kernel will shutdown the kernel, terminate all threads and
free all host resources used by the kernel before returning.

This patch also introduces idle CPU handling since it is closely
related to the shutdown process. A host semaphore is used to wait for
new interrupts when the kernel switches the CPU to idle to avoid
wasting host CPU cycles. When the kernel is shutdown we terminate the
idle thread at the first CPU idle event.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/setup.h |   7 ++
 arch/um/lkl/kernel/setup.c      | 193 ++++++++++++++++++++++++++++++++
 2 files changed, 200 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/setup.h
 create mode 100644 arch/um/lkl/kernel/setup.c

diff --git a/arch/um/lkl/include/asm/setup.h b/arch/um/lkl/include/asm/setup.h
new file mode 100644
index 000000000000..b40955208cc6
--- /dev/null
+++ b/arch/um/lkl/include/asm/setup.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_SETUP_H
+#define _ASM_LKL_SETUP_H
+
+#define COMMAND_LINE_SIZE 4096
+
+#endif
diff --git a/arch/um/lkl/kernel/setup.c b/arch/um/lkl/kernel/setup.c
new file mode 100644
index 000000000000..1bf973d36307
--- /dev/null
+++ b/arch/um/lkl/kernel/setup.c
@@ -0,0 +1,193 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/binfmts.h>
+#include <linux/init.h>
+#include <linux/init_task.h>
+#include <linux/personality.h>
+#include <linux/reboot.h>
+#include <linux/fs.h>
+#include <linux/start_kernel.h>
+#include <linux/syscalls.h>
+#include <linux/tick.h>
+#include <asm/host_ops.h>
+#include <asm/irq.h>
+#include <asm/unistd.h>
+#include <asm/syscalls.h>
+#include <asm/cpu.h>
+
+struct lkl_host_operations *lkl_ops;
+static char cmd_line[COMMAND_LINE_SIZE];
+static void *init_sem;
+static int is_running;
+void (*pm_power_off)(void) = NULL;
+static unsigned long mem_size = 64 * 1024 * 1024;
+
+static long lkl_panic_blink(int state)
+{
+	lkl_ops->panic();
+	return 0;
+}
+
+static int __init setup_mem_size(char *str)
+{
+	mem_size = memparse(str, NULL);
+	return 0;
+}
+early_param("mem", setup_mem_size);
+
+void __init setup_arch(char **cl)
+{
+	*cl = cmd_line;
+	panic_blink = lkl_panic_blink;
+	parse_early_param();
+	bootmem_init(mem_size);
+}
+
+static void __init lkl_run_kernel(void *arg)
+{
+	threads_init();
+	lkl_cpu_get();
+	start_kernel();
+}
+
+int __init lkl_start_kernel(struct lkl_host_operations *ops, const char *fmt,
+			    ...)
+{
+	va_list ap;
+	int ret;
+
+	lkl_ops = ops;
+
+	va_start(ap, fmt);
+	ret = vsnprintf(boot_command_line, COMMAND_LINE_SIZE, fmt, ap);
+	va_end(ap);
+
+	if (ops->virtio_devices)
+		strscpy(boot_command_line + ret, ops->virtio_devices,
+			COMMAND_LINE_SIZE - ret);
+
+	memcpy(cmd_line, boot_command_line, COMMAND_LINE_SIZE);
+
+	init_sem = lkl_ops->sem_alloc(0);
+	if (!init_sem)
+		return -ENOMEM;
+
+	ret = lkl_cpu_init();
+	if (ret)
+		goto out_free_init_sem;
+
+	ret = lkl_ops->thread_create(lkl_run_kernel, NULL);
+	if (!ret) {
+		ret = -ENOMEM;
+		goto out_free_init_sem;
+	}
+
+	lkl_ops->sem_down(init_sem);
+	lkl_ops->sem_free(init_sem);
+	current_thread_info()->tid = lkl_ops->thread_self();
+	lkl_cpu_change_owner(current_thread_info()->tid);
+
+	lkl_cpu_put();
+	is_running = 1;
+
+	return 0;
+
+out_free_init_sem:
+	lkl_ops->sem_free(init_sem);
+
+	return ret;
+}
+
+int lkl_is_running(void)
+{
+	return is_running;
+}
+
+void machine_halt(void)
+{
+	lkl_cpu_shutdown();
+}
+
+void machine_power_off(void)
+{
+	machine_halt();
+}
+
+void machine_restart(char *unused)
+{
+	machine_halt();
+}
+
+long lkl_sys_halt(void)
+{
+	long err;
+	long params[6] = {
+		LINUX_REBOOT_MAGIC1,
+		LINUX_REBOOT_MAGIC2,
+		LINUX_REBOOT_CMD_RESTART,
+	};
+
+	err = lkl_syscall(__NR_reboot, params);
+	if (err < 0)
+		return err;
+
+	is_running = false;
+
+	lkl_cpu_wait_shutdown();
+
+	syscalls_cleanup();
+	threads_cleanup();
+	/* Shutdown the clockevents source. */
+	tick_suspend_local();
+	free_mem();
+	lkl_ops->thread_join(current_thread_info()->tid);
+
+	return 0;
+}
+
+static int lkl_run_init(struct linux_binprm *bprm);
+
+static struct linux_binfmt lkl_run_init_binfmt = {
+	.module		= THIS_MODULE,
+	.load_binary	= lkl_run_init,
+};
+
+static int lkl_run_init(struct linux_binprm *bprm)
+{
+	int ret;
+
+	if (strcmp("/init", bprm->filename) != 0)
+		return -EINVAL;
+
+	ret = flush_old_exec(bprm);
+	if (ret)
+		return ret;
+	set_personality(PER_LINUX);
+	setup_new_exec(bprm);
+	install_exec_creds(bprm);
+
+	set_binfmt(&lkl_run_init_binfmt);
+
+	init_pid_ns.child_reaper = NULL;
+
+	syscalls_init();
+
+	lkl_ops->sem_up(init_sem);
+	lkl_ops->thread_exit();
+
+	return 0;
+}
+
+/* skip mounting the "real" rootfs. ramfs is good enough. */
+static int __init fs_setup(void)
+{
+	int fd;
+
+	fd = sys_open("/init", O_CREAT, 0700);
+	WARN_ON(fd < 0);
+	sys_close(fd);
+
+	register_binfmt(&lkl_run_init_binfmt);
+
+	return 0;
+}
+late_initcall(fs_setup);
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 13/47] lkl: plug in the build system
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (11 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 12/47] lkl: initialization and cleanup Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 14/47] lkl tools: skeleton for host side library, tests and tools Hajime Tazaki
                   ` (36 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Basic Makefiles for building LKL. Add a new architecture specific
target for installing the resulting library files and headers.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/Makefile.um         | 117 ++++++++++++++++++++++++++++++++++++
 arch/um/lkl/auto.conf       |   1 +
 arch/um/lkl/kernel/Makefile |   4 ++
 arch/um/lkl/mm/Makefile     |   1 +
 4 files changed, 123 insertions(+)
 create mode 100644 arch/um/Makefile.um
 create mode 100644 arch/um/lkl/auto.conf
 create mode 100644 arch/um/lkl/kernel/Makefile
 create mode 100644 arch/um/lkl/mm/Makefile

diff --git a/arch/um/Makefile.um b/arch/um/Makefile.um
new file mode 100644
index 000000000000..24a088e5df04
--- /dev/null
+++ b/arch/um/Makefile.um
@@ -0,0 +1,117 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# This file is included by the global makefile so that you can add your own
+# architecture-specific flags and dependencies.
+#
+# Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
+# Licensed under the GPL
+#
+
+core-y			+= $(ARCH_DIR)/kernel/		\
+			 $(ARCH_DIR)/drivers/		\
+			 $(ARCH_DIR)/os-$(OS)/
+
+ifdef CONFIG_64BIT
+	KBUILD_CFLAGS += -mcmodel=large
+endif
+
+SHARED_HEADERS	:= $(ARCH_DIR)/include/shared
+ARCH_INCLUDE	:= -I$(srctree)/$(SHARED_HEADERS)
+ARCH_INCLUDE	+= -I$(srctree)/$(HOST_DIR)/um/shared
+KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/um
+
+# -Dvmap=kernel_vmap prevents anything from referencing the libpcap.o symbol so
+# named - it's a common symbol in libpcap, so we get a binary which crashes.
+#
+# Same things for in6addr_loopback and mktime - found in libc. For these two we
+# only get link-time error, luckily.
+#
+# -Dlongjmp=kernel_longjmp prevents anything from referencing the libpthread.a
+# embedded copy of longjmp, same thing for setjmp.
+#
+# These apply to USER_CFLAGS to.
+
+KBUILD_CFLAGS += $(CFLAGS) $(CFLAGS-y) -D__arch_um__ \
+	$(ARCH_INCLUDE) $(MODE_INCLUDE) -Dvmap=kernel_vmap	\
+	-Dlongjmp=kernel_longjmp -Dsetjmp=kernel_setjmp \
+	-Din6addr_loopback=kernel_in6addr_loopback \
+	-Din6addr_any=kernel_in6addr_any -Dstrrchr=kernel_strrchr
+
+KBUILD_AFLAGS += $(ARCH_INCLUDE)
+
+USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \
+		$(ARCH_INCLUDE) $(MODE_INCLUDE) $(filter -I%,$(CFLAGS)) \
+		-D_FILE_OFFSET_BITS=64 -idirafter $(srctree)/include \
+		-idirafter $(obj)/include -D__KERNEL__ -D__UM_HOST__
+
+#This will adjust *FLAGS accordingly to the platform.
+include $(ARCH_DIR)/Makefile-os-$(OS)
+
+KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/include \
+		   -I$(srctree)/$(HOST_DIR)/include/uapi \
+		   -I$(objtree)/$(HOST_DIR)/include/generated \
+		   -I$(objtree)/$(HOST_DIR)/include/generated/uapi
+
+# -Derrno=kernel_errno - This turns all kernel references to errno into
+# kernel_errno to separate them from the libc errno.  This allows -fno-common
+# in KBUILD_CFLAGS.  Otherwise, it would cause ld to complain about the two different
+# errnos.
+# These apply to kernelspace only.
+#
+# strip leading and trailing whitespace to make the USER_CFLAGS removal of these
+# defines more robust
+
+KERNEL_DEFINES = $(strip -Derrno=kernel_errno -Dsigprocmask=kernel_sigprocmask \
+			 -Dmktime=kernel_mktime $(ARCH_KERNEL_DEFINES))
+KBUILD_CFLAGS += $(KERNEL_DEFINES)
+
+PHONY += linux
+
+all: linux
+
+linux: vmlinux
+	@echo '  LINK $@'
+	$(Q)ln -f $< $@
+
+define archhelp
+  echo '* linux		- Binary kernel image (./linux) - for backward'
+  echo '		   compatibility only, this creates a hard link to the'
+  echo '		   real kernel binary, the "vmlinux" binary you'
+  echo '		   find in the kernel root.'
+endef
+
+archheaders:
+	$(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(HEADER_ARCH) asm-generic archheaders
+
+archprepare:
+	$(Q)$(MAKE) $(build)=$(HOST_DIR)/um include/generated/user_constants.h
+
+LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static
+LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)
+
+CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \
+	$(call cc-option, -fno-stack-protector,) \
+	$(call cc-option, -fno-stack-protector-all,)
+
+# Options used by linker script
+export LDS_START      := $(START)
+export LDS_ELF_ARCH   := $(ELF_ARCH)
+export LDS_ELF_FORMAT := $(ELF_FORMAT)
+
+# The wrappers will select whether using "malloc" or the kernel allocator.
+LINK_WRAPS = -Wl,--wrap,malloc -Wl,--wrap,free -Wl,--wrap,calloc
+
+LD_FLAGS_CMDLINE = $(foreach opt,$(KBUILD_LDFLAGS),-Wl,$(opt))
+
+# Used by link-vmlinux.sh which has special support for um link
+export CFLAGS_vmlinux := $(LINK-y) $(LINK_WRAPS) $(LD_FLAGS_CMDLINE)
+
+# When cleaning we don't include .config, so we don't include
+# TT or skas makefiles and don't clean skas_ptregs.h.
+CLEAN_FILES += linux x.i gmon.out
+
+archclean:
+	@find . \( -name '*.bb' -o -name '*.bbg' -o -name '*.da' \
+		-o -name '*.gcov' \) -type f -print | xargs rm -f
+
+export USER_CFLAGS CFLAGS_NO_HARDENING OS DEV_NULL_PATH
diff --git a/arch/um/lkl/auto.conf b/arch/um/lkl/auto.conf
new file mode 100644
index 000000000000..4bfd65a02d73
--- /dev/null
+++ b/arch/um/lkl/auto.conf
@@ -0,0 +1 @@
+export OUTPUT_FORMAT=$(shell $(LD) -r -print-output-format)
diff --git a/arch/um/lkl/kernel/Makefile b/arch/um/lkl/kernel/Makefile
new file mode 100644
index 000000000000..ef489f2f7176
--- /dev/null
+++ b/arch/um/lkl/kernel/Makefile
@@ -0,0 +1,4 @@
+extra-y := vmlinux.lds
+
+obj-y = setup.o threads.o irq.o time.o syscalls.o misc.o console.o \
+	syscalls_32.o cpu.o
diff --git a/arch/um/lkl/mm/Makefile b/arch/um/lkl/mm/Makefile
new file mode 100644
index 000000000000..2af6e3051897
--- /dev/null
+++ b/arch/um/lkl/mm/Makefile
@@ -0,0 +1 @@
+obj-y = bootmem.o
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 14/47] lkl tools: skeleton for host side library, tests and tools
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (12 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 13/47] lkl: plug in the build system Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 15/47] lkl tools: host lib: add utilities functions Hajime Tazaki
                   ` (35 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: H . K . Jerry Chu, Xiao Jia, Conrad Meyer, Octavian Purdila,
	Motomu Utsumi, Akira Moroo, Petros Angelatos, Yuan Liu,
	Thomas Liebetraut, Mark Stillwell, Patrick Collins,
	Ben Wolsieffer, Michael Zimmermann, Luca Dariz, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch adds the skeleton for the host library, tests and
application examples.

The host library is implementing the host operations needed by LKL and
is split into host dependent (depends on a specific host, e.g. POSIX
hosts) and host independent parts (will work on all supported hosts).

Signed-off-by: Ben Wolsieffer <benwolsieffer@gmail.com>
Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore         |  14 +
 tools/lkl/Build              |   6 +
 tools/lkl/Makefile           | 130 +++++
 tools/lkl/Makefile.autoconf  | 114 +++++
 tools/lkl/Targets            |  27 +
 tools/lkl/include/.gitignore |   1 +
 tools/lkl/include/lkl.h      | 928 +++++++++++++++++++++++++++++++++++
 tools/lkl/include/lkl_host.h | 160 ++++++
 tools/lkl/lib/.gitignore     |   3 +
 tools/lkl/lib/Build          |  25 +
 10 files changed, 1408 insertions(+)
 create mode 100644 tools/lkl/.gitignore
 create mode 100644 tools/lkl/Build
 create mode 100644 tools/lkl/Makefile
 create mode 100644 tools/lkl/Makefile.autoconf
 create mode 100644 tools/lkl/Targets
 create mode 100644 tools/lkl/include/.gitignore
 create mode 100644 tools/lkl/include/lkl.h
 create mode 100644 tools/lkl/include/lkl_host.h
 create mode 100644 tools/lkl/lib/.gitignore
 create mode 100644 tools/lkl/lib/Build

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
new file mode 100644
index 000000000000..796785986336
--- /dev/null
+++ b/tools/lkl/.gitignore
@@ -0,0 +1,14 @@
+tests/boot
+fs2tar
+cptofs
+cpfromfs
+lklfuse
+tests/valgrind*.xml
+*.exe
+*.dll
+tests/net-test
+tests/disk
+Makefile.conf
+include/lkl_autoconf.h
+tests/autoconf.sh
+*.pyc
diff --git a/tools/lkl/Build b/tools/lkl/Build
new file mode 100644
index 000000000000..6048440d0e1b
--- /dev/null
+++ b/tools/lkl/Build
@@ -0,0 +1,6 @@
+CFLAGS_lklfuse.o += -D_FILE_OFFSET_BITS=64
+
+cptofs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs.o
+fs2tar-$(LKL_HOST_CONFIG_ARCHIVE) += fs2tar.o
+lklfuse-$(LKL_HOST_CONFIG_FUSE) += lklfuse.o
+
diff --git a/tools/lkl/Makefile b/tools/lkl/Makefile
new file mode 100644
index 000000000000..7e0cb0d01bf2
--- /dev/null
+++ b/tools/lkl/Makefile
@@ -0,0 +1,130 @@
+# Do not use make's built-in rules
+# (this improves performance and avoids hard-to-debug behaviour);
+# also do not print "Entering directory..." messages from make
+.SUFFIXES:
+MAKEFLAGS += -r --no-print-directory
+
+KCONFIG?=defconfig
+
+ifneq ($(silent),1)
+  ifneq ($(V),1)
+	QUIET_AUTOCONF       = @echo '  AUTOCONF '$@;
+	Q = @
+  endif
+endif
+
+PREFIX   := /usr
+
+ifeq (,$(srctree))
+  srctree := $(patsubst %/,%,$(dir $(shell pwd)))
+  srctree := $(patsubst %/,%,$(dir $(srctree)))
+endif
+export srctree
+
+-include ../scripts/Makefile.include
+
+# OUTPUT fixup should be *after* include ../scripts/Makefile.include
+ifneq ($(OUTPUT),)
+  OUTPUT := $(OUTPUT)/tools/lkl/
+else
+  OUTPUT := $(CURDIR)/
+endif
+export OUTPUT
+
+
+all:
+
+conf: $(OUTPUT)Makefile.conf
+
+$(OUTPUT)Makefile.conf: Makefile.autoconf
+	$(call QUIET_AUTOCONF, headers)$(MAKE) -f Makefile.autoconf -s
+
+-include $(OUTPUT)Makefile.conf
+
+export CFLAGS += -I$(OUTPUT)/include -Iinclude -Wall -g -O2 -Wextra \
+	 -Wno-unused-parameter \
+	 -Wno-missing-field-initializers -fno-strict-aliasing
+
+-include Targets
+
+TARGETS := $(progs-y:%=$(OUTPUT)%$(EXESUF))
+TARGETS += $(libs-y:%=$(OUTPUT)%$(SOSUF))
+all: $(TARGETS)
+
+# this workaround is for FreeBSD
+bin/stat:
+ifeq ($(LKL_HOST_CONFIG_BSD),y)
+	$(Q)ln -sf `which gnustat` bin/stat
+	$(Q)ln -sf `which gsed` bin/sed
+else
+	$(Q)touch bin/stat
+endif
+
+# rule to build lkl.o
+$(OUTPUT)lib/lkl.o: bin/stat
+	$(Q)$(MAKE) -C ../.. ARCH=um SUBARCH=lkl $(KOPT) $(KCONFIG)
+# this workaround is for arm32 linker (ld.gold)
+	$(Q)export PATH=$(srctree)/tools/lkl/bin/:${PATH} ;\
+	$(MAKE) -C ../.. ARCH=um SUBARCH=lkl $(KOPT) install INSTALL_PATH=$(OUTPUT)
+
+# rules to link libs
+$(OUTPUT)%$(SOSUF): LDFLAGS += -shared
+$(OUTPUT)%$(SOSUF): $(OUTPUT)%-in.o $(OUTPUT)liblkl.a
+	$(QUIET_LINK)$(CC) $(LDFLAGS) $(LDFLAGS_$*-y) -o $@ $^ $(LDLIBS) $(LDLIBS_$*-y)
+
+# liblkl is special
+$(OUTPUT)liblkl$(SOSUF): $(OUTPUT)%-in.o $(OUTPUT)lib/lkl.o
+$(OUTPUT)liblkl.a: $(OUTPUT)lib/liblkl-in.o $(OUTPUT)lib/lkl.o
+	$(QUIET_AR)$(AR) -rc $@ $^
+
+# rule to link programs
+$(OUTPUT)%$(EXESUF): $(OUTPUT)%-in.o $(OUTPUT)liblkl.a
+	$(QUIET_LINK)$(CC) $(LDFLAGS) $(LDFLAGS_$*-y) -o $@ $^ $(LDLIBS) $(LDLIBS_$*-y)
+
+# rule to build objects
+$(OUTPUT)%-in.o: $(OUTPUT)lib/lkl.o FORCE
+	$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(patsubst %/,%,$(dir $*)) obj=$(notdir $*)
+
+
+$(OUTPUT)cpfromfs$(EXESUF): cptofs$(EXESUF)
+	$(Q)if ! [ -e $@ ]; then ln -s $< $@; fi
+
+clean:
+	$(call QUIET_CLEAN, objects)find $(OUTPUT) -name '*.o' -delete -o -name '\.*.cmd'\
+	 -delete -o -name '\.*.d' -delete
+	$(call QUIET_CLEAN, headers)$(RM) -r $(OUTPUT)/include/lkl/
+	$(call QUIET_CLEAN, liblkl.a)$(RM) $(OUTPUT)/liblkl.a
+	$(call QUIET_CLEAN, targets)$(RM) $(TARGETS) bin/stat
+
+clean-conf: clean
+	$(call QUIET_CLEAN, Makefile.conf)$(RM) $(OUTPUT)/Makefile.conf
+
+headers_install: $(TARGETS)
+	$(call QUIET_INSTALL, headers) \
+	    install -d $(DESTDIR)$(PREFIX)/include ; \
+	    install -m 644 include/lkl.h include/lkl_host.h $(OUTPUT)include/lkl_autoconf.h \
+	      include/lkl_config.h $(DESTDIR)$(PREFIX)/include ; \
+	    cp -r $(OUTPUT)include/lkl $(DESTDIR)$(PREFIX)/include
+
+libraries_install: $(libs-y:%=$(OUTPUT)%$(SOSUF)) $(OUTPUT)liblkl.a
+	$(call QUIET_INSTALL, libraries) \
+	    install -d $(DESTDIR)$(PREFIX)/lib ; \
+	    install -m 644 $^ $(DESTDIR)$(PREFIX)/lib
+
+programs_install: $(progs-y:%=$(OUTPUT)%$(EXESUF))
+	$(call QUIET_INSTALL, programs) \
+	    install -d $(DESTDIR)$(PREFIX)/bin ; \
+	    install -m 755 $^ $(DESTDIR)$(PREFIX)/bin
+
+install: headers_install libraries_install programs_install
+
+
+run-tests:
+	./tests/run.py $(tests)
+
+FORCE: ;
+.PHONY: all clean FORCE run-tests
+.PHONY: headers_install libraries_install programs_install install
+.NOTPARALLEL : lib/lkl.o
+.SECONDARY:
+
diff --git a/tools/lkl/Makefile.autoconf b/tools/lkl/Makefile.autoconf
new file mode 100644
index 000000000000..1c3a053a8e94
--- /dev/null
+++ b/tools/lkl/Makefile.autoconf
@@ -0,0 +1,114 @@
+POSIX_HOSTS=elf64-x86-64 elf32-i386 elf64-x86-64-freebsd elf32-littlearm elf64-littleaarch64
+NT_HOSTS=pe-i386 pe-x86-64
+
+define set_autoconf_var
+  $(shell echo "#define LKL_HOST_CONFIG_$(1) $(2)" \
+	  >> $(OUTPUT)/include/lkl_autoconf.h)
+  $(shell echo "LKL_HOST_CONFIG_$(1)=$(2)" >> $(OUTPUT)/tests/autoconf.sh)
+  export LKL_HOST_CONFIG_$(1)=$(2)
+endef
+
+define find_include
+  $(eval include_paths=$(shell $(CC) -E -Wp,-v -xc /dev/null 2>&1 | grep '^ '))
+  $(foreach f, $(include_paths), $(wildcard $(f)/$(1)))
+endef
+
+define is_defined
+$(shell $(CC) -dM -E - </dev/null | grep $(1))
+endef
+
+define android_host
+  $(call set_autoconf_var,ANDROID,y)
+endef
+
+define bsd_host
+  $(call set_autoconf_var,BSD,y)
+endef
+
+define arm_host
+  $(call set_autoconf_var,ARM,y)
+endef
+
+define aarch64_host
+  $(call set_autoconf_var,AARCH64,y)
+endef
+
+define virtio_net_dpdk
+  $(call set_autoconf_var,VIRTIO_NET_DPDK,y)
+  RTE_SDK ?= $(OUTPUT)/dpdk-17.02
+  RTE_TARGET ?= build
+  DPDK_LIBS = -lrte_pmd_vmxnet3_uio -lrte_pmd_ixgbe -lrte_pmd_e1000
+  DPDK_LIBS += -lrte_pmd_virtio
+  DPDK_LIBS += -lrte_timer -lrte_hash -lrte_mbuf -lrte_ethdev -lrte_eal
+  DPDK_LIBS += -lrte_mempool -lrte_ring -lrte_pmd_ring
+  DPDK_LIBS += -lrte_kvargs -lrte_net
+  CFLAGS += -I$$(RTE_SDK)/$$(RTE_TARGET)/include -msse4.2 -mpopcnt
+  LDFLAGS +=-L$$(RTE_SDK)/$$(RTE_TARGET)/lib
+  LDFLAGS +=-Wl,--whole-archive $$(DPDK_LIBS) -Wl,--no-whole-archive -lm -ldl
+endef
+
+define virtio_net_vde
+  $(call set_autoconf_var,VIRTIO_NET_VDE,y)
+  LDLIBS += $(shell pkg-config --libs vdeplug)
+endef
+
+define posix_host
+  $(call set_autoconf_var,POSIX,y)
+  $(call set_autoconf_var,VIRTIO_NET,y)
+  LDFLAGS += -pie
+  CFLAGS += -fPIC -pthread
+  SOSUF := .so
+  $(if $(call is_defined,__ANDROID__),$(call android_host),LDLIBS += -lrt -lpthread)
+  $(if $(filter $(1),elf64-x86-64-freebsd),$(call bsd_host))
+  $(if $(filter $(1),elf32-littlearm),$(call arm_host))
+  $(if $(filter $(1),elf64-littleaarch64),$(call aarch64_host))
+  $(if $(filter yes,$(dpdk)),$(call virtio_net_dpdk))
+  $(if $(filter yes,$(vde)),$(call virtio_net_vde))
+  $(if $(strip $(call find_include,fuse.h)),$(call set_autoconf_var,FUSE,y))
+  $(if $(strip $(call find_include,archive.h)),$(call set_autoconf_var,ARCHIVE,y))
+  $(if $(strip $(call find_include,linux/if_tun.h)),$(call set_autoconf_var,VIRTIO_NET_MACVTAP,y))
+  $(if $(filter $(1),elf64-x86-64-freebsd),$(call set_autoconf_var,NEEDS_LARGP,y))
+  $(if $(filter $(1),elf32-i386),$(call set_autoconf_var,I386,y))
+endef
+
+define nt64_host
+  $(call set_autoconf_var,NEEDS_LARGP,y)
+  CFLAGS += -Wl,--enable-auto-image-base -Wl,--image-base -Wl,0x10000000 \
+  	 -Wl,--out-implib=$(OUTPUT)liblkl.dll.a -Wl,--export-all-symbols \
+	 -Wl,--enable-auto-import
+  LDFLAGS +=-Wl,--image-base -Wl,0x10000000 -Wl,--enable-auto-image-base \
+   	   -Wl,--out-implib=$(OUTPUT)liblkl.dll.a -Wl,--export-all-symbols \
+	   -Wl,--enable-auto-import
+endef
+
+define nt_host
+  $(call set_autoconf_var,NT,y)
+  KOPT = "KALLSYMS_EXTRA_PASS=1"
+  LDLIBS += -lws2_32
+  EXESUF := .exe
+  SOSUF := .dll
+  CFLAGS += -Iinclude/mingw32
+  $(if $(filter $(1),pe-x86-64),$(call nt64_host))
+endef
+
+define do_autoconf
+  export CROSS_COMPILE := $(CROSS_COMPILE)
+  export CC := $(CROSS_COMPILE)gcc
+  export LD := $(CROSS_COMPILE)ld
+  export AR := $(CROSS_COMPILE)ar
+  $(eval LD := $(CROSS_COMPILE)ld)
+  $(eval CC := $(CROSS_COMPILE)gcc)
+  $(eval LD_FMT := $(shell $(LD) -r -print-output-format))
+  $(if $(filter $(LD_FMT),$(POSIX_HOSTS)),$(call posix_host,$(LD_FMT)))
+  $(if $(filter $(LD_FMT),$(NT_HOSTS)),$(call nt_host,$(LD_FMT)))
+endef
+
+export do_autoconf
+
+
+$(OUTPUT)Makefile.conf: Makefile.autoconf
+	$(shell mkdir -p $(OUTPUT)/include)
+	$(shell mkdir -p $(OUTPUT)/tests)
+	$(shell echo -n "" > $(OUTPUT)/include/lkl_autoconf.h)
+	$(shell echo -n "" > $(OUTPUT)/tests/autoconf.sh)
+	@echo "$$do_autoconf" > $(OUTPUT)/Makefile.conf
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
new file mode 100644
index 000000000000..e6394fae4526
--- /dev/null
+++ b/tools/lkl/Targets
@@ -0,0 +1,27 @@
+libs-y += lib/liblkl
+
+ifneq ($(LKL_HOST_CONFIG_BSD),y)
+libs-$(LKL_HOST_CONFIG_POSIX) += lib/hijack/liblkl-hijack
+endif
+LDFLAGS_lib/hijack/liblkl-hijack-y += -shared -nodefaultlibs
+LDLIBS_lib/hijack/liblkl-hijack-y += -ldl
+LDLIBS_lib/hijack/liblkl-hijack-$(LKL_HOST_CONFIG_ARM) += -lgcc -lc
+LDLIBS_lib/hijack/liblkl-hijack-$(LKL_HOST_CONFIG_AARCH64) += -lc
+LDLIBS_lib/hijack/liblkl-hijack-$(LKL_HOST_CONFIG_I386) += -lc_nonshared
+
+progs-$(LKL_HOST_CONFIG_FUSE) += lklfuse
+LDLIBS_lklfuse-y := -lfuse
+
+progs-$(LKL_HOST_CONFIG_ARCHIVE) += fs2tar
+LDLIBS_fs2tar-y := -larchive
+LDLIBS_fs2tar-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
+
+
+progs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs
+LDLIBS_cptofs-y := -larchive
+LDLIBS_cptofs-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
+
+progs-y += tests/boot
+progs-y += tests/disk
+progs-y += tests/net-test
+
diff --git a/tools/lkl/include/.gitignore b/tools/lkl/include/.gitignore
new file mode 100644
index 000000000000..c41a463c898d
--- /dev/null
+++ b/tools/lkl/include/.gitignore
@@ -0,0 +1 @@
+lkl/
\ No newline at end of file
diff --git a/tools/lkl/include/lkl.h b/tools/lkl/include/lkl.h
new file mode 100644
index 000000000000..65f151f9c047
--- /dev/null
+++ b/tools/lkl/include/lkl.h
@@ -0,0 +1,928 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_H
+#define _LKL_H
+
+#include "lkl_autoconf.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define _LKL_LIBC_COMPAT_H
+
+#ifdef __cplusplus
+#define class __lkl__class
+#endif
+
+/*
+ * Avoid collisions between Android which defines __unused and
+ * linux/icmp.h which uses __unused as a structure field.
+ */
+#pragma push_macro("__unused")
+#undef __unused
+
+#include <lkl/asm/syscalls.h>
+
+#pragma pop_macro("__unused")
+
+#ifdef __cplusplus
+#undef class
+#endif
+
+#if defined(__MINGW32__)
+#define strtok_r strtok_s
+#define inet_pton lkl_inet_pton
+
+int inet_pton(int af, const char *src, void *dst);
+#endif
+
+#if __LKL__BITS_PER_LONG == 64
+#define lkl_sys_fstatat lkl_sys_newfstatat
+#define lkl_sys_fstat lkl_sys_newfstat
+
+#else
+#define __lkl__NR_fcntl __lkl__NR_fcntl64
+
+#define lkl_stat lkl_stat64
+#define lkl_sys_stat lkl_sys_stat64
+#define lkl_sys_lstat lkl_sys_lstat64
+#define lkl_sys_truncate lkl_sys_truncate64
+#define lkl_sys_ftruncate lkl_sys_ftruncate64
+#define lkl_sys_sendfile lkl_sys_sendfile64
+#define lkl_sys_fstatat lkl_sys_fstatat64
+#define lkl_sys_fstat lkl_sys_fstat64
+#define lkl_sys_fcntl lkl_sys_fcntl64
+
+#define lkl_statfs lkl_statfs64
+
+static inline int lkl_sys_statfs(const char *path, struct lkl_statfs *buf)
+{
+	return lkl_sys_statfs64(path, sizeof(*buf), buf);
+}
+
+static inline int lkl_sys_fstatfs(unsigned int fd, struct lkl_statfs *buf)
+{
+	return lkl_sys_fstatfs64(fd, sizeof(*buf), buf);
+}
+
+#define lkl_sys_nanosleep lkl_sys_nanosleep_time32
+static inline int lkl_sys_nanosleep_time32(struct lkl_timespec *rqtp,
+					   struct lkl_timespec *rmtp)
+{
+	long p[6] = {(long)rqtp, (long)rmtp, 0, 0, 0, 0};
+
+	return lkl_syscall(__lkl__NR_nanosleep, p);
+}
+
+#endif
+
+static inline int lkl_sys_stat(const char *path, struct lkl_stat *buf)
+{
+	return lkl_sys_fstatat(LKL_AT_FDCWD, path, buf, 0);
+}
+
+static inline int lkl_sys_lstat(const char *path, struct lkl_stat *buf)
+{
+	return lkl_sys_fstatat(LKL_AT_FDCWD, path, buf,
+			       LKL_AT_SYMLINK_NOFOLLOW);
+}
+
+#ifdef __lkl__NR_llseek
+/**
+ * lkl_sys_lseek - wrapper for lkl_sys_llseek
+ */
+static inline long long lkl_sys_lseek(unsigned int fd, __lkl__kernel_loff_t off,
+				      unsigned int whence)
+{
+	long long res;
+	long ret = lkl_sys_llseek(fd, off >> 32, off & 0xffffffff, &res,
+				  whence);
+
+	return ret < 0 ? ret : res;
+}
+#endif
+
+static inline void *lkl_sys_mmap(void *addr, size_t length, int prot, int flags,
+				 int fd, off_t offset)
+{
+	return (void *)lkl_sys_mmap_pgoff((long)addr, length, prot, flags, fd,
+					  offset >> 12);
+}
+
+#define lkl_sys_mmap2 lkl_sys_mmap_pgoff
+
+#ifdef __lkl__NR_openat
+/**
+ * lkl_sys_open - wrapper for lkl_sys_openat
+ */
+static inline long lkl_sys_open(const char *file, int flags, int mode)
+{
+	return lkl_sys_openat(LKL_AT_FDCWD, file, flags, mode);
+}
+
+/**
+ * lkl_sys_creat - wrapper for lkl_sys_openat
+ */
+static inline long lkl_sys_creat(const char *file, int mode)
+{
+	return lkl_sys_openat(LKL_AT_FDCWD, file,
+			      LKL_O_CREAT|LKL_O_WRONLY|LKL_O_TRUNC, mode);
+}
+#endif
+
+
+#ifdef __lkl__NR_faccessat
+/**
+ * lkl_sys_access - wrapper for lkl_sys_faccessat
+ */
+static inline long lkl_sys_access(const char *file, int mode)
+{
+	return lkl_sys_faccessat(LKL_AT_FDCWD, file, mode);
+}
+#endif
+
+#ifdef __lkl__NR_fchownat
+/**
+ * lkl_sys_chown - wrapper for lkl_sys_fchownat
+ */
+static inline long lkl_sys_chown(const char *path, lkl_uid_t uid, lkl_gid_t gid)
+{
+	return lkl_sys_fchownat(LKL_AT_FDCWD, path, uid, gid, 0);
+}
+#endif
+
+#ifdef __lkl__NR_fchmodat
+/**
+ * lkl_sys_chmod - wrapper for lkl_sys_fchmodat
+ */
+static inline long lkl_sys_chmod(const char *path, mode_t mode)
+{
+	return lkl_sys_fchmodat(LKL_AT_FDCWD, path, mode);
+}
+#endif
+
+#ifdef __lkl__NR_linkat
+/**
+ * lkl_sys_link - wrapper for lkl_sys_linkat
+ */
+static inline long lkl_sys_link(const char *existing, const char *new)
+{
+	return lkl_sys_linkat(LKL_AT_FDCWD, existing, LKL_AT_FDCWD, new, 0);
+}
+#endif
+
+#ifdef __lkl__NR_unlinkat
+/**
+ * lkl_sys_unlink - wrapper for lkl_sys_unlinkat
+ */
+static inline long lkl_sys_unlink(const char *path)
+{
+	return lkl_sys_unlinkat(LKL_AT_FDCWD, path, 0);
+}
+#endif
+
+#ifdef __lkl__NR_symlinkat
+/**
+ * lkl_sys_symlink - wrapper for lkl_sys_symlinkat
+ */
+static inline long lkl_sys_symlink(const char *existing, const char *new)
+{
+	return lkl_sys_symlinkat(existing, LKL_AT_FDCWD, new);
+}
+#endif
+
+#ifdef __lkl__NR_readlinkat
+/**
+ * lkl_sys_readlink - wrapper for lkl_sys_readlinkat
+ */
+static inline long lkl_sys_readlink(const char *path, char *buf, size_t bufsize)
+{
+	return lkl_sys_readlinkat(LKL_AT_FDCWD, path, buf, bufsize);
+}
+#endif
+
+#ifdef __lkl__NR_renameat
+/**
+ * lkl_sys_rename - wrapper for lkl_sys_renameat
+ */
+static inline long lkl_sys_rename(const char *old, const char *new)
+{
+	return lkl_sys_renameat(LKL_AT_FDCWD, old, LKL_AT_FDCWD, new);
+}
+#endif
+
+#ifdef __lkl__NR_mkdirat
+/**
+ * lkl_sys_mkdir - wrapper for lkl_sys_mkdirat
+ */
+static inline long lkl_sys_mkdir(const char *path, mode_t mode)
+{
+	return lkl_sys_mkdirat(LKL_AT_FDCWD, path, mode);
+}
+#endif
+
+#ifdef __lkl__NR_unlinkat
+/**
+ * lkl_sys_rmdir - wrapper for lkl_sys_unlinkrat
+ */
+static inline long lkl_sys_rmdir(const char *path)
+{
+	return lkl_sys_unlinkat(LKL_AT_FDCWD, path, LKL_AT_REMOVEDIR);
+}
+#endif
+
+#ifdef __lkl__NR_mknodat
+/**
+ * lkl_sys_mknod - wrapper for lkl_sys_mknodat
+ */
+static inline long lkl_sys_mknod(const char *path, mode_t mode, dev_t dev)
+{
+	return lkl_sys_mknodat(LKL_AT_FDCWD, path, mode, dev);
+}
+#endif
+
+#ifdef __lkl__NR_pipe2
+/**
+ * lkl_sys_pipe - wrapper for lkl_sys_pipe2
+ */
+static inline long lkl_sys_pipe(int fd[2])
+{
+	return lkl_sys_pipe2(fd, 0);
+}
+#endif
+
+#ifdef __lkl__NR_sendto
+/**
+ * lkl_sys_send - wrapper for lkl_sys_sendto
+ */
+static inline long lkl_sys_send(int fd, void *buf, size_t len, int flags)
+{
+	return lkl_sys_sendto(fd, buf, len, flags, 0, 0);
+}
+#endif
+
+#ifdef __lkl__NR_recvfrom
+/**
+ * lkl_sys_recv - wrapper for lkl_sys_recvfrom
+ */
+static inline long lkl_sys_recv(int fd, void *buf, size_t len, int flags)
+{
+	return lkl_sys_recvfrom(fd, buf, len, flags, 0, 0);
+}
+#endif
+
+#ifdef __lkl__NR_pselect6
+/**
+ * lkl_sys_select - wrapper for lkl_sys_pselect
+ */
+static inline long lkl_sys_select(int n, lkl_fd_set *rfds, lkl_fd_set *wfds,
+				  lkl_fd_set *efds, struct lkl_timeval *tv)
+{
+	long data[2] = { 0, _LKL_NSIG/8 };
+	struct lkl_timespec ts;
+	lkl_time_t extra_secs;
+	const lkl_time_t max_time = ((1ULL<<8)*sizeof(time_t)-1)-1;
+
+	if (tv) {
+		if (tv->tv_sec < 0 || tv->tv_usec < 0)
+			return -LKL_EINVAL;
+
+		extra_secs = tv->tv_usec / 1000000;
+		ts.tv_nsec = tv->tv_usec % 1000000 * 1000;
+		ts.tv_sec = extra_secs > max_time - tv->tv_sec ?
+			max_time : tv->tv_sec + extra_secs;
+	}
+	return lkl_sys_pselect6(n, rfds, wfds, efds, tv ?
+				(struct __lkl__kernel_timespec *)&ts : 0, data);
+}
+#endif
+
+#ifdef __lkl__NR_ppoll
+/**
+ * lkl_sys_poll - wrapper for lkl_sys_ppoll
+ */
+static inline long lkl_sys_poll(struct lkl_pollfd *fds, int n, int timeout)
+{
+	return lkl_sys_ppoll(fds, n, timeout >= 0 ?
+			     (struct __lkl__kernel_timespec *)
+			     &((struct lkl_timespec){ .tv_sec = timeout/1000,
+				   .tv_nsec = timeout%1000*1000000 }) : 0,
+			     0, _LKL_NSIG/8);
+}
+#endif
+
+#ifdef __lkl__NR_epoll_create1
+/**
+ * lkl_sys_epoll_create - wrapper for lkl_sys_epoll_create1
+ */
+static inline long lkl_sys_epoll_create(int size)
+{
+	return lkl_sys_epoll_create1(0);
+}
+#endif
+
+#ifdef __lkl__NR_epoll_pwait
+/**
+ * lkl_sys_epoll_wait - wrapper for lkl_sys_epoll_pwait
+ */
+static inline long lkl_sys_epoll_wait(int fd, struct lkl_epoll_event *ev,
+				      int cnt, int to)
+{
+	return lkl_sys_epoll_pwait(fd, ev, cnt, to, 0, _LKL_NSIG/8);
+}
+#endif
+
+
+
+/**
+ * lkl_strerror - returns a string describing the given error code
+ *
+ * @err - error code
+ * @returns - string for the given error code
+ */
+const char *lkl_strerror(int err);
+
+/**
+ * lkl_perror - prints a string describing the given error code
+ *
+ * @msg - prefix for the error message
+ * @err - error code
+ */
+void lkl_perror(char *msg, int err);
+
+/**
+ * struct lkl_dev_blk_ops - block device host operations, defined in lkl_host.h.
+ */
+struct lkl_dev_blk_ops;
+
+/**
+ * lkl_disk - host disk handle
+ *
+ * @dev - a pointer to 'virtio_blk_dev' structure for this disk
+ * @fd - a POSIX file descriptor that can be used by preadv/pwritev
+ * @handle - an NT file handle that can be used by ReadFile/WriteFile
+ */
+struct lkl_disk {
+	void *dev;
+	union {
+		int fd;
+		void *handle;
+	};
+	struct lkl_dev_blk_ops *ops;
+};
+
+/**
+ * lkl_disk_add - add a new disk
+ *
+ * @disk - the host disk handle
+ * @returns a disk id (0 is valid) or a strictly negative value in case of error
+ */
+int lkl_disk_add(struct lkl_disk *disk);
+
+/**
+ * lkl_disk_remove - remove a disk
+ *
+ * This function makes a cleanup of the @disk's virtio_dev structure
+ * that was initialized by lkl_disk_add before.
+ *
+ * @disk - the host disk handle
+ */
+int lkl_disk_remove(struct lkl_disk disk);
+
+/**
+ * lkl_get_virtiolkl_encode_dev_from_sysfs_blkdev - extract device id from sysfs
+ *
+ * This function returns the device id for the given sysfs dev node.
+ * The content of the node has to be in the form 'MAJOR:MINOR'.
+ * Also, this function expects an absolute path which means that sysfs
+ * already has to be mounted at the given path
+ *
+ * @sysfs_path - absolute path to the sysfs dev node
+ * @pdevid - pointer to memory where dev id will be returned
+ * @returns - 0 on success, a negative value on error
+ */
+int lkl_encode_dev_from_sysfs(const char *sysfs_path, uint32_t *pdevid);
+
+/**
+ * lkl_get_virtio_blkdev - get device id of a disk (partition)
+ *
+ * This function returns the device id for the given disk.
+ *
+ * @disk_id - the disk id identifying the disk
+ * @part - disk partition or zero for full disk
+ * @pdevid - pointer to memory where dev id will be returned
+ * @returns - 0 on success, a negative value on error
+ */
+int lkl_get_virtio_blkdev(int disk_id, unsigned int part, uint32_t *pdevid);
+
+
+/**
+ * lkl_mount_dev - mount a disk
+ *
+ * This functions creates a device file for the given disk, creates a mount
+ * point and mounts the device over the mount point.
+ *
+ * @disk_id - the disk id identifying the disk to be mounted
+ * @part - disk partition or zero for full disk
+ * @fs_type - filesystem type
+ * @flags - mount flags
+ * @opts - additional filesystem specific mount options
+ * @mnt_str - a string that will be filled by this function with the path where
+ * the filesystem has been mounted
+ * @mnt_str_len - size of mnt_str
+ * @returns - 0 on success, a negative value on error
+ */
+long lkl_mount_dev(unsigned int disk_id, unsigned int part, const char *fs_type,
+		   int flags, const char *opts,
+		   char *mnt_str, unsigned int mnt_str_len);
+
+/**
+ * lkl_umount_dev - umount a disk
+ *
+ * This functions umounts the given disks and removes the device file and the
+ * mount point.
+ *
+ * @disk_id - the disk id identifying the disk to be mounted
+ * @part - disk partition or zero for full disk
+ * @flags - umount flags
+ * @timeout_ms - timeout to wait for the kernel to flush closed files so that
+ * umount can succeed
+ * @returns - 0 on success, a negative value on error
+ */
+long lkl_umount_dev(unsigned int disk_id, unsigned int part, int flags,
+		    long timeout_ms);
+
+/**
+ * lkl_umount_timeout - umount filesystem with timeout
+ *
+ * @path - the path to unmount
+ * @flags - umount flags
+ * @timeout_ms - timeout to wait for the kernel to flush closed files so that
+ * umount can succeed
+ * @returns - 0 on success, a negative value on error
+ */
+long lkl_umount_timeout(char *path, int flags, long timeout_ms);
+
+/**
+ * lkl_opendir - open a directory
+ *
+ * @path - directory path
+ * @err - pointer to store the error in case of failure
+ * @returns - a handle to be used when calling lkl_readdir
+ */
+struct lkl_dir *lkl_opendir(const char *path, int *err);
+
+/**
+ * lkl_fdopendir - open a directory
+ *
+ * @fd - file descriptor
+ * @err - pointer to store the error in case of failure
+ * @returns - a handle to be used when calling lkl_readdir
+ */
+struct lkl_dir *lkl_fdopendir(int fd, int *err);
+
+/**
+ * lkl_rewinddir - reset directory stream
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ */
+void lkl_rewinddir(struct lkl_dir *dir);
+
+/**
+ * lkl_closedir - close the directory
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ */
+int lkl_closedir(struct lkl_dir *dir);
+
+/**
+ * lkl_readdir - get the next available entry of the directory
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ * @returns - a lkl_dirent64 entry or NULL if the end of the directory stream is
+ * reached or if an error occurred; check lkl_errdir() to distinguish between
+ * errors or end of the directory stream
+ */
+struct lkl_linux_dirent64 *lkl_readdir(struct lkl_dir *dir);
+
+/**
+ * lkl_errdir - checks if an error occurred during the last lkl_readdir call
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ * @returns - 0 if no error occurred, or a negative value otherwise
+ */
+int lkl_errdir(struct lkl_dir *dir);
+
+/**
+ * lkl_dirfd - gets the file descriptor associated with the directory handle
+ *
+ * @dir - the directory handle as returned by lkl_opendir
+ * @returns - a positive value,which is the LKL file descriptor associated with
+ * the directory handle, or a negative value otherwise
+ */
+int lkl_dirfd(struct lkl_dir *dir);
+
+/**
+ * lkl_if_up - activate network interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_up(int ifindex);
+
+/**
+ * lkl_if_down - deactivate network interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_down(int ifindex);
+
+/**
+ * lkl_if_set_mtu - set MTU on interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @mtu - the requested MTU size
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_mtu(int ifindex, int mtu);
+
+/**
+ * lkl_if_set_ipv4 - set IPv4 address on interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @addr - 4-byte IP address (i.e., struct in_addr)
+ * @netmask_len - prefix length of the @addr
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv4(int ifindex, unsigned int addr, unsigned int netmask_len);
+
+/**
+ * lkl_set_ipv4_gateway - add an IPv4 default route
+ *
+ * @addr - 4-byte IP address of the gateway (i.e., struct in_addr)
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_set_ipv4_gateway(unsigned int addr);
+
+/**
+ * lkl_if_set_ipv4_gateway - add an IPv4 default route in rule table
+ *
+ * @ifindex - the ifindex of the interface, used for tableid calculation
+ * @addr - 4-byte IP address of the interface
+ * @netmask_len - prefix length of the @addr
+ * @gw_addr - 4-byte IP address of the gateway
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv4_gateway(int ifindex, unsigned int addr,
+		unsigned int netmask_len, unsigned int gw_addr);
+
+/**
+ * lkl_if_set_ipv6 - set IPv6 address on interface
+ * must be called after interface is up.
+ *
+ * @ifindex - the ifindex of the interface
+ * @addr - 16-byte IPv6 address (i.e., struct in6_addr)
+ * @netprefix_len - prefix length of the @addr
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv6(int ifindex, void *addr, unsigned int netprefix_len);
+
+/**
+ * lkl_set_ipv6_gateway - add an IPv6 default route
+ *
+ * @addr - 16-byte IPv6 address of the gateway (i.e., struct in6_addr)
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_set_ipv6_gateway(void *addr);
+
+/**
+ * lkl_if_set_ipv6_gateway - add an IPv6 default route in rule table
+ *
+ * @ifindex - the ifindex of the interface, used for tableid calculation
+ * @addr - 16-byte IP address of the interface
+ * @netmask_len - prefix length of the @addr
+ * @gw_addr - 16-byte IP address of the gateway (i.e., struct in_addr)
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv6_gateway(int ifindex, void *addr,
+		unsigned int netmask_len, void *gw_addr);
+
+/**
+ * lkl_ifname_to_ifindex - obtain ifindex of an interface by name
+ *
+ * @name - string of an interface
+ * @returns - return an integer of ifindex if no error
+ */
+int lkl_ifname_to_ifindex(const char *name);
+
+/**
+ * lkl_netdev - host network device handle, defined in lkl_host.h.
+ */
+struct lkl_netdev;
+
+/**
+ * lkl_netdev_args - arguments to lkl_netdev_add
+ * @mac - optional MAC address for the device
+ * @offload - offload bits for the device
+ */
+struct lkl_netdev_args {
+	void *mac;
+	unsigned int offload;
+};
+
+/**
+ * lkl_netdev_add - add a new network device
+ *
+ * Must be called before calling lkl_start_kernel.
+ *
+ * @nd - the network device host handle
+ * @args - arguments that configs the netdev. Can be NULL
+ * @returns a network device id (0 is valid) or a strictly negative value in
+ * case of error
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args);
+#else
+static inline int lkl_netdev_add(struct lkl_netdev *nd,
+				 struct lkl_netdev_args *args)
+{
+	return -LKL_ENOSYS;
+}
+#endif
+
+/**
+ * lkl_netdev_remove - remove a previously added network device
+ *
+ * Attempts to release all resources held by a network device created
+ * via lkl_netdev_add.
+ *
+ * @id - the network device id, as return by @lkl_netdev_add
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+void lkl_netdev_remove(int id);
+#else
+static inline void lkl_netdev_remove(int id)
+{
+}
+#endif
+
+/**
+ * lkl_netdev_free - frees a network device
+ *
+ * @nd - the network device to free
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+void lkl_netdev_free(struct lkl_netdev *nd);
+#else
+static inline void lkl_netdev_free(struct lkl_netdev *nd)
+{
+}
+#endif
+
+/**
+ * lkl_netdev_get_ifindex - retrieve the interface index for a given network
+ * device id
+ *
+ * @id - the network device id
+ * @returns the interface index or a stricly negative value in case of error
+ */
+int lkl_netdev_get_ifindex(int id);
+
+/**
+ * lkl_netdev_tap_create - create TAP net_device for the virtio net backend
+ *
+ * @ifname - interface name for the TAP device. need to be configured
+ * on host in advance
+ * @offload - offload bits for the device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+struct lkl_netdev *lkl_netdev_tap_create(const char *ifname, int offload);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_tap_create(const char *ifname, int offload)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_dpdk_create - create DPDK net_device for the virtio net backend
+ *
+ * @ifname - interface name for the DPDK device. The name for DPDK device is
+ * only used for an internal use.
+ * @offload - offload bits for the device
+ * @mac - mac address pointer of dpdk-ed device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET_DPDK
+struct lkl_netdev *lkl_netdev_dpdk_create(const char *ifname, int offload,
+					unsigned char *mac);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_dpdk_create(const char *ifname, int offload, unsigned char *mac)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_vde_create - create VDE net_device for the virtio net backend
+ *
+ * @switch_path - path to the VDE switch directory. Needs to be started on host
+ * in advance.
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET_VDE
+struct lkl_netdev *lkl_netdev_vde_create(const char *switch_path);
+#else
+static inline struct lkl_netdev *lkl_netdev_vde_create(const char *switch_path)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_raw_create - create raw socket net_device for the virtio net
+ *                         backend
+ *
+ * @ifname - interface name for the snoop device.
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+struct lkl_netdev *lkl_netdev_raw_create(const char *ifname);
+#else
+static inline struct lkl_netdev *lkl_netdev_raw_create(const char *ifname)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_macvtap_create - create macvtap net_device for the virtio
+ * net backend
+ *
+ * @path - a file name for the macvtap device. need to be configured
+ * on host in advance
+ * @offload - offload bits for the device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET_MACVTAP
+struct lkl_netdev *lkl_netdev_macvtap_create(const char *path, int offload);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_macvtap_create(const char *path, int offload)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_pipe_create - create pipe net_device for the virtio
+ * net backend
+ *
+ * @ifname - a file name for the rx and tx pipe device. need to be configured
+ * on host in advance. delimiter is "|". e.g. "rx_name|tx_name".
+ * @offload - offload bits for the device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+struct lkl_netdev *lkl_netdev_pipe_create(const char *ifname, int offload);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_pipe_create(const char *ifname, int offload)
+{
+	return NULL;
+}
+#endif
+
+/*
+ * lkl_register_dbg_handler- register a signal handler that loads a debug lib.
+ *
+ * The signal handler is triggered by Ctrl-Z. It creates a new pthread which
+ * call dbg_entrance().
+ *
+ * If you run the program from shell script, make sure you ignore SIGTSTP by
+ * "trap '' TSTP" in the shell script.
+ */
+void lkl_register_dbg_handler(void);
+
+/**
+ * lkl_add_neighbor - add a permanent arp entry
+ * @ifindex - the ifindex of the interface
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @ip - ip address of the entry in network byte order
+ * @mac - mac address of the entry
+ */
+int lkl_add_neighbor(int ifindex, int af, void *addr, void *mac);
+
+/**
+ * lkl_mount_fs - mount a file system type like proc, sys
+ * @fstype - file system type. e.g. proc, sys
+ * @returns - 0 on success. 1 if it's already mounted. negative on failure.
+ */
+int lkl_mount_fs(char *fstype);
+
+/**
+ * lkl_if_add_ip - add an ip address
+ * @ifindex - the ifindex of the interface
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @addr - ip address of the entry in network byte order
+ * @netprefix_len - prefix length of the @addr
+ */
+int lkl_if_add_ip(int ifindex, int af, void *addr, unsigned int netprefix_len);
+
+/**
+ * lkl_if_del_ip - add an ip address
+ * @ifindex - the ifindex of the interface
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @addr - ip address of the entry in network byte order
+ * @netprefix_len - prefix length of the @addr
+ */
+int lkl_if_del_ip(int ifindex, int af, void *addr, unsigned int netprefix_len);
+
+/**
+ * lkl_add_gateway - add a gateway
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @gwaddr - 4-byte IP address of the gateway (i.e., struct in_addr)
+ */
+int lkl_add_gateway(int af, void *gwaddr);
+
+/**
+ * XXX Should I use OIF selector?
+ * temporary table idx = ifindex * 2 + 0 <- ipv4
+ * temporary table idx = ifindex * 2 + 1 <- ipv6
+ */
+/**
+ * lkl_if_add_rule_from_addr - create an ip rule table with "from" selector
+ * @ifindex - the ifindex of the interface, used for table id calculation
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @saddr - network byte order ip address, "from" selector address of this rule
+ */
+int lkl_if_add_rule_from_saddr(int ifindex, int af, void *saddr);
+
+/**
+ * lkl_if_add_gateway - add gateway to rule table
+ * @ifindex - the ifindex of the interface, used for table id calculation
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @gwaddr - 4-byte IP address of the gateway (i.e., struct in_addr)
+ */
+int lkl_if_add_gateway(int ifindex, int af, void *gwaddr);
+
+/**
+ * lkl_if_add_linklocal - add linklocal route to rule table
+ * @ifindex - the ifindex of the interface, used for table id calculation
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @addr - ip address of the entry in network byte order
+ * @netprefix_len - prefix length of the @addr
+ */
+int lkl_if_add_linklocal(int ifindex, int af,  void *addr, int netprefix_len);
+
+/**
+ * lkl_if_wait_ipv6_dad - wait for DAD to be done for a ipv6 address
+ * must be called after interface is up
+ *
+ * @ifindex - the ifindex of the interface
+ * @addr - ip address of the entry in network byte order
+ */
+int lkl_if_wait_ipv6_dad(int ifindex, void *addr);
+
+/**
+ * lkl_set_fd_limit - set the maximum number of file descriptors allowed
+ * @fd_limit - fd max limit
+ */
+int lkl_set_fd_limit(unsigned int fd_limit);
+
+/**
+ * lkl_qdisc_add - set qdisc rule onto an interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @root - the name of root class (e.g., "root");
+ * @type - the type of qdisc (e.g., "fq")
+ */
+int lkl_qdisc_add(int ifindex, const char *root, const char *type);
+
+/**
+ * lkl_qdisc_parse_add - Add a qdisc entry for an interface with strings
+ *
+ * @ifindex - the ifindex of the interface
+ * @entries - strings of qdisc configurations in the form of
+ *            "root|type;root|type;..."
+ */
+void lkl_qdisc_parse_add(int ifindex, const char *entries);
+
+/**
+ * lkl_sysctl - write a sysctl value
+ *
+ * @path - the path to an sysctl entry (e.g., "net.ipv4.tcp_wmem");
+ * @value - the value of the sysctl (e.g., "4096 87380 2147483647")
+ */
+int lkl_sysctl(const char *path, const char *value);
+
+/**
+ * lkl_sysctl_parse_write - Configure sysctl parameters with strings
+ *
+ * @sysctls - Configure sysctl parameters as the form of "key=value;..."
+ */
+void lkl_sysctl_parse_write(const char *sysctls);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
new file mode 100644
index 000000000000..ab9c3f2a69fb
--- /dev/null
+++ b/tools/lkl/include/lkl_host.h
@@ -0,0 +1,160 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_HOST_H
+#define _LKL_HOST_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <lkl/asm/host_ops.h>
+#include <lkl.h>
+
+extern struct lkl_host_operations lkl_host_ops;
+
+/**
+ * lkl_printf - print a message via the host print operation
+ *
+ * @fmt: printf like format string
+ */
+int lkl_printf(const char *fmt, ...);
+
+extern char lkl_virtio_devs[4096];
+
+#ifdef LKL_HOST_CONFIG_POSIX
+#include <sys/uio.h>
+#else
+struct iovec {
+	void *iov_base;
+	size_t iov_len;
+};
+#endif
+
+extern struct lkl_dev_blk_ops lkl_dev_blk_ops;
+
+/**
+ * struct lkl_blk_req - block device request
+ *
+ * @type: type of request
+ * @prio: priority of request - currently unused
+ * @sector: offset in units 512 bytes for read / write requests
+ * @buf: an array of buffers to be used for read / write requests
+ * @count: the number of buffers
+ */
+struct lkl_blk_req {
+#define LKL_DEV_BLK_TYPE_READ		0
+#define LKL_DEV_BLK_TYPE_WRITE		1
+#define LKL_DEV_BLK_TYPE_FLUSH		4
+#define LKL_DEV_BLK_TYPE_FLUSH_OUT	5
+	unsigned int type;
+	unsigned int prio;
+	unsigned long long sector;
+	struct iovec *buf;
+	int count;
+};
+
+/**
+ * struct lkl_dev_blk_ops - block device host operations
+ */
+struct lkl_dev_blk_ops {
+	/**
+	 * @get_capacity: returns the disk capacity in bytes
+	 *
+	 * @disk - the disk for which the capacity is requested;
+	 * @res - pointer to receive the capacity, in bytes;
+	 * @returns - 0 in case of success, negative value in case of error
+	 */
+	int (*get_capacity)(struct lkl_disk disk, unsigned long long *res);
+#define LKL_DEV_BLK_STATUS_OK		0
+#define LKL_DEV_BLK_STATUS_IOERR	1
+#define LKL_DEV_BLK_STATUS_UNSUP	2
+	/**
+	 * @request: issue a block request
+	 *
+	 * @disk - the disk the request is issued to;
+	 * @req - a request described by &struct lkl_blk_req
+	 */
+	int (*request)(struct lkl_disk disk, struct lkl_blk_req *req);
+};
+
+struct lkl_netdev {
+	struct lkl_dev_net_ops *ops;
+	int id;
+	uint8_t has_vnet_hdr: 1;
+};
+
+/**
+ * struct lkl_dev_net_ops - network device host operations
+ */
+struct lkl_dev_net_ops {
+	/**
+	 * @tx: writes a L2 packet into the net device
+	 *
+	 * The data buffer can only hold 0 or 1 complete packets.
+	 *
+	 * @nd - pointer to the network device;
+	 * @iov - pointer to the buffer vector;
+	 * @cnt - # of vectors in iov.
+	 *
+	 * @returns number of bytes transmitted
+	 */
+	int (*tx)(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+
+	/**
+	 * @rx: reads a packet from the net device.
+	 *
+	 * It must only read one complete packet if present.
+	 *
+	 * If the buffer is too small for the packet, the implementation may
+	 * decide to drop it or trim it.
+	 *
+	 * @nd - pointer to the network device
+	 * @iov - pointer to the buffer vector to store the packet
+	 * @cnt - # of vectors in iov.
+	 *
+	 * @returns number of bytes read for success or < 0 if error
+	 */
+	int (*rx)(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+
+#define LKL_DEV_NET_POLL_RX		1
+#define LKL_DEV_NET_POLL_TX		2
+#define LKL_DEV_NET_POLL_HUP		4
+
+	/**
+	 * @poll: polls a net device
+	 *
+	 * Supports the following events: LKL_DEV_NET_POLL_RX
+	 * (readable), LKL_DEV_NET_POLL_TX (writable) or
+	 * LKL_DEV_NET_POLL_HUP (the close operations has been issued
+	 * and we need to clean up). Blocks until one event is
+	 * available.
+	 *
+	 * @nd - pointer to the network device
+	 *
+	 * @returns - LKL_DEV_NET_POLL_RX, LKL_DEV_NET_POLL_TX,
+	 * LKL_DEV_NET_POLL_HUP or a negative value for errors
+	 */
+	int (*poll)(struct lkl_netdev *nd);
+
+	/**
+	 * @poll_hup: make poll wakeup and return LKL_DEV_NET_POLL_HUP
+	 *
+	 * @nd - pointer to the network device
+	 */
+	void (*poll_hup)(struct lkl_netdev *nd);
+
+	/**
+	 * @free: frees a network device
+	 *
+	 * Implementation must release its resources and free the network device
+	 * structure.
+	 *
+	 * @nd - pointer to the network device
+	 */
+	void (*free)(struct lkl_netdev *nd);
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/tools/lkl/lib/.gitignore b/tools/lkl/lib/.gitignore
new file mode 100644
index 000000000000..427ae0273fdd
--- /dev/null
+++ b/tools/lkl/lib/.gitignore
@@ -0,0 +1,3 @@
+lkl.o
+liblkl.a
+
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
new file mode 100644
index 000000000000..719c7308c830
--- /dev/null
+++ b/tools/lkl/lib/Build
@@ -0,0 +1,25 @@
+CFLAGS_posix-host.o += -D_FILE_OFFSET_BITS=64
+CFLAGS_virtio_net_vde.o += $(pkg-config --cflags vdeplug 2>/dev/null)
+CFLAGS_nt-host.o += -D_WIN32_WINNT=0x0600
+
+liblkl-y += fs.o
+liblkl-y += iomem.o
+liblkl-y += net.o
+liblkl-y += jmp_buf.o
+liblkl-$(LKL_HOST_CONFIG_POSIX) += posix-host.o
+liblkl-$(LKL_HOST_CONFIG_NT) += nt-host.o
+liblkl-y += utils.o
+liblkl-y += virtio_blk.o
+liblkl-y += virtio.o
+liblkl-y += dbg.o
+liblkl-y += dbg_handler.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_fd.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_tap.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_raw.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET_MACVTAP) += virtio_net_macvtap.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET_DPDK) += virtio_net_dpdk.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET_VDE) += virtio_net_vde.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_pipe.o
+liblkl-y += ../../perf/pmu-events/jsmn.o
+liblkl-y += config.o
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 15/47] lkl tools: host lib: add utilities functions
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (13 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 14/47] lkl tools: skeleton for host side library, tests and tools Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 16/47] lkl tools: host lib: memory mapped I/O helpers Hajime Tazaki
                   ` (34 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: Conrad Meyer, Octavian Purdila, Motomu Utsumi, Hajime Tazaki,
	Patrick Collins, Akira Moroo, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Add basic utility functions for getting a string from a kernel error
code and a fprintf like function that uses the host print
operation. The latter is useful for informing the user about errors
that occur in the host library.

Other configuration and debug utilities are also added.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl_config.h |  61 +++
 tools/lkl/lib/config.c         | 793 +++++++++++++++++++++++++++++++++
 tools/lkl/lib/dbg.c            | 300 +++++++++++++
 tools/lkl/lib/dbg_handler.c    |  44 ++
 tools/lkl/lib/endian.h         |  31 ++
 tools/lkl/lib/jmp_buf.c        |  14 +
 tools/lkl/lib/jmp_buf.h        |   8 +
 tools/lkl/lib/utils.c          | 266 +++++++++++
 8 files changed, 1517 insertions(+)
 create mode 100644 tools/lkl/include/lkl_config.h
 create mode 100644 tools/lkl/lib/config.c
 create mode 100644 tools/lkl/lib/dbg.c
 create mode 100644 tools/lkl/lib/dbg_handler.c
 create mode 100644 tools/lkl/lib/endian.h
 create mode 100644 tools/lkl/lib/jmp_buf.c
 create mode 100644 tools/lkl/lib/jmp_buf.h
 create mode 100644 tools/lkl/lib/utils.c

diff --git a/tools/lkl/include/lkl_config.h b/tools/lkl/include/lkl_config.h
new file mode 100644
index 000000000000..d3edf8b414cf
--- /dev/null
+++ b/tools/lkl/include/lkl_config.h
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_CONFIG_H
+#define _LKL_LIB_CONFIG_H
+
+#define LKL_CONFIG_JSON_TOKEN_MAX 300
+
+struct lkl_config_iface {
+	struct lkl_config_iface *next;
+	struct lkl_netdev *nd;
+
+	/* OBSOLETE: should use IFTYPE and IFPARAMS */
+	char *iftap;
+	char *iftype;
+	char *ifparams;
+	char *ifmtu_str;
+	char *ifip;
+	char *ifipv6;
+	char *ifgateway;
+	char *ifgateway6;
+	char *ifmac_str;
+	char *ifnetmask_len;
+	char *ifnetmask6_len;
+	char *ifoffload_str;
+	char *ifneigh_entries;
+	char *ifqdisc_entries;
+};
+
+struct lkl_config {
+	int ifnum;
+	struct lkl_config_iface *ifaces;
+
+	char *gateway;
+	char *gateway6;
+	char *debug;
+	char *mount;
+	/* single_cpu mode:
+	 * 0: Don't pin to single CPU (default).
+	 * 1: Pin only LKL kernel threads to single CPU.
+	 * 2: Pin all LKL threads to single CPU including all LKL kernel threads
+	 * and device polling threads. Avoid this mode if having busy polling
+	 * threads.
+	 *
+	 * mode 2 can achieve better TCP_RR but worse TCP_STREAM than mode 1.
+	 * You should choose the best for your application and virtio device
+	 * type.
+	 */
+	char *single_cpu;
+	char *sysctls;
+	char *boot_cmdline;
+	char *dump;
+	char *delay_main;
+};
+
+int lkl_load_config_json(struct lkl_config *cfg, char *jstr);
+int lkl_load_config_env(struct lkl_config *cfg);
+void lkl_show_config(struct lkl_config *cfg);
+int lkl_load_config_pre(struct lkl_config *cfg);
+int lkl_load_config_post(struct lkl_config *cfg);
+int lkl_unload_config(struct lkl_config *cfg);
+
+#endif /* _LKL_LIB_CONFIG_H */
diff --git a/tools/lkl/lib/config.c b/tools/lkl/lib/config.c
new file mode 100644
index 000000000000..76fccd598db9
--- /dev/null
+++ b/tools/lkl/lib/config.c
@@ -0,0 +1,793 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdlib.h>
+#define _HAVE_STRING_ARCH_strtok_r
+#include <string.h>
+#include <lkl_host.h>
+#include <lkl_config.h>
+
+#include "../../perf/pmu-events/jsmn.h"
+
+static int jsoneq(const char *json, jsmntok_t *tok, const char *s)
+{
+	if (tok->type == JSMN_STRING &&
+		(int) strlen(s) == tok->end - tok->start &&
+		strncmp(json + tok->start, s, tok->end - tok->start) == 0) {
+		return 0;
+	}
+	return -1;
+}
+
+static int cfgcpy(char **to, char *from)
+{
+	if (!from)
+		return 0;
+	if (*to)
+		free(*to);
+	*to = (char *)malloc((strlen(from) + 1) * sizeof(char));
+	if (*to == NULL) {
+		lkl_printf("malloc failed\n");
+		return -1;
+	}
+	strcpy(*to, from);
+	return 0;
+}
+
+static int cfgncpy(char **to, char *from, int len)
+{
+	if (!from)
+		return 0;
+	if (*to)
+		free(*to);
+	*to = (char *)malloc((len + 1) * sizeof(char));
+	if (*to == NULL) {
+		lkl_printf("malloc failed\n");
+		return -1;
+	}
+	strncpy(*to, from, len + 1);
+	(*to)[len] = '\0';
+	return 0;
+}
+
+static int parse_ifarr(struct lkl_config *cfg,
+		jsmntok_t *toks, char *jstr, int startpos)
+{
+	int ifidx, pos, posend, ret;
+	char **cfgptr;
+	struct lkl_config_iface *iface, *prev = NULL;
+
+	if (!cfg || !toks || !jstr)
+		return -1;
+	pos = startpos;
+	pos++;
+	if (toks[pos].type != JSMN_ARRAY) {
+		lkl_printf("unexpected json type, json array expected\n");
+		return -1;
+	}
+
+	cfg->ifnum = toks[pos].size;
+	pos++;
+	iface = cfg->ifaces;
+
+	for (ifidx = 0; ifidx < cfg->ifnum; ifidx++) {
+		if (toks[pos].type != JSMN_OBJECT) {
+			lkl_printf("object json type expected\n");
+			return -1;
+		}
+
+		posend = pos + toks[pos].size;
+		pos++;
+		iface = malloc(sizeof(struct lkl_config_iface));
+		memset(iface, 0, sizeof(struct lkl_config_iface));
+
+		if (prev)
+			prev->next = iface;
+		else
+			cfg->ifaces = iface;
+		prev = iface;
+
+		for (; pos < posend; pos += 2) {
+			if (toks[pos].type != JSMN_STRING) {
+				lkl_printf("object json type expected\n");
+				return -1;
+			}
+			if (jsoneq(jstr, &toks[pos], "type") == 0) {
+				cfgptr = &iface->iftype;
+			} else if (jsoneq(jstr, &toks[pos], "param") == 0) {
+				cfgptr = &iface->ifparams;
+			} else if (jsoneq(jstr, &toks[pos], "mtu") == 0) {
+				cfgptr = &iface->ifmtu_str;
+			} else if (jsoneq(jstr, &toks[pos], "ip") == 0) {
+				cfgptr = &iface->ifip;
+			} else if (jsoneq(jstr, &toks[pos], "ipv6") == 0) {
+				cfgptr = &iface->ifipv6;
+			} else if (jsoneq(jstr, &toks[pos], "ifgateway") == 0) {
+				cfgptr = &iface->ifgateway;
+			} else if (jsoneq(jstr, &toks[pos],
+							"ifgateway6") == 0) {
+				cfgptr = &iface->ifgateway6;
+			} else if (jsoneq(jstr, &toks[pos], "mac") == 0) {
+				cfgptr = &iface->ifmac_str;
+			} else if (jsoneq(jstr, &toks[pos], "masklen") == 0) {
+				cfgptr = &iface->ifnetmask_len;
+			} else if (jsoneq(jstr, &toks[pos], "masklen6") == 0) {
+				cfgptr = &iface->ifnetmask6_len;
+			} else if (jsoneq(jstr, &toks[pos], "neigh") == 0) {
+				cfgptr = &iface->ifneigh_entries;
+			} else if (jsoneq(jstr, &toks[pos], "qdisc") == 0) {
+				cfgptr = &iface->ifqdisc_entries;
+			} else if (jsoneq(jstr, &toks[pos], "offload") == 0) {
+				cfgptr = &iface->ifoffload_str;
+			} else {
+				lkl_printf("unexpected key: %.*s\n",
+						toks[pos].end-toks[pos].start,
+						jstr + toks[pos].start);
+				return -1;
+			}
+			ret = cfgncpy(cfgptr, jstr + toks[pos+1].start,
+					toks[pos+1].end-toks[pos+1].start);
+			if (ret < 0)
+				return ret;
+		}
+	}
+	return pos - startpos;
+}
+
+#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
+
+int lkl_load_config_json(struct lkl_config *cfg, char *jstr)
+{
+	int pos, ret;
+	char **cfgptr;
+	jsmn_parser jp;
+	jsmntok_t toks[LKL_CONFIG_JSON_TOKEN_MAX];
+
+	if (!cfg || !jstr)
+		return -1;
+	jsmn_init(&jp);
+	ret = jsmn_parse(&jp, jstr, strlen(jstr), toks, ARRAY_SIZE(toks));
+	if (ret != JSMN_SUCCESS) {
+		lkl_printf("failed to parse json\n");
+		return -1;
+	}
+	if (toks[0].type != JSMN_OBJECT) {
+		lkl_printf("object json type expected\n");
+		return -1;
+	}
+	for (pos = 1; pos < jp.toknext; pos++) {
+		if (toks[pos].type != JSMN_STRING) {
+			lkl_printf("string json type expected\n");
+			return -1;
+		}
+		if (jsoneq(jstr, &toks[pos], "interfaces") == 0) {
+			ret = parse_ifarr(cfg, toks, jstr, pos);
+			if (ret < 0)
+				return ret;
+			pos += ret;
+			pos--;
+			continue;
+		}
+		if (jsoneq(jstr, &toks[pos], "gateway") == 0) {
+			cfgptr = &cfg->gateway;
+		} else if (jsoneq(jstr, &toks[pos], "gateway6") == 0) {
+			cfgptr = &cfg->gateway6;
+		} else if (jsoneq(jstr, &toks[pos], "debug") == 0) {
+			cfgptr = &cfg->debug;
+		} else if (jsoneq(jstr, &toks[pos], "mount") == 0) {
+			cfgptr = &cfg->mount;
+		} else if (jsoneq(jstr, &toks[pos], "singlecpu") == 0) {
+			cfgptr = &cfg->single_cpu;
+		} else if (jsoneq(jstr, &toks[pos], "sysctl") == 0) {
+			cfgptr = &cfg->sysctls;
+		} else if (jsoneq(jstr, &toks[pos], "boot_cmdline") == 0) {
+			cfgptr = &cfg->boot_cmdline;
+		} else if (jsoneq(jstr, &toks[pos], "dump") == 0) {
+			cfgptr = &cfg->dump;
+		} else if (jsoneq(jstr, &toks[pos], "delay_main") == 0) {
+			cfgptr = &cfg->delay_main;
+		} else {
+			lkl_printf("unexpected key in json %.*s\n",
+					toks[pos].end-toks[pos].start,
+					jstr + toks[pos].start);
+			return -1;
+		}
+		pos++;
+		ret = cfgncpy(cfgptr, jstr + toks[pos].start,
+				toks[pos].end-toks[pos].start);
+		if (ret < 0)
+			return ret;
+	}
+	return 0;
+}
+
+void lkl_show_config(struct lkl_config *cfg)
+{
+	struct lkl_config_iface *iface;
+	int i = 0;
+
+	if (!cfg)
+		return;
+	lkl_printf("gateway: %s\n", cfg->gateway);
+	lkl_printf("gateway6: %s\n", cfg->gateway6);
+	lkl_printf("debug: %s\n", cfg->debug);
+	lkl_printf("mount: %s\n", cfg->mount);
+	lkl_printf("singlecpu: %s\n", cfg->single_cpu);
+	lkl_printf("sysctl: %s\n", cfg->sysctls);
+	lkl_printf("cmdline: %s\n", cfg->boot_cmdline);
+	lkl_printf("dump: %s\n", cfg->dump);
+	lkl_printf("delay: %s\n", cfg->delay_main);
+
+	for (iface = cfg->ifaces; iface; iface = iface->next, i++) {
+		lkl_printf("ifmac[%d] = %s\n", i, iface->ifmac_str);
+		lkl_printf("ifmtu[%d] = %s\n", i, iface->ifmtu_str);
+		lkl_printf("iftype[%d] = %s\n", i, iface->iftype);
+		lkl_printf("ifparam[%d] = %s\n", i, iface->ifparams);
+		lkl_printf("ifip[%d] = %s\n", i, iface->ifip);
+		lkl_printf("ifmasklen[%d] = %s\n", i, iface->ifnetmask_len);
+		lkl_printf("ifgateway[%d] = %s\n", i, iface->ifgateway);
+		lkl_printf("ifip6[%d] = %s\n", i, iface->ifipv6);
+		lkl_printf("ifmasklen6[%d] = %s\n", i, iface->ifnetmask6_len);
+		lkl_printf("ifgateway6[%d] = %s\n", i, iface->ifgateway6);
+		lkl_printf("ifoffload[%d] = %s\n", i, iface->ifoffload_str);
+		lkl_printf("ifneigh[%d] = %s\n", i, iface->ifneigh_entries);
+		lkl_printf("ifqdisk[%d] = %s\n", i, iface->ifqdisc_entries);
+	}
+}
+
+int lkl_load_config_env(struct lkl_config *cfg)
+{
+	int ret;
+	char *envtap = getenv("LKL_HIJACK_NET_TAP");
+	char *enviftype = getenv("LKL_HIJACK_NET_IFTYPE");
+	char *envifparams = getenv("LKL_HIJACK_NET_IFPARAMS");
+	char *envmtu_str = getenv("LKL_HIJACK_NET_MTU");
+	char *envip = getenv("LKL_HIJACK_NET_IP");
+	char *envipv6 = getenv("LKL_HIJACK_NET_IPV6");
+	char *envifgateway = getenv("LKL_HIJACK_NET_IFGATEWAY");
+	char *envifgateway6 = getenv("LKL_HIJACK_NET_IFGATEWAY6");
+	char *envmac_str = getenv("LKL_HIJACK_NET_MAC");
+	char *envnetmask_len = getenv("LKL_HIJACK_NET_NETMASK_LEN");
+	char *envnetmask6_len = getenv("LKL_HIJACK_NET_NETMASK6_LEN");
+	char *envgateway = getenv("LKL_HIJACK_NET_GATEWAY");
+	char *envgateway6 = getenv("LKL_HIJACK_NET_GATEWAY6");
+	char *envdebug = getenv("LKL_HIJACK_DEBUG");
+	char *envmount = getenv("LKL_HIJACK_MOUNT");
+	char *envneigh_entries = getenv("LKL_HIJACK_NET_NEIGHBOR");
+	char *envqdisc_entries = getenv("LKL_HIJACK_NET_QDISC");
+	char *envsingle_cpu = getenv("LKL_HIJACK_SINGLE_CPU");
+	char *envoffload_str = getenv("LKL_HIJACK_OFFLOAD");
+	char *envsysctls = getenv("LKL_HIJACK_SYSCTL");
+	char *envboot_cmdline = getenv("LKL_HIJACK_BOOT_CMDLINE") ? : "";
+	char *envdump = getenv("LKL_HIJACK_DUMP");
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return -1;
+	if (envtap || enviftype)
+		cfg->ifnum = 1;
+
+	iface = malloc(sizeof(struct lkl_config_iface));
+	memset(iface, 0, sizeof(struct lkl_config_iface));
+
+	ret = cfgcpy(&iface->iftap, envtap);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->iftype, enviftype);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifparams, envifparams);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifmtu_str, envmtu_str);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifip, envip);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifipv6, envipv6);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifgateway, envifgateway);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifgateway6, envifgateway6);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifmac_str, envmac_str);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifnetmask_len, envnetmask_len);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifnetmask6_len, envnetmask6_len);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifoffload_str, envoffload_str);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifneigh_entries, envneigh_entries);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifqdisc_entries, envqdisc_entries);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->gateway, envgateway);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->gateway6, envgateway6);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->debug, envdebug);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->mount, envmount);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->single_cpu, envsingle_cpu);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->sysctls, envsysctls);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->boot_cmdline, envboot_cmdline);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->dump, envdump);
+	if (ret < 0)
+		return ret;
+	return 0;
+}
+
+static int parse_mac_str(char *mac_str, __lkl__u8 mac[LKL_ETH_ALEN])
+{
+	char delim[] = ":";
+	char *saveptr = NULL, *token = NULL;
+	int i = 0;
+
+	if (!mac_str)
+		return 0;
+
+	for (token = strtok_r(mac_str, delim, &saveptr);
+	     i < LKL_ETH_ALEN; i++) {
+		if (!token) {
+			/* The address is too short */
+			return -1;
+		}
+
+		mac[i] = (__lkl__u8) strtol(token, NULL, 16);
+		token = strtok_r(NULL, delim, &saveptr);
+	}
+
+	if (strtok_r(NULL, delim, &saveptr)) {
+		/* The address is too long */
+		return -1;
+	}
+
+	return 1;
+}
+
+/* Add permanent neighbor entries in the form of "ip|mac;ip|mac;..." */
+static void add_neighbor(int ifindex, char *entries)
+{
+	char *saveptr = NULL, *token = NULL;
+	char *ip = NULL, *mac_str = NULL;
+	int ret = 0;
+	__lkl__u8 mac[LKL_ETH_ALEN];
+	char ip_addr[16];
+	int af;
+
+	for (token = strtok_r(entries, ";", &saveptr); token;
+	     token = strtok_r(NULL, ";", &saveptr)) {
+		ip = strtok(token, "|");
+		mac_str = strtok(NULL, "|");
+		if (ip == NULL || mac_str == NULL || strtok(NULL, "|") != NULL)
+			return;
+
+		af = LKL_AF_INET;
+		ret = inet_pton(LKL_AF_INET, ip, ip_addr);
+		if (ret == 0) {
+			ret = inet_pton(LKL_AF_INET6, ip, ip_addr);
+			af = LKL_AF_INET6;
+		}
+		if (ret != 1) {
+			lkl_printf("Bad ip address: %s\n", ip);
+			return;
+		}
+
+		ret = parse_mac_str(mac_str, mac);
+		if (ret != 1) {
+			lkl_printf("Failed to parse mac: %s\n", mac_str);
+			return;
+		}
+		ret = lkl_add_neighbor(ifindex, af, ip_addr, mac);
+		if (ret) {
+			lkl_printf("Failed to add neighbor entry: %s\n",
+				   lkl_strerror(ret));
+			return;
+		}
+	}
+}
+
+/* We don't have an easy way to make FILE*s out of our fds, so we
+ * can't use e.g. fgets
+ */
+static int dump_file(char *path)
+{
+	int ret = -1, bytes_read = 0;
+	char str[1024] = { 0 };
+	int fd;
+
+	fd = lkl_sys_open(path, LKL_O_RDONLY, 0);
+
+	if (fd < 0) {
+		lkl_printf("%s lkl_sys_open %s: %s\n",
+			   __func__, path, lkl_strerror(fd));
+		return -1;
+	}
+
+	/* Need to print this out in order to make sense of the output */
+	lkl_printf("Reading from %s:\n==========\n", path);
+	while ((ret = lkl_sys_read(fd, str, sizeof(str) - 1)) > 0)
+		bytes_read += lkl_printf("%s", str);
+	lkl_printf("==========\n");
+
+	if (ret) {
+		lkl_printf("%s lkl_sys_read %s: %s\n",
+			   __func__, path, lkl_strerror(ret));
+		return -1;
+	}
+
+	return 0;
+}
+
+static void mount_cmds_exec(char *_cmds, int (*callback)(char *))
+{
+	char *saveptr = NULL, *token;
+	int ret = 0;
+	char *cmds = strdup(_cmds);
+
+	token = strtok_r(cmds, ",", &saveptr);
+
+	while (token && ret >= 0) {
+		ret = callback(token);
+		token = strtok_r(NULL, ",", &saveptr);
+	}
+
+	if (ret < 0)
+		lkl_printf("%s: failed parsing %s\n", __func__, _cmds);
+
+	free(cmds);
+}
+
+static int lkl_config_netdev_create(struct lkl_config *cfg,
+				    struct lkl_config_iface *iface)
+{
+	int ret, offload = 0;
+	struct lkl_netdev_args nd_args;
+	__lkl__u8 mac[LKL_ETH_ALEN] = {0};
+	struct lkl_netdev *nd = NULL;
+
+	if (iface->ifoffload_str)
+		offload = strtol(iface->ifoffload_str, NULL, 0);
+	memset(&nd_args, 0, sizeof(struct lkl_netdev_args));
+
+	if (iface->iftap) {
+		lkl_printf("WARN: LKL_HIJACK_NET_TAP is now obsoleted.\n");
+		lkl_printf("use LKL_HIJACK_NET_IFTYPE and PARAMS\n");
+		nd = lkl_netdev_tap_create(iface->iftap, offload);
+	}
+
+	if (!nd && iface->iftype && iface->ifparams) {
+		if ((strcmp(iface->iftype, "tap") == 0)) {
+			nd = lkl_netdev_tap_create(iface->ifparams, offload);
+		} else if ((strcmp(iface->iftype, "macvtap") == 0)) {
+			nd = lkl_netdev_macvtap_create(iface->ifparams,
+						       offload);
+		} else if ((strcmp(iface->iftype, "dpdk") == 0)) {
+			nd = lkl_netdev_dpdk_create(iface->ifparams, offload,
+						    mac);
+		} else if ((strcmp(iface->iftype, "pipe") == 0)) {
+			nd = lkl_netdev_pipe_create(iface->ifparams, offload);
+		} else {
+			if (offload) {
+				lkl_printf("WARN: %s isn't supported on %s\n",
+					   "LKL_HIJACK_OFFLOAD",
+					   iface->iftype);
+				lkl_printf(
+					"WARN: Disabling offload features.\n");
+			}
+			offload = 0;
+		}
+		if (strcmp(iface->iftype, "vde") == 0)
+			nd = lkl_netdev_vde_create(iface->ifparams);
+		if (strcmp(iface->iftype, "raw") == 0)
+			nd = lkl_netdev_raw_create(iface->ifparams);
+	}
+
+	if (nd) {
+		if ((mac[0] != 0) || (mac[1] != 0) ||
+				(mac[2] != 0) || (mac[3] != 0) ||
+				(mac[4] != 0) || (mac[5] != 0)) {
+			nd_args.mac = mac;
+		} else {
+			ret = parse_mac_str(iface->ifmac_str, mac);
+
+			if (ret < 0) {
+				lkl_printf("failed to parse mac\n");
+				return -1;
+			} else if (ret > 0) {
+				nd_args.mac = mac;
+			} else {
+				nd_args.mac = NULL;
+			}
+		}
+
+		nd_args.offload = offload;
+		ret = lkl_netdev_add(nd, &nd_args);
+		if (ret < 0) {
+			lkl_printf("failed to add netdev: %s\n",
+				   lkl_strerror(ret));
+			return -1;
+		}
+		nd->id = ret;
+		iface->nd = nd;
+	}
+	return 0;
+}
+
+static int lkl_config_netdev_configure(struct lkl_config *cfg,
+				       struct lkl_config_iface *iface)
+{
+	int ret, nd_ifindex = -1;
+	struct lkl_netdev *nd = iface->nd;
+
+	if (!nd) {
+		lkl_printf("no netdev available %s\n", iface ? iface->ifparams
+			   : "(null)");
+		return -1;
+	}
+
+	if (nd->id >= 0) {
+		nd_ifindex = lkl_netdev_get_ifindex(nd->id);
+		if (nd_ifindex > 0)
+			lkl_if_up(nd_ifindex);
+		else
+			lkl_printf(
+				"failed to get ifindex for netdev id %d: %s\n",
+				nd->id, lkl_strerror(nd_ifindex));
+	}
+
+	if (nd_ifindex >= 0 && iface->ifmtu_str) {
+		int mtu = atoi(iface->ifmtu_str);
+
+		ret = lkl_if_set_mtu(nd_ifindex, mtu);
+		if (ret < 0)
+			lkl_printf("failed to set MTU: %s\n",
+				   lkl_strerror(ret));
+	}
+
+	if (nd_ifindex >= 0 && iface->ifip && iface->ifnetmask_len) {
+		unsigned int addr;
+
+		if (inet_pton(LKL_AF_INET, iface->ifip,
+			      (struct lkl_in_addr *)&addr) != 1)
+			lkl_printf("Invalid ipv4 address: %s\n", iface->ifip);
+
+		int nmlen = atoi(iface->ifnetmask_len);
+
+		if (addr != LKL_INADDR_NONE && nmlen > 0 && nmlen < 32) {
+			ret = lkl_if_set_ipv4(nd_ifindex, addr, nmlen);
+			if (ret < 0)
+				lkl_printf("failed to set IPv4 address: %s\n",
+					   lkl_strerror(ret));
+		}
+		if (iface->ifgateway) {
+			unsigned int gwaddr;
+
+			if (inet_pton(LKL_AF_INET, iface->ifgateway,
+				      (struct lkl_in_addr *)&gwaddr) != 1)
+				lkl_printf("Invalid ipv4 gateway: %s\n",
+					   iface->ifgateway);
+
+			if (gwaddr != LKL_INADDR_NONE) {
+				ret = lkl_if_set_ipv4_gateway(nd_ifindex,
+						addr, nmlen, gwaddr);
+				if (ret < 0)
+					lkl_printf(
+						"failed to set v4 if gw: %s\n",
+						lkl_strerror(ret));
+			}
+		}
+	}
+
+	if (nd_ifindex >= 0 && iface->ifipv6 &&
+			iface->ifnetmask6_len) {
+		struct lkl_in6_addr addr;
+		unsigned int pflen = atoi(iface->ifnetmask6_len);
+
+		if (inet_pton(LKL_AF_INET6, iface->ifipv6,
+			      (struct lkl_in6_addr *)&addr) != 1) {
+			lkl_printf("Invalid ipv6 addr: %s\n",
+				   iface->ifipv6);
+		}  else {
+			ret = lkl_if_set_ipv6(nd_ifindex, &addr, pflen);
+			if (ret < 0)
+				lkl_printf("failed to set IPv6 address: %s\n",
+					   lkl_strerror(ret));
+		}
+		if (iface->ifgateway6) {
+			char gwaddr[16];
+
+			if (inet_pton(LKL_AF_INET6, iface->ifgateway6,
+								gwaddr) != 1) {
+				lkl_printf("Invalid ipv6 gateway: %s\n",
+					   iface->ifgateway6);
+			} else {
+				ret = lkl_if_set_ipv6_gateway(nd_ifindex,
+						&addr, pflen, gwaddr);
+				if (ret < 0)
+					lkl_printf(
+						"failed to set v6 if gw: %s\n",
+						lkl_strerror(ret));
+			}
+		}
+	}
+
+	if (nd_ifindex >= 0 && iface->ifneigh_entries)
+		add_neighbor(nd_ifindex, iface->ifneigh_entries);
+
+	if (nd_ifindex >= 0 && iface->ifqdisc_entries)
+		lkl_qdisc_parse_add(nd_ifindex, iface->ifqdisc_entries);
+
+	return 0;
+}
+
+static void free_cfgparam(char *cfgparam)
+{
+	if (cfgparam)
+		free(cfgparam);
+}
+
+static int lkl_clean_config(struct lkl_config *cfg)
+{
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return -1;
+
+	for (iface = cfg->ifaces; iface; iface = iface->next) {
+		free_cfgparam(iface->iftap);
+		free_cfgparam(iface->iftype);
+		free_cfgparam(iface->ifparams);
+		free_cfgparam(iface->ifmtu_str);
+		free_cfgparam(iface->ifip);
+		free_cfgparam(iface->ifipv6);
+		free_cfgparam(iface->ifgateway);
+		free_cfgparam(iface->ifgateway6);
+		free_cfgparam(iface->ifmac_str);
+		free_cfgparam(iface->ifnetmask_len);
+		free_cfgparam(iface->ifnetmask6_len);
+		free_cfgparam(iface->ifoffload_str);
+		free_cfgparam(iface->ifneigh_entries);
+		free_cfgparam(iface->ifqdisc_entries);
+	}
+	free_cfgparam(cfg->gateway);
+	free_cfgparam(cfg->gateway6);
+	free_cfgparam(cfg->debug);
+	free_cfgparam(cfg->mount);
+	free_cfgparam(cfg->single_cpu);
+	free_cfgparam(cfg->sysctls);
+	free_cfgparam(cfg->boot_cmdline);
+	free_cfgparam(cfg->dump);
+	free_cfgparam(cfg->delay_main);
+	return 0;
+}
+
+
+int lkl_load_config_pre(struct lkl_config *cfg)
+{
+	int lkl_debug, ret;
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return 0;
+
+	if (cfg->debug)
+		lkl_debug = strtol(cfg->debug, NULL, 0);
+
+	if (!cfg->debug || (lkl_debug == 0))
+		lkl_host_ops.print = NULL;
+
+	for (iface = cfg->ifaces; iface; iface = iface->next) {
+		ret = lkl_config_netdev_create(cfg, iface);
+		if (ret < 0)
+			return -1;
+	}
+
+	return 0;
+}
+
+int lkl_load_config_post(struct lkl_config *cfg)
+{
+	int ret;
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return 0;
+
+	if (cfg->mount)
+		mount_cmds_exec(cfg->mount, lkl_mount_fs);
+
+	for (iface = cfg->ifaces; iface; iface = iface->next) {
+		ret = lkl_config_netdev_configure(cfg, iface);
+		if (ret < 0)
+			break;
+	}
+
+	if (cfg->gateway) {
+		unsigned int gwaddr;
+
+		if (inet_pton(LKL_AF_INET, cfg->gateway,
+			      (struct lkl_in_addr *)&gwaddr) != 1)
+			lkl_printf("Invalid ipv4 gateway: %s\n", cfg->gateway);
+
+		if (gwaddr != LKL_INADDR_NONE) {
+			ret = lkl_set_ipv4_gateway(gwaddr);
+			if (ret < 0)
+				lkl_printf("failed to set IPv4 gateway: %s\n",
+					   lkl_strerror(ret));
+		}
+	}
+
+	if (cfg->gateway6) {
+		char gw[16];
+
+		if (inet_pton(LKL_AF_INET6, cfg->gateway6, gw) != 1) {
+			lkl_printf("Invalid ipv6 gateway: %s\n", cfg->gateway6);
+		} else {
+			ret = lkl_set_ipv6_gateway(gw);
+			if (ret < 0)
+				lkl_printf("failed to set IPv6 gateway: %s\n",
+					   lkl_strerror(ret));
+		}
+	}
+
+	if (cfg->sysctls)
+		lkl_sysctl_parse_write(cfg->sysctls);
+
+	/* put a delay before calling main() */
+	if (cfg->delay_main) {
+		unsigned long delay = strtoul(cfg->delay_main, NULL, 10);
+
+		if (delay == ~0UL)
+			lkl_printf("got invalid delay_main value (%s)\n",
+				   cfg->delay_main);
+		else {
+			lkl_printf("sleeping %lu usec\n", delay);
+			usleep(delay);
+		}
+	}
+
+	return 0;
+}
+
+int lkl_unload_config(struct lkl_config *cfg)
+{
+	struct lkl_config_iface *iface;
+
+	if (cfg) {
+		if (cfg->dump)
+			mount_cmds_exec(cfg->dump, dump_file);
+
+		for (iface = cfg->ifaces; iface; iface = iface->next) {
+			if (iface->nd) {
+				if (iface->nd->id >= 0)
+					lkl_netdev_remove(iface->nd->id);
+				lkl_netdev_free(iface->nd);
+			}
+		}
+
+		lkl_clean_config(cfg);
+	}
+
+	return 0;
+}
diff --git a/tools/lkl/lib/dbg.c b/tools/lkl/lib/dbg.c
new file mode 100644
index 000000000000..b613353bce5c
--- /dev/null
+++ b/tools/lkl/lib/dbg.c
@@ -0,0 +1,300 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <errno.h>
+#include <lkl.h>
+#include <limits.h>
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+static const char *PROMOTE = "$";
+#define str(x) #x
+#define xstr(s) str(s)
+#define MAX_BUF 100
+static char cmd[MAX_BUF];
+static char argv[10][MAX_BUF];
+static int argc;
+static char cur_dir[MAX_BUF] = "/";
+
+static char *normalize_path(const char *src, size_t src_len)
+{
+	char *res;
+	unsigned int res_len;
+	const char *ptr = src;
+	const char *end = &src[src_len];
+	const char *next;
+
+	res = malloc((src_len > 0 ? src_len : 1) + 1);
+	res_len = 0;
+
+	for (ptr = src; ptr < end; ptr = next+1) {
+		size_t len;
+
+		next = memchr(ptr, '/', end-ptr);
+		if (next == NULL)
+			next = end;
+
+		len = next-ptr;
+		switch (len) {
+		case 2:
+			if (ptr[0] == '.' && ptr[1] == '.') {
+				const char *slash = strrchr(res, '/');
+
+				if (slash != NULL)
+					res_len = slash - res;
+				continue;
+			}
+			break;
+		case 1:
+			if (ptr[0] == '.')
+				continue;
+			break;
+		case 0:
+			continue;
+		}
+		res[res_len++] = '/';
+		memcpy(&res[res_len], ptr, len);
+		res_len += len;
+	}
+	if (res_len == 0)
+		res[res_len++] = '/';
+	res[res_len] = '\0';
+	return res;
+}
+
+static void build_path(char *path)
+{
+	char *npath;
+
+	strcpy(path, cur_dir);
+	if (argc >= 1) {
+		if (argv[0][0] == '/')
+			strncpy(path, argv[0], LKL_PATH_MAX);
+		else {
+			strncat(path, "/", LKL_PATH_MAX - strlen(path) - 1);
+			strncat(path, argv[0], LKL_PATH_MAX - strlen(path) - 1);
+		}
+	}
+	npath = normalize_path(path, strlen(path));
+	strcpy(path, npath);
+	free(npath);
+}
+
+static void help(void)
+{
+	const char *msg =
+		"cat FILE\n"
+		"\tShow content of FILE\n"
+		"cd [DIR]\n"
+		"\tChange directory to DIR\n"
+		"exit\n"
+		"\tExit the debug session\n"
+		"help\n"
+		"\tShow this message\n"
+		"ls [DIR]\n"
+		"\tList files in DIR\n"
+		"mount FSTYPE\n"
+		"\tMount FSTYPE as /FSTYPE\n"
+		"overwrite FILE\n"
+		"\tOverwrite content of FILE from stdin\n"
+		"pwd\n"
+		"\tShow current directory\n"
+		;
+	printf("%s", msg);
+}
+
+static void ls(void)
+{
+	char path[LKL_PATH_MAX];
+	struct lkl_dir *dir;
+	struct lkl_linux_dirent64 *de;
+	int err;
+
+	build_path(path);
+	dir = lkl_opendir(path, &err);
+	if (dir) {
+		do {
+			de = lkl_readdir(dir);
+			if (de) {
+				printf("%s\n", de->d_name);
+			} else {
+				err = lkl_errdir(dir);
+				if (err != 0) {
+					fprintf(stderr, "%s\n",
+						lkl_strerror(err));
+				}
+				break;
+			}
+		} while (1);
+		lkl_closedir(dir);
+	} else {
+		fprintf(stderr, "%s: %s\n", path, lkl_strerror(err));
+	}
+}
+
+static void cd(void)
+{
+	char path[LKL_PATH_MAX];
+	struct lkl_dir *dir;
+	int err;
+
+	build_path(path);
+	dir = lkl_opendir(path, &err);
+	if (dir) {
+		strcpy(cur_dir, path);
+		lkl_closedir(dir);
+	} else {
+		fprintf(stderr, "%s: %s\n", path, lkl_strerror(err));
+	}
+}
+
+static void mount(void)
+{
+	char *fstype;
+	int ret = 0;
+
+	if (argc != 1) {
+		fprintf(stderr, "%s\n", "One argument is needed.");
+		return;
+	}
+
+	fstype = argv[0];
+	ret = lkl_mount_fs(fstype);
+	if (ret == 1)
+		fprintf(stderr, "%s is already mounted.\n", fstype);
+}
+
+static void cat(void)
+{
+	char path[LKL_PATH_MAX];
+	int ret;
+	char buf[1024];
+	int fd;
+
+	if (argc != 1) {
+		fprintf(stderr, "%s\n", "One argument is needed.");
+		return;
+	}
+
+	build_path(path);
+	fd = lkl_sys_open(path, LKL_O_RDONLY, 0);
+
+	if (fd < 0) {
+		fprintf(stderr, "lkl_sys_open %s: %s\n",
+			path, lkl_strerror(fd));
+		return;
+	}
+
+	while ((ret = lkl_sys_read(fd, buf, sizeof(buf) - 1)) > 0) {
+		buf[ret] = '\0';
+		printf("%s", buf);
+	}
+
+	if (ret) {
+		fprintf(stderr, "lkl_sys_read %s: %s\n",
+			path, lkl_strerror(ret));
+	}
+	lkl_sys_close(fd);
+}
+
+static void overwrite(void)
+{
+	char path[LKL_PATH_MAX];
+	int ret;
+	int fd;
+	char buf[1024];
+
+	build_path(path);
+	fd = lkl_sys_open(path, LKL_O_WRONLY | LKL_O_CREAT, 0);
+	if (fd < 0) {
+		fprintf(stderr, "lkl_sys_open %s: %s\n",
+			path, lkl_strerror(fd));
+		return;
+	}
+	printf("Input the content and stop by hitting Ctrl-D:\n");
+	while (fgets(buf, 1023, stdin)) {
+		ret = lkl_sys_write(fd, buf, strlen(buf));
+		if (ret < 0) {
+			fprintf(stderr, "lkl_sys_write %s: %s\n",
+				path, lkl_strerror(fd));
+		}
+	}
+	lkl_sys_close(fd);
+}
+
+static void pwd(void)
+{
+	printf("%s\n", cur_dir);
+}
+
+static int parse_cmd(char *input)
+{
+	char *token;
+
+	token = strtok(input, " ");
+	if (token)
+		strcpy(cmd, token);
+	else
+		return -1;
+
+	argc = 0;
+	token = strtok(NULL, " ");
+	while (token) {
+		if (argc >= 10) {
+			fprintf(stderr, "To many args > 10\n");
+			return -1;
+		}
+		strcpy(argv[argc++], token);
+		token = strtok(NULL, " ");
+	}
+	return 0;
+}
+
+static void run_cmd(void)
+{
+	if (strcmp(cmd, "cat") == 0)
+		cat();
+	else if (strcmp(cmd, "cd") == 0)
+		cd();
+	else if (strcmp(cmd, "help") == 0)
+		help();
+	else if (strcmp(cmd, "ls") == 0)
+		ls();
+	else if (strcmp(cmd, "mount") == 0)
+		mount();
+	else if (strcmp(cmd, "overwrite") == 0)
+		overwrite();
+	else if (strcmp(cmd, "pwd") == 0)
+		pwd();
+	else
+		fprintf(stderr, "Unknown command: %s\n", cmd);
+}
+
+void dbg_entrance(void)
+{
+	char input[MAX_BUF + 1];
+	int ret;
+	int c;
+
+	printf("Type help to see a list of commands\n");
+	do {
+		printf("%s ", PROMOTE);
+		ret = scanf("%" xstr(MAX_BUF) "[^\n]s", input);
+		while ((c = getchar()) != '\n' && c != EOF)
+			;
+		if (ret == 0)
+			continue;
+		if (ret != 1 && errno != EINTR) {
+			perror("scanf");
+			continue;
+		}
+		if (strlen(input) == MAX_BUF) {
+			fprintf(stderr, "Too long input > %d\n", MAX_BUF - 1);
+			continue;
+		}
+		if (parse_cmd(input))
+			continue;
+		if (strcmp(cmd, "exit") == 0)
+			break;
+		run_cmd();
+	} while (1);
+}
diff --git a/tools/lkl/lib/dbg_handler.c b/tools/lkl/lib/dbg_handler.c
new file mode 100644
index 000000000000..01d165a5fc1e
--- /dev/null
+++ b/tools/lkl/lib/dbg_handler.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <lkl_host.h>
+
+extern void dbg_entrance(void);
+static int dbg_running;
+
+static void dbg_thread(void *arg)
+{
+	lkl_host_ops.thread_detach();
+	printf("======Enter Debug======\n");
+	dbg_entrance();
+	printf("======Exit Debug======\n");
+	dbg_running = 0;
+}
+
+void dbg_handler(int signum)
+{
+	/* We don't care about the possible race on dbg_running. */
+	if (dbg_running) {
+		fprintf(stderr, "A debug lib is running\n");
+		return;
+	}
+	dbg_running = 1;
+	lkl_host_ops.thread_create(&dbg_thread, NULL);
+}
+
+#ifndef __MINGW32__
+#include <signal.h>
+void lkl_register_dbg_handler(void)
+{
+	struct sigaction sa;
+
+	sigemptyset(&sa.sa_mask);
+	sa.sa_handler = dbg_handler;
+	if (sigaction(SIGTSTP, &sa, NULL) == -1)
+		perror("sigaction");
+}
+#else
+void lkl_register_dbg_handler(void)
+{
+	fprintf(stderr, "%s is not implemented.\n", __func__);
+}
+#endif
diff --git a/tools/lkl/lib/endian.h b/tools/lkl/lib/endian.h
new file mode 100644
index 000000000000..aaccfa0edb65
--- /dev/null
+++ b/tools/lkl/lib/endian.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_ENDIAN_H
+#define _LKL_LIB_ENDIAN_H
+
+#if defined(__FreeBSD__)
+#include <sys/endian.h>
+#elif defined(__ANDROID__)
+#include <sys/endian.h>
+#elif defined(__MINGW32__)
+#include <winsock.h>
+#define le32toh(x) (x)
+#define le16toh(x) (x)
+#define htole32(x) (x)
+#define htole16(x) (x)
+#define le64toh(x) (x)
+#define htobe32(x) htonl(x)
+#define htobe16(x) htons(x)
+#define be32toh(x) ntohl(x)
+#define be16toh(x) ntohs(x)
+#else
+#include <endian.h>
+#endif
+
+#ifndef htonl
+#define htonl(x) htobe32(x)
+#define htons(x) htobe16(x)
+#define ntohl(x) be32toh(x)
+#define ntohs(x) be16toh(x)
+#endif
+
+#endif /* _LKL_LIB_ENDIAN_H */
diff --git a/tools/lkl/lib/jmp_buf.c b/tools/lkl/lib/jmp_buf.c
new file mode 100644
index 000000000000..f6bdd7e4bd83
--- /dev/null
+++ b/tools/lkl/lib/jmp_buf.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <setjmp.h>
+#include <lkl_host.h>
+
+void jmp_buf_set(struct lkl_jmp_buf *jmpb, void (*f)(void))
+{
+	if (!setjmp(*((jmp_buf *)jmpb->buf)))
+		f();
+}
+
+void jmp_buf_longjmp(struct lkl_jmp_buf *jmpb, int val)
+{
+	longjmp(*((jmp_buf *)jmpb->buf), val);
+}
diff --git a/tools/lkl/lib/jmp_buf.h b/tools/lkl/lib/jmp_buf.h
new file mode 100644
index 000000000000..8782cbaaf51f
--- /dev/null
+++ b/tools/lkl/lib/jmp_buf.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_JMP_BUF_H
+#define _LKL_LIB_JMP_BUF_H
+
+void jmp_buf_set(struct lkl_jmp_buf *jmpb, void (*f)(void));
+void jmp_buf_longjmp(struct lkl_jmp_buf *jmpb, int val);
+
+#endif
diff --git a/tools/lkl/lib/utils.c b/tools/lkl/lib/utils.c
new file mode 100644
index 000000000000..7de92bbe5475
--- /dev/null
+++ b/tools/lkl/lib/utils.c
@@ -0,0 +1,266 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <lkl_host.h>
+
+static const char * const lkl_err_strings[] = {
+	"Success",
+	"Operation not permitted",
+	"No such file or directory",
+	"No such process",
+	"Interrupted system call",
+	"I/O error",
+	"No such device or address",
+	"Argument list too long",
+	"Exec format error",
+	"Bad file number",
+	"No child processes",
+	"Try again",
+	"Out of memory",
+	"Permission denied",
+	"Bad address",
+	"Block device required",
+	"Device or resource busy",
+	"File exists",
+	"Cross-device link",
+	"No such device",
+	"Not a directory",
+	"Is a directory",
+	"Invalid argument",
+	"File table overflow",
+	"Too many open files",
+	"Not a typewriter",
+	"Text file busy",
+	"File too large",
+	"No space left on device",
+	"Illegal seek",
+	"Read-only file system",
+	"Too many links",
+	"Broken pipe",
+	"Math argument out of domain of func",
+	"Math result not representable",
+	"Resource deadlock would occur",
+	"File name too long",
+	"No record locks available",
+	"Invalid system call number",
+	"Directory not empty",
+	"Too many symbolic links encountered",
+	"Bad error code", /* EWOULDBLOCK is EAGAIN */
+	"No message of desired type",
+	"Identifier removed",
+	"Channel number out of range",
+	"Level 2 not synchronized",
+	"Level 3 halted",
+	"Level 3 reset",
+	"Link number out of range",
+	"Protocol driver not attached",
+	"No CSI structure available",
+	"Level 2 halted",
+	"Invalid exchange",
+	"Invalid request descriptor",
+	"Exchange full",
+	"No anode",
+	"Invalid request code",
+	"Invalid slot",
+	"Bad error code", /* EDEADLOCK is EDEADLK */
+	"Bad font file format",
+	"Device not a stream",
+	"No data available",
+	"Timer expired",
+	"Out of streams resources",
+	"Machine is not on the network",
+	"Package not installed",
+	"Object is remote",
+	"Link has been severed",
+	"Advertise error",
+	"Srmount error",
+	"Communication error on send",
+	"Protocol error",
+	"Multihop attempted",
+	"RFS specific error",
+	"Not a data message",
+	"Value too large for defined data type",
+	"Name not unique on network",
+	"File descriptor in bad state",
+	"Remote address changed",
+	"Can not access a needed shared library",
+	"Accessing a corrupted shared library",
+	".lib section in a.out corrupted",
+	"Attempting to link in too many shared libraries",
+	"Cannot exec a shared library directly",
+	"Illegal byte sequence",
+	"Interrupted system call should be restarted",
+	"Streams pipe error",
+	"Too many users",
+	"Socket operation on non-socket",
+	"Destination address required",
+	"Message too long",
+	"Protocol wrong type for socket",
+	"Protocol not available",
+	"Protocol not supported",
+	"Socket type not supported",
+	"Operation not supported on transport endpoint",
+	"Protocol family not supported",
+	"Address family not supported by protocol",
+	"Address already in use",
+	"Cannot assign requested address",
+	"Network is down",
+	"Network is unreachable",
+	"Network dropped connection because of reset",
+	"Software caused connection abort",
+	"Connection reset by peer",
+	"No buffer space available",
+	"Transport endpoint is already connected",
+	"Transport endpoint is not connected",
+	"Cannot send after transport endpoint shutdown",
+	"Too many references: cannot splice",
+	"Connection timed out",
+	"Connection refused",
+	"Host is down",
+	"No route to host",
+	"Operation already in progress",
+	"Operation now in progress",
+	"Stale file handle",
+	"Structure needs cleaning",
+	"Not a XENIX named type file",
+	"No XENIX semaphores available",
+	"Is a named type file",
+	"Remote I/O error",
+	"Quota exceeded",
+	"No medium found",
+	"Wrong medium type",
+	"Operation Canceled",
+	"Required key not available",
+	"Key has expired",
+	"Key has been revoked",
+	"Key was rejected by service",
+	"Owner died",
+	"State not recoverable",
+	"Operation not possible due to RF-kill",
+	"Memory page has hardware error",
+};
+
+const char *lkl_strerror(int err)
+{
+	if (err < 0)
+		err = -err;
+
+	if ((size_t)err >= sizeof(lkl_err_strings) / sizeof(const char *))
+		return "Bad error code";
+
+	return lkl_err_strings[err];
+}
+
+void lkl_perror(char *msg, int err)
+{
+	const char *err_msg = lkl_strerror(err);
+	/* We need to use 'real' printf because lkl_host_ops.print can
+	 * be turned off when debugging is off.
+	 */
+	lkl_printf("%s: %s\n", msg, err_msg);
+}
+
+static int lkl_vprintf(const char *fmt, va_list args)
+{
+	int n;
+	char *buffer;
+	va_list copy;
+
+	if (!lkl_host_ops.print)
+		return 0;
+
+	va_copy(copy, args);
+	n = vsnprintf(NULL, 0, fmt, copy);
+	va_end(copy);
+
+	buffer = lkl_host_ops.mem_alloc(n + 1);
+	if (!buffer)
+		return -1;
+
+	vsnprintf(buffer, n + 1, fmt, args);
+
+	lkl_host_ops.print(buffer, n);
+	lkl_host_ops.mem_free(buffer);
+
+	return n;
+}
+
+int lkl_printf(const char *fmt, ...)
+{
+	int n;
+	va_list args;
+
+	va_start(args, fmt);
+	n = lkl_vprintf(fmt, args);
+	va_end(args);
+
+	return n;
+}
+
+void lkl_bug(const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	lkl_vprintf(fmt, args);
+	va_end(args);
+
+	lkl_host_ops.panic();
+}
+#ifndef __arch_um__
+int lkl_sysctl(const char *path, const char *value)
+{
+	int ret;
+	int fd;
+	char *delim, *p;
+	char full_path[256];
+
+	lkl_mount_fs("proc");
+
+	snprintf(full_path, sizeof(full_path), "/proc/sys/%s", path);
+	p = full_path;
+	while ((delim = strstr(p, "."))) {
+		*delim = '/';
+		p = delim + 1;
+	}
+
+	fd = lkl_sys_open(full_path, LKL_O_WRONLY | LKL_O_CREAT, 0);
+	if (fd < 0) {
+		lkl_printf("lkl_sys_open %s: %s\n",
+			   full_path, lkl_strerror(fd));
+		return -1;
+	}
+	ret = lkl_sys_write(fd, value, strlen(value));
+	if (ret < 0) {
+		lkl_printf("lkl_sys_write %s: %s\n",
+			full_path, lkl_strerror(fd));
+	}
+
+	lkl_sys_close(fd);
+
+	return 0;
+}
+
+/* Configure sysctl parameters as the form of "key=value;key=value;..." */
+void lkl_sysctl_parse_write(const char *sysctls)
+{
+	char *saveptr = NULL, *token = NULL;
+	char *key = NULL, *value = NULL;
+	char strings[256];
+	int ret = 0;
+
+	strcpy(strings, sysctls);
+	for (token = strtok_r(strings, ";", &saveptr); token;
+	     token = strtok_r(NULL, ";", &saveptr)) {
+		key = strtok(token, "=");
+		value = strtok(NULL, "=");
+		ret = lkl_sysctl(key, value);
+		if (ret) {
+			lkl_printf("Failed to configure sysctl entries: %s\n",
+				   lkl_strerror(ret));
+			return;
+		}
+	}
+}
+#endif
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 16/47] lkl tools: host lib: memory mapped I/O helpers
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (14 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 15/47] lkl tools: host lib: add utilities functions Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 17/47] lkl tools: host lib: virtio devices Hajime Tazaki
                   ` (33 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch adds helpers for implementing the memory mapped I/O host
operations that can be used by code that implements host
devices. Generic host operations for lkl_ioremap and lkl_iomem_access
are provided that allows multiplexing multiple I/O memory mapped
regions.

The host device code can create a new memory mapped I/O region with
register_iomem(). Read and write access functions need to be provided
by the caller.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/lib/iomem.c | 88 +++++++++++++++++++++++++++++++++++++++++++
 tools/lkl/lib/iomem.h | 15 ++++++++
 2 files changed, 103 insertions(+)
 create mode 100644 tools/lkl/lib/iomem.c
 create mode 100644 tools/lkl/lib/iomem.h

diff --git a/tools/lkl/lib/iomem.c b/tools/lkl/lib/iomem.c
new file mode 100644
index 000000000000..2301fe4e5ad5
--- /dev/null
+++ b/tools/lkl/lib/iomem.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <stdint.h>
+#include <lkl_host.h>
+
+#include "iomem.h"
+
+#define IOMEM_OFFSET_BITS		24
+#define MAX_IOMEM_REGIONS		256
+
+#define IOMEM_ADDR_TO_INDEX(addr) \
+	(((uintptr_t)addr) >> IOMEM_OFFSET_BITS)
+#define IOMEM_ADDR_TO_OFFSET(addr) \
+	(((uintptr_t)addr) & ((1 << IOMEM_OFFSET_BITS) - 1))
+#define IOMEM_INDEX_TO_ADDR(i) \
+	(void *)(uintptr_t)(i << IOMEM_OFFSET_BITS)
+
+static struct iomem_region {
+	void *data;
+	int size;
+	const struct lkl_iomem_ops *ops;
+} iomem_regions[MAX_IOMEM_REGIONS];
+
+void *register_iomem(void *data, int size, const struct lkl_iomem_ops *ops)
+{
+	int i;
+
+	if (size > (1 << IOMEM_OFFSET_BITS) - 1)
+		return NULL;
+
+	for (i = 1; i < MAX_IOMEM_REGIONS; i++)
+		if (!iomem_regions[i].ops)
+			break;
+
+	if (i >= MAX_IOMEM_REGIONS)
+		return NULL;
+
+	iomem_regions[i].data = data;
+	iomem_regions[i].size = size;
+	iomem_regions[i].ops = ops;
+	return IOMEM_INDEX_TO_ADDR(i);
+}
+
+void unregister_iomem(void *base)
+{
+	unsigned int index = IOMEM_ADDR_TO_INDEX(base);
+
+	if (index >= MAX_IOMEM_REGIONS) {
+		lkl_printf("%s: invalid iomem_addr %p\n", __func__, base);
+		return;
+	}
+
+	iomem_regions[index].size = 0;
+	iomem_regions[index].ops = NULL;
+}
+
+void *lkl_ioremap(long addr, int size)
+{
+	int index = IOMEM_ADDR_TO_INDEX(addr);
+	struct iomem_region *iomem = &iomem_regions[index];
+
+	if (index >= MAX_IOMEM_REGIONS)
+		return NULL;
+
+	if (iomem->ops && size <= iomem->size)
+		return IOMEM_INDEX_TO_ADDR(index);
+
+	return NULL;
+}
+
+int lkl_iomem_access(const volatile void *addr, void *res, int size, int write)
+{
+	int index = IOMEM_ADDR_TO_INDEX(addr);
+	struct iomem_region *iomem = &iomem_regions[index];
+	int offset = IOMEM_ADDR_TO_OFFSET(addr);
+	int ret;
+
+	if (index > MAX_IOMEM_REGIONS || !iomem_regions[index].ops ||
+	    offset + size > iomem_regions[index].size)
+		return -1;
+
+	if (write)
+		ret = iomem->ops->write(iomem->data, offset, res, size);
+	else
+		ret = iomem->ops->read(iomem->data, offset, res, size);
+
+	return ret;
+}
diff --git a/tools/lkl/lib/iomem.h b/tools/lkl/lib/iomem.h
new file mode 100644
index 000000000000..0ad80ccc2626
--- /dev/null
+++ b/tools/lkl/lib/iomem.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_IOMEM_H
+#define _LKL_LIB_IOMEM_H
+
+struct lkl_iomem_ops {
+	int (*read)(void *data, int offset, void *res, int size);
+	int (*write)(void *data, int offset, void *value, int size);
+};
+
+void *register_iomem(void *data, int size, const struct lkl_iomem_ops *ops);
+void unregister_iomem(void *iomem_base);
+void *lkl_ioremap(long addr, int size);
+int lkl_iomem_access(const volatile void *addr, void *res, int size, int write);
+
+#endif /* _LKL_LIB_IOMEM_H */
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 17/47] lkl tools: host lib: virtio devices
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (15 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 16/47] lkl tools: host lib: memory mapped I/O helpers Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 18/47] lkl tools: host lib: virtio block device Hajime Tazaki
                   ` (32 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: H . K . Jerry Chu, Conrad Meyer, Octavian Purdila, Akira Moroo,
	Yuan Liu, Patrick Collins, Michael Zimmermann, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add helpers for implementing host virtio devices. It uses the memory
mapped I/O helpers to interact with the Linux MMIO virtio transport
driver and offers support to setup and add a new virtio device,
dispatch requests from the incoming queues as well as support for
completing requests.

All added virtio devices are stored in lkl_virtio_devs as strings, per
the Linux MMIO virtio transport driver command line specification.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/lib/virtio.c | 631 +++++++++++++++++++++++++++++++++++++++++
 tools/lkl/lib/virtio.h |  93 ++++++
 2 files changed, 724 insertions(+)
 create mode 100644 tools/lkl/lib/virtio.c
 create mode 100644 tools/lkl/lib/virtio.h

diff --git a/tools/lkl/lib/virtio.c b/tools/lkl/lib/virtio.c
new file mode 100644
index 000000000000..4b3dbba607c3
--- /dev/null
+++ b/tools/lkl/lib/virtio.c
@@ -0,0 +1,631 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <stdio.h>
+#include <stdbool.h>
+#include <inttypes.h>
+#include <lkl_host.h>
+#include <lkl/linux/virtio_ring.h>
+#include "iomem.h"
+#include "virtio.h"
+#include "endian.h"
+
+#define VIRTIO_DEV_MAGIC		0x74726976
+#define VIRTIO_DEV_VERSION		2
+
+#define VIRTIO_MMIO_MAGIC_VALUE		0x000
+#define VIRTIO_MMIO_VERSION		0x004
+#define VIRTIO_MMIO_DEVICE_ID		0x008
+#define VIRTIO_MMIO_VENDOR_ID		0x00c
+#define VIRTIO_MMIO_DEVICE_FEATURES	0x010
+#define VIRTIO_MMIO_DEVICE_FEATURES_SEL 0x014
+#define VIRTIO_MMIO_DRIVER_FEATURES	0x020
+#define VIRTIO_MMIO_DRIVER_FEATURES_SEL 0x024
+#define VIRTIO_MMIO_QUEUE_SEL		0x030
+#define VIRTIO_MMIO_QUEUE_NUM_MAX	0x034
+#define VIRTIO_MMIO_QUEUE_NUM		0x038
+#define VIRTIO_MMIO_QUEUE_READY		0x044
+#define VIRTIO_MMIO_QUEUE_NOTIFY	0x050
+#define VIRTIO_MMIO_INTERRUPT_STATUS	0x060
+#define VIRTIO_MMIO_INTERRUPT_ACK	0x064
+#define VIRTIO_MMIO_STATUS		0x070
+#define VIRTIO_MMIO_QUEUE_DESC_LOW	0x080
+#define VIRTIO_MMIO_QUEUE_DESC_HIGH	0x084
+#define VIRTIO_MMIO_QUEUE_AVAIL_LOW	0x090
+#define VIRTIO_MMIO_QUEUE_AVAIL_HIGH	0x094
+#define VIRTIO_MMIO_QUEUE_USED_LOW	0x0a0
+#define VIRTIO_MMIO_QUEUE_USED_HIGH	0x0a4
+#define VIRTIO_MMIO_CONFIG_GENERATION	0x0fc
+#define VIRTIO_MMIO_CONFIG		0x100
+#define VIRTIO_MMIO_INT_VRING		0x01
+#define VIRTIO_MMIO_INT_CONFIG		0x02
+
+#define BIT(x) (1ULL << x)
+
+#define virtio_panic(msg, ...) do {					\
+		lkl_printf("LKL virtio error" msg, ##__VA_ARGS__);	\
+		lkl_host_ops.panic();					\
+	} while (0)
+
+struct virtio_queue {
+	uint32_t num_max;
+	uint32_t num;
+	uint32_t ready;
+	uint32_t max_merge_len;
+
+	struct lkl_vring_desc *desc;
+	struct lkl_vring_avail *avail;
+	struct lkl_vring_used *used;
+	uint16_t last_avail_idx;
+	uint16_t last_used_idx_signaled;
+};
+
+struct _virtio_req {
+	struct virtio_req req;
+	struct virtio_dev *dev;
+	struct virtio_queue *q;
+	uint16_t idx;
+};
+
+
+static inline uint16_t virtio_get_used_event(struct virtio_queue *q)
+{
+	return q->avail->ring[q->num];
+}
+
+static inline void virtio_set_avail_event(struct virtio_queue *q, uint16_t val)
+{
+	*((uint16_t *)&q->used->ring[q->num]) = val;
+}
+
+static inline void virtio_deliver_irq(struct virtio_dev *dev)
+{
+	dev->int_status |= VIRTIO_MMIO_INT_VRING;
+	/* Make sure all memory writes before are visible to the driver. */
+	__sync_synchronize();
+	lkl_trigger_irq(dev->irq);
+}
+
+static inline uint16_t virtio_get_used_idx(struct virtio_queue *q)
+{
+	return le16toh(q->used->idx);
+}
+
+static inline void virtio_add_used(struct virtio_queue *q, uint16_t used_idx,
+				   uint16_t avail_idx, uint16_t len)
+{
+	uint16_t desc_idx = q->avail->ring[avail_idx & (q->num - 1)];
+
+	used_idx = used_idx & (q->num - 1);
+	q->used->ring[used_idx].id = desc_idx;
+	q->used->ring[used_idx].len = htole16(len);
+}
+
+/*
+ * Make sure all memory writes before are visible to the driver before updating
+ * the idx.  We need it here even we already have one in virtio_deliver_irq()
+ * because there might already be an driver thread reading the idx and dequeuing
+ * used buffers.
+ */
+static inline void virtio_sync_used_idx(struct virtio_queue *q, uint16_t idx)
+{
+	__sync_synchronize();
+	q->used->idx = htole16(idx);
+}
+
+#define min_len(a, b) (a < b ? a : b)
+
+void virtio_req_complete(struct virtio_req *req, uint32_t len)
+{
+	int send_irq = 0;
+	struct _virtio_req *_req = container_of(req, struct _virtio_req, req);
+	struct virtio_queue *q = _req->q;
+	uint16_t avail_idx = _req->idx;
+	uint16_t used_idx = virtio_get_used_idx(_req->q);
+	int i;
+
+	/*
+	 * We've potentially used up multiple (non-chained) descriptors and have
+	 * to create one "used" entry for each descriptor we've consumed.
+	 */
+	for (i = 0; i < req->buf_count; i++) {
+		uint16_t used_len;
+
+		if (!q->max_merge_len)
+			used_len = len;
+		else
+			used_len = min_len(len,  req->buf[i].iov_len);
+
+		virtio_add_used(q, used_idx++, avail_idx++, used_len);
+
+		len -= used_len;
+		if (!len)
+			break;
+	}
+	virtio_sync_used_idx(q, used_idx);
+	q->last_avail_idx = avail_idx;
+
+	/*
+	 * Triggers the irq whenever there is no available buffer.
+	 */
+	if (q->last_avail_idx == le16toh(q->avail->idx))
+		send_irq = 1;
+
+	/*
+	 * There are two rings: q->avail and q->used for each of the rx and tx
+	 * queues that are used to pass buffers between kernel driver and the
+	 * virtio device implementation.
+	 *
+	 * Kernel maitains the first one and appends buffers to it. In rx queue,
+	 * it's empty buffers kernel offers to store received packets. In tx
+	 * queue, it's buffers containing packets to transmit. Kernel notifies
+	 * the device by mmio write (see VIRTIO_MMIO_QUEUE_NOTIFY below).
+	 *
+	 * The virtio device (here in this file) maintains the
+	 * q->used and appends buffer to it after consuming it from q->avail.
+	 *
+	 * The device needs to notify the driver by triggering irq here. The
+	 * LKL_VIRTIO_RING_F_EVENT_IDX is enabled in this implementation so
+	 * kernel can set virtio_get_used_event(q) to tell the device to "only
+	 * trigger the irq when this item in q->used ring is populated."
+	 *
+	 * Because driver and device are run in two different threads. When
+	 * driver sets virtio_get_used_event(q), q->used->idx may already be
+	 * increased to a larger one. So we need to trigger the irq when
+	 * virtio_get_used_event(q) < q->used->idx.
+	 *
+	 * To avoid unnessary irqs for each packet after
+	 * virtio_get_used_event(q) < q->used->idx, last_used_idx_signaled is
+	 * stored and irq is only triggered if
+	 * last_used_idx_signaled <= virtio_get_used_event(q) < q->used->idx
+	 *
+	 * This is what lkl_vring_need_event() checks and it evens covers the
+	 * case when those numbers wrap up.
+	 */
+	if (send_irq || lkl_vring_need_event(le16toh(virtio_get_used_event(q)),
+					     virtio_get_used_idx(q),
+					     q->last_used_idx_signaled)) {
+		q->last_used_idx_signaled = virtio_get_used_idx(q);
+		virtio_deliver_irq(_req->dev);
+	}
+}
+
+/*
+ * Grab the vring_desc from the queue at the appropriate index in the
+ * queue's circular buffer, converting from little-endian to
+ * the host's endianness.
+ */
+static inline
+struct lkl_vring_desc *vring_desc_at_le_idx(struct virtio_queue *q,
+					    __lkl__virtio16 le_idx)
+{
+	return &q->desc[le16toh(le_idx) & (q->num - 1)];
+}
+
+static inline
+struct lkl_vring_desc *vring_desc_at_avail_idx(struct virtio_queue *q,
+					       uint16_t idx)
+{
+	uint16_t desc_idx = q->avail->ring[idx & (q->num - 1)];
+
+	return vring_desc_at_le_idx(q, desc_idx);
+}
+
+/* Initialize buf to hold the same info as the vring_desc */
+static void add_dev_buf_from_vring_desc(struct virtio_req *req,
+					struct lkl_vring_desc *vring_desc)
+{
+	struct iovec *buf = &req->buf[req->buf_count++];
+
+	buf->iov_base = (void *)(uintptr_t)le64toh(vring_desc->addr);
+	buf->iov_len = le32toh(vring_desc->len);
+
+	if (!(buf->iov_base && buf->iov_len))
+		virtio_panic("bad vring_desc: %p %d\n",
+			     buf->iov_base, buf->iov_len);
+
+	req->total_len += buf->iov_len;
+}
+
+static struct lkl_vring_desc *get_next_desc(struct virtio_queue *q,
+					    struct lkl_vring_desc *desc,
+					    uint16_t *idx)
+{
+	uint16_t desc_idx;
+
+	if (q->max_merge_len) {
+		if (++(*idx) == le16toh(q->avail->idx))
+			return NULL;
+		desc_idx = q->avail->ring[*idx & (q->num - 1)];
+		return vring_desc_at_le_idx(q, desc_idx);
+	}
+
+	if (!(le16toh(desc->flags) & LKL_VRING_DESC_F_NEXT))
+		return NULL;
+	return vring_desc_at_le_idx(q, desc->next);
+}
+
+/*
+ * Below there are two distinctly different (per packet) buffer allocation
+ * schemes for us to deal with:
+ *
+ * 1. One or more descriptors chained through "next" as indicated by the
+ *    LKL_VRING_DESC_F_NEXT flag,
+ * 2. One or more descriptors from the ring sequentially, as many as are
+ *    available and needed. This is the RX only "mergeable_rx_bufs" mode.
+ *    The mode is entered when the VIRTIO_NET_F_MRG_RXBUF device feature
+ *    is enabled.
+ */
+static int virtio_process_one(struct virtio_dev *dev, int qidx)
+{
+	struct virtio_queue *q = &dev->queue[qidx];
+	uint16_t idx = q->last_avail_idx;
+	struct _virtio_req _req = {
+		.dev = dev,
+		.q = q,
+		.idx = idx,
+	};
+	struct virtio_req *req = &_req.req;
+	struct lkl_vring_desc *desc = vring_desc_at_avail_idx(q, _req.idx);
+
+	do {
+		add_dev_buf_from_vring_desc(req, desc);
+		if (q->max_merge_len && req->total_len > q->max_merge_len)
+			break;
+		desc = get_next_desc(q, desc, &idx);
+	} while (desc && req->buf_count < VIRTIO_REQ_MAX_BUFS);
+
+	if (desc && le16toh(desc->flags) & LKL_VRING_DESC_F_NEXT)
+		virtio_panic("too many chained bufs");
+
+	return dev->ops->enqueue(dev, qidx, req);
+}
+
+/* NB: we can enter this function two different ways in the case of
+ * netdevs --- either through a tx/rx thread poll (which the LKL
+ * scheduler knows nothing about) or through virtio_write called
+ * inside an interrupt handler, so to be safe, it's not enough to
+ * synchronize only the tx/rx polling threads.
+ *
+ * At the moment, it seems like only netdevs require the
+ * synchronization we do here (i.e. locking around operations on a
+ * particular virtqueue, with dev->ops->acquire_queue), since they
+ * have these two different entry points, one of which isn't managed
+ * by the LKL scheduler. So only devs corresponding to netdevs will
+ * have non-NULL acquire/release_queue.
+ *
+ * In the future, this may change. If you see errors thrown in virtio
+ * driver code by block/console devices, you should be suspicious of
+ * the synchronization going on here.
+ */
+void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx)
+{
+	struct virtio_queue *q = &dev->queue[qidx];
+
+	if (!q->ready)
+		return;
+
+	if (dev->ops->acquire_queue)
+		dev->ops->acquire_queue(dev, qidx);
+
+	while (q->last_avail_idx != le16toh(q->avail->idx)) {
+		/*
+		 * Make sure following loads happens after loading
+		 * q->avail->idx.
+		 */
+		__sync_synchronize();
+		if (virtio_process_one(dev, qidx) < 0)
+			break;
+		if (q->last_avail_idx == le16toh(q->avail->idx))
+			virtio_set_avail_event(q, q->avail->idx);
+	}
+
+	if (dev->ops->release_queue)
+		dev->ops->release_queue(dev, qidx);
+}
+
+static inline uint32_t virtio_read_device_features(struct virtio_dev *dev)
+{
+	if (dev->device_features_sel)
+		return (uint32_t)(dev->device_features >> 32);
+
+	return (uint32_t)dev->device_features;
+}
+
+static inline void virtio_write_driver_features(struct virtio_dev *dev,
+						uint32_t val)
+{
+	uint64_t tmp;
+
+	if (dev->driver_features_sel) {
+		tmp = dev->driver_features & 0xFFFFFFFF;
+		dev->driver_features = tmp | (uint64_t)val << 32;
+	} else {
+		tmp = dev->driver_features & 0xFFFFFFFF00000000;
+		dev->driver_features = tmp | val;
+	}
+}
+
+static int virtio_read(void *data, int offset, void *res, int size)
+{
+	uint32_t val;
+	struct virtio_dev *dev = (struct virtio_dev *)data;
+
+	if (offset >= VIRTIO_MMIO_CONFIG) {
+		offset -= VIRTIO_MMIO_CONFIG;
+		if (offset + size > dev->config_len)
+			return -LKL_EINVAL;
+		memcpy(res, dev->config_data + offset, size);
+		return 0;
+	}
+
+	if (size != sizeof(uint32_t))
+		return -LKL_EINVAL;
+
+	switch (offset) {
+	case VIRTIO_MMIO_MAGIC_VALUE:
+		val = VIRTIO_DEV_MAGIC;
+		break;
+	case VIRTIO_MMIO_VERSION:
+		val = VIRTIO_DEV_VERSION;
+		break;
+	case VIRTIO_MMIO_DEVICE_ID:
+		val = dev->device_id;
+		break;
+	case VIRTIO_MMIO_VENDOR_ID:
+		val = dev->vendor_id;
+		break;
+	case VIRTIO_MMIO_DEVICE_FEATURES:
+		val = virtio_read_device_features(dev);
+		break;
+	case VIRTIO_MMIO_QUEUE_NUM_MAX:
+		val = dev->queue[dev->queue_sel].num_max;
+		break;
+	case VIRTIO_MMIO_QUEUE_READY:
+		val = dev->queue[dev->queue_sel].ready;
+		break;
+	case VIRTIO_MMIO_INTERRUPT_STATUS:
+		val = dev->int_status;
+		break;
+	case VIRTIO_MMIO_STATUS:
+		val = dev->status;
+		break;
+	case VIRTIO_MMIO_CONFIG_GENERATION:
+		val = dev->config_gen;
+		break;
+	default:
+		return -1;
+	}
+
+	*(uint32_t *)res = htole32(val);
+
+	return 0;
+}
+
+static inline void set_ptr_low(void **ptr, uint32_t val)
+{
+	uint64_t tmp = (uintptr_t)*ptr;
+
+	tmp = (tmp & 0xFFFFFFFF00000000) | val;
+	*ptr = (void *)(long)tmp;
+}
+
+static inline void set_ptr_high(void **ptr, uint32_t val)
+{
+	uint64_t tmp = (uintptr_t)*ptr;
+
+	tmp = (tmp & 0x00000000FFFFFFFF) | ((uint64_t)val << 32);
+	*ptr = (void *)(long)tmp;
+}
+
+static inline void set_status(struct virtio_dev *dev, uint32_t val)
+{
+	if ((val & LKL_VIRTIO_CONFIG_S_FEATURES_OK) &&
+	    (!(dev->driver_features & BIT(LKL_VIRTIO_F_VERSION_1)) ||
+	     !(dev->driver_features & BIT(LKL_VIRTIO_RING_F_EVENT_IDX)) ||
+	     dev->ops->check_features(dev)))
+		val &= ~LKL_VIRTIO_CONFIG_S_FEATURES_OK;
+	dev->status = val;
+}
+
+static int virtio_write(void *data, int offset, void *res, int size)
+{
+	struct virtio_dev *dev = (struct virtio_dev *)data;
+	struct virtio_queue *q = &dev->queue[dev->queue_sel];
+	uint32_t val;
+	int ret = 0;
+
+	if (offset >= VIRTIO_MMIO_CONFIG) {
+		offset -= VIRTIO_MMIO_CONFIG;
+
+		if (offset + size >= dev->config_len)
+			return -LKL_EINVAL;
+		memcpy(dev->config_data + offset, res, size);
+		return 0;
+	}
+
+	if (size != sizeof(uint32_t))
+		return -LKL_EINVAL;
+
+	val = le32toh(*(uint32_t *)res);
+
+	switch (offset) {
+	case VIRTIO_MMIO_DEVICE_FEATURES_SEL:
+		if (val > 1)
+			return -LKL_EINVAL;
+		dev->device_features_sel = val;
+		break;
+	case VIRTIO_MMIO_DRIVER_FEATURES_SEL:
+		if (val > 1)
+			return -LKL_EINVAL;
+		dev->driver_features_sel = val;
+		break;
+	case VIRTIO_MMIO_DRIVER_FEATURES:
+		virtio_write_driver_features(dev, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_SEL:
+		dev->queue_sel = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_NUM:
+		dev->queue[dev->queue_sel].num = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_READY:
+		dev->queue[dev->queue_sel].ready = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_NOTIFY:
+		virtio_process_queue(dev, val);
+		break;
+	case VIRTIO_MMIO_INTERRUPT_ACK:
+		dev->int_status = 0;
+		break;
+	case VIRTIO_MMIO_STATUS:
+		set_status(dev, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_DESC_LOW:
+		set_ptr_low((void **)&q->desc, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_DESC_HIGH:
+		set_ptr_high((void **)&q->desc, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_AVAIL_LOW:
+		set_ptr_low((void **)&q->avail, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_AVAIL_HIGH:
+		set_ptr_high((void **)&q->avail, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_USED_LOW:
+		set_ptr_low((void **)&q->used, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_USED_HIGH:
+		set_ptr_high((void **)&q->used, val);
+		break;
+	default:
+		ret = -1;
+	}
+
+	return ret;
+}
+
+static const struct lkl_iomem_ops virtio_ops = {
+	.read = virtio_read,
+	.write = virtio_write,
+};
+
+char lkl_virtio_devs[4096];
+static char *devs = lkl_virtio_devs;
+static uint32_t lkl_num_virtio_boot_devs;
+
+void virtio_set_queue_max_merge_len(struct virtio_dev *dev, int q, int len)
+{
+	dev->queue[q].max_merge_len = len;
+}
+
+int virtio_dev_setup(struct virtio_dev *dev, int queues, int num_max)
+{
+	int qsize = queues * sizeof(*dev->queue);
+	int avail, mmio_size;
+	int i;
+	int num_bytes;
+	int ret;
+
+	dev->irq = lkl_get_free_irq("virtio");
+	if (dev->irq < 0)
+		return dev->irq;
+
+	dev->int_status = 0;
+	dev->device_features |= BIT(LKL_VIRTIO_F_VERSION_1) |
+		BIT(LKL_VIRTIO_RING_F_EVENT_IDX);
+	dev->queue = lkl_host_ops.mem_alloc(qsize);
+	if (!dev->queue)
+		return -LKL_ENOMEM;
+
+	memset(dev->queue, 0, qsize);
+	for (i = 0; i < queues; i++)
+		dev->queue[i].num_max = num_max;
+
+	mmio_size = VIRTIO_MMIO_CONFIG + dev->config_len;
+	dev->base = register_iomem(dev, mmio_size, &virtio_ops);
+	if (!dev->base) {
+		lkl_host_ops.mem_free(dev->queue);
+		return -LKL_ENOMEM;
+	}
+
+	if (!lkl_is_running()) {
+		avail = sizeof(lkl_virtio_devs) - (devs - lkl_virtio_devs);
+		num_bytes = snprintf(devs, avail,
+				     " virtio_mmio.device=%d@0x%"PRIxPTR":%d",
+				     mmio_size, (uintptr_t) dev->base,
+				     dev->irq);
+		if (num_bytes < 0 || num_bytes >= avail) {
+			lkl_put_irq(dev->irq, "virtio");
+			unregister_iomem(dev->base);
+			lkl_host_ops.mem_free(dev->queue);
+			return -LKL_ENOMEM;
+		}
+		devs += num_bytes;
+		dev->virtio_mmio_id = lkl_num_virtio_boot_devs++;
+	} else {
+		ret =
+		    lkl_sys_virtio_mmio_device_add((long)dev->base, mmio_size,
+						   dev->irq);
+		if (ret < 0) {
+			lkl_printf("can't register mmio device\n");
+			return -1;
+		}
+		dev->virtio_mmio_id = lkl_num_virtio_boot_devs + ret;
+	}
+
+	return 0;
+}
+
+int virtio_dev_cleanup(struct virtio_dev *dev)
+{
+	char devname[100];
+	long fd, ret;
+	long mount_ret;
+
+	if (!lkl_is_running())
+		goto skip_unbind;
+
+	mount_ret = lkl_mount_fs("sysfs");
+	if (mount_ret < 0)
+		return mount_ret;
+
+	if (dev->virtio_mmio_id >= virtio_get_num_bootdevs())
+		ret = snprintf(devname, sizeof(devname), "virtio-mmio.%d.auto",
+			       dev->virtio_mmio_id - virtio_get_num_bootdevs());
+	else
+		ret = snprintf(devname, sizeof(devname), "virtio-mmio.%d",
+			       dev->virtio_mmio_id);
+	if (ret < 0 || (size_t) ret >= sizeof(devname))
+		return -LKL_ENOMEM;
+
+	fd = lkl_sys_open("/sysfs/bus/platform/drivers/virtio-mmio/unbind",
+			  LKL_O_WRONLY, 0);
+	if (fd < 0)
+		return fd;
+
+	ret = lkl_sys_write(fd, devname, strlen(devname));
+	if (ret < 0)
+		return ret;
+
+	ret = lkl_sys_close(fd);
+	if (ret < 0)
+		return ret;
+
+	if (mount_ret == 0) {
+		ret = lkl_sys_umount("/sysfs", 0);
+		if (ret < 0)
+			return ret;
+	}
+
+skip_unbind:
+	lkl_put_irq(dev->irq, "virtio");
+	unregister_iomem(dev->base);
+	lkl_host_ops.mem_free(dev->queue);
+	return 0;
+}
+
+uint32_t virtio_get_num_bootdevs(void)
+{
+	return lkl_num_virtio_boot_devs;
+}
diff --git a/tools/lkl/lib/virtio.h b/tools/lkl/lib/virtio.h
new file mode 100644
index 000000000000..7427aa8fad79
--- /dev/null
+++ b/tools/lkl/lib/virtio.h
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_VIRTIO_H
+#define _LKL_LIB_VIRTIO_H
+
+#include <stdint.h>
+#include <lkl_host.h>
+
+#define PAGE_SIZE		4096
+
+/* The following are copied from skbuff.h */
+#if (65536/PAGE_SIZE + 1) < 16
+#define MAX_SKB_FRAGS 16UL
+#else
+#define MAX_SKB_FRAGS (65536/PAGE_SIZE + 1)
+#endif
+
+#define VIRTIO_REQ_MAX_BUFS	(MAX_SKB_FRAGS + 2)
+
+struct virtio_req {
+	uint16_t buf_count;
+	struct iovec buf[VIRTIO_REQ_MAX_BUFS];
+	uint32_t total_len;
+};
+
+struct virtio_dev;
+
+struct virtio_dev_ops {
+	int (*check_features)(struct virtio_dev *dev);
+	/**
+	 * enqueue - queues the request for processing
+	 *
+	 * Note that the curret implementation assumes that the requests are
+	 * processed synchronous and, as such, @virtio_req_complete must be
+	 * called by from this function.
+	 *
+	 * @dev - virtio device
+	 * @q	- queue index
+	 *
+	 * @returns a negative value if the request has not been queued for
+	 * processing in which case the virtio device is resposible for
+	 * restaring the queue processing by calling @virtio_process_queue at a
+	 * later time; 0 or a positive value means that the request has been
+	 * queued for processing
+	 */
+	int (*enqueue)(struct virtio_dev *dev, int q, struct virtio_req *req);
+	/*
+	 * Acquire/release a lock on the specified queue. Only implemented by
+	 * netdevs, all other devices have NULL acquire/release function
+	 * pointers.
+	 */
+	void (*acquire_queue)(struct virtio_dev *dev, int queue_idx);
+	void (*release_queue)(struct virtio_dev *dev, int queue_idx);
+};
+
+struct virtio_dev {
+	uint32_t device_id;
+	uint32_t vendor_id;
+	uint64_t device_features;
+	uint32_t device_features_sel;
+	uint64_t driver_features;
+	uint32_t driver_features_sel;
+	uint32_t queue_sel;
+	struct virtio_queue *queue;
+	uint32_t queue_notify;
+	uint32_t int_status;
+	uint32_t status;
+	uint32_t config_gen;
+
+	struct virtio_dev_ops *ops;
+	int irq;
+	void *config_data;
+	int config_len;
+	void *base;
+	uint32_t virtio_mmio_id;
+};
+
+int virtio_dev_setup(struct virtio_dev *dev, int queues, int num_max);
+int virtio_dev_cleanup(struct virtio_dev *dev);
+uint32_t virtio_get_num_bootdevs(void);
+/**
+ * virtio_req_complete - complete a virtio request
+ *
+ * @req - the request to be completed
+ * @len - the total size in bytes of the completed request
+ */
+void virtio_req_complete(struct virtio_req *req, uint32_t len);
+void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx);
+void virtio_set_queue_max_merge_len(struct virtio_dev *dev, int q, int len);
+
+#define container_of(ptr, type, member) \
+	(type *)((char *)(ptr) - __builtin_offsetof(type, member))
+
+#endif /* _LKL_LIB_VIRTIO_H */
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 18/47] lkl tools: host lib: virtio block device
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (16 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 17/47] lkl tools: host lib: virtio devices Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 19/47] lkl tools: host lib: filesystem helpers Hajime Tazaki
                   ` (31 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Petros Angelatos, Michael Zimmermann, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Host independent implementation for virtio block devices. The host
dependent part of the host library must provide an implementation for
lkl_dev_block_ops.

Disks can be added to the LKL configuration via lkl_disk_add(), a new
LKL application API.

Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/lib/virtio_blk.c | 132 +++++++++++++++++++++++++++++++++++++
 1 file changed, 132 insertions(+)
 create mode 100644 tools/lkl/lib/virtio_blk.c

diff --git a/tools/lkl/lib/virtio_blk.c b/tools/lkl/lib/virtio_blk.c
new file mode 100644
index 000000000000..9e23316c5d99
--- /dev/null
+++ b/tools/lkl/lib/virtio_blk.c
@@ -0,0 +1,132 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <lkl_host.h>
+#include "virtio.h"
+#include "endian.h"
+
+struct virtio_blk_dev {
+	struct virtio_dev dev;
+	struct lkl_virtio_blk_config config;
+	struct lkl_dev_blk_ops *ops;
+	struct lkl_disk disk;
+};
+
+struct virtio_blk_req_trailer {
+	uint8_t status;
+};
+
+static int blk_check_features(struct virtio_dev *dev)
+{
+	if (dev->driver_features == dev->device_features)
+		return 0;
+
+	return -LKL_EINVAL;
+}
+
+static int blk_enqueue(struct virtio_dev *dev, int q, struct virtio_req *req)
+{
+	struct virtio_blk_dev *blk_dev;
+	struct lkl_virtio_blk_outhdr *h;
+	struct virtio_blk_req_trailer *t;
+	struct lkl_blk_req lkl_req;
+
+	if (req->buf_count < 3) {
+		lkl_printf("virtio_blk: no status buf\n");
+		goto out;
+	}
+
+	h = req->buf[0].iov_base;
+	t = req->buf[req->buf_count - 1].iov_base;
+	blk_dev = container_of(dev, struct virtio_blk_dev, dev);
+
+	t->status = LKL_DEV_BLK_STATUS_IOERR;
+
+	if (req->buf[0].iov_len != sizeof(*h)) {
+		lkl_printf("virtio_blk: bad header buf\n");
+		goto out;
+	}
+
+	if (req->buf[req->buf_count - 1].iov_len != sizeof(*t)) {
+		lkl_printf("virtio_blk: bad status buf\n");
+		goto out;
+	}
+
+	lkl_req.type = le32toh(h->type);
+	lkl_req.prio = le32toh(h->ioprio);
+	lkl_req.sector = le32toh(h->sector);
+	lkl_req.buf = &req->buf[1];
+	lkl_req.count = req->buf_count - 2;
+
+	t->status = blk_dev->ops->request(blk_dev->disk, &lkl_req);
+
+out:
+	virtio_req_complete(req, 0);
+	return 0;
+}
+
+static struct virtio_dev_ops blk_ops = {
+	.check_features = blk_check_features,
+	.enqueue = blk_enqueue,
+};
+
+
+int lkl_disk_add(struct lkl_disk *disk)
+{
+	struct virtio_blk_dev *dev;
+	unsigned long long capacity;
+	int ret;
+
+	dev = lkl_host_ops.mem_alloc(sizeof(*dev));
+	if (!dev)
+		return -LKL_ENOMEM;
+
+	disk->dev = dev;
+
+	dev->dev.device_id = LKL_VIRTIO_ID_BLOCK;
+	dev->dev.vendor_id = 0;
+	dev->dev.device_features = 0;
+	dev->dev.config_gen = 0;
+	dev->dev.config_data = &dev->config;
+	dev->dev.config_len = sizeof(dev->config);
+	dev->dev.ops = &blk_ops;
+	if (disk->ops)
+		dev->ops = disk->ops;
+	else
+		dev->ops = &lkl_dev_blk_ops;
+	dev->disk = *disk;
+
+	ret = dev->ops->get_capacity(*disk, &capacity);
+	if (ret) {
+		ret = -LKL_ENOMEM;
+		goto out_free;
+	}
+	dev->config.capacity = capacity / 512;
+
+	ret = virtio_dev_setup(&dev->dev, 1, 32);
+	if (ret)
+		goto out_free;
+
+	return dev->dev.virtio_mmio_id;
+
+out_free:
+	lkl_host_ops.mem_free(dev);
+
+	return ret;
+}
+
+int lkl_disk_remove(struct lkl_disk disk)
+{
+	struct virtio_blk_dev *dev;
+	int ret;
+
+	dev = (struct virtio_blk_dev *)disk.dev;
+	if (!dev)
+		return -LKL_EINVAL;
+
+	ret = virtio_dev_cleanup(&dev->dev);
+	if (ret < 0)
+		return ret;
+
+	lkl_host_ops.mem_free(dev);
+
+	return 0;
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 19/47] lkl tools: host lib: filesystem helpers
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (17 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 18/47] lkl tools: host lib: virtio block device Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 20/47] lkl tools: host lib: posix host operations Hajime Tazaki
                   ` (30 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: Conrad Meyer, Octavian Purdila, Hajime Tazaki,
	Michael Zimmermann, Akira Moroo, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Add LKL applications APIs to mount and unmount a filesystem from a
disk added via lkl_disk_add().

Also add open/close/read directory wrappers on top of
lkl_sys_getdents64.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/lib/fs.c | 433 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 433 insertions(+)
 create mode 100644 tools/lkl/lib/fs.c

diff --git a/tools/lkl/lib/fs.c b/tools/lkl/lib/fs.c
new file mode 100644
index 000000000000..c6f197aec3fb
--- /dev/null
+++ b/tools/lkl/lib/fs.c
@@ -0,0 +1,433 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <lkl_host.h>
+
+#include "virtio.h"
+
+#define MAX_FSTYPE_LEN 50
+int lkl_mount_fs(char *fstype)
+{
+	char dir[MAX_FSTYPE_LEN+2] = "/";
+	int flags = 0, ret = 0;
+
+	strncat(dir, fstype, MAX_FSTYPE_LEN);
+
+	/* Create with regular umask */
+	ret = lkl_sys_mkdir(dir, 0xff);
+	if (ret && ret != -LKL_EEXIST) {
+		lkl_perror("mount_fs mkdir", ret);
+		return ret;
+	}
+
+	/* We have no use for nonzero flags right now */
+	ret = lkl_sys_mount("none", dir, fstype, flags, NULL);
+	if (ret && ret != -LKL_EBUSY) {
+		lkl_sys_rmdir(dir);
+		return ret;
+	}
+
+	if (ret == -LKL_EBUSY)
+		return 1;
+	return 0;
+}
+
+static uint32_t new_encode_dev(unsigned int major, unsigned int minor)
+{
+	return (minor & 0xff) | (major << 8) | ((minor & ~0xff) << 12);
+}
+
+static int startswith(const char *str, const char *pre)
+{
+	return strncmp(pre, str, strlen(pre)) == 0;
+}
+
+static int get_node_with_prefix(const char *path, const char *prefix,
+				char *result, unsigned int result_len)
+{
+	struct lkl_dir *dir = NULL;
+	struct lkl_linux_dirent64 *dirent;
+	int ret;
+
+	dir = lkl_opendir(path, &ret);
+	if (!dir)
+		return ret;
+
+	ret = -LKL_ENOENT;
+
+	while ((dirent = lkl_readdir(dir))) {
+		if (startswith(dirent->d_name, prefix)) {
+			if (strlen(dirent->d_name) + 1 > result_len) {
+				ret = -LKL_ENOMEM;
+				break;
+			}
+			memcpy(result, dirent->d_name, strlen(dirent->d_name));
+			result[strlen(dirent->d_name)] = '\0';
+			ret = 0;
+			break;
+		}
+	}
+
+	lkl_closedir(dir);
+
+	return ret;
+}
+
+int lkl_encode_dev_from_sysfs(const char *sysfs_path, uint32_t *pdevid)
+{
+	int ret;
+	long fd;
+	int major, minor;
+	char buf[16] = { 0, };
+	char *bufptr;
+
+	fd = lkl_sys_open(sysfs_path, LKL_O_RDONLY, 0);
+	if (fd < 0)
+		return fd;
+
+	ret = lkl_sys_read(fd, buf, sizeof(buf));
+	if (ret < 0)
+		goto out_close;
+
+	if (ret == sizeof(buf)) {
+		ret = -LKL_ENOBUFS;
+		goto out_close;
+	}
+
+	bufptr = strchr(buf, ':');
+	if (bufptr == NULL) {
+		ret = -LKL_EINVAL;
+		goto out_close;
+	}
+	bufptr[0] = '\0';
+	bufptr++;
+
+	major = atoi(buf);
+	minor = atoi(bufptr);
+
+	*pdevid = new_encode_dev(major, minor);
+	ret = 0;
+
+out_close:
+	lkl_sys_close(fd);
+
+	return ret;
+}
+
+#define SYSFS_DEV_VIRTIO_PLATFORM_PATH \
+	"/sysfs/devices/platform/virtio-mmio.%d.auto"
+#define SYSFS_DEV_VIRTIO_CMDLINE_PATH \
+	"/sysfs/devices/virtio-mmio-cmdline/virtio-mmio.%d"
+
+struct abuf {
+	char *mem, *ptr;
+	unsigned int len;
+};
+
+static int snprintf_append(struct abuf *buf, const char *fmt, ...)
+{
+	int ret;
+	va_list args;
+
+	if (!buf->ptr)
+		buf->ptr = buf->mem;
+
+	va_start(args, fmt);
+	ret = vsnprintf(buf->ptr, buf->len - (buf->ptr - buf->mem), fmt, args);
+	va_end(args);
+
+	if (ret < 0 || (ret >= (buf->len - (buf->ptr - buf->mem))))
+		return -LKL_ENOMEM;
+
+	buf->ptr += ret;
+
+	return 0;
+}
+
+int lkl_get_virtio_blkdev(int disk_id, unsigned int part, uint32_t *pdevid)
+{
+	char sysfs_path[LKL_PATH_MAX];
+	char virtio_name[LKL_PATH_MAX];
+	char disk_name[LKL_PATH_MAX];
+	struct abuf sysfs_path_buf = {
+		.mem = sysfs_path,
+		.len = sizeof(sysfs_path),
+	};
+	char *fmt;
+	int ret;
+
+	if (disk_id < 0)
+		return -LKL_EINVAL;
+
+	ret = lkl_mount_fs("sysfs");
+	if (ret < 0)
+		return ret;
+
+	if ((uint32_t) disk_id >= virtio_get_num_bootdevs()) {
+		fmt = SYSFS_DEV_VIRTIO_PLATFORM_PATH;
+		disk_id -= virtio_get_num_bootdevs();
+	} else {
+		fmt = SYSFS_DEV_VIRTIO_CMDLINE_PATH;
+	}
+
+	ret = snprintf_append(&sysfs_path_buf, fmt, disk_id);
+	if (ret)
+		return ret;
+
+	ret = get_node_with_prefix(sysfs_path, "virtio", virtio_name,
+				   sizeof(virtio_name));
+	if (ret)
+		return ret;
+
+	ret = snprintf_append(&sysfs_path_buf, "/%s/block", virtio_name);
+	if (ret)
+		return ret;
+
+	ret = get_node_with_prefix(sysfs_path, "vd", disk_name,
+				   sizeof(disk_name));
+	if (ret)
+		return ret;
+
+	if (!part)
+		ret = snprintf_append(&sysfs_path_buf, "/%s/dev", disk_name);
+	else
+		ret = snprintf_append(&sysfs_path_buf, "/%s/%s%d/dev",
+				      disk_name, disk_name, part);
+	if (ret)
+		return ret;
+
+	return lkl_encode_dev_from_sysfs(sysfs_path, pdevid);
+}
+
+long lkl_mount_dev(unsigned int disk_id, unsigned int part,
+		   const char *fs_type, int flags,
+		   const char *data, char *mnt_str, unsigned int mnt_str_len)
+{
+	char dev_str[] = { "/dev/xxxxxxxx" };
+	unsigned int dev;
+	int err;
+	char _data[4096]; /* FIXME: PAGE_SIZE is not exported by LKL */
+
+	if (mnt_str_len < sizeof(dev_str))
+		return -LKL_ENOMEM;
+
+	err = lkl_get_virtio_blkdev(disk_id, part, &dev);
+	if (err < 0)
+		return err;
+
+	snprintf(dev_str, sizeof(dev_str), "/dev/%08x", dev);
+	snprintf(mnt_str, mnt_str_len, "/mnt/%08x", dev);
+
+	err = lkl_sys_access("/dev", LKL_S_IRWXO);
+	if (err < 0) {
+		if (err == -LKL_ENOENT)
+			err = lkl_sys_mkdir("/dev", 0700);
+		if (err < 0)
+			return err;
+	}
+
+	err = lkl_sys_mknod(dev_str, LKL_S_IFBLK | 0600, dev);
+	if (err < 0)
+		return err;
+
+	err = lkl_sys_access("/mnt", LKL_S_IRWXO);
+	if (err < 0) {
+		if (err == -LKL_ENOENT)
+			err = lkl_sys_mkdir("/mnt", 0700);
+		if (err < 0)
+			return err;
+	}
+
+	err = lkl_sys_mkdir(mnt_str, 0700);
+	if (err < 0) {
+		lkl_sys_unlink(dev_str);
+		return err;
+	}
+
+	/* kernel always copies a full page */
+	if (data) {
+		strncpy(_data, data, sizeof(_data));
+		_data[sizeof(_data) - 1] = 0;
+	} else {
+		_data[0] = 0;
+	}
+
+	err = lkl_sys_mount(dev_str, mnt_str, (char *)fs_type, flags, _data);
+	if (err < 0) {
+		lkl_sys_unlink(dev_str);
+		lkl_sys_rmdir(mnt_str);
+		return err;
+	}
+
+	return 0;
+}
+
+long lkl_umount_timeout(char *path, int flags, long timeout_ms)
+{
+	long incr = 10000000; /* 10 ms */
+	struct lkl_timespec ts = {
+		.tv_sec = 0,
+		.tv_nsec = incr,
+	};
+	long err;
+
+	do {
+		err = lkl_sys_umount(path, flags);
+		if (err == -LKL_EBUSY) {
+			lkl_sys_nanosleep((struct __lkl__kernel_timespec *)&ts,
+					  NULL);
+			timeout_ms -= incr / 1000000;
+		}
+	} while (err == -LKL_EBUSY && timeout_ms > 0);
+
+	return err;
+}
+
+long lkl_umount_dev(unsigned int disk_id, unsigned int part, int flags,
+		    long timeout_ms)
+{
+	char dev_str[] = { "/dev/xxxxxxxx" };
+	char mnt_str[] = { "/mnt/xxxxxxxx" };
+	unsigned int dev;
+	int err;
+
+	err = lkl_get_virtio_blkdev(disk_id, part, &dev);
+	if (err < 0)
+		return err;
+
+	snprintf(dev_str, sizeof(dev_str), "/dev/%08x", dev);
+	snprintf(mnt_str, sizeof(mnt_str), "/mnt/%08x", dev);
+
+	err = lkl_umount_timeout(mnt_str, flags, timeout_ms);
+	if (err)
+		return err;
+
+	err = lkl_sys_unlink(dev_str);
+	if (err)
+		return err;
+
+	return lkl_sys_rmdir(mnt_str);
+}
+
+struct lkl_dir {
+	int fd;
+	char buf[1024];
+	char *pos;
+	int len;
+};
+
+static struct lkl_dir *lkl_dir_alloc(int *err)
+{
+	struct lkl_dir *dir = lkl_host_ops.mem_alloc(sizeof(struct lkl_dir));
+
+	if (!dir) {
+		*err = -LKL_ENOMEM;
+		return NULL;
+	}
+
+	dir->len = 0;
+	dir->pos = NULL;
+
+	return dir;
+}
+
+struct lkl_dir *lkl_opendir(const char *path, int *err)
+{
+	struct lkl_dir *dir = lkl_dir_alloc(err);
+
+	if (!dir) {
+		*err = -LKL_ENOMEM;
+		return NULL;
+	}
+
+	dir->fd = lkl_sys_open(path, LKL_O_RDONLY | LKL_O_DIRECTORY, 0);
+	if (dir->fd < 0) {
+		*err = dir->fd;
+		lkl_host_ops.mem_free(dir);
+		return NULL;
+	}
+
+	*err = 0;
+
+	return dir;
+}
+
+struct lkl_dir *lkl_fdopendir(int fd, int *err)
+{
+	struct lkl_dir *dir = lkl_dir_alloc(err);
+
+	if (!dir)
+		return NULL;
+
+	dir->fd = fd;
+
+	return dir;
+}
+
+void lkl_rewinddir(struct lkl_dir *dir)
+{
+	lkl_sys_lseek(dir->fd, 0, LKL_SEEK_SET);
+	dir->len = 0;
+	dir->pos = NULL;
+}
+
+int lkl_closedir(struct lkl_dir *dir)
+{
+	int ret;
+
+	ret = lkl_sys_close(dir->fd);
+	lkl_host_ops.mem_free(dir);
+
+	return ret;
+}
+
+struct lkl_linux_dirent64 *lkl_readdir(struct lkl_dir *dir)
+{
+	struct lkl_linux_dirent64 *de;
+
+	if (dir->len < 0)
+		return NULL;
+
+	if (!dir->pos || dir->pos - dir->buf >= dir->len)
+		goto read_buf;
+
+return_de:
+	de = (struct lkl_linux_dirent64 *)dir->pos;
+	dir->pos += de->d_reclen;
+
+	return de;
+
+read_buf:
+	dir->pos = NULL;
+	de = (struct lkl_linux_dirent64 *)dir->buf;
+	dir->len = lkl_sys_getdents64(dir->fd, de, sizeof(dir->buf));
+	if (dir->len <= 0)
+		return NULL;
+
+	dir->pos = dir->buf;
+	goto return_de;
+}
+
+int lkl_errdir(struct lkl_dir *dir)
+{
+	if (dir->len >= 0)
+		return 0;
+
+	return dir->len;
+}
+
+int lkl_dirfd(struct lkl_dir *dir)
+{
+	return dir->fd;
+}
+
+int lkl_set_fd_limit(unsigned int fd_limit)
+{
+	struct lkl_rlimit rlim = {
+		.rlim_cur = fd_limit,
+		.rlim_max = fd_limit,
+	};
+	return lkl_sys_setrlimit(LKL_RLIMIT_NOFILE, &rlim);
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 20/47] lkl tools: host lib: posix host operations
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (18 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 19/47] lkl tools: host lib: filesystem helpers Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 21/47] lkl tools: "boot" test Hajime Tazaki
                   ` (29 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: Conrad Meyer, Octavian Purdila, Akira Moroo, Yuan Liu,
	Thomas Liebetraut, Mark Stillwell, Patrick Collins,
	Pierre-Hugues Husson, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Implement LKL host operations for POSIX hosts.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 Makefile                   |   2 +
 tools/lkl/lib/posix-host.c | 439 +++++++++++++++++++++++++++++++++++++
 2 files changed, 441 insertions(+)
 create mode 100644 tools/lkl/lib/posix-host.c

diff --git a/Makefile b/Makefile
index 8c1f7422d0bc..cf88c66fba2e 100644
--- a/Makefile
+++ b/Makefile
@@ -1124,7 +1124,9 @@ archprepare: archheaders archscripts scripts prepare3 outputmakefile \
 	asm-generic $(version_h) $(autoksyms_h) include/generated/utsrelease.h
 
 prepare0: archprepare
+ifeq ($(findstring elf,$(if $(CONFIG_OUTPUT_FORMAT),$(CONFIG_OUTPUT_FORMAT),elf)),elf)
 	$(Q)$(MAKE) $(build)=scripts/mod
+endif
 	$(Q)$(MAKE) $(build)=.
 
 # All the preparing..
diff --git a/tools/lkl/lib/posix-host.c b/tools/lkl/lib/posix-host.c
new file mode 100644
index 000000000000..4d52b06c9944
--- /dev/null
+++ b/tools/lkl/lib/posix-host.c
@@ -0,0 +1,439 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <pthread.h>
+#include <stdlib.h>
+#include <sys/time.h>
+#include <time.h>
+#include <signal.h>
+#include <assert.h>
+#include <unistd.h>
+#include <errno.h>
+#include <string.h>
+#include <time.h>
+#include <stdint.h>
+#include <sys/uio.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/syscall.h>
+#include <poll.h>
+#include <lkl_host.h>
+#include "iomem.h"
+#include "jmp_buf.h"
+
+/* Let's see if the host has semaphore.h */
+#include <unistd.h>
+
+#ifdef _POSIX_SEMAPHORES
+#include <semaphore.h>
+/* TODO(pscollins): We don't support fork() for now, but maybe one day
+ * we will?
+ */
+#define SHARE_SEM 0
+#endif /* _POSIX_SEMAPHORES */
+
+static void print(const char *str, int len)
+{
+	int ret __attribute__((unused));
+
+	ret = write(STDOUT_FILENO, str, len);
+}
+
+struct lkl_mutex {
+	pthread_mutex_t mutex;
+};
+
+struct lkl_sem {
+#ifdef _POSIX_SEMAPHORES
+	sem_t sem;
+#else
+	pthread_mutex_t lock;
+	int count;
+	pthread_cond_t cond;
+#endif /* _POSIX_SEMAPHORES */
+};
+
+struct lkl_tls_key {
+	pthread_key_t key;
+};
+
+#define WARN_UNLESS(exp) do {						\
+		if (exp < 0)						\
+			lkl_printf("%s: %s\n", #exp, strerror(errno));	\
+	} while (0)
+
+static int _warn_pthread(int ret, char *str_exp)
+{
+	if (ret > 0)
+		lkl_printf("%s: %s\n", str_exp, strerror(ret));
+
+	return ret;
+}
+
+
+/* pthread_* functions use the reverse convention */
+#define WARN_PTHREAD(exp) _warn_pthread(exp, #exp)
+
+static struct lkl_sem *sem_alloc(int count)
+{
+	struct lkl_sem *sem;
+
+	sem = malloc(sizeof(*sem));
+	if (!sem)
+		return NULL;
+
+#ifdef _POSIX_SEMAPHORES
+	if (sem_init(&sem->sem, SHARE_SEM, count) < 0) {
+		lkl_printf("sem_init: %s\n", strerror(errno));
+		free(sem);
+		return NULL;
+	}
+#else
+	pthread_mutex_init(&sem->lock, NULL);
+	sem->count = count;
+	WARN_PTHREAD(pthread_cond_init(&sem->cond, NULL));
+#endif /* _POSIX_SEMAPHORES */
+
+	return sem;
+}
+
+static void sem_free(struct lkl_sem *sem)
+{
+#ifdef _POSIX_SEMAPHORES
+	WARN_UNLESS(sem_destroy(&sem->sem));
+#else
+	WARN_PTHREAD(pthread_cond_destroy(&sem->cond));
+	WARN_PTHREAD(pthread_mutex_destroy(&sem->lock));
+#endif /* _POSIX_SEMAPHORES */
+	free(sem);
+}
+
+static void sem_up(struct lkl_sem *sem)
+{
+#ifdef _POSIX_SEMAPHORES
+	WARN_UNLESS(sem_post(&sem->sem));
+#else
+	WARN_PTHREAD(pthread_mutex_lock(&sem->lock));
+	sem->count++;
+	if (sem->count > 0)
+		WARN_PTHREAD(pthread_cond_signal(&sem->cond));
+	WARN_PTHREAD(pthread_mutex_unlock(&sem->lock));
+#endif /* _POSIX_SEMAPHORES */
+
+}
+
+static void sem_down(struct lkl_sem *sem)
+{
+#ifdef _POSIX_SEMAPHORES
+	int err;
+
+	do {
+		err = sem_wait(&sem->sem);
+	} while (err < 0 && errno == EINTR);
+	if (err < 0 && errno != EINTR)
+		lkl_printf("sem_wait: %s\n", strerror(errno));
+#else
+	WARN_PTHREAD(pthread_mutex_lock(&sem->lock));
+	while (sem->count <= 0)
+		WARN_PTHREAD(pthread_cond_wait(&sem->cond, &sem->lock));
+	sem->count--;
+	WARN_PTHREAD(pthread_mutex_unlock(&sem->lock));
+#endif /* _POSIX_SEMAPHORES */
+}
+
+static struct lkl_mutex *mutex_alloc(int recursive)
+{
+	struct lkl_mutex *_mutex = malloc(sizeof(struct lkl_mutex));
+	pthread_mutex_t *mutex = NULL;
+	pthread_mutexattr_t attr;
+
+	if (!_mutex)
+		return NULL;
+
+	mutex = &_mutex->mutex;
+	WARN_PTHREAD(pthread_mutexattr_init(&attr));
+
+	/* PTHREAD_MUTEX_ERRORCHECK is *very* useful for debugging,
+	 * but has some overhead, so we provide an option to turn it
+	 * off.
+	 */
+#ifdef DEBUG
+	if (!recursive)
+		WARN_PTHREAD(pthread_mutexattr_settype(
+				     &attr, PTHREAD_MUTEX_ERRORCHECK));
+#endif /* DEBUG */
+
+	if (recursive)
+		WARN_PTHREAD(pthread_mutexattr_settype(
+				     &attr, PTHREAD_MUTEX_RECURSIVE));
+
+	WARN_PTHREAD(pthread_mutex_init(mutex, &attr));
+
+	return _mutex;
+}
+
+static void mutex_lock(struct lkl_mutex *mutex)
+{
+	WARN_PTHREAD(pthread_mutex_lock(&mutex->mutex));
+}
+
+static void mutex_unlock(struct lkl_mutex *_mutex)
+{
+	pthread_mutex_t *mutex = &_mutex->mutex;
+
+	WARN_PTHREAD(pthread_mutex_unlock(mutex));
+}
+
+static void mutex_free(struct lkl_mutex *_mutex)
+{
+	pthread_mutex_t *mutex = &_mutex->mutex;
+
+	WARN_PTHREAD(pthread_mutex_destroy(mutex));
+	free(_mutex);
+}
+
+static lkl_thread_t thread_create(void (*fn)(void *), void *arg)
+{
+	pthread_t thread;
+
+	if (WARN_PTHREAD(pthread_create(&thread, NULL, (void* (*)(void *))fn,
+					arg)))
+		return 0;
+	else
+		return (lkl_thread_t) thread;
+}
+
+static void thread_detach(void)
+{
+	WARN_PTHREAD(pthread_detach(pthread_self()));
+}
+
+static void thread_exit(void)
+{
+	pthread_exit(NULL);
+}
+
+static int thread_join(lkl_thread_t tid)
+{
+	if (WARN_PTHREAD(pthread_join((pthread_t)tid, NULL)))
+		return (-1);
+	else
+		return 0;
+}
+
+static lkl_thread_t thread_self(void)
+{
+	return (lkl_thread_t)pthread_self();
+}
+
+static int thread_equal(lkl_thread_t a, lkl_thread_t b)
+{
+	return pthread_equal((pthread_t)a, (pthread_t)b);
+}
+
+static struct lkl_tls_key *tls_alloc(void (*destructor)(void *))
+{
+	struct lkl_tls_key *ret = malloc(sizeof(struct lkl_tls_key));
+
+	if (WARN_PTHREAD(pthread_key_create(&ret->key, destructor))) {
+		free(ret);
+		return NULL;
+	}
+	return ret;
+}
+
+static void tls_free(struct lkl_tls_key *key)
+{
+	WARN_PTHREAD(pthread_key_delete(key->key));
+	free(key);
+}
+
+static int tls_set(struct lkl_tls_key *key, void *data)
+{
+	if (WARN_PTHREAD(pthread_setspecific(key->key, data)))
+		return (-1);
+	return 0;
+}
+
+static void *tls_get(struct lkl_tls_key *key)
+{
+	return pthread_getspecific(key->key);
+}
+
+static unsigned long long time_ns(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_MONOTONIC, &ts);
+
+	return 1e9*ts.tv_sec + ts.tv_nsec;
+}
+
+static void *timer_alloc(void (*fn)(void *), void *arg)
+{
+	int err;
+	timer_t timer;
+	struct sigevent se =  {
+		.sigev_notify = SIGEV_THREAD,
+		.sigev_value = {
+			.sival_ptr = arg,
+		},
+		.sigev_notify_function = (void (*)(union sigval))fn,
+	};
+
+	err = timer_create(CLOCK_REALTIME, &se, &timer);
+	if (err)
+		return NULL;
+
+	return (void *)(long)timer;
+}
+
+static int timer_set_oneshot(void *_timer, unsigned long ns)
+{
+	timer_t timer = (timer_t)(long)_timer;
+	struct itimerspec ts = {
+		.it_value = {
+			.tv_sec = ns / 1000000000,
+			.tv_nsec = ns % 1000000000,
+		},
+	};
+
+	return timer_settime(timer, 0, &ts, NULL);
+}
+
+static void timer_free(void *_timer)
+{
+	timer_t timer = (timer_t)(long)_timer;
+
+	timer_delete(timer);
+}
+
+#ifndef __arch_um__
+static void panic(void)
+{
+	assert(0);
+}
+#endif
+
+static long _gettid(void)
+{
+#ifdef	__FreeBSD__
+	return (long)pthread_self();
+#else
+	return syscall(SYS_gettid);
+#endif
+}
+
+struct lkl_host_operations lkl_host_ops = {
+#ifndef __arch_um__
+	.panic = panic,
+#endif
+	.thread_create = thread_create,
+	.thread_detach = thread_detach,
+	.thread_exit = thread_exit,
+	.thread_join = thread_join,
+	.thread_self = thread_self,
+	.thread_equal = thread_equal,
+	.sem_alloc = sem_alloc,
+	.sem_free = sem_free,
+	.sem_up = sem_up,
+	.sem_down = sem_down,
+	.mutex_alloc = mutex_alloc,
+	.mutex_free = mutex_free,
+	.mutex_lock = mutex_lock,
+	.mutex_unlock = mutex_unlock,
+	.tls_alloc = tls_alloc,
+	.tls_free = tls_free,
+	.tls_set = tls_set,
+	.tls_get = tls_get,
+	.time = time_ns,
+	.timer_alloc = timer_alloc,
+	.timer_set_oneshot = timer_set_oneshot,
+	.timer_free = timer_free,
+	.print = print,
+	.mem_alloc = malloc,
+	.mem_free = free,
+	.ioremap = lkl_ioremap,
+	.iomem_access = lkl_iomem_access,
+	.virtio_devices = lkl_virtio_devs,
+	.gettid = _gettid,
+	.jmp_buf_set = jmp_buf_set,
+	.jmp_buf_longjmp = jmp_buf_longjmp,
+};
+
+static int fd_get_capacity(struct lkl_disk disk, unsigned long long *res)
+{
+	off_t off;
+
+	off = lseek(disk.fd, 0, SEEK_END);
+	if (off < 0)
+		return (-1);
+
+	*res = off;
+	return 0;
+}
+
+static int do_rw(ssize_t (*fn)(), struct lkl_disk disk, struct lkl_blk_req *req)
+{
+	off_t off = req->sector * 512;
+	void *addr;
+	int len;
+	int i;
+	int ret = 0;
+
+	for (i = 0; i < req->count; i++) {
+
+		addr = req->buf[i].iov_base;
+		len = req->buf[i].iov_len;
+
+		do {
+			ret = fn(disk.fd, addr, len, off);
+
+			if (ret <= 0) {
+				ret = -1;
+				goto out;
+			}
+
+			addr += ret;
+			len -= ret;
+			off += ret;
+
+		} while (len);
+	}
+
+out:
+	return ret;
+}
+
+static int blk_request(struct lkl_disk disk, struct lkl_blk_req *req)
+{
+	int err = 0;
+
+	switch (req->type) {
+	case LKL_DEV_BLK_TYPE_READ:
+		err = do_rw(pread, disk, req);
+		break;
+	case LKL_DEV_BLK_TYPE_WRITE:
+		err = do_rw(pwrite, disk, req);
+		break;
+	case LKL_DEV_BLK_TYPE_FLUSH:
+	case LKL_DEV_BLK_TYPE_FLUSH_OUT:
+#ifdef __linux__
+		err = fdatasync(disk.fd);
+#else
+		err = fsync(disk.fd);
+#endif
+		break;
+	default:
+		return LKL_DEV_BLK_STATUS_UNSUP;
+	}
+
+	if (err < 0)
+		return LKL_DEV_BLK_STATUS_IOERR;
+
+	return LKL_DEV_BLK_STATUS_OK;
+}
+
+struct lkl_dev_blk_ops lkl_dev_blk_ops = {
+	.get_capacity = fd_get_capacity,
+	.request = blk_request,
+};
+
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 21/47] lkl tools: "boot" test
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (19 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 20/47] lkl tools: host lib: posix host operations Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 22/47] lkl tools: tool that converts a filesystem image to tar Hajime Tazaki
                   ` (28 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: H . K . Jerry Chu, Conrad Meyer, Octavian Purdila, Motomu Utsumi,
	Lai Jiangshan, Akira Moroo, Petros Angelatos, Yuan Liu,
	Thomas Liebetraut, Mark Stillwell, Patrick Collins,
	David Disseldorp, Luca Dariz, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add a simple LKL test applications that starts the kernel and performs
simple tests that minimally exercise the LKL API.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/tests/Build             |   3 +
 tools/lkl/tests/boot.c            | 562 ++++++++++++++++++++++++++++++
 tools/lkl/tests/boot.sh           |   9 +
 tools/lkl/tests/cla.c             | 159 +++++++++
 tools/lkl/tests/cla.h             |  33 ++
 tools/lkl/tests/disk.c            | 189 ++++++++++
 tools/lkl/tests/disk.sh           |  70 ++++
 tools/lkl/tests/run.py            | 186 ++++++++++
 tools/lkl/tests/tap13.py          | 209 +++++++++++
 tools/lkl/tests/test.c            | 126 +++++++
 tools/lkl/tests/test.h            |  72 ++++
 tools/lkl/tests/test.sh           | 240 +++++++++++++
 tools/lkl/tests/valgrind.supp     |  85 +++++
 tools/lkl/tests/valgrind2xunit.py |  69 ++++
 14 files changed, 2012 insertions(+)
 create mode 100644 tools/lkl/tests/Build
 create mode 100644 tools/lkl/tests/boot.c
 create mode 100755 tools/lkl/tests/boot.sh
 create mode 100644 tools/lkl/tests/cla.c
 create mode 100644 tools/lkl/tests/cla.h
 create mode 100644 tools/lkl/tests/disk.c
 create mode 100755 tools/lkl/tests/disk.sh
 create mode 100755 tools/lkl/tests/run.py
 create mode 100644 tools/lkl/tests/tap13.py
 create mode 100644 tools/lkl/tests/test.c
 create mode 100644 tools/lkl/tests/test.h
 create mode 100644 tools/lkl/tests/test.sh
 create mode 100644 tools/lkl/tests/valgrind.supp
 create mode 100755 tools/lkl/tests/valgrind2xunit.py

diff --git a/tools/lkl/tests/Build b/tools/lkl/tests/Build
new file mode 100644
index 000000000000..ace86a3d3438
--- /dev/null
+++ b/tools/lkl/tests/Build
@@ -0,0 +1,3 @@
+boot-y += boot.o test.o
+disk-y += disk.o cla.o test.o
+net-test-y += net-test.o cla.o test.o
diff --git a/tools/lkl/tests/boot.c b/tools/lkl/tests/boot.c
new file mode 100644
index 000000000000..74fba648e558
--- /dev/null
+++ b/tools/lkl/tests/boot.c
@@ -0,0 +1,562 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <time.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include <sys/stat.h>
+#include <fcntl.h>
+#if defined(__FreeBSD__)
+#include <net/if.h>
+#include <sys/ioctl.h>
+#elif __linux
+#include <sys/epoll.h>
+#include <sys/ioctl.h>
+#elif __MINGW32__
+#include <windows.h>
+#endif
+
+#include "test.h"
+
+#ifndef __MINGW32__
+#define sleep_ns 87654321
+int lkl_test_nanosleep(void)
+{
+	struct lkl_timespec ts = {
+		.tv_sec = 0,
+		.tv_nsec = sleep_ns,
+	};
+	struct timespec start, stop;
+	long delta;
+	long ret;
+
+	clock_gettime(CLOCK_MONOTONIC, &start);
+	ret = lkl_sys_nanosleep((struct __lkl__kernel_timespec *)&ts, NULL);
+	clock_gettime(CLOCK_MONOTONIC, &stop);
+
+	delta = 1e9*(stop.tv_sec - start.tv_sec) +
+		(stop.tv_nsec - start.tv_nsec);
+
+	lkl_test_logf("sleep %ld, expected sleep %d\n", delta, sleep_ns);
+
+	if (ret == 0 && delta > sleep_ns * 0.9)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+#endif
+
+LKL_TEST_CALL(getpid, lkl_sys_getpid, 1)
+
+void check_latency(long (*f)(void), long *min, long *max, long *avg)
+{
+	int i;
+	unsigned long long start, stop, sum = 0;
+	static const int count = 1000;
+	long delta;
+
+	*min = 1000000000;
+	*max = -1;
+
+	for (i = 0; i < count; i++) {
+		start = lkl_host_ops.time();
+		f();
+		stop = lkl_host_ops.time();
+		delta = stop - start;
+		if (*min > delta)
+			*min = delta;
+		if (*max < delta)
+			*max = delta;
+		sum += delta;
+	}
+	*avg = sum / count;
+}
+
+static long native_getpid(void)
+{
+#ifdef __MINGW32__
+	GetCurrentProcessId();
+#else
+	getpid();
+#endif
+	return 0;
+}
+
+int lkl_test_syscall_latency(void)
+{
+	long min, max, avg;
+
+	lkl_test_logf("avg/min/max: ");
+
+	check_latency(lkl_sys_getpid, &min, &max, &avg);
+
+	lkl_test_logf("lkl:%ld/%ld/%ld ", avg, min, max);
+
+	check_latency(native_getpid, &min, &max, &avg);
+
+	lkl_test_logf("native:%ld/%ld/%ld\n", avg, min, max);
+
+	return TEST_SUCCESS;
+}
+
+#define access_rights 0721
+
+LKL_TEST_CALL(creat, lkl_sys_creat, 0, "/file", access_rights)
+LKL_TEST_CALL(close, lkl_sys_close, 0, 0);
+LKL_TEST_CALL(failopen, lkl_sys_open, -LKL_ENOENT, "/file2", 0, 0);
+LKL_TEST_CALL(umask, lkl_sys_umask, 022,  0777);
+LKL_TEST_CALL(umask2, lkl_sys_umask, 0777, 0);
+LKL_TEST_CALL(open, lkl_sys_open, 0, "/file", LKL_O_RDWR, 0);
+static const char wrbuf[] = "test";
+LKL_TEST_CALL(write, lkl_sys_write, sizeof(wrbuf), 0, wrbuf, sizeof(wrbuf));
+LKL_TEST_CALL(lseek_cur, lkl_sys_lseek, sizeof(wrbuf), 0, 0, LKL_SEEK_CUR);
+LKL_TEST_CALL(lseek_end, lkl_sys_lseek, sizeof(wrbuf), 0, 0, LKL_SEEK_END);
+LKL_TEST_CALL(lseek_set, lkl_sys_lseek, 0, 0, 0, LKL_SEEK_SET);
+
+int lkl_test_read(void)
+{
+	char buf[10] = { 0, };
+	long ret;
+
+	ret = lkl_sys_read(0, buf, sizeof(buf));
+
+	lkl_test_logf("lkl_sys_read=%ld buf=%s\n", ret, buf);
+
+	if (ret == sizeof(wrbuf) && !strcmp(wrbuf, buf))
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+int lkl_test_fstat(void)
+{
+	struct lkl_stat stat;
+	long ret;
+
+	ret = lkl_sys_fstat(0, &stat);
+
+	lkl_test_logf("lkl_sys_fstat=%ld mode=%o size=%zd\n", ret, stat.st_mode,
+		      stat.st_size);
+
+	if (ret == 0 && stat.st_size == sizeof(wrbuf) &&
+	    stat.st_mode == (access_rights | LKL_S_IFREG))
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+LKL_TEST_CALL(mkdir, lkl_sys_mkdir, 0, "/mnt", access_rights)
+
+int lkl_test_stat(void)
+{
+	struct lkl_stat stat;
+	long ret;
+
+	ret = lkl_sys_stat("/mnt", &stat);
+
+	lkl_test_logf("lkl_sys_stat(\"/mnt\")=%ld mode=%o\n", ret,
+		      stat.st_mode);
+
+	if (ret == 0 && stat.st_mode == (access_rights | LKL_S_IFDIR))
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+static int lkl_test_pipe2(void)
+{
+	int pipe_fds[2];
+	int READ_IDX = 0, WRITE_IDX = 1;
+	const char msg[] = "Hello world!";
+	char str[20];
+	int msg_len_bytes = strlen(msg) + 1;
+	int cmp_res;
+	long ret;
+
+	ret = lkl_sys_pipe2(pipe_fds, LKL_O_NONBLOCK);
+	if (ret) {
+		lkl_test_logf("pipe2: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_write(pipe_fds[WRITE_IDX], msg, msg_len_bytes);
+	if (ret != msg_len_bytes) {
+		if (ret < 0)
+			lkl_test_logf("write error: %s\n", lkl_strerror(ret));
+		else
+			lkl_test_logf("short write: %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_read(pipe_fds[READ_IDX], str, msg_len_bytes);
+	if (ret != msg_len_bytes) {
+		if (ret < 0)
+			lkl_test_logf("read error: %s\n", lkl_strerror(ret));
+		else
+			lkl_test_logf("short read: %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	cmp_res = memcmp(msg, str, msg_len_bytes);
+	if (cmp_res) {
+		lkl_test_logf("memcmp failed: %d\n", cmp_res);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_close(pipe_fds[0]);
+	if (ret) {
+		lkl_test_logf("close error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_close(pipe_fds[1]);
+	if (ret) {
+		lkl_test_logf("close error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_epoll(void)
+{
+	int epoll_fd, pipe_fds[2];
+	int READ_IDX = 0, WRITE_IDX = 1;
+	struct lkl_epoll_event wait_on, read_result;
+	const char msg[] = "Hello world!";
+	long ret;
+
+	memset(&wait_on, 0, sizeof(wait_on));
+	memset(&read_result, 0, sizeof(read_result));
+
+	ret = lkl_sys_pipe2(pipe_fds, LKL_O_NONBLOCK);
+	if (ret) {
+		lkl_test_logf("pipe2 error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	epoll_fd = lkl_sys_epoll_create(1);
+	if (epoll_fd < 0) {
+		lkl_test_logf("epoll_create error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	wait_on.events = LKL_POLLIN | LKL_POLLOUT;
+	wait_on.data = pipe_fds[READ_IDX];
+
+	ret = lkl_sys_epoll_ctl(epoll_fd, LKL_EPOLL_CTL_ADD, pipe_fds[READ_IDX],
+				&wait_on);
+	if (ret < 0) {
+		lkl_test_logf("epoll_ctl error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	/* Shouldn't be ready before we have written something */
+	ret = lkl_sys_epoll_wait(epoll_fd, &read_result, 1, 0);
+	if (ret != 0) {
+		if (ret < 0)
+			lkl_test_logf("epoll_wait error: %s\n",
+				      lkl_strerror(ret));
+		else
+			lkl_test_logf("epoll_wait: bad event: 0x%lx\n", ret);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_write(pipe_fds[WRITE_IDX], msg, strlen(msg) + 1);
+	if (ret < 0) {
+		lkl_test_logf("write error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	/* We expect exactly 1 fd to be ready immediately */
+	ret = lkl_sys_epoll_wait(epoll_fd, &read_result, 1, 0);
+	if (ret != 1) {
+		if (ret < 0)
+			lkl_test_logf("epoll_wait error: %s\n",
+				      lkl_strerror(ret));
+		else
+			lkl_test_logf("epoll_wait: bad ev no %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	/* Already tested reading from pipe2 so no need to do it
+	 * here
+	 */
+
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(chdir_proc, lkl_sys_chdir, 0, "proc");
+
+static int dir_fd;
+
+static int lkl_test_open_cwd(void)
+{
+	dir_fd = lkl_sys_open(".", LKL_O_RDONLY | LKL_O_DIRECTORY, 0);
+	if (dir_fd < 0) {
+		lkl_test_logf("failed to open current directory: %s\n",
+			      lkl_strerror(dir_fd));
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+/* column where to insert a line break for the list file tests below. */
+#define COL_LINE_BREAK 70
+
+static int lkl_test_getdents64(void)
+{
+	long ret;
+	char buf[1024], *pos;
+	struct lkl_linux_dirent64 *de;
+	int wr;
+
+	de = (struct lkl_linux_dirent64 *)buf;
+	ret = lkl_sys_getdents64(dir_fd, de, sizeof(buf));
+
+	wr = lkl_test_logf("%d ", dir_fd);
+
+	if (ret < 0)
+		return TEST_FAILURE;
+
+	for (pos = buf; pos - buf < ret; pos += de->d_reclen) {
+		de = (struct lkl_linux_dirent64 *)pos;
+
+		wr += lkl_test_logf("%s ", de->d_name);
+		if (wr >= COL_LINE_BREAK) {
+			lkl_test_logf("\n");
+			wr = 0;
+		}
+	}
+
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(close_dir_fd, lkl_sys_close, 0, dir_fd);
+LKL_TEST_CALL(chdir_root, lkl_sys_chdir, 0, "/");
+LKL_TEST_CALL(mount_fs_proc, lkl_mount_fs, 0, "proc");
+LKL_TEST_CALL(umount_fs_proc, lkl_umount_timeout, 0, "proc", 0, 1000);
+LKL_TEST_CALL(lo_ifup, lkl_if_up, 0, 1);
+
+static int lkl_test_mutex(void)
+{
+	long ret = TEST_SUCCESS;
+	/*
+	 * Can't do much to verify that this works, so we'll just let Valgrind
+	 * warn us on CI if we've made bad memory accesses.
+	 */
+
+	struct lkl_mutex *mutex;
+
+	mutex = lkl_host_ops.mutex_alloc(0);
+	lkl_host_ops.mutex_lock(mutex);
+	lkl_host_ops.mutex_unlock(mutex);
+	lkl_host_ops.mutex_free(mutex);
+
+	mutex = lkl_host_ops.mutex_alloc(1);
+	lkl_host_ops.mutex_lock(mutex);
+	lkl_host_ops.mutex_lock(mutex);
+	lkl_host_ops.mutex_unlock(mutex);
+	lkl_host_ops.mutex_unlock(mutex);
+	lkl_host_ops.mutex_free(mutex);
+
+	return ret;
+}
+
+static int lkl_test_semaphore(void)
+{
+	long ret = TEST_SUCCESS;
+	/*
+	 * Can't do much to verify that this works, so we'll just let Valgrind
+	 * warn us on CI if we've made bad memory accesses.
+	 */
+
+	struct lkl_sem *sem = lkl_host_ops.sem_alloc(1);
+
+	lkl_host_ops.sem_down(sem);
+	lkl_host_ops.sem_up(sem);
+	lkl_host_ops.sem_free(sem);
+
+	return ret;
+}
+
+static int lkl_test_gettid(void)
+{
+	long tid = lkl_host_ops.gettid();
+
+	lkl_test_logf("%ld", tid);
+
+	/* As far as I know, thread IDs are non-zero on all reasonable
+	 * systems.
+	 */
+	if (tid)
+		return TEST_SUCCESS;
+	else
+		return TEST_FAILURE;
+}
+
+static void test_thread(void *data)
+{
+	int *pipe_fds = (int *) data;
+	char tmp[LKL_PIPE_BUF+1];
+	int ret;
+
+	ret = lkl_sys_read(pipe_fds[0], tmp, sizeof(tmp));
+	if (ret < 0)
+		lkl_test_logf("%s: %s\n", __func__, lkl_strerror(ret));
+}
+
+static int lkl_test_syscall_thread(void)
+{
+	int pipe_fds[2];
+	char tmp[LKL_PIPE_BUF+1];
+	long ret;
+	lkl_thread_t tid;
+
+	ret = lkl_sys_pipe2(pipe_fds, 0);
+	if (ret) {
+		lkl_test_logf("pipe2: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_fcntl(pipe_fds[0], LKL_F_SETPIPE_SZ, 1);
+	if (ret < 0) {
+		lkl_test_logf("fcntl setpipe_sz: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	tid = lkl_host_ops.thread_create(test_thread, pipe_fds);
+	if (!tid) {
+		lkl_test_logf("failed to create thread\n");
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_write(pipe_fds[1], tmp, sizeof(tmp));
+	if (ret != sizeof(tmp)) {
+		if (ret < 0)
+			lkl_test_logf("write error: %s\n", lkl_strerror(ret));
+		else
+			lkl_test_logf("short write: %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_host_ops.thread_join(tid);
+	if (ret) {
+		lkl_test_logf("failed to join thread\n");
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+#ifndef __MINGW32__
+static void thread_get_pid(void *unused)
+{
+	lkl_sys_getpid();
+}
+
+static int lkl_test_many_syscall_threads(void)
+{
+	lkl_thread_t tid;
+	int count = 65, ret;
+
+	while (--count > 0) {
+		tid = lkl_host_ops.thread_create(thread_get_pid, NULL);
+		if (!tid) {
+			lkl_test_logf("failed to create thread\n");
+			return TEST_FAILURE;
+		}
+
+		ret = lkl_host_ops.thread_join(tid);
+		if (ret) {
+			lkl_test_logf("failed to join thread\n");
+			return TEST_FAILURE;
+		}
+	}
+
+	return TEST_SUCCESS;
+}
+#endif
+
+static void thread_quit_immediately(void *unused)
+{
+}
+
+static int lkl_test_join(void)
+{
+	lkl_thread_t tid = lkl_host_ops.thread_create(thread_quit_immediately,
+						      NULL);
+	int ret = lkl_host_ops.thread_join(tid);
+
+	if (ret == 0) {
+		lkl_test_logf("joined %ld\n", tid);
+		return TEST_SUCCESS;
+	} else {
+		lkl_test_logf("failed joining %ld\n", tid);
+		return TEST_FAILURE;
+	}
+}
+
+LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
+	     "mem=16M loglevel=8");
+LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
+
+struct lkl_test tests[] = {
+	LKL_TEST(mutex),
+	LKL_TEST(semaphore),
+	LKL_TEST(join),
+	LKL_TEST(start_kernel),
+	LKL_TEST(getpid),
+	LKL_TEST(syscall_latency),
+	LKL_TEST(umask),
+	LKL_TEST(umask2),
+	LKL_TEST(creat),
+	LKL_TEST(close),
+	LKL_TEST(failopen),
+	LKL_TEST(open),
+	LKL_TEST(write),
+	LKL_TEST(lseek_cur),
+	LKL_TEST(lseek_end),
+	LKL_TEST(lseek_set),
+	LKL_TEST(read),
+	LKL_TEST(fstat),
+	LKL_TEST(mkdir),
+	LKL_TEST(stat),
+#ifndef __MINGW32__
+	LKL_TEST(nanosleep),
+#endif
+	LKL_TEST(pipe2),
+	LKL_TEST(epoll),
+	LKL_TEST(mount_fs_proc),
+	LKL_TEST(chdir_proc),
+	LKL_TEST(open_cwd),
+	LKL_TEST(getdents64),
+	LKL_TEST(close_dir_fd),
+	LKL_TEST(chdir_root),
+	LKL_TEST(umount_fs_proc),
+	LKL_TEST(lo_ifup),
+	LKL_TEST(gettid),
+	LKL_TEST(syscall_thread),
+	/*
+	 * Wine has an issue where the FlsCallback is not called when
+	 * the thread terminates which makes testing the automatic
+	 * syscall threads cleanup impossible under wine.
+	 */
+#ifndef __MINGW32__
+	LKL_TEST(many_syscall_threads),
+#endif
+	LKL_TEST(stop_kernel),
+};
+
+int main(int argc, const char **argv)
+{
+	lkl_host_ops.print = lkl_test_log;
+
+	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
+			    "boot");
+}
diff --git a/tools/lkl/tests/boot.sh b/tools/lkl/tests/boot.sh
new file mode 100755
index 000000000000..d985c04b0ac1
--- /dev/null
+++ b/tools/lkl/tests/boot.sh
@@ -0,0 +1,9 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+source $script_dir/test.sh
+
+lkl_test_plan 1 "boot"
+lkl_test_run 1
+lkl_test_exec $script_dir/boot
diff --git a/tools/lkl/tests/cla.c b/tools/lkl/tests/cla.c
new file mode 100644
index 000000000000..a34badeb5f06
--- /dev/null
+++ b/tools/lkl/tests/cla.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdlib.h>
+#ifdef __MINGW32__
+#include <winsock2.h>
+#else
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#endif
+
+#include "cla.h"
+
+static int cl_arg_parse_bool(struct cl_arg *arg, const char *value)
+{
+	*((int *)arg->store) = 1;
+	return 0;
+}
+
+static int cl_arg_parse_str(struct cl_arg *arg, const char *value)
+{
+	*((const char **)arg->store) = value;
+	return 0;
+}
+
+static int cl_arg_parse_int(struct cl_arg *arg, const char *value)
+{
+	errno = 0;
+	*((int *)arg->store) = strtol(value, NULL, 0);
+	return errno == 0;
+}
+
+static int cl_arg_parse_str_set(struct cl_arg *arg, const char *value)
+{
+	const char **set = arg->set;
+	int i;
+
+	for (i = 0; set[i] != NULL; i++) {
+		if (strcmp(set[i], value) == 0) {
+			*((int *)arg->store) = i;
+			return 0;
+		}
+	}
+
+	return -1;
+}
+
+static int cl_arg_parse_ipv4(struct cl_arg *arg, const char *value)
+{
+	unsigned int addr;
+
+	if (!value)
+		return -1;
+
+	addr = inet_addr(value);
+	if (addr == INADDR_NONE)
+		return -1;
+	*((unsigned int *)arg->store) = addr;
+	return 0;
+}
+
+static cl_arg_parser_t parsers[] = {
+	[CL_ARG_BOOL] = cl_arg_parse_bool,
+	[CL_ARG_INT] = cl_arg_parse_int,
+	[CL_ARG_STR] = cl_arg_parse_str,
+	[CL_ARG_STR_SET] = cl_arg_parse_str_set,
+	[CL_ARG_IPV4] = cl_arg_parse_ipv4,
+};
+
+static struct cl_arg *find_short_arg(char name, struct cl_arg *args)
+{
+	struct cl_arg *arg;
+
+	for (arg = args; arg->short_name != 0; arg++) {
+		if (arg->short_name == name)
+			return arg;
+	}
+
+	return NULL;
+}
+
+static struct cl_arg *find_long_arg(const char *name, struct cl_arg *args)
+{
+	struct cl_arg *arg;
+
+	for (arg = args; arg->long_name; arg++) {
+		if (strcmp(arg->long_name, name) == 0)
+			return arg;
+	}
+
+	return NULL;
+}
+
+static void print_help(struct cl_arg *args)
+{
+	struct cl_arg *arg;
+
+	fprintf(stderr, "usage:\n");
+	for (arg = args; arg->long_name; arg++) {
+		fprintf(stderr, "-%c, --%-20s %s", arg->short_name,
+			arg->long_name, arg->help);
+		if (arg->type == CL_ARG_STR_SET) {
+			const char **set = arg->set;
+
+			fprintf(stderr, " [ ");
+			while (*set != NULL)
+				fprintf(stderr, "%s ", *(set++));
+			fprintf(stderr, "]");
+		}
+		fprintf(stderr, "\n");
+	}
+}
+
+int parse_args(int argc, const char **argv, struct cl_arg *args)
+{
+	int i;
+
+	for (i = 1; i < argc; i++) {
+		struct cl_arg *arg = NULL;
+		cl_arg_parser_t parser;
+
+		if (argv[i][0] == '-') {
+			if (argv[i][1] != '-')
+				arg = find_short_arg(argv[i][1], args);
+			else
+				arg = find_long_arg(&argv[i][2], args);
+		}
+
+		if (!arg) {
+			fprintf(stderr, "unknown option '%s'\n", argv[i]);
+			print_help(args);
+			return -1;
+		}
+
+		if (arg->type == CL_ARG_USER || arg->type >= CL_ARG_END)
+			parser = arg->parser;
+		else
+			parser = parsers[arg->type];
+
+		if (!parser) {
+			fprintf(stderr, "can't parse --'%s'/-'%c'\n",
+				arg->long_name, args->short_name);
+			return -1;
+		}
+
+		if (parser(arg, argv[i + 1]) < 0) {
+			fprintf(stderr, "can't parse '%s'\n", argv[i]);
+			print_help(args);
+			return -1;
+		}
+
+		if (arg->has_arg)
+			i++;
+	}
+
+	return 0;
+}
diff --git a/tools/lkl/tests/cla.h b/tools/lkl/tests/cla.h
new file mode 100644
index 000000000000..f8369be02e5a
--- /dev/null
+++ b/tools/lkl/tests/cla.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_TEST_CLA_H
+#define _LKL_TEST_CLA_H
+
+enum cl_arg_type {
+	CL_ARG_USER = 0,
+	CL_ARG_BOOL,
+	CL_ARG_INT,
+	CL_ARG_STR,
+	CL_ARG_STR_SET,
+	CL_ARG_IPV4,
+	CL_ARG_END,
+};
+
+struct cl_arg;
+
+typedef int (*cl_arg_parser_t)(struct cl_arg *arg, const char *value);
+
+struct cl_arg {
+	const char *long_name;
+	char short_name;
+	const char *help;
+	int has_arg;
+	enum cl_arg_type type;
+	void *store;
+	void *set;
+	cl_arg_parser_t parser;
+};
+
+int parse_args(int argc, const char **argv, struct cl_arg *args);
+
+
+#endif /* _LKL_TEST_CLA_H */
diff --git a/tools/lkl/tests/disk.c b/tools/lkl/tests/disk.c
new file mode 100644
index 000000000000..0aa039876b54
--- /dev/null
+++ b/tools/lkl/tests/disk.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <time.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <lkl.h>
+#include <lkl_host.h>
+#ifndef __MINGW32__
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#else
+#include <windows.h>
+#endif
+
+#include "test.h"
+#include "cla.h"
+
+static struct {
+	int printk;
+	const char *disk;
+	const char *fstype;
+	int partition;
+} cla;
+
+struct cl_arg args[] = {
+	{"disk", 'd', "disk file to use", 1, CL_ARG_STR, &cla.disk},
+	{"partition", 'P', "partition to mount", 1, CL_ARG_INT, &cla.partition},
+	{"type", 't', "filesystem type", 1, CL_ARG_STR, &cla.fstype},
+	{0},
+};
+
+
+static struct lkl_disk disk;
+static int disk_id = -1;
+
+int lkl_test_disk_add(void)
+{
+#ifdef __MINGW32__
+	disk.handle = CreateFile(cla.disk, GENERIC_READ | GENERIC_WRITE,
+			       0, NULL, OPEN_EXISTING, 0, NULL);
+	if (!disk.handle)
+#else
+	disk.fd = open(cla.disk, O_RDWR);
+	if (disk.fd < 0)
+#endif
+		goto out_unlink;
+
+	disk.ops = NULL;
+
+	disk_id = lkl_disk_add(&disk);
+	if (disk_id < 0)
+		goto out_close;
+
+	goto out;
+
+out_close:
+#ifdef __MINGW32__
+	CloseHandle(disk.handle);
+#else
+	close(disk.fd);
+#endif
+
+out_unlink:
+#ifdef __MINGW32__
+	DeleteFile(cla.disk);
+#else
+	unlink(cla.disk);
+#endif
+
+out:
+	lkl_test_logf("disk fd/handle %x disk_id %d", disk.fd, disk_id);
+
+	if (disk_id >= 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+int lkl_test_disk_remove(void)
+{
+	int ret;
+
+	ret = lkl_disk_remove(disk);
+
+#ifdef __MINGW32__
+	CloseHandle(disk.handle);
+#else
+	close(disk.fd);
+#endif
+
+	if (ret == 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+
+static char mnt_point[32];
+
+LKL_TEST_CALL(mount_dev, lkl_mount_dev, 0, disk_id, cla.partition, cla.fstype,
+	      0, NULL, mnt_point, sizeof(mnt_point))
+
+static int lkl_test_umount_dev(void)
+{
+	long ret, ret2;
+
+	ret = lkl_sys_chdir("/");
+
+	ret2 = lkl_umount_dev(disk_id, cla.partition, 0, 1000);
+
+	lkl_test_logf("%ld %ld", ret, ret2);
+
+	if (!ret && !ret2)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+struct lkl_dir *dir;
+
+static int lkl_test_opendir(void)
+{
+	int err;
+
+	dir = lkl_opendir(mnt_point, &err);
+
+	lkl_test_logf("lkl_opedir(%s) = %d %s\n", mnt_point, err,
+		      lkl_strerror(err));
+
+	if (err == 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+static int lkl_test_readdir(void)
+{
+	struct lkl_linux_dirent64 *de = lkl_readdir(dir);
+	int wr = 0;
+
+	while (de) {
+		wr += lkl_test_logf("%s ", de->d_name);
+		if (wr >= 70) {
+			lkl_test_logf("\n");
+			wr = 0;
+			break;
+		}
+		de = lkl_readdir(dir);
+	}
+
+	if (lkl_errdir(dir) == 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+LKL_TEST_CALL(closedir, lkl_closedir, 0, dir);
+LKL_TEST_CALL(chdir_mnt_point, lkl_sys_chdir, 0, mnt_point);
+LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
+	     "mem=16M loglevel=8");
+LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
+
+struct lkl_test tests[] = {
+	LKL_TEST(disk_add),
+	LKL_TEST(start_kernel),
+	LKL_TEST(mount_dev),
+	LKL_TEST(chdir_mnt_point),
+	LKL_TEST(opendir),
+	LKL_TEST(readdir),
+	LKL_TEST(closedir),
+	LKL_TEST(umount_dev),
+	LKL_TEST(stop_kernel),
+	LKL_TEST(disk_remove),
+
+};
+
+int main(int argc, const char **argv)
+{
+	if (parse_args(argc, argv, args) < 0)
+		return -1;
+
+	lkl_host_ops.print = lkl_test_log;
+
+	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
+			    "disk %s", cla.fstype);
+}
diff --git a/tools/lkl/tests/disk.sh b/tools/lkl/tests/disk.sh
new file mode 100755
index 000000000000..e2ec6cf69d4b
--- /dev/null
+++ b/tools/lkl/tests/disk.sh
@@ -0,0 +1,70 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+source $script_dir/test.sh
+
+function prepfs()
+{
+    set -e
+
+    file=`mktemp`
+
+    dd if=/dev/zero of=$file bs=1024 count=204800
+
+    yes | mkfs.$1 $file
+
+    if ! [ -z $ANDROID_WDIR ]; then
+        adb shell mkdir -p $ANDROID_WDIR
+        adb push $file $ANDROID_WDIR
+        rm $file
+        file=$ANDROID_WDIR/$(basename $file)
+    fi
+    if ! [ -z $BSD_WDIR ]; then
+        $MYSSH mkdir -p $BSD_WDIR
+        ssh_copy $file $BSD_WDIR
+        rm $file
+        file=$BSD_WDIR/$(basename $file)
+    fi
+
+    export_vars file
+}
+
+function cleanfs()
+{
+    set -e
+
+    if ! [ -z $ANDROID_WDIR ]; then
+        adb shell rm $1
+        adb shell rm $ANDROID_WDIR/disk
+    elif ! [ -z $BSD_WDIR ]; then
+        $MYSSH rm $1
+        $MYSSH rm $BSD_WDIR/disk
+    else
+        rm $1
+    fi
+}
+
+if [ "$1" = "-t" ]; then
+    shift
+    fstype=$1
+    shift
+fi
+
+if [ -z "$fstype" ]; then
+    fstype="ext4"
+fi
+
+if [ -z $(which mkfs.$fstype) ]; then
+    lkl_test_plan 0 "disk $fstype"
+    echo "no mkfs.$fstype command"
+    exit 0
+fi
+
+lkl_test_plan 1 "disk $fstype"
+lkl_test_run 1 prepfs $fstype
+lkl_test_exec $script_dir/disk -d $file -t $fstype $@
+lkl_test_plan 1 "disk $fstype"
+lkl_test_run 1 cleanfs $file
+
diff --git a/tools/lkl/tests/run.py b/tools/lkl/tests/run.py
new file mode 100755
index 000000000000..f21733339fe9
--- /dev/null
+++ b/tools/lkl/tests/run.py
@@ -0,0 +1,186 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License
+#
+# Author: Octavian Purdila <tavi@cs.pub.ro>
+#
+
+from __future__ import print_function
+
+import argparse
+import os
+import subprocess
+import sys
+import tap13
+import xml.etree.ElementTree as ET
+
+from junit_xml import TestSuite, TestCase
+
+
+class Reporter(tap13.Reporter):
+    def start(self, obj):
+        if type(obj) is tap13.Test:
+            if obj.result == "*":
+                end='\r'
+            else:
+                end='\n'
+            print("  TEST       %-8s %.50s" %
+                  (obj.result, obj.description + " " + obj.comment), end=end)
+
+        elif type(obj) is tap13.Suite:
+            if obj.tests_planned == 0:
+                status = "skip"
+            else:
+                status = ""
+            print("  SUITE      %-8s %s" % (status, obj.name))
+
+    def end(self, obj):
+        if type(obj) is tap13.Test:
+            if obj.result != "ok":
+                try:
+                    print(obj.yaml["log"], end='')
+                except:
+                    None
+
+
+mydir=os.path.dirname(os.path.realpath(__file__))
+
+tests = [
+    'boot.sh',
+    'disk.sh -t ext4',
+    'disk.sh -t btrfs',
+    'disk.sh -t vfat',
+    'disk.sh -t xfs',
+    'net.sh -b loopback',
+    'net.sh -b tap',
+    'net.sh -b pipe',
+    'net.sh -b raw',
+    'net.sh -b macvtap',
+    'lklfuse.sh -t ext4',
+    'lklfuse.sh -t btrfs',
+    'lklfuse.sh -t vfat',
+    'lklfuse.sh -t xfs',
+    'hijack-test.sh'
+]
+
+parser = argparse.ArgumentParser(description='LKL test runner')
+parser.add_argument('tests', nargs='?', action='append',
+                    help='tests to run %s' % tests)
+parser.add_argument('--junit-dir',
+                    help='directory where to store the juni suites')
+parser.add_argument('--gdb', action='store_true', default=False,
+                    help='run simple tests under gdb; implies --pass-through')
+parser.add_argument('--pass-through', action='store_true',  default=False,
+                    help='run the test without interpeting the test output')
+parser.add_argument('--valgrind', action='store_true', default=False,
+                    help='run simple tests under valgrind')
+
+args = parser.parse_args()
+if args.tests == [None]:
+    args.tests = tests
+
+if args.gdb:
+    args.pass_through=True
+    os.environ['GDB']="yes"
+
+if args.valgrind:
+    os.environ['VALGRIND']="yes"
+
+tap = tap13.Parser(Reporter())
+
+os.environ['PATH'] += ":" + mydir
+
+exit_code = 0
+
+for t in args.tests:
+    if not t:
+        continue
+    if args.pass_through:
+        print(t)
+        if subprocess.call(t, shell=True) != 0:
+            exit_code = 1
+    else:
+        p = subprocess.Popen(t, shell=True, stdout=subprocess.PIPE)
+        tap.parse(p.stdout)
+
+if args.pass_through:
+    sys.exit(exit_code)
+
+suites_count = 0
+tests_total = 0
+tests_not_ok = 0
+tests_ok = 0
+tests_skip = 0
+val_errs = 0
+val_fails = 0
+val_skips = 0
+
+for s in tap.run.suites:
+
+    junit_tests = []
+    suites_count += 1
+
+    for t in s.tests:
+        try:
+            secs = t.yaml["time_us"] / 1000000.0
+        except:
+            secs = 0
+        try:
+            log = t.yaml['log']
+        except:
+            log = ""
+
+        jt = TestCase(t.description, elapsed_sec=secs, stdout=log)
+        if t.result == 'skip':
+            jt.add_skipped_info(output=log)
+        elif t.result == 'not ok':
+            jt.add_error_info(output=log)
+
+        junit_tests.append(jt)
+
+        tests_total += 1
+        if t.result == "ok":
+            tests_ok += 1
+        elif t.result == "not ok":
+            tests_not_ok += 1
+            exit_code = 1
+        elif t.result == "skip":
+            tests_skip += 1
+
+    if args.junit_dir:
+        js = TestSuite(s.name, junit_tests)
+        with open(os.path.join(args.junit_dir, os.path.basename(s.name) + '.xml'), 'w') as f:
+            js.to_file(f, [js])
+
+        if os.getenv('VALGRIND') is not None:
+            val_xml = 'valgrind-%s.xml' % os.path.basename(s.name).replace(' ','-')
+            # skipped tests don't generate xml file
+            if os.path.exists(val_xml) is False:
+                continue
+
+            cmd = 'mv %s %s' % (val_xml, args.junit_dir)
+            subprocess.call(cmd, shell=True, )
+
+            cmd = mydir + '/valgrind2xunit.py ' + val_xml
+            subprocess.call(cmd, shell=True, cwd=args.junit_dir)
+
+            # count valgrind results
+            doc = ET.parse(os.path.join(args.junit_dir, 'valgrind-%s_xunit.xml' \
+                                        % (os.path.basename(s.name).replace(' ','-'))))
+            ts = doc.getroot()
+            val_errs += int(ts.get('errors'))
+            val_fails += int(ts.get('failures'))
+            val_skips += int(ts.get('skip'))
+
+print("Summary: %d suites run, %d tests, %d ok, %d not ok, %d skipped" %
+      (suites_count, tests_total, tests_ok, tests_not_ok, tests_skip))
+
+if os.getenv('VALGRIND') is not None:
+    print(" valgrind (memcheck): %d failures, %d skipped" % (val_fails, val_skips))
+    if val_errs or val_fails:
+        exit_code = 1
+
+sys.exit(exit_code)
diff --git a/tools/lkl/tests/tap13.py b/tools/lkl/tests/tap13.py
new file mode 100644
index 000000000000..65c73cda7ca1
--- /dev/null
+++ b/tools/lkl/tests/tap13.py
@@ -0,0 +1,209 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License
+#
+# Author: Octavian Purdila <tavi@cs.pub.ro>
+#
+# Based on TAP13:
+#
+# Copyright 2013, Red Hat, Inc.
+# Author: Josef Skladanka <jskladan@redhat.com>
+#
+from __future__ import print_function
+
+import re
+import sys
+import yamlish
+
+
+class Reporter(object):
+
+    def start(self, obj):
+        None
+
+    def end(self, obj):
+        None
+
+
+class Test(object):
+    def __init__(self, reporter, result, id, description=None, directive=None,
+                 comment=None):
+        self.reporter = reporter
+        self.result = result
+        if directive:
+            self.result = directive.lower()
+        if id:
+            self.id = int(id)
+        else:
+            self.id = None
+        if description:
+            self.description = description
+        else:
+            self.description = ""
+        if comment:
+            self.comment = "# " + comment
+        else:
+            self.comment = ""
+        self.yaml = None
+        self._yaml_buffer = None
+        self.diagnostics = []
+
+        self.reporter.start(self)
+
+    def end(self):
+        if not self.yaml:
+            self.yaml = yamlish.load(self._yaml_buffer)
+            self.reporter.end(self)
+
+
+class Suite(object):
+    def __init__(self, reporter, start, end, explanation):
+        self.reporter = reporter
+        self.tests = []
+        self.name = explanation
+        self.tests_planned = int(end)
+
+        self.__tests_counter = 0
+        self.__tests_base = 0
+
+        self.reporter.start(self)
+
+    def newTest(self, args):
+        try:
+            self.tests[-1].end()
+        except IndexError:
+            None
+
+        if 'id' not in args or not args['id']:
+            args['id'] = self.__tests_counter
+        else:
+            args['id'] = int(args['id']) + self.__tests_base
+
+        if args['id'] < self.__tests_counter:
+            print("error: bad test id %d, fixing it" % (args['id']))
+            args['id'] = self.__tests_counter
+        # according to TAP13 specs, missing tests must be handled as 'not ok'
+        # here we add the missing tests in sequence
+        while args['id'] > (self.__tests_counter + 1):
+            comment = 'test %d not present' % self.__tests_counter
+            self.tests.append(Test(self.reporter, 'not ok',
+                                   self.__tests_counter, comment=comment))
+            self.__tests_counter += 1
+
+        if args['id'] == self.__tests_counter:
+            if args['directive']:
+                self.test().result = args['directive'].lower()
+            else:
+                self.test().result = args['result']
+            self.reporter.start(self.test())
+        else:
+            self.tests.append(Test(self.reporter, **args))
+            self.__tests_counter += 1
+
+    def test(self):
+        return self.tests[-1]
+
+    def end(self, name, planned):
+        if name == self.name:
+            self.tests_planned += int(planned)
+            self.__tests_base = self.__tests_counter
+            return False
+        try:
+            self.test().end()
+        except IndexError:
+            None
+        if len(self.tests) != self.tests_planned:
+            for i in range(len(self.tests), self.tests_planned):
+                self.tests.append(Test(self.reporter, 'not ok', i+1,
+                                       comment='test not present'))
+        return True
+
+
+class Run(object):
+
+    def __init__(self, reporter):
+        self.reporter = reporter
+        self.suites = []
+
+    def suite(self):
+        return self.suites[-1]
+
+    def test(self):
+        return self.suites[-1].tests[-1]
+
+    def newSuite(self, args):
+        new = False
+        try:
+            if self.suite().end(args['explanation'], args['end']):
+                new = True
+        except IndexError:
+            new = True
+        if new:
+            self.suites.append(Suite(self.reporter, **args))
+
+    def newTest(self, args):
+        self.suite().newTest(args)
+
+
+class Parser(object):
+    RE_PLAN = re.compile(r"^\s*(?P<start>\d+)\.\.(?P<end>\d+)\s*(#\s*(?P<explanation>.*))?\s*$")
+    RE_TEST_LINE = re.compile(r"^\s*(?P<result>(not\s+)?ok|[*]+)\s*(?P<id>\d+)?\s*(?P<description>[^#]+)?\s*(#\s*(?P<directive>TODO|SKIP)?\s*(?P<comment>.+)?)?\s*$",  re.IGNORECASE)
+    RE_EXPLANATION = re.compile(r"^\s*#\s*(?P<explanation>.+)?\s*$")
+    RE_YAMLISH_START = re.compile(r"^\s*---.*$")
+    RE_YAMLISH_END = re.compile(r"^\s*\.\.\.\s*$")
+
+    def __init__(self, reporter):
+        self.seek_test = False
+        self.in_test = False
+        self.in_yaml = False
+        self.run = Run(reporter)
+
+    def parse(self, source):
+        # to avoid input buffering
+        while True:
+            line = source.readline()
+            if not line:
+                break
+
+            if self.in_yaml:
+                if Parser.RE_YAMLISH_END.match(line):
+                    self.run.test()._yaml_buffer.append(line.strip())
+                    self.in_yaml = False
+                else:
+                    self.run.test()._yaml_buffer.append(line.rstrip())
+                continue
+
+            line = line.strip()
+
+            if self.in_test:
+                if Parser.RE_EXPLANATION.match(line):
+                    self.run.test().diagnostics.append(line)
+                    continue
+                if Parser.RE_YAMLISH_START.match(line):
+                    self.run.test()._yaml_buffer = [line.strip()]
+                    self.in_yaml = True
+                    continue
+
+            m = Parser.RE_PLAN.match(line)
+            if m:
+                self.seek_test = True
+                args = m.groupdict()
+                self.run.newSuite(args)
+                continue
+
+            if self.seek_test:
+                m = Parser.RE_TEST_LINE.match(line)
+                if m:
+                    args = m.groupdict()
+                    self.run.newTest(args)
+                    self.in_test = True
+                    continue
+
+            print(line)
+        try:
+            self.run.suite().end(None, 0)
+        except IndexError:
+            None
diff --git a/tools/lkl/tests/test.c b/tools/lkl/tests/test.c
new file mode 100644
index 000000000000..3e334d106c48
--- /dev/null
+++ b/tools/lkl/tests/test.c
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <stdarg.h>
+#include <time.h>
+
+#include "test.h"
+
+/* circular log buffer */
+
+static char log_buf[0x10000];
+static char *head = log_buf, *tail = log_buf;
+
+static inline void advance(char **ptr)
+{
+	if ((unsigned int)(*ptr - log_buf) >= sizeof(log_buf))
+		*ptr = log_buf;
+	else
+		*ptr = *ptr + 1;
+}
+
+static void log_char(char c)
+{
+	*tail = c;
+	advance(&tail);
+	if (tail == head)
+		advance(&head);
+}
+
+static void print_log(void)
+{
+	char last;
+
+	printf(" log: |\n");
+	last = '\n';
+	while (head != tail) {
+		if (last == '\n')
+			printf("  ");
+		last = *head;
+		putchar(last);
+		advance(&head);
+	}
+	if (last != '\n')
+		putchar('\n');
+}
+
+int lkl_test_run(const struct lkl_test *tests, int nr, const char *fmt, ...)
+{
+	int i, ret, status = TEST_SUCCESS;
+	clock_t start, stop;
+	char name[1024];
+	va_list args;
+
+	va_start(args, fmt);
+	vsnprintf(name, sizeof(name), fmt, args);
+	va_end(args);
+
+	printf("1..%d # %s\n", nr, name);
+	for (i = 1; i <= nr; i++) {
+		const struct lkl_test *t = &tests[i-1];
+		unsigned long delta_us;
+
+		printf("* %d %s\n", i, t->name);
+		fflush(stdout);
+
+		start = clock();
+
+		ret = t->fn(t->arg1, t->arg2, t->arg3);
+
+		stop = clock();
+
+		switch (ret) {
+		case TEST_SUCCESS:
+			printf("ok %d %s\n", i, t->name);
+			break;
+		case TEST_SKIP:
+			printf("ok %d %s # SKIP\n", i, t->name);
+			break;
+		case TEST_BAILOUT:
+			status = TEST_BAILOUT;
+			/* fall through */
+		case TEST_FAILURE:
+		default:
+			if (status != TEST_BAILOUT)
+				status = TEST_FAILURE;
+			printf("not ok %d %s\n", i, t->name);
+		}
+
+		printf(" ---\n");
+		delta_us = (stop - start) * 1000000 / CLOCKS_PER_SEC;
+		printf(" time_us: %ld\n", delta_us);
+		print_log();
+		printf(" ...\n");
+
+		if (status == TEST_BAILOUT) {
+			printf("Bail out!\n");
+			return TEST_FAILURE;
+		}
+
+		fflush(stdout);
+	}
+
+	return status;
+}
+
+
+void lkl_test_log(const char *str, int len)
+{
+	while (len--)
+		log_char(*(str++));
+}
+
+int lkl_test_logf(const char *fmt, ...)
+{
+	char tmp[1024], *c;
+	va_list args;
+	unsigned int n;
+
+	va_start(args, fmt);
+	n = vsnprintf(tmp, sizeof(tmp), fmt, args);
+	va_end(args);
+
+	for (c = tmp; *c != 0; c++)
+		log_char(*c);
+
+	return n > sizeof(tmp) ? sizeof(tmp) : n;
+}
diff --git a/tools/lkl/tests/test.h b/tools/lkl/tests/test.h
new file mode 100644
index 000000000000..f63ad6d419cb
--- /dev/null
+++ b/tools/lkl/tests/test.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_TEST_H
+#define _LKL_TEST_H
+
+#define TEST_SUCCESS	0
+#define TEST_FAILURE	1
+#define TEST_SKIP	2
+#define TEST_TODO	3
+#define TEST_BAILOUT	4
+
+struct lkl_test {
+	const char *name;
+	int (*fn)();
+	void *arg1, *arg2, *arg3;
+};
+
+/**
+ * Simple wrapper to initialize a test entry.
+ * @name - test name, it assume test function is named test_@name
+ * @vargs - arguments to be passed to the function
+ */
+#define LKL_TEST(name, ...) { #name, lkl_test_##name, __VA_ARGS__ }
+
+/**
+ * lkl_test_run - run a test suite
+ *
+ * @tests - the list of tests to run
+ * @nr - number of tests
+ * @fmt - format string to be used for suite name
+ */
+int lkl_test_run(const struct lkl_test *tests, int nr, const char *fmt, ...);
+
+/**
+ * lkl_test_log - store a string in the test log buffer
+ * @str - the string to log (can be non-NULL terminated)
+ * @len - the string length
+ */
+void lkl_test_log(const char *str, int len);
+
+/**
+ * lkl_test_logf - printf like function to store into the test log buffer
+ * @fmt - printf format string
+ * @vargs - arguments to the format string
+ */
+int lkl_test_logf(const char *fmt, ...) __attribute__((format(printf, 1, 2)));
+
+/**
+ * LKL_TEST_CALL - create a test function as for a LKL call
+ *
+ * The test function will be named lkl_test_@name and will return
+ * TEST_SUCCESS if the called functions returns @expect. Otherwise
+ * will return TEST_FAILUIRE.
+ *
+ * @name - test name; must be unique because it is part of the the
+ * test function; the test function will be named
+ * @call - function to call
+ * @expect - expected return value for success
+ * @args - arguments to pass to the LKL call
+ */
+#define LKL_TEST_CALL(name, call, expect, ...)				\
+	static int lkl_test_##name(void)				\
+	{								\
+		long ret;						\
+									\
+		ret = call(__VA_ARGS__);				\
+		lkl_test_logf("%s(%s) = %ld %s\n", #call, #__VA_ARGS__, \
+			ret, ret < 0 ? lkl_strerror(ret) : "");		\
+		return (ret == expect) ? TEST_SUCCESS : TEST_FAILURE;	\
+	}
+
+
+#endif /* _LKL_TEST_H */
diff --git a/tools/lkl/tests/test.sh b/tools/lkl/tests/test.sh
new file mode 100644
index 000000000000..f1500d19de20
--- /dev/null
+++ b/tools/lkl/tests/test.sh
@@ -0,0 +1,240 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+basedir=$(cd $script_dir/..; pwd)
+source ${script_dir}/autoconf.sh
+
+TEST_SUCCESS=0
+TEST_FAILURE=1
+TEST_SKIP=113
+TEST_TODO=114
+TEST_BAILOUT=115
+
+print_log()
+{
+    echo " log: |"
+    while read line; do
+        echo "  $line"
+    done < $1
+}
+
+export_vars()
+{
+    if [ -z "$var_file" ]; then
+        return
+    fi
+
+    for i in $@; do
+        echo "$i=${!i}" >> $var_file
+    done
+}
+
+lkl_test_run()
+{
+    log_file=$(mktemp)
+    export var_file=$(mktemp)
+
+    tid=$1 && shift && tname=$@
+
+    echo "* $tid $tname"
+
+    start=$(date '+%s%9N')
+    # run in a separate shell to avoid -e terminating us
+    $@ 2>&1 | strings >$log_file
+    exit=${PIPESTATUS[0]}
+    stop=$(date '+%s%9N')
+
+    case $exit in
+    $TEST_SUCCESS)
+        echo "ok $tid $tname"
+        ;;
+    $TEST_SKIP)
+        echo "ok $tid $tname # SKIP"
+        ;;
+    $TEST_BAILOUT)
+        echo "not ok $tid $tname"
+        echo "Bail out!"
+        ;;
+    $TEST_FAILURE|*)
+        echo "not ok $tid $tname"
+        ;;
+    esac
+
+    delta=$(((stop-start)/1000))
+
+    echo " ---"
+    echo " time_us: $delta"
+    print_log $log_file
+    echo -e " ..."
+
+    rm $log_file
+    . $var_file
+    rm $var_file
+
+    return $exit
+}
+
+lkl_test_plan()
+{
+    echo "1..$1 # $2"
+    export suite_name="${2// /\-}"
+}
+
+lkl_test_exec()
+{
+    local SUDO=""
+    local WRAPPER=""
+
+    if [ "$1" = "sudo" ]; then
+        SUDO=sudo
+        shift
+    fi
+
+    local file=$1
+    shift
+
+    if [ -n "$LKL_HOST_CONFIG_NT" ]; then
+        file=$file.exe
+    fi
+
+    if file $file | grep "interpreter /system/bin/linker" ; then
+        adb push "$file" $ANDROID_WDIR
+        if [ -n "$SUDO" ]; then
+            ANDROID_USER=root
+            SUDO=""
+        fi
+        if [ -n "$ANDROID_USER" ]; then
+            SU="su $ANDROID_USER"
+        else
+            SU=""
+        fi
+        WRAPPER="adb shell $SU"
+        file=$ANDROID_WDIR/$(basename $file)
+    elif file $file | grep PE32; then
+        WRAPPER="wine"
+    elif file $file | grep ARM; then
+        WRAPPER="qemu-arm-static"
+    elif file $file | grep "FreeBSD" ; then
+        ssh_copy "$file" $BSD_WDIR
+        if [ -n "$SUDO" ]; then
+            SUDO=""
+        fi
+        WRAPPER="$MYSSH $SU"
+        # ssh will mess up with pipes ('|') so, escape the pipe char.
+        args="${@//\|/\\\|}"
+        set - $BSD_WDIR/$(basename $file) $args
+        file=""
+    elif [ -n "$GDB" ]; then
+        WRAPPER="gdb"
+        args="$@"
+        set - -ex "run $args" -ex quit $file
+        file=""
+    elif [ -n "$VALGRIND" ]; then
+        WRAPPER="valgrind --suppressions=$script_dir/valgrind.supp \
+                  --leak-check=full --show-leak-kinds=all --xml=yes \
+                  --xml-file=valgrind-$suite_name.xml"
+    fi
+
+    $SUDO $WRAPPER $file "$@"
+}
+
+lkl_test_cmd()
+{
+    local WRAPPER=""
+
+    if [ -z "$QUIET" ]; then
+        SHOPTS="-x"
+    fi
+
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        if [ "$1" = "sudo" ]; then
+            ANDROID_USER=root
+            shift
+        fi
+        if [ -n "$ANDROID_USER" ]; then
+            SU="su $ANDROID_USER"
+        else
+            SU=""
+        fi
+        WRAPPER="adb shell $SU"
+    elif [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+        WRAPPER="$MYSSH $SU"
+    fi
+
+    echo "$@" | $WRAPPER sh $SHOPTS
+}
+
+adb_push()
+{
+    while [ -n "$1" ]; do
+        if [[ "$1" = *.sh ]]; then
+            type="script"
+        else
+            type="file"
+        fi
+
+        dir=$(dirname $1)
+        adb shell mkdir -p $ANDROID_WDIR/$dir
+
+        if [ "$type" = "script" ]; then
+            sed "s/\/usr\/bin\/env bash/\/system\/bin\/sh/" $basedir/$1 | \
+                adb shell cat \> $ANDROID_WDIR/$1
+            adb shell chmod a+x $ANDROID_WDIR/$1
+        else
+            adb push $basedir/$1 $ANDROID_WDIR/$dir
+        fi
+
+        shift
+    done
+}
+
+# XXX: $MYSSH and $MYSCP are defined in a circleci docker image.
+# see the definitions in lkl/lkl-docker:circleci/freebsd11/Dockerfile
+ssh_push()
+{
+    while [ -n "$1" ]; do
+        if [[ "$1" = *.sh ]]; then
+            type="script"
+        else
+            type="file"
+        fi
+
+        dir=$(dirname $1)
+        $MYSSH mkdir -p $BSD_WDIR/$dir
+
+        $MYSCP -P 7722 -r $basedir/$1 root@localhost:$BSD_WDIR/$dir
+        if [ "$type" = "script" ]; then
+            $MYSSH chmod a+x $BSD_WDIR/$1
+        fi
+
+        shift
+    done
+}
+
+ssh_copy()
+{
+    $MYSCP -P 7722 -r $1 root@localhost:$2
+}
+
+lkl_test_android_cleanup()
+{
+    adb shell rm -rf $ANDROID_WDIR
+}
+
+lkl_test_bsd_cleanup()
+{
+    $MYSSH rm -rf $BSD_WDIR
+}
+
+if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+    trap lkl_test_android_cleanup EXIT
+    export ANDROID_WDIR=/data/local/tmp/lkl
+    adb shell mkdir -p $ANDROID_WDIR
+fi
+
+if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+    trap lkl_test_bsd_cleanup EXIT
+    export BSD_WDIR=/root/lkl
+    $MYSSH mkdir -p $BSD_WDIR
+fi
diff --git a/tools/lkl/tests/valgrind.supp b/tools/lkl/tests/valgrind.supp
new file mode 100644
index 000000000000..5ce717d759fc
--- /dev/null
+++ b/tools/lkl/tests/valgrind.supp
@@ -0,0 +1,85 @@
+{
+   <unfinished timer 1>
+   Memcheck:Leak
+   match-leak-kinds: possible
+   ...
+   fun:pthread_create@@GLIBC_2.2.5
+   fun:__start_helper_thread
+   fun:__pthread_once_slow
+   fun:timer_create@@GLIBC_2.3.3
+   fun:timer_alloc
+   fun:clockevent_set_state_oneshot
+   ...
+   fun:__clockevents_switch_state
+   fun:clockevents_switch_state
+   fun:tick_setup_periodic
+   ...
+}
+
+{
+   <pid1 kernel thread>
+   Memcheck:Leak
+   match-leak-kinds: possible
+   ...
+   fun:thread_create
+   fun:copy_thread
+   fun:copy_thread_tls
+   ...
+   fun:rest_init
+   fun:start_kernel
+   fun:lkl_run_kernel
+}
+
+{
+   <xfs uninitialized buf error: delete this once upstream is fixed>
+   Memcheck:Value8
+   fun:crc32_body
+   fun:crc32_le_generic
+   fun:__crc32c_le
+   fun:chksum_update
+   fun:crypto_shash_update
+   fun:crc32c
+   fun:xlog_cksum
+}
+
+{
+   <xfs pwrite64 issue: delete this once upstream is fixed>
+   Memcheck:Param
+   pwrite64(buf)
+   ...
+   fun:blk_request
+   fun:blk_enqueue
+   fun:virtio_process_one
+   fun:virtio_process_queue
+   fun:virtio_write
+   fun:__raw_writel
+   fun:writel
+   fun:vm_notify
+   fun:virtqueue_notify
+   fun:virtio_queue_rq
+   fun:blk_mq_dispatch_rq_list
+   fun:blk_mq_sched_dispatch_requests
+}
+
+{
+   <virtio_net_pipe xmits>
+   Memcheck:Param
+   writev(vector[...])
+   ...
+   fun:fd_net_tx
+   fun:net_enqueue
+   fun:virtio_process_one
+   fun:virtio_process_queue
+   fun:virtio_write
+   fun:__raw_writel
+   fun:writel
+   fun:vm_notify
+   fun:virtqueue_notify
+   fun:virtqueue_kick
+   fun:start_xmit
+   fun:__netdev_start_xmit
+   fun:netdev_start_xmit
+   fun:xmit_one
+   fun:dev_hard_start_xmit
+   fun:sch_direct_xmit
+}
\ No newline at end of file
diff --git a/tools/lkl/tests/valgrind2xunit.py b/tools/lkl/tests/valgrind2xunit.py
new file mode 100755
index 000000000000..ab7c12b83377
--- /dev/null
+++ b/tools/lkl/tests/valgrind2xunit.py
@@ -0,0 +1,69 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+
+##
+## Downloader from
+## http://humdi.net/wiki/tips/valgrind-to-xunit-xml-converter
+##
+
+import xml.etree.ElementTree as ET
+import sys
+import os
+
+fname = sys.argv[1]
+if fname is None:
+    fname = 'valgrind.xml'
+
+doc = ET.parse(fname)
+errors = doc.findall('.//error')
+
+out = open(os.path.splitext(os.path.basename(fname))[0]+'_xunit.xml',"w")
+out.write('<?xml version="1.0" encoding="UTF-8"?>\n')
+out.write('<testsuite name="valgrind" tests="'+str(len(errors))+'" errors="0" failures="'+str(len(errors))+'" skip="0">\n')
+errorcount=0
+for error in errors:
+    errorcount=errorcount+1
+
+    kind = error.find('kind')
+    what = error.find('what')
+    if  what == None:
+        what = error.find('xwhat/text')
+
+    stack = error.find('stack')
+    frames = stack.findall('frame')
+
+    for frame in frames:
+        fi = frame.find('file')
+        li = frame.find('line')
+        if fi != None and li != None:
+            break
+
+    if fi != None and li != None:
+        out.write('    <testcase classname="ValgrindMemoryCheck" name="Memory check '+str(errorcount)+' ('+kind.text+', '+fi.text+':'+li.text+')" time="0">\n')
+    else:
+        out.write('    <testcase classname="ValgrindMemoryCheck" name="Memory check '+str(errorcount)+' ('+kind.text+')" time="0">\n')
+    out.write('        <error type="'+kind.text+'">\n')
+    out.write('  '+what.text+'\n\n')
+
+    for frame in frames:
+        ip = frame.find('ip')
+        fn = frame.find('fn')
+        fi = frame.find('file')
+        li = frame.find('line')
+
+        if fn is None:
+            bodytext = '(unresolved symbol)'
+        else:
+            bodytext = fn.text
+        bodytext = bodytext.replace("&","&amp;")
+        bodytext = bodytext.replace("<","&lt;")
+        bodytext = bodytext.replace(">","&gt;")
+        if fi != None and li != None:
+            out.write('  '+ip.text+': '+bodytext+' ('+fi.text+':'+li.text+')\n')
+        else:
+            out.write('  '+ip.text+': '+bodytext+'\n')
+
+    out.write('        </error>\n')
+    out.write('    </testcase>\n')
+out.write('</testsuite>\n')
+out.close()
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 22/47] lkl tools: tool that converts a filesystem image to tar
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (20 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 21/47] lkl tools: "boot" test Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 23/47] lkl tools: tool that reads/writes to/from a filesystem image Hajime Tazaki
                   ` (27 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Petros Angelatos, Conrad Meyer, Hajime Tazaki,
	Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Simple utility that converts a filesystem image to a tar file,
preserving file rights and extended attributes.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/fs2tar.c | 410 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 410 insertions(+)
 create mode 100644 tools/lkl/fs2tar.c

diff --git a/tools/lkl/fs2tar.c b/tools/lkl/fs2tar.c
new file mode 100644
index 000000000000..d2834afcce93
--- /dev/null
+++ b/tools/lkl/fs2tar.c
@@ -0,0 +1,410 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifdef __FreeBSD__
+#include <sys/param.h>
+#endif
+
+#include <stdio.h>
+#include <time.h>
+#include <argp.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include <libgen.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <archive.h>
+#include <archive_entry.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+char doc[] = "";
+char args_doc[] = "-t fstype fsimage_path tar_path";
+static struct argp_option options[] = {
+	{"enable-printk", 'p', 0, 0, "show Linux printks"},
+	{"partition", 'P', "int", 0, "partition number"},
+	{"filesystem-type", 't', "string", 0,
+	 "select filesystem type - mandatory"},
+	{"selinux-contexts", 's', "file", 0,
+	 "export selinux contexts to file"},
+	{0},
+};
+
+static struct cl_args {
+	int printk;
+	int part;
+	const char *fsimg_type;
+	const char *fsimg_path;
+	const char *tar_path;
+	FILE *selinux;
+} cla;
+
+static error_t parse_opt(int key, char *arg, struct argp_state *state)
+{
+	struct cl_args *cla = state->input;
+
+	switch (key) {
+	case 'p':
+		cla->printk = 1;
+		break;
+	case 'P':
+		cla->part = atoi(arg);
+		break;
+	case 't':
+		cla->fsimg_type = arg;
+		break;
+	case 's':
+		cla->selinux = fopen(arg, "w");
+		if (!cla->selinux) {
+			fprintf(stderr,
+				"failed to open selinux contexts file: %s\n",
+				strerror(errno));
+			return -1;
+		}
+		break;
+	case ARGP_KEY_ARG:
+		if (!cla->fsimg_path)
+			cla->fsimg_path = arg;
+		else if (!cla->tar_path)
+			cla->tar_path = arg;
+		else
+			return -1;
+		break;
+	case ARGP_KEY_END:
+		if (state->arg_num < 2 || !cla->fsimg_type)
+			argp_usage(state);
+	default:
+		return ARGP_ERR_UNKNOWN;
+	}
+
+	return 0;
+}
+
+static struct argp argp = { options, parse_opt, args_doc, doc };
+
+static struct archive *tar;
+
+static int searchdir(const char *fsimg_path, const char *path);
+
+static int copy_file(const char *fsimg_path, const char *path)
+{
+	long fsimg_fd;
+	char buff[4096];
+	long len, wrote;
+	int ret = 0;
+
+	fsimg_fd = lkl_sys_open(fsimg_path, LKL_O_RDONLY, 0);
+	if (fsimg_fd < 0) {
+		fprintf(stderr, "fsimg error opening %s: %s\n", fsimg_path,
+			lkl_strerror(fsimg_fd));
+		return fsimg_fd;
+	}
+
+	do {
+		len = lkl_sys_read(fsimg_fd, buff, sizeof(buff));
+		if (len > 0) {
+			wrote = archive_write_data(tar, buff, len);
+			if (wrote != len) {
+				fprintf(stderr,
+					"error writing file %s to archive: %s [%d %ld]\n",
+					path, archive_error_string(tar), ret,
+					len);
+				ret = -archive_errno(tar);
+				break;
+			}
+		}
+
+		if (len < 0) {
+			fprintf(stderr, "error reading fsimg file %s: %s\n",
+				fsimg_path, lkl_strerror(len));
+			ret = len;
+		}
+
+	} while (len > 0);
+
+	lkl_sys_close(fsimg_fd);
+
+	return ret;
+}
+
+static int add_link(const char *fsimg_path, const char *path,
+		    struct archive_entry *entry)
+{
+	char buf[4096] = { 0, };
+	long len;
+
+	len = lkl_sys_readlink(fsimg_path, buf, sizeof(buf));
+	if (len < 0) {
+		fprintf(stderr, "fsimg readlink error %s: %s\n",
+			fsimg_path, lkl_strerror(len));
+		return len;
+	}
+
+	archive_entry_set_symlink(entry, buf);
+
+	return 0;
+}
+
+static inline void fsimg_copy_stat(struct stat *st, struct lkl_stat *fst)
+{
+	st->st_dev = fst->st_dev;
+	st->st_ino = fst->st_ino;
+	st->st_mode = fst->st_mode;
+	st->st_nlink = fst->st_nlink;
+	st->st_uid = fst->st_uid;
+	st->st_gid = fst->st_gid;
+	st->st_rdev = fst->st_rdev;
+	st->st_size = fst->st_size;
+	st->st_blksize = fst->st_blksize;
+	st->st_blocks = fst->st_blocks;
+	st->st_atim.tv_sec = fst->lkl_st_atime;
+	st->st_atim.tv_nsec = fst->st_atime_nsec;
+	st->st_mtim.tv_sec = fst->lkl_st_mtime;
+	st->st_mtim.tv_nsec = fst->st_mtime_nsec;
+	st->st_ctim.tv_sec = fst->lkl_st_ctime;
+	st->st_ctim.tv_nsec = fst->st_ctime_nsec;
+}
+
+static int copy_xattr(const char *fsimg_path, const char *path,
+		      struct archive_entry *entry)
+{
+	long ret;
+	char *xattr_list, *i;
+	long xattr_list_size;
+
+	ret = lkl_sys_llistxattr(fsimg_path, NULL, 0);
+	if (ret < 0) {
+		fprintf(stderr, "fsimg llistxattr(%s) error: %s\n",
+			path, lkl_strerror(ret));
+		return ret;
+	}
+
+	if (!ret)
+		return 0;
+
+	xattr_list = malloc(ret);
+
+	ret = lkl_sys_llistxattr(fsimg_path, xattr_list, ret);
+	if (ret < 0) {
+		fprintf(stderr, "fsimg llistxattr(%s) error: %s\n", path,
+			lkl_strerror(ret));
+		free(xattr_list);
+		return ret;
+	}
+
+	xattr_list_size = ret;
+
+	for (i = xattr_list; i - xattr_list < xattr_list_size;
+	     i += strlen(i) + 1) {
+		void *xattr_buf;
+
+		ret = lkl_sys_lgetxattr(fsimg_path, i, NULL, 0);
+		if (ret < 0) {
+			fprintf(stderr, "fsimg lgetxattr(%s) error: %s\n", path,
+				lkl_strerror(ret));
+			free(xattr_list);
+			return ret;
+		}
+
+		xattr_buf = malloc(ret);
+
+		ret = lkl_sys_lgetxattr(fsimg_path, i, xattr_buf, ret);
+		if (ret < 0) {
+			fprintf(stderr, "fsimg lgetxattr2(%s) error: %s\n",
+				path, lkl_strerror(ret));
+			free(xattr_list);
+			free(xattr_buf);
+			return ret;
+		}
+
+		if (cla.selinux && strcmp(i, "security.selinux") == 0)
+			fprintf(cla.selinux, "%s %s\n", path,
+				(char *)xattr_buf);
+
+		archive_entry_xattr_clear(entry);
+		archive_entry_xattr_add_entry(entry, i, xattr_buf, ret);
+
+		free(xattr_buf);
+	}
+
+	free(xattr_list);
+
+	return 0;
+}
+
+static int do_entry(const char *fsimg_path, const char *path,
+		    const struct lkl_linux_dirent64 *de)
+{
+	char fsimg_new_path[PATH_MAX], new_path[PATH_MAX];
+	struct lkl_stat fsimg_stat;
+	struct stat stat;
+	struct archive_entry *entry;
+	int ftype;
+	long ret;
+
+	snprintf(new_path, sizeof(new_path), "%s/%s", path, de->d_name);
+	snprintf(fsimg_new_path, sizeof(fsimg_new_path), "%s/%s", fsimg_path,
+		 de->d_name);
+
+	ret = lkl_sys_lstat(fsimg_new_path, &fsimg_stat);
+	if (ret) {
+		fprintf(stderr, "fsimg lstat(%s) error: %s\n",
+			path, lkl_strerror(ret));
+		return ret;
+	}
+
+	entry = archive_entry_new();
+
+	archive_entry_set_pathname(entry, new_path);
+	fsimg_copy_stat(&stat, &fsimg_stat);
+	archive_entry_copy_stat(entry, &stat);
+	ret = copy_xattr(fsimg_new_path, new_path, entry);
+	if (ret)
+		return ret;
+	/* TODO: ACLs */
+
+	ftype = stat.st_mode & S_IFMT;
+
+	switch (ftype) {
+	case S_IFREG:
+		archive_write_header(tar, entry);
+		ret = copy_file(fsimg_new_path, new_path);
+		break;
+	case S_IFDIR:
+		archive_write_header(tar, entry);
+		ret = searchdir(fsimg_new_path, new_path);
+		break;
+	case S_IFLNK:
+		ret = add_link(fsimg_new_path, new_path, entry);
+		/* fall through */
+	case S_IFSOCK:
+	case S_IFBLK:
+	case S_IFCHR:
+	case S_IFIFO:
+		if (ret)
+			break;
+		archive_write_header(tar, entry);
+		break;
+	default:
+		printf("skipping %s: unsupported entry type %d\n", new_path,
+		       ftype);
+	}
+
+	archive_entry_free(entry);
+
+	if (ret)
+		printf("error processing entry %s, aborting\n", new_path);
+
+	return ret;
+}
+
+static int searchdir(const char *fsimg_path, const char *path)
+{
+	long ret, fd;
+	char buf[1024], *pos;
+	long buf_len;
+
+	fd = lkl_sys_open(fsimg_path, LKL_O_RDONLY | LKL_O_DIRECTORY, 0);
+	if (fd < 0) {
+		fprintf(stderr, "failed to open dir %s: %s", fsimg_path,
+			lkl_strerror(fd));
+		return fd;
+	}
+
+	do {
+		struct lkl_linux_dirent64 *de;
+
+		de = (struct lkl_linux_dirent64 *) buf;
+		buf_len = lkl_sys_getdents64(fd, de, sizeof(buf));
+		if (buf_len < 0) {
+			fprintf(stderr, "gentdents64 error: %s\n",
+				lkl_strerror(buf_len));
+			break;
+		}
+
+		for (pos = buf; pos - buf < buf_len; pos += de->d_reclen) {
+			de = (struct lkl_linux_dirent64 *)pos;
+
+			if (!strcmp(de->d_name, ".") ||
+			    !strcmp(de->d_name, ".."))
+				continue;
+
+			ret = do_entry(fsimg_path, path, de);
+			if (ret)
+				goto out;
+		}
+
+	} while (buf_len > 0);
+
+out:
+	lkl_sys_close(fd);
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	struct lkl_disk disk;
+	long ret;
+	char mpoint[32];
+	unsigned int disk_id;
+
+	if (argp_parse(&argp, argc, argv, 0, 0, &cla) < 0)
+		return -1;
+
+	if (!cla.printk)
+		lkl_host_ops.print = NULL;
+
+	disk.fd = open(cla.fsimg_path, O_RDONLY);
+	if (disk.fd < 0) {
+		fprintf(stderr, "can't open fsimg %s: %s\n", cla.fsimg_path,
+			strerror(errno));
+		ret = 1;
+		goto out;
+	}
+
+	disk.ops = NULL;
+
+	ret = lkl_disk_add(&disk);
+	if (ret < 0) {
+		fprintf(stderr, "can't add disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+	disk_id = ret;
+
+	lkl_start_kernel(&lkl_host_ops, "mem=10M");
+
+	ret = lkl_mount_dev(disk_id, cla.part, cla.fsimg_type, LKL_MS_RDONLY,
+			    NULL, mpoint, sizeof(mpoint));
+	if (ret) {
+		fprintf(stderr, "can't mount disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+
+	ret = lkl_sys_chdir(mpoint);
+	if (ret) {
+		fprintf(stderr, "can't chdir to %s: %s\n", mpoint,
+			lkl_strerror(ret));
+		goto out_umount;
+	}
+
+	tar = archive_write_new();
+	archive_write_set_format_pax_restricted(tar);
+	archive_write_open_filename(tar, cla.tar_path);
+
+	ret = searchdir(mpoint, "");
+
+	archive_write_free(tar);
+
+	if (cla.selinux)
+		fclose(cla.selinux);
+
+out_umount:
+	lkl_umount_dev(disk_id, cla.part, 0, 1000);
+
+out_close:
+	close(disk.fd);
+
+out:
+	lkl_sys_halt();
+
+	return ret;
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 23/47] lkl tools: tool that reads/writes to/from a filesystem image
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (21 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 22/47] lkl tools: tool that converts a filesystem image to tar Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 24/47] lkl tools: virtio: add network device support Hajime Tazaki
                   ` (26 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: Conrad Meyer, Octavian Purdila, Akira Moroo, Petros Angelatos,
	Dan Peebles, Yuriy Taraday, Tuomas Tynkkynen, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

cptofs will be built with.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Dan Peebles <pumpkin@me.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Signed-off-by: Yuriy Taraday <yorik.sar@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/cptofs.c | 635 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 635 insertions(+)
 create mode 100644 tools/lkl/cptofs.c

diff --git a/tools/lkl/cptofs.c b/tools/lkl/cptofs.c
new file mode 100644
index 000000000000..dd490435d5b7
--- /dev/null
+++ b/tools/lkl/cptofs.c
@@ -0,0 +1,635 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifdef __FreeBSD__
+#include <sys/param.h>
+#endif
+
+#include <stdio.h>
+#include <time.h>
+#include <argp.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include <libgen.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <fnmatch.h>
+#include <dirent.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+static const char doc_cptofs[] = "Copy files to a filesystem image";
+static const char doc_cpfromfs[] = "Copy files from a filesystem image";
+static const char args_doc_cptofs[] = "-t fstype -i fsimage path... fs_path";
+static const char args_doc_cpfromfs[] = "-t fstype -i fsimage fs_path... path";
+
+static struct argp_option options[] = {
+	{"enable-printk", 'p', 0, 0, "show Linux printks"},
+	{"partition", 'P', "int", 0, "partition number"},
+	{"filesystem-type", 't', "string", 0,
+	 "select filesystem type - mandatory"},
+	{"filesystem-image", 'i', "string", 0,
+	 "path to the filesystem image - mandatory"},
+	{"selinux", 's', "string", 0, "selinux attributes for destination"},
+	{0},
+};
+
+static struct cl_args {
+	int printk;
+	int part;
+	const char *fsimg_type;
+	const char *fsimg_path;
+	int npaths;
+	char **paths;
+	const char *selinux;
+} cla;
+
+static int cptofs;
+
+static error_t parse_opt(int key, char *arg, struct argp_state *state)
+{
+	struct cl_args *cla = state->input;
+
+	switch (key) {
+	case 'p':
+		cla->printk = 1;
+		break;
+	case 'P':
+		cla->part = atoi(arg);
+		break;
+	case 't':
+		cla->fsimg_type = arg;
+		break;
+	case 'i':
+		cla->fsimg_path = arg;
+		break;
+	case 's':
+		cla->selinux = arg;
+		break;
+	case ARGP_KEY_ARG:
+		// Capture all remaining arguments in our paths array and stop
+		// parsing here. We treat the last one as the destination and
+		// everything before it as sources, just like cp does.
+		cla->paths = &state->argv[state->next - 1];
+		cla->npaths = state->argc - state->next + 1;
+		state->next = state->argc;
+		break;
+	default:
+		return ARGP_ERR_UNKNOWN;
+	}
+
+	return 0;
+}
+
+static struct argp argp_cptofs = {
+	.options = options,
+	.parser = parse_opt,
+	.args_doc = args_doc_cptofs,
+	.doc = doc_cptofs,
+};
+
+static struct argp argp_cpfromfs = {
+	.options = options,
+	.parser = parse_opt,
+	.args_doc = args_doc_cpfromfs,
+	.doc = doc_cpfromfs,
+};
+
+static int searchdir(const char *fs_path, const char *path, const char *match);
+
+static int open_src(const char *path)
+{
+	int fd;
+
+	if (cptofs)
+		fd = open(path, O_RDONLY, 0);
+	else
+		fd = lkl_sys_open(path, LKL_O_RDONLY, 0);
+
+	if (fd < 0)
+		fprintf(stderr, "unable to open file %s for reading: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(fd));
+
+	return fd;
+}
+
+static int open_dst(const char *path, int mode)
+{
+	int fd;
+
+	if (cptofs)
+		fd = lkl_sys_open(path, LKL_O_RDWR | LKL_O_TRUNC | LKL_O_CREAT,
+				  mode);
+	else
+		fd = open(path, O_RDWR | O_TRUNC | O_CREAT, mode);
+
+	if (fd < 0)
+		fprintf(stderr, "unable to open file %s for writing: %s\n",
+			path, cptofs ? lkl_strerror(fd) : strerror(errno));
+
+	if (cla.selinux && cptofs) {
+		int ret = lkl_sys_fsetxattr(fd, "security.selinux", cla.selinux,
+					    strlen(cla.selinux), 0);
+		if (ret)
+			fprintf(stderr,
+				"unable to set selinux attribute on %s: %s\n",
+				path, lkl_strerror(ret));
+	}
+
+	return fd;
+}
+
+static int read_src(int fd, char *buf, int len)
+{
+	int ret;
+
+	if (cptofs)
+		ret = read(fd, buf, len);
+	else
+		ret = lkl_sys_read(fd, buf, len);
+
+	if (ret < 0)
+		fprintf(stderr, "error reading file: %s\n",
+			cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int write_dst(int fd, char *buf, int len)
+{
+	int ret;
+
+	if (cptofs)
+		ret = lkl_sys_write(fd, buf, len);
+	else
+		ret = write(fd, buf, len);
+
+	if (ret < 0)
+		fprintf(stderr, "error writing file: %s\n",
+			cptofs ? lkl_strerror(ret) : strerror(errno));
+
+	return ret;
+}
+
+static void close_src(int fd)
+{
+	if (cptofs)
+		close(fd);
+	else
+		lkl_sys_close(fd);
+}
+
+static void close_dst(int fd)
+{
+	if (cptofs)
+		lkl_sys_close(fd);
+	else
+		close(fd);
+}
+
+static int copy_file(const char *src, const char *dst, int mode)
+{
+	long len, to_write, wrote;
+	char buf[4096], *ptr;
+	int ret = 0;
+	int fd_src, fd_dst;
+
+	fd_src = open_src(src);
+	if (fd_src < 0)
+		return fd_src;
+
+	fd_dst = open_dst(dst, mode);
+	if (fd_dst < 0)
+		return fd_dst;
+
+	do {
+		len = read_src(fd_src, buf, sizeof(buf));
+
+		if (len > 0) {
+			ptr = buf;
+			to_write = len;
+			do {
+				wrote = write_dst(fd_dst, ptr, to_write);
+
+				if (wrote < 0) {
+					ret = wrote;
+					goto out;
+				}
+
+				to_write -= wrote;
+				ptr += len;
+
+			} while (to_write > 0);
+		}
+
+		if (len < 0)
+			ret = len;
+
+	} while (len > 0);
+
+out:
+	close_src(fd_src);
+	close_dst(fd_dst);
+
+	return ret;
+}
+
+static int stat_src(const char *path, unsigned int *type, unsigned int *mode,
+		    long long *size, struct lkl_timespec *mtime,
+		    struct lkl_timespec *atime)
+{
+	struct stat stat;
+	struct lkl_stat lkl_stat;
+	int ret;
+
+	if (cptofs) {
+		ret = lstat(path, &stat);
+		if (type)
+			*type = stat.st_mode & S_IFMT;
+		if (mode)
+			*mode = stat.st_mode & ~S_IFMT;
+		if (size)
+			*size = stat.st_size;
+		if (mtime) {
+			mtime->tv_sec = stat.st_mtim.tv_sec;
+			mtime->tv_nsec = stat.st_mtim.tv_nsec;
+		}
+		if (atime) {
+			atime->tv_sec = stat.st_atim.tv_sec;
+			atime->tv_nsec = stat.st_atim.tv_nsec;
+		}
+	} else {
+		ret = lkl_sys_lstat(path, &lkl_stat);
+		if (type)
+			*type = lkl_stat.st_mode & S_IFMT;
+		if (mode)
+			*mode = lkl_stat.st_mode & ~S_IFMT;
+		if (size)
+			*size = lkl_stat.st_size;
+		if (mtime) {
+			mtime->tv_sec = lkl_stat.lkl_st_mtime;
+			mtime->tv_nsec = lkl_stat.st_mtime_nsec;
+		}
+		if (atime) {
+			atime->tv_sec = lkl_stat.lkl_st_atime;
+			atime->tv_nsec = lkl_stat.st_atime_nsec;
+		}
+	}
+
+	if (ret)
+		fprintf(stderr, "fsimg lstat(%s) error: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int mkdir_dst(const char *path, unsigned int mode)
+{
+	int ret;
+
+	if (cptofs) {
+		ret = lkl_sys_mkdir(path, mode);
+		if (ret == -LKL_EEXIST)
+			ret = 0;
+	} else {
+		ret = mkdir(path, mode);
+		if (ret < 0 && errno == EEXIST)
+			ret = 0;
+	}
+
+	if (ret)
+		fprintf(stderr, "unable to create directory %s: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int readlink_src(const char *src, char *out, int outsize)
+{
+	int ret;
+
+	if (cptofs)
+		ret = readlink(src, out, outsize);
+	else
+		ret = lkl_sys_readlink(src, out, outsize);
+
+	if (ret < 0)
+		fprintf(stderr, "unable to readlink '%s': %s\n", src,
+			cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int symlink_dst(const char *path, const char *target)
+{
+	int ret;
+
+	if (cptofs)
+		ret = lkl_sys_symlink(target, path);
+	else
+		ret = symlink(target, path);
+
+	if (ret)
+		fprintf(stderr, "unable to symlink '%s' with target '%s': %s\n",
+			path, target, cptofs ? lkl_strerror(ret) :
+			strerror(errno));
+
+	return ret;
+}
+
+static int copy_symlink(const char *src, const char *dst)
+{
+	int ret;
+	long long size, actual_size;
+	char *target = NULL;
+
+	ret = stat_src(src, NULL, NULL, &size, NULL, NULL);
+	if (ret) {
+		ret = -1;
+		goto out;
+	}
+
+	target = malloc(size + 1);
+	if (!target) {
+		fprintf(stderr, "Unable to allocate memory (%lld bytes)\n",
+			size + 1);
+		ret = -1;
+		goto out;
+	}
+
+	actual_size = readlink_src(src, target, size);
+	if (actual_size != size) {
+		fprintf(stderr,
+			"readlink(%s) bad size: got %lld, expected %lld\n",
+			src, actual_size, size);
+		ret = -1;
+		goto out;
+	}
+	target[size] = 0; // readlink doesn't append the trailing null byte
+
+	ret = symlink_dst(dst, target);
+	if (ret)
+		ret = -1;
+
+out:
+	if (target)
+		free(target);
+
+	return ret;
+}
+
+static int do_entry(const char *_src, const char *_dst, const char *name)
+{
+	char src[PATH_MAX], dst[PATH_MAX];
+	struct lkl_timespec mtime, atime;
+	unsigned int type, mode;
+	int ret;
+
+	snprintf(src, sizeof(src), "%s/%s", _src, name);
+	snprintf(dst, sizeof(dst), "%s/%s", _dst, name);
+
+	ret = stat_src(src, &type, &mode, NULL, &mtime, &atime);
+
+	switch (type) {
+	case S_IFREG:
+	{
+		ret = copy_file(src, dst, mode);
+		break;
+	}
+	case S_IFDIR:
+		ret = mkdir_dst(dst, mode);
+		if (ret)
+			break;
+		ret = searchdir(src, dst, NULL);
+		break;
+	case S_IFLNK:
+		ret = copy_symlink(src, dst);
+		break;
+	case S_IFSOCK:
+	case S_IFBLK:
+	case S_IFCHR:
+	case S_IFIFO:
+	default:
+		printf("skipping %s: unsupported entry type %d\n", src, type);
+	}
+
+	if (!ret) {
+		if (cptofs) {
+			struct lkl_timespec lkl_ts[] = { atime, mtime };
+
+			ret = lkl_sys_utimensat(-1, dst,
+						(struct __lkl__kernel_timespec
+						 *)lkl_ts,
+						LKL_AT_SYMLINK_NOFOLLOW);
+		} else {
+			struct timespec ts[] = {
+				{ .tv_sec = atime.tv_sec,
+				  .tv_nsec = atime.tv_nsec, },
+				{ .tv_sec = mtime.tv_sec,
+				  .tv_nsec = mtime.tv_nsec, },
+			};
+
+			ret = utimensat(-1, dst, ts, AT_SYMLINK_NOFOLLOW);
+		}
+	}
+
+	if (ret)
+		printf("error processing entry %s, aborting\n", src);
+
+	return ret;
+}
+
+static DIR *open_dir(const char *path)
+{
+	DIR *dir;
+	int err;
+
+	if (cptofs)
+		dir = opendir(path);
+	else
+		dir = (DIR *)lkl_opendir(path, &err);
+
+	if (!dir)
+		fprintf(stderr, "unable to open directory %s: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(err));
+	return dir;
+}
+
+static const char *read_dir(DIR *dir, const char *path)
+{
+	struct lkl_dir *lkl_dir = (struct lkl_dir *)dir;
+	const char *name = NULL;
+	const char *err = NULL;
+
+	if (cptofs) {
+		struct dirent *de = readdir(dir);
+
+		if (de)
+			name = de->d_name;
+	} else {
+		struct lkl_linux_dirent64 *de = lkl_readdir(lkl_dir);
+
+		if (de)
+			name = de->d_name;
+	}
+
+	if (!name) {
+		if (cptofs) {
+			if (errno)
+				err = strerror(errno);
+		} else {
+			if (lkl_errdir(lkl_dir))
+				err = lkl_strerror(lkl_errdir(lkl_dir));
+		}
+	}
+
+	if (err)
+		fprintf(stderr, "error while reading directory %s: %s\n",
+			path, err);
+	return name;
+}
+
+static void close_dir(DIR *dir)
+{
+	if (cptofs)
+		closedir(dir);
+	else
+		lkl_closedir((struct lkl_dir *)dir);
+}
+
+static int searchdir(const char *src, const char *dst, const char *match)
+{
+	DIR *dir;
+	const char *name;
+	int ret = 0;
+
+	dir = open_dir(src);
+	if (!dir)
+		return -1;
+
+	while ((name = read_dir(dir, src))) {
+		if (!strcmp(name, ".") || !strcmp(name, "..") ||
+		    (match && fnmatch(match, name, 0) != 0))
+			continue;
+
+		ret = do_entry(src, dst, name);
+		if (ret)
+			goto out;
+	}
+
+out:
+	close_dir(dir);
+
+	return ret;
+}
+
+static int match_root(const char *src)
+{
+	const char *c = src;
+
+	while (*c) {
+		switch (*c) {
+		case '.':
+			if (c > src && c[-1] == '.')
+				return 0;
+			break;
+		case '/':
+			break;
+		default:
+			return 0;
+		}
+		c++;
+	}
+
+	return 1;
+}
+
+int copy_one(const char *src, const char *mpoint, const char *dst)
+{
+	char *src_path_dir, *src_path_base;
+	char src_path[PATH_MAX], dst_path[PATH_MAX];
+
+	if (cptofs) {
+		snprintf(src_path, sizeof(src_path),  "%s", src);
+		snprintf(dst_path, sizeof(dst_path),  "%s/%s", mpoint, dst);
+	} else {
+		snprintf(src_path, sizeof(src_path),  "%s/%s", mpoint, src);
+		snprintf(dst_path, sizeof(dst_path),  "%s", dst);
+	}
+
+	if (match_root(src))
+		return searchdir(src_path, dst, NULL);
+
+	src_path_dir = dirname(strdup(src_path));
+	src_path_base = basename(strdup(src_path));
+
+	return searchdir(src_path_dir, dst_path, src_path_base);
+}
+
+int main(int argc, char **argv)
+{
+	struct lkl_disk disk;
+	long ret, umount_ret;
+	int i;
+	char mpoint[32];
+	unsigned int disk_id;
+
+	if (strstr(argv[0], "cptofs")) {
+		cptofs = 1;
+		ret = argp_parse(&argp_cptofs, argc, argv, 0, 0, &cla);
+	} else {
+		ret = argp_parse(&argp_cpfromfs, argc, argv, 0, 0, &cla);
+	}
+
+	if (ret < 0)
+		return -1;
+
+	if (!cla.printk)
+		lkl_host_ops.print = NULL;
+
+	disk.fd = open(cla.fsimg_path, cptofs ? O_RDWR : O_RDONLY);
+	if (disk.fd < 0) {
+		fprintf(stderr, "can't open fsimg %s: %s\n", cla.fsimg_path,
+			strerror(errno));
+		ret = 1;
+		goto out;
+	}
+
+	disk.ops = NULL;
+
+	ret = lkl_disk_add(&disk);
+	if (ret < 0) {
+		fprintf(stderr, "can't add disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+	disk_id = ret;
+
+	lkl_start_kernel(&lkl_host_ops, "mem=100M");
+
+	ret = lkl_mount_dev(disk_id, cla.part, cla.fsimg_type,
+			    cptofs ? 0 : LKL_MS_RDONLY,
+			    NULL, mpoint, sizeof(mpoint));
+	if (ret) {
+		fprintf(stderr, "can't mount disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+
+	lkl_sys_umask(0);
+
+	for (i = 0; i < cla.npaths - 1; i++) {
+		ret = copy_one(cla.paths[i], mpoint, cla.paths[cla.npaths - 1]);
+		if (ret)
+			break;
+	}
+
+	umount_ret = lkl_umount_dev(disk_id, cla.part, 0, 1000);
+	if (ret == 0)
+		ret = umount_ret;
+
+out_close:
+	close(disk.fd);
+
+out:
+	lkl_sys_halt();
+
+	return ret;
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 24/47] lkl tools: virtio: add network device support
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (22 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 23/47] lkl tools: tool that reads/writes to/from a filesystem image Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 25/47] lkl: add support for Windows hosts Hajime Tazaki
                   ` (25 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um
  Cc: H . K . Jerry Chu, Xiao Jia, Octavian Purdila, Motomu Utsumi,
	Akira Moroo, Yuan Liu, Thomas Liebetraut, Patrick Collins,
	David Disseldorp, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

This also adds various virtio_net backend to be used as network devices.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/lib/net.c                 | 818 ++++++++++++++++++++++++++++
 tools/lkl/lib/virtio_net.c          | 321 +++++++++++
 tools/lkl/lib/virtio_net_dpdk.c     | 480 ++++++++++++++++
 tools/lkl/lib/virtio_net_fd.c       | 217 ++++++++
 tools/lkl/lib/virtio_net_fd.h       |  28 +
 tools/lkl/lib/virtio_net_macvtap.c  |  32 ++
 tools/lkl/lib/virtio_net_pipe.c     |  76 +++
 tools/lkl/lib/virtio_net_raw.c      |  94 ++++
 tools/lkl/lib/virtio_net_tap.c      | 111 ++++
 tools/lkl/lib/virtio_net_vde.c      | 168 ++++++
 tools/lkl/scripts/dpdk-sdk-build.sh |  18 +
 tools/lkl/tests/net-setup.sh        | 134 +++++
 tools/lkl/tests/net-test.c          | 317 +++++++++++
 tools/lkl/tests/net.sh              | 186 +++++++
 14 files changed, 3000 insertions(+)
 create mode 100644 tools/lkl/lib/net.c
 create mode 100644 tools/lkl/lib/virtio_net.c
 create mode 100644 tools/lkl/lib/virtio_net_dpdk.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.h
 create mode 100644 tools/lkl/lib/virtio_net_macvtap.c
 create mode 100644 tools/lkl/lib/virtio_net_pipe.c
 create mode 100644 tools/lkl/lib/virtio_net_raw.c
 create mode 100644 tools/lkl/lib/virtio_net_tap.c
 create mode 100644 tools/lkl/lib/virtio_net_vde.c
 create mode 100755 tools/lkl/scripts/dpdk-sdk-build.sh
 create mode 100644 tools/lkl/tests/net-setup.sh
 create mode 100644 tools/lkl/tests/net-test.c
 create mode 100755 tools/lkl/tests/net.sh

diff --git a/tools/lkl/lib/net.c b/tools/lkl/lib/net.c
new file mode 100644
index 000000000000..316965ffd21e
--- /dev/null
+++ b/tools/lkl/lib/net.c
@@ -0,0 +1,818 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <stdio.h>
+#include "endian.h"
+#include <lkl_host.h>
+
+#ifdef __MINGW32__
+#include <ws2tcpip.h>
+
+int lkl_inet_pton(int af, const char *src, void *dst)
+{
+	struct addrinfo hint, *res = NULL;
+	int err;
+
+	memset(&hint, 0, sizeof(struct addrinfo));
+
+	hint.ai_family = af;
+	hint.ai_flags = AI_NUMERICHOST;
+
+	err = getaddrinfo(src, NULL, &hint, &res);
+	if (err)
+		return 0;
+
+	switch (af) {
+	case AF_INET:
+		*(struct in_addr *)dst =
+			((struct sockaddr_in *)&res->ai_addr)->sin_addr;
+		break;
+	case AF_INET6:
+		*(struct in6_addr *)dst =
+			((struct sockaddr_in6 *)&res->ai_addr)->sin6_addr;
+		break;
+	default:
+		freeaddrinfo(res);
+		return 0;
+	}
+
+	freeaddrinfo(res);
+	return 1;
+}
+#endif
+
+static inline void set_sockaddr(struct lkl_sockaddr_in *sin, unsigned int addr,
+				unsigned short port)
+{
+	sin->sin_family = LKL_AF_INET;
+	sin->sin_addr.lkl_s_addr = addr;
+	sin->sin_port = port;
+}
+
+static inline int ifindex_to_name(int sock, struct lkl_ifreq *ifr, int ifindex)
+{
+	ifr->lkl_ifr_ifindex = ifindex;
+	return lkl_sys_ioctl(sock, LKL_SIOCGIFNAME, (long)ifr);
+}
+
+int lkl_ifname_to_ifindex(const char *name)
+{
+	struct lkl_ifreq ifr;
+	int fd, ret;
+
+	fd = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (fd < 0)
+		return fd;
+
+	strcpy(ifr.lkl_ifr_name, name);
+
+	ret = lkl_sys_ioctl(fd, LKL_SIOCGIFINDEX, (long)&ifr);
+	if (ret < 0)
+		return ret;
+
+	return ifr.lkl_ifr_ifindex;
+}
+
+int lkl_if_up(int ifindex)
+{
+	struct lkl_ifreq ifr;
+	int err, sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+
+	if (sock < 0)
+		return sock;
+	err = ifindex_to_name(sock, &ifr, ifindex);
+	if (err < 0)
+		return err;
+
+	err = lkl_sys_ioctl(sock, LKL_SIOCGIFFLAGS, (long)&ifr);
+	if (!err) {
+		ifr.lkl_ifr_flags |= LKL_IFF_UP;
+		err = lkl_sys_ioctl(sock, LKL_SIOCSIFFLAGS, (long)&ifr);
+	}
+
+	lkl_sys_close(sock);
+
+	return err;
+}
+
+int lkl_if_down(int ifindex)
+{
+	struct lkl_ifreq ifr;
+	int err, sock;
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (sock < 0)
+		return sock;
+
+	err = ifindex_to_name(sock, &ifr, ifindex);
+	if (err < 0)
+		return err;
+
+	err = lkl_sys_ioctl(sock, LKL_SIOCGIFFLAGS, (long)&ifr);
+	if (!err) {
+		ifr.lkl_ifr_flags &= ~LKL_IFF_UP;
+		err = lkl_sys_ioctl(sock, LKL_SIOCSIFFLAGS, (long)&ifr);
+	}
+
+	lkl_sys_close(sock);
+
+	return err;
+}
+
+int lkl_if_set_mtu(int ifindex, int mtu)
+{
+	struct lkl_ifreq ifr;
+	int err, sock;
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (sock < 0)
+		return sock;
+
+	err = ifindex_to_name(sock, &ifr, ifindex);
+	if (err < 0)
+		return err;
+
+	ifr.lkl_ifr_mtu = mtu;
+
+	err = lkl_sys_ioctl(sock, LKL_SIOCSIFMTU, (long)&ifr);
+
+	lkl_sys_close(sock);
+
+	return err;
+}
+
+int lkl_if_set_ipv4(int ifindex, unsigned int addr, unsigned int netmask_len)
+{
+	return lkl_if_add_ip(ifindex, LKL_AF_INET, &addr, netmask_len);
+}
+
+int lkl_if_set_ipv4_gateway(int ifindex, unsigned int src_addr,
+		unsigned int src_masklen, unsigned int via_addr)
+{
+	int err;
+
+	err = lkl_if_add_rule_from_saddr(ifindex, LKL_AF_INET, &src_addr);
+	if (err)
+		return err;
+	err = lkl_if_add_linklocal(ifindex, LKL_AF_INET,
+					&src_addr, src_masklen);
+	if (err)
+		return err;
+	return lkl_if_add_gateway(ifindex, LKL_AF_INET, &via_addr);
+}
+
+int lkl_set_ipv4_gateway(unsigned int addr)
+{
+	return lkl_add_gateway(LKL_AF_INET, &addr);
+}
+
+int lkl_netdev_get_ifindex(int id)
+{
+	struct lkl_ifreq ifr;
+	int sock, ret;
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (sock < 0)
+		return sock;
+
+	snprintf(ifr.lkl_ifr_name, sizeof(ifr.lkl_ifr_name), "eth%d", id);
+	ret = lkl_sys_ioctl(sock, LKL_SIOCGIFINDEX, (long)&ifr);
+	lkl_sys_close(sock);
+
+	return ret < 0 ? ret : ifr.lkl_ifr_ifindex;
+}
+
+static int netlink_sock(unsigned int groups)
+{
+	struct lkl_sockaddr_nl la;
+	int fd, err;
+
+	fd = lkl_sys_socket(LKL_AF_NETLINK, LKL_SOCK_DGRAM, LKL_NETLINK_ROUTE);
+	if (fd < 0)
+		return fd;
+
+	memset(&la, 0, sizeof(la));
+	la.nl_family = LKL_AF_NETLINK;
+	la.nl_groups = groups;
+	err = lkl_sys_bind(fd, (struct lkl_sockaddr *)&la, sizeof(la));
+	if (err < 0)
+		return err;
+
+	return fd;
+}
+
+static int parse_rtattr(struct lkl_rtattr *tb[], int max,
+			struct lkl_rtattr *rta, int len)
+{
+	unsigned short type;
+
+	memset(tb, 0, sizeof(struct lkl_rtattr *) * (max + 1));
+	while (LKL_RTA_OK(rta, len)) {
+		type = rta->rta_type;
+		if ((type <= max) && (!tb[type]))
+			tb[type] = rta;
+		rta = LKL_RTA_NEXT(rta, len);
+	}
+	if (len)
+		lkl_printf("!!!Deficit %d, rta_len=%d\n", len,
+			rta->rta_len);
+	return 0;
+}
+
+struct addr_filter {
+	unsigned int ifindex;
+	void *addr;
+};
+
+static unsigned int get_ifa_flags(struct lkl_ifaddrmsg *ifa,
+				  struct lkl_rtattr *ifa_flags_attr)
+{
+	return ifa_flags_attr ? *(unsigned int *)LKL_RTA_DATA(ifa_flags_attr) :
+				ifa->ifa_flags;
+}
+
+/* returns:
+ * 0 - dad succeed.
+ * -1 - dad failed or other error.
+ * 1 - should wait for new msg.
+ */
+static int check_ipv6_dad(struct lkl_sockaddr_nl *nladdr,
+			  struct lkl_nlmsghdr *n, void *arg)
+{
+	struct addr_filter *filter = arg;
+	struct lkl_ifaddrmsg *ifa = LKL_NLMSG_DATA(n);
+	struct lkl_rtattr *rta_tb[LKL_IFA_MAX+1];
+	unsigned int ifa_flags;
+	int len = n->nlmsg_len;
+
+	if (n->nlmsg_type != LKL_RTM_NEWADDR)
+		return 1;
+
+	len -= LKL_NLMSG_LENGTH(sizeof(*ifa));
+	if (len < 0) {
+		lkl_printf("BUG: wrong nlmsg len %d\n", len);
+		return -1;
+	}
+
+	parse_rtattr(rta_tb, LKL_IFA_MAX, LKL_IFA_RTA(ifa),
+		     n->nlmsg_len - LKL_NLMSG_LENGTH(sizeof(*ifa)));
+
+	ifa_flags = get_ifa_flags(ifa, rta_tb[LKL_IFA_FLAGS]);
+
+	if (ifa->ifa_index != filter->ifindex)
+		return 1;
+	if (ifa->ifa_family != LKL_AF_INET6)
+		return 1;
+
+	if (!rta_tb[LKL_IFA_LOCAL])
+		rta_tb[LKL_IFA_LOCAL] = rta_tb[LKL_IFA_ADDRESS];
+
+	if (!rta_tb[LKL_IFA_LOCAL] ||
+	    (filter->addr && memcmp(LKL_RTA_DATA(rta_tb[LKL_IFA_LOCAL]),
+				    filter->addr, 16))) {
+		return 1;
+	}
+	if (ifa_flags & LKL_IFA_F_DADFAILED) {
+		lkl_printf("IPV6 DAD failed.\n");
+		return -1;
+	}
+	if (!(ifa_flags & LKL_IFA_F_TENTATIVE))
+		return 0;
+	return 1;
+}
+
+/* Copied from iproute2/lib/ */
+static int rtnl_listen(int fd, int (*handler)(struct lkl_sockaddr_nl *nladdr,
+					      struct lkl_nlmsghdr *, void *),
+		       void *arg)
+{
+	int status;
+	struct lkl_nlmsghdr *h;
+	struct lkl_sockaddr_nl nladdr = { .nl_family = LKL_AF_NETLINK };
+	struct lkl_iovec iov;
+	struct lkl_user_msghdr msg = {
+		.msg_name = &nladdr,
+		.msg_namelen = sizeof(nladdr),
+		.msg_iov = &iov,
+		.msg_iovlen = 1,
+	};
+	char   buf[16384];
+
+	iov.iov_base = buf;
+	while (1) {
+		iov.iov_len = sizeof(buf);
+		status = lkl_sys_recvmsg(fd, &msg, 0);
+
+		if (status < 0) {
+			if (status == -LKL_EINTR || status == -LKL_EAGAIN)
+				continue;
+			lkl_printf("netlink receive error %s (%d)\n",
+				lkl_strerror(status), status);
+			if (status == -LKL_ENOBUFS)
+				continue;
+			return status;
+		}
+		if (status == 0) {
+			lkl_printf("EOF on netlink\n");
+			return -1;
+		}
+		if (msg.msg_namelen != sizeof(nladdr)) {
+			lkl_printf("Sender address length == %d\n",
+				msg.msg_namelen);
+			return -1;
+		}
+
+		for (h = (struct lkl_nlmsghdr *)buf;
+		     (unsigned int)status >= sizeof(*h);) {
+			int err;
+			int len = h->nlmsg_len;
+			int l = len - sizeof(*h);
+
+			if (l < 0 || len > status) {
+				if (msg.msg_flags & LKL_MSG_TRUNC) {
+					lkl_printf("Truncated message\n");
+					return -1;
+				}
+				lkl_printf("!!!malformed message: len=%d\n",
+					len);
+				return -1;
+			}
+
+			err = handler(&nladdr, h, arg);
+			if (err <= 0)
+				return err;
+
+			status -= LKL_NLMSG_ALIGN(len);
+			h = (struct lkl_nlmsghdr *)((char *)h +
+						    LKL_NLMSG_ALIGN(len));
+		}
+		if (msg.msg_flags & LKL_MSG_TRUNC) {
+			lkl_printf("Message truncated\n");
+			continue;
+		}
+		if (status) {
+			lkl_printf("!!!Remnant of size %d\n", status);
+			return -1;
+		}
+	}
+}
+
+int lkl_if_wait_ipv6_dad(int ifindex, void *addr)
+{
+	struct addr_filter filter = {.ifindex = ifindex, .addr = addr};
+	int fd, ret;
+	struct {
+		struct lkl_nlmsghdr		nlmsg_info;
+		struct lkl_ifaddrmsg	ifaddrmsg_info;
+	} req;
+
+	fd = netlink_sock(1 << (LKL_RTNLGRP_IPV6_IFADDR - 1));
+	if (fd < 0)
+		return fd;
+
+	memset(&req, 0, sizeof(req));
+	req.nlmsg_info.nlmsg_len =
+			LKL_NLMSG_LENGTH(sizeof(struct lkl_ifaddrmsg));
+	req.nlmsg_info.nlmsg_flags = LKL_NLM_F_REQUEST | LKL_NLM_F_DUMP;
+	req.nlmsg_info.nlmsg_type = LKL_RTM_GETADDR;
+	req.ifaddrmsg_info.ifa_family = LKL_AF_INET6;
+	req.ifaddrmsg_info.ifa_index = ifindex;
+	ret = lkl_sys_send(fd, &req, req.nlmsg_info.nlmsg_len, 0);
+	if (ret < 0) {
+		lkl_perror("lkl_sys_send", ret);
+		return ret;
+	}
+	ret = rtnl_listen(fd, check_ipv6_dad, (void *)&filter);
+	lkl_sys_close(fd);
+	return ret;
+}
+
+int lkl_if_set_ipv6(int ifindex, void *addr, unsigned int netprefix_len)
+{
+	int err = lkl_if_add_ip(ifindex, LKL_AF_INET6, addr, netprefix_len);
+
+	if (err)
+		return err;
+	return lkl_if_wait_ipv6_dad(ifindex, addr);
+}
+
+int lkl_if_set_ipv6_gateway(int ifindex, void *src_addr,
+		unsigned int src_masklen, void *via_addr)
+{
+	int err;
+
+	err = lkl_if_add_rule_from_saddr(ifindex, LKL_AF_INET6, src_addr);
+	if (err)
+		return err;
+	err = lkl_if_add_linklocal(ifindex, LKL_AF_INET6,
+					src_addr, src_masklen);
+	if (err)
+		return err;
+	return lkl_if_add_gateway(ifindex, LKL_AF_INET6, via_addr);
+}
+
+int lkl_set_ipv6_gateway(void *addr)
+{
+	return lkl_add_gateway(LKL_AF_INET6, addr);
+}
+
+/* returns:
+ * 0 - succeed.
+ * < 0 - error number.
+ * 1 - should wait for new msg.
+ */
+static int check_error(struct lkl_sockaddr_nl *nladdr, struct lkl_nlmsghdr *n,
+		       void *arg)
+{
+	unsigned int s = *(unsigned int *)arg;
+
+	if (nladdr->nl_pid != 0 || n->nlmsg_seq != s) {
+		/* Don't forget to skip that message. */
+		return 1;
+	}
+
+	if (n->nlmsg_type == LKL_NLMSG_ERROR) {
+		struct lkl_nlmsgerr *err =
+			(struct lkl_nlmsgerr *)LKL_NLMSG_DATA(n);
+		int l = n->nlmsg_len - sizeof(*n);
+
+		if (l < (int)sizeof(struct lkl_nlmsgerr))
+			lkl_printf("ERROR truncated\n");
+		else if (!err->error)
+			return 0;
+
+		lkl_printf("RTNETLINK answers: %s\n",
+			lkl_strerror(-err->error));
+		return err->error;
+	}
+	lkl_printf("Unexpected reply!!!\n");
+	return -1;
+}
+
+static unsigned int seq;
+static int rtnl_talk(int fd, struct lkl_nlmsghdr *n)
+{
+	int status;
+	struct lkl_sockaddr_nl nladdr = {.nl_family = LKL_AF_NETLINK};
+	struct lkl_iovec iov = {.iov_base = (void *)n, .iov_len = n->nlmsg_len};
+	struct lkl_user_msghdr msg = {
+			.msg_name = &nladdr,
+			.msg_namelen = sizeof(nladdr),
+			.msg_iov = &iov,
+			.msg_iovlen = 1,
+	};
+
+	n->nlmsg_seq = seq;
+	n->nlmsg_flags |= LKL_NLM_F_ACK;
+
+	status = lkl_sys_sendmsg(fd, &msg, 0);
+	if (status < 0) {
+		lkl_perror("Cannot talk to rtnetlink", status);
+		return status;
+	}
+
+	status = rtnl_listen(fd, check_error, (void *)&seq);
+	seq++;
+	return status;
+}
+
+static int addattr_l(struct lkl_nlmsghdr *n, unsigned int maxlen,
+		     int type, const void *data, int alen)
+{
+	int len = LKL_RTA_LENGTH(alen);
+	struct lkl_rtattr *rta;
+
+	if (LKL_NLMSG_ALIGN(n->nlmsg_len) + LKL_RTA_ALIGN(len) > maxlen) {
+		lkl_printf("%s ERROR: message exceeded bound of %d\n", __func__,
+			   maxlen);
+		return -1;
+	}
+	rta = ((struct lkl_rtattr *) (((void *) (n)) +
+				      LKL_NLMSG_ALIGN(n->nlmsg_len)));
+	rta->rta_type = type;
+	rta->rta_len = len;
+	memcpy(LKL_RTA_DATA(rta), data, alen);
+	n->nlmsg_len = LKL_NLMSG_ALIGN(n->nlmsg_len) + LKL_RTA_ALIGN(len);
+	return 0;
+}
+
+int lkl_add_neighbor(int ifindex, int af, void *ip, void *mac)
+{
+	struct {
+		struct lkl_nlmsghdr n;
+		struct lkl_ndmsg r;
+		char buf[1024];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_ndmsg)),
+		.n.nlmsg_type = LKL_RTM_NEWNEIGH,
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST |
+				 LKL_NLM_F_CREATE | LKL_NLM_F_REPLACE,
+		.r.ndm_family = af,
+		.r.ndm_ifindex = ifindex,
+		.r.ndm_state = LKL_NUD_PERMANENT,
+
+	};
+	int err, addr_sz;
+	int fd;
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	// create the IP attribute
+	addattr_l(&req.n, sizeof(req), LKL_NDA_DST, ip, addr_sz);
+
+	// create the MAC attribute
+	addattr_l(&req.n, sizeof(req), LKL_NDA_LLADDR, mac, 6);
+
+	err = rtnl_talk(fd, &req.n);
+	lkl_sys_close(fd);
+	return err;
+}
+
+static int ipaddr_modify(int cmd, int flags, int ifindex, int af, void *addr,
+			 unsigned int netprefix_len)
+{
+	struct {
+		struct lkl_nlmsghdr n;
+		struct lkl_ifaddrmsg ifa;
+		char buf[256];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_ifaddrmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST | flags,
+		.n.nlmsg_type = cmd,
+		.ifa.ifa_family = af,
+		.ifa.ifa_prefixlen = netprefix_len,
+		.ifa.ifa_index = ifindex,
+	};
+	int err, addr_sz;
+	int fd;
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	// create the IP attribute
+	addattr_l(&req.n, sizeof(req), LKL_IFA_LOCAL, addr, addr_sz);
+
+	err = rtnl_talk(fd, &req.n);
+
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_if_add_ip(int ifindex, int af, void *addr, unsigned int netprefix_len)
+{
+	return ipaddr_modify(LKL_RTM_NEWADDR, LKL_NLM_F_CREATE | LKL_NLM_F_EXCL,
+			     ifindex, af, addr, netprefix_len);
+}
+
+int lkl_if_del_ip(int ifindex, int af, void *addr, unsigned int netprefix_len)
+{
+	return ipaddr_modify(LKL_RTM_DELADDR, 0, ifindex, af,
+			     addr, netprefix_len);
+}
+
+static int iproute_modify(int cmd, unsigned int flags, int ifindex, int af,
+		void *route_addr, int route_masklen, void *gwaddr)
+{
+	struct {
+		struct lkl_nlmsghdr	n;
+		struct lkl_rtmsg	r;
+		char			buf[1024];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_rtmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST | flags,
+		.n.nlmsg_type = cmd,
+		.r.rtm_family = af,
+		.r.rtm_table = LKL_RT_TABLE_MAIN,
+		.r.rtm_scope = LKL_RT_SCOPE_UNIVERSE,
+	};
+	int err, addr_sz;
+	int i, fd;
+
+	fd = netlink_sock(0);
+	if (fd < 0) {
+		lkl_printf("netlink_sock error: %d\n", fd);
+		return fd;
+	}
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	if (cmd != LKL_RTM_DELROUTE) {
+		req.r.rtm_protocol = LKL_RTPROT_BOOT;
+		req.r.rtm_scope = LKL_RT_SCOPE_UNIVERSE;
+		req.r.rtm_type = LKL_RTN_UNICAST;
+	}
+
+	if (gwaddr)
+		addattr_l(&req.n, sizeof(req),
+				LKL_RTA_GATEWAY, gwaddr, addr_sz);
+
+	if (af == LKL_AF_INET && route_addr) {
+		unsigned int netaddr = *(unsigned int *)route_addr;
+
+		netaddr = ntohl(netaddr);
+		netaddr = (netaddr >> (32 - route_masklen));
+		netaddr = (netaddr << (32 - route_masklen));
+		netaddr =  htonl(netaddr);
+		*(unsigned int *)route_addr = netaddr;
+		req.r.rtm_dst_len = route_masklen;
+		addattr_l(&req.n, sizeof(req), LKL_RTA_DST,
+					route_addr, addr_sz);
+	}
+
+	if (af == LKL_AF_INET6 && route_addr) {
+		struct lkl_in6_addr netaddr =
+			*(struct lkl_in6_addr *)route_addr;
+		int rmbyte = route_masklen/8;
+		int rmbit = route_masklen%8;
+
+		for (i = 0; i < rmbyte; i++)
+			netaddr.in6_u.u6_addr8[15-i] = 0;
+		netaddr.in6_u.u6_addr8[15-rmbyte] =
+			(netaddr.in6_u.u6_addr8[15-rmbyte] >> rmbit);
+		netaddr.in6_u.u6_addr8[15-rmbyte] =
+			(netaddr.in6_u.u6_addr8[15-rmbyte] << rmbit);
+		*(struct lkl_in6_addr *)route_addr = netaddr;
+		req.r.rtm_dst_len = route_masklen;
+		addattr_l(&req.n, sizeof(req), LKL_RTA_DST,
+					route_addr, addr_sz);
+	}
+
+	if (ifindex != LKL_RT_TABLE_MAIN) {
+		if (af == LKL_AF_INET)
+			req.r.rtm_table = ifindex * 2;
+		else if (af == LKL_AF_INET6)
+			req.r.rtm_table = ifindex * 2 + 1;
+		addattr_l(&req.n, sizeof(req), LKL_RTA_OIF, &ifindex, addr_sz);
+	}
+	err = rtnl_talk(fd, &req.n);
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_if_add_linklocal(int ifindex, int af,  void *addr, int netprefix_len)
+{
+	return iproute_modify(LKL_RTM_NEWROUTE, LKL_NLM_F_CREATE|LKL_NLM_F_EXCL,
+			ifindex, af, addr, netprefix_len, NULL);
+}
+
+int lkl_if_add_gateway(int ifindex, int af, void *gwaddr)
+{
+	return iproute_modify(LKL_RTM_NEWROUTE, LKL_NLM_F_CREATE|LKL_NLM_F_EXCL,
+			ifindex, af, NULL, 0, gwaddr);
+}
+
+int lkl_add_gateway(int af, void *gwaddr)
+{
+	return iproute_modify(LKL_RTM_NEWROUTE, LKL_NLM_F_CREATE|LKL_NLM_F_EXCL,
+			LKL_RT_TABLE_MAIN, af, NULL, 0, gwaddr);
+}
+
+static int iprule_modify(int cmd, int ifindex, int af, void *saddr)
+{
+	struct {
+		struct lkl_nlmsghdr	n;
+		struct lkl_rtmsg		r;
+		char			buf[1024];
+	} req = {
+		.n.nlmsg_type = cmd,
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_rtmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST,
+		.r.rtm_protocol = LKL_RTPROT_BOOT,
+		.r.rtm_scope = LKL_RT_SCOPE_UNIVERSE,
+		.r.rtm_family = af,
+		.r.rtm_type = LKL_RTN_UNSPEC,
+	};
+	int fd, err;
+	int addr_sz;
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	if (cmd == LKL_RTM_NEWRULE) {
+		req.n.nlmsg_flags |= LKL_NLM_F_CREATE|LKL_NLM_F_EXCL;
+		req.r.rtm_type = LKL_RTN_UNICAST;
+	}
+
+	//set from address
+	req.r.rtm_src_len = 8 * addr_sz;
+	addattr_l(&req.n, sizeof(req), LKL_FRA_SRC, saddr, addr_sz);
+
+	//use ifindex as table id
+	if (af == LKL_AF_INET)
+		req.r.rtm_table = ifindex * 2;
+	else if (af == LKL_AF_INET6)
+		req.r.rtm_table = ifindex * 2 + 1;
+	err = rtnl_talk(fd, &req.n);
+
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_if_add_rule_from_saddr(int ifindex, int af, void *saddr)
+{
+	return iprule_modify(LKL_RTM_NEWRULE, ifindex, af, saddr);
+}
+
+static int qdisc_add(int cmd, int flags, int ifindex,
+		     const char *root, const char *type)
+{
+	struct {
+		struct lkl_nlmsghdr n;
+		struct lkl_tcmsg tc;
+		char buf[2*1024];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_tcmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST|flags,
+		.n.nlmsg_type = cmd,
+		.tc.tcm_family = LKL_AF_UNSPEC,
+	};
+	int err, fd;
+
+	if (!root || !type) {
+		lkl_printf("root and type arguments\n");
+		return -1;
+	}
+
+	if (strcmp(root, "root") == 0)
+		req.tc.tcm_parent = LKL_TC_H_ROOT;
+	req.tc.tcm_ifindex = ifindex;
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	// create the qdisc attribute
+	addattr_l(&req.n, sizeof(req), LKL_TCA_KIND, type, strlen(type)+1);
+
+	err = rtnl_talk(fd, &req.n);
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_qdisc_add(int ifindex, const char *root, const char *type)
+{
+	return qdisc_add(LKL_RTM_NEWQDISC, LKL_NLM_F_CREATE | LKL_NLM_F_EXCL,
+			 ifindex, root, type);
+}
+
+/* Add a qdisc entry for an interface in the form of
+ * "root|type;root|type;..."
+ */
+void lkl_qdisc_parse_add(int ifindex, const char *entries)
+{
+	char *saveptr = NULL, *token = NULL;
+	char *root = NULL, *type = NULL;
+	char strings[256];
+	int ret = 0;
+
+	strcpy(strings, entries);
+
+	for (token = strtok_r(strings, ";", &saveptr); token;
+	     token = strtok_r(NULL, ";", &saveptr)) {
+		root = strtok(token, "|");
+		type = strtok(NULL, "|");
+		ret = lkl_qdisc_add(ifindex, root, type);
+		if (ret) {
+			lkl_printf("Failed to add qdisc entry: %s\n",
+				   lkl_strerror(ret));
+			return;
+		}
+	}
+}
diff --git a/tools/lkl/lib/virtio_net.c b/tools/lkl/lib/virtio_net.c
new file mode 100644
index 000000000000..60743109215b
--- /dev/null
+++ b/tools/lkl/lib/virtio_net.c
@@ -0,0 +1,321 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <lkl_host.h>
+#include "virtio.h"
+#include "endian.h"
+
+#include <lkl/linux/virtio_net.h>
+
+#define netdev_of(x) (container_of(x, struct virtio_net_dev, dev))
+#define BIT(x) (1ULL << x)
+
+/* We always have 2 queues on a netdev: one for tx, one for rx. */
+#define RX_QUEUE_IDX 0
+#define TX_QUEUE_IDX 1
+
+#define NUM_QUEUES (TX_QUEUE_IDX + 1)
+#define QUEUE_DEPTH 128
+
+/* In fact, we'll hit the limit on the devs string below long before
+ * we hit this, but it's good enough for now.
+ */
+#define MAX_NET_DEVS 16
+
+#ifdef DEBUG
+#define bad_request(s) do {			\
+		lkl_printf("%s\n", s);		\
+		panic();			\
+	} while (0)
+#else
+#define bad_request(s) lkl_printf("virtio_net: %s\n", s)
+#endif /* DEBUG */
+
+struct virtio_net_dev {
+	struct virtio_dev dev;
+	struct lkl_virtio_net_config config;
+	struct lkl_netdev *nd;
+	struct lkl_mutex **queue_locks;
+	lkl_thread_t poll_tid;
+};
+
+static int net_check_features(struct virtio_dev *dev)
+{
+	if (dev->driver_features == dev->device_features)
+		return 0;
+
+	return -LKL_EINVAL;
+}
+
+static void net_acquire_queue(struct virtio_dev *dev, int queue_idx)
+{
+	lkl_host_ops.mutex_lock(netdev_of(dev)->queue_locks[queue_idx]);
+}
+
+static void net_release_queue(struct virtio_dev *dev, int queue_idx)
+{
+	lkl_host_ops.mutex_unlock(netdev_of(dev)->queue_locks[queue_idx]);
+}
+
+/*
+ * The buffers passed through "req" from the virtio_net driver always starts
+ * with a vnet_hdr. We need to check the backend device if it expects vnet_hdr
+ * and adjust buffer offset accordingly.
+ */
+static int net_enqueue(struct virtio_dev *dev, int q, struct virtio_req *req)
+{
+	struct lkl_virtio_net_hdr_v1 *header;
+	struct virtio_net_dev *net_dev;
+	struct iovec *iov;
+	int ret;
+
+	header = req->buf[0].iov_base;
+	net_dev = netdev_of(dev);
+	/*
+	 * The backend device does not expect a vnet_hdr so adjust buf
+	 * accordingly. (We make adjustment to req->buf so it can be used
+	 * directly for the tx/rx call but remember to undo the change after the
+	 * call.  Note that it's ok to pass iov with entry's len==0.  The caller
+	 * will skip to the next entry correctly.
+	 */
+	if (!net_dev->nd->has_vnet_hdr) {
+		req->buf[0].iov_base += sizeof(*header);
+		req->buf[0].iov_len -= sizeof(*header);
+	}
+	iov = req->buf;
+
+	/* Pick which virtqueue to send the buffer(s) to */
+	if (q == TX_QUEUE_IDX) {
+		ret = net_dev->nd->ops->tx(net_dev->nd, iov, req->buf_count);
+		if (ret < 0)
+			return -1;
+	} else if (q == RX_QUEUE_IDX) {
+		int i, len;
+
+		ret = net_dev->nd->ops->rx(net_dev->nd, iov, req->buf_count);
+		if (ret < 0)
+			return -1;
+		if (net_dev->nd->has_vnet_hdr) {
+			/*
+			 * If the number of bytes returned exactly matches the
+			 * total space in the iov then there is a good chance we
+			 * did not supply a large enough buffer for the whole
+			 * pkt, i.e., pkt has been truncated.  This is only
+			 * likely to happen under mergeable RX buffer mode.
+			 */
+			if (req->total_len == (unsigned int)ret)
+				lkl_printf("PKT is likely truncated! len=%d\n",
+				    ret);
+		} else {
+			header->flags = 0;
+			header->gso_type = LKL_VIRTIO_NET_HDR_GSO_NONE;
+		}
+		/*
+		 * Have to compute how many descriptors we've consumed (really
+		 * only matters to the the mergeable RX mode) and return it
+		 * through "num_buffers".
+		 */
+		for (i = 0, len = ret; len > 0; i++)
+			len -= req->buf[i].iov_len;
+		header->num_buffers = i;
+
+		if (dev->device_features & BIT(LKL_VIRTIO_NET_F_GUEST_CSUM))
+			header->flags |= LKL_VIRTIO_NET_HDR_F_DATA_VALID;
+	} else {
+		bad_request("tried to push on non-existent queue");
+		return -1;
+	}
+	if (!net_dev->nd->has_vnet_hdr) {
+		/* Undo the adjustment */
+		req->buf[0].iov_base -= sizeof(*header);
+		req->buf[0].iov_len += sizeof(*header);
+		ret += sizeof(struct lkl_virtio_net_hdr_v1);
+	}
+	virtio_req_complete(req, ret);
+	return 0;
+}
+
+static struct virtio_dev_ops net_ops = {
+	.check_features = net_check_features,
+	.enqueue = net_enqueue,
+	.acquire_queue = net_acquire_queue,
+	.release_queue = net_release_queue,
+};
+
+void poll_thread(void *arg)
+{
+	struct virtio_net_dev *dev = arg;
+
+	/* Synchronization is handled in virtio_process_queue */
+	do {
+		int ret = dev->nd->ops->poll(dev->nd);
+
+		if (ret < 0) {
+			lkl_printf("virtio net poll error: %d\n", ret);
+			continue;
+		}
+		if (ret & LKL_DEV_NET_POLL_HUP)
+			break;
+		if (ret & LKL_DEV_NET_POLL_RX)
+			virtio_process_queue(&dev->dev, 0);
+		if (ret & LKL_DEV_NET_POLL_TX)
+			virtio_process_queue(&dev->dev, 1);
+	} while (1);
+}
+
+struct virtio_net_dev *registered_devs[MAX_NET_DEVS];
+static int registered_dev_idx;
+
+static int dev_register(struct virtio_net_dev *dev)
+{
+	if (registered_dev_idx == MAX_NET_DEVS) {
+		lkl_printf("Too many virtio_net devices!\n");
+		/* This error code is a little bit of a lie */
+		return -LKL_ENOMEM;
+	} else {
+		/* registered_dev_idx is incremented by the caller */
+		registered_devs[registered_dev_idx] = dev;
+		return 0;
+	}
+}
+
+static void free_queue_locks(struct lkl_mutex **queues, int num_queues)
+{
+	int i = 0;
+
+	if (!queues)
+		return;
+
+	for (i = 0; i < num_queues; i++)
+		lkl_host_ops.mutex_free(queues[i]);
+
+	lkl_host_ops.mem_free(queues);
+}
+
+static struct lkl_mutex **init_queue_locks(int num_queues)
+{
+	int i;
+	struct lkl_mutex **ret = lkl_host_ops.mem_alloc(
+		sizeof(struct lkl_mutex *) * num_queues);
+	if (!ret)
+		return NULL;
+
+	memset(ret, 0, sizeof(struct lkl_mutex *) * num_queues);
+	for (i = 0; i < num_queues; i++) {
+		ret[i] = lkl_host_ops.mutex_alloc(1);
+		if (!ret[i]) {
+			free_queue_locks(ret, i);
+			return NULL;
+		}
+	}
+
+	return ret;
+}
+
+int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
+{
+	struct virtio_net_dev *dev;
+	int ret = -LKL_ENOMEM;
+
+	dev = lkl_host_ops.mem_alloc(sizeof(*dev));
+	if (!dev)
+		return -LKL_ENOMEM;
+
+	memset(dev, 0, sizeof(*dev));
+
+	dev->dev.device_id = LKL_VIRTIO_ID_NET;
+	if (args) {
+		if (args->mac) {
+			dev->dev.device_features |= BIT(LKL_VIRTIO_NET_F_MAC);
+			memcpy(dev->config.mac, args->mac, LKL_ETH_ALEN);
+		}
+		dev->dev.device_features |= args->offload;
+
+	}
+	dev->dev.config_data = &dev->config;
+	dev->dev.config_len = sizeof(dev->config);
+	dev->dev.ops = &net_ops;
+	dev->nd = nd;
+	dev->queue_locks = init_queue_locks(NUM_QUEUES);
+
+	if (!dev->queue_locks)
+		goto out_free;
+
+	/*
+	 * MUST match the number of queue locks we initialized. We could init
+	 * the queues in virtio_dev_setup to help enforce this, but netdevs are
+	 * the only flavor that need these locks, so it's better to do it
+	 * here.
+	 */
+	ret = virtio_dev_setup(&dev->dev, NUM_QUEUES, QUEUE_DEPTH);
+
+	if (ret)
+		goto out_free;
+
+	/*
+	 * We may receive upto 64KB TSO packet so collect as many descriptors as
+	 * there are available up to 64KB in total len.
+	 */
+	if (dev->dev.device_features & BIT(LKL_VIRTIO_NET_F_MRG_RXBUF))
+		virtio_set_queue_max_merge_len(&dev->dev, RX_QUEUE_IDX, 65536);
+
+	dev->poll_tid = lkl_host_ops.thread_create(poll_thread, dev);
+	if (dev->poll_tid == 0)
+		goto out_cleanup_dev;
+
+	ret = dev_register(dev);
+	if (ret < 0)
+		goto out_cleanup_dev;
+
+	return registered_dev_idx++;
+
+out_cleanup_dev:
+	virtio_dev_cleanup(&dev->dev);
+
+out_free:
+	if (dev->queue_locks)
+		free_queue_locks(dev->queue_locks, NUM_QUEUES);
+	lkl_host_ops.mem_free(dev);
+
+	return ret;
+}
+
+/* Return 0 for success, -1 for failure. */
+void lkl_netdev_remove(int id)
+{
+	struct virtio_net_dev *dev;
+	int ret;
+
+	if (id >= registered_dev_idx) {
+		lkl_printf("%s: invalid id: %d\n", __func__, id);
+		return;
+	}
+
+	dev = registered_devs[id];
+
+	dev->nd->ops->poll_hup(dev->nd);
+	lkl_host_ops.thread_join(dev->poll_tid);
+
+	ret = lkl_netdev_get_ifindex(id);
+	if (ret < 0) {
+		lkl_printf("%s: failed to get ifindex for id %d: %s\n",
+			   __func__, id, lkl_strerror(ret));
+		return;
+	}
+
+	ret = lkl_if_down(ret);
+	if (ret < 0) {
+		lkl_printf("%s: failed to put interface id %d down: %s\n",
+			   __func__, id, lkl_strerror(ret));
+		return;
+	}
+
+	virtio_dev_cleanup(&dev->dev);
+
+	free_queue_locks(dev->queue_locks, NUM_QUEUES);
+	lkl_host_ops.mem_free(dev);
+}
+
+void lkl_netdev_free(struct lkl_netdev *nd)
+{
+	nd->ops->free(nd);
+}
diff --git a/tools/lkl/lib/virtio_net_dpdk.c b/tools/lkl/lib/virtio_net_dpdk.c
new file mode 100644
index 000000000000..9512769554a5
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_dpdk.c
@@ -0,0 +1,480 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel DPDK based virtual network interface feature for LKL
+ * Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ */
+
+//#define DEBUG
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/queue.h>
+
+#include <rte_eal.h>
+#include <rte_ethdev.h>
+#include <rte_mempool.h>
+#include <rte_net.h>
+
+#include <lkl_host.h>
+
+static char *ealargs[4] = {
+	"lkl_vif_dpdk",
+	"-c 1",
+	"-n 1",
+	"--log-level=0",
+};
+
+#define MAX_PKT_BURST           16
+/* XXX: disable cache due to no thread-safe on mempool cache. */
+#define MEMPOOL_CACHE_SZ        0
+/* for TSO pkt */
+#define MAX_PACKET_SZ           (65535 \
+	- (sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM))
+#define MBUF_NUM                (512*2) /* vmxnet3 requires 1024 */
+#define MBUF_SIZ        \
+	(MAX_PACKET_SZ + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+#define NUMDESC         512	/* nb_min on vmxnet3 is 512 */
+#define NUMQUEUE        1
+
+#define BIT(x) (1ULL << x)
+
+static int portid;
+
+struct lkl_netdev_dpdk {
+	struct lkl_netdev dev;
+	int portid;
+	struct rte_mempool *rxpool, *txpool; /* ring buffer pool */
+	/* burst receive context by rump dpdk code */
+	struct rte_mbuf *rcv_mbuf[MAX_PKT_BURST];
+	int npkts;
+	int bufidx;
+	int offload;
+	int close: 1;
+	int busy_poll: 1;
+};
+
+static int dpdk_net_tx_prep(struct rte_mbuf *rm,
+		struct lkl_virtio_net_hdr_v1 *header)
+{
+	struct rte_net_hdr_lens hdr_lens;
+	uint32_t ptype;
+
+#ifdef DEBUG
+	lkl_printf("dpdk-tx: gso_type=%d, gso=%d, hdrlen=%d validation=%d\n",
+		header->gso_type, header->gso_size, header->hdr_len,
+		rte_validate_tx_offload(rm));
+#endif
+
+	ptype = rte_net_get_ptype(rm, &hdr_lens, RTE_PTYPE_ALL_MASK);
+	rm->l2_len = hdr_lens.l2_len;
+	rm->l3_len = hdr_lens.l3_len;
+	rm->l4_len = hdr_lens.l4_len; // including tcp opts
+
+	if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP) {
+		if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV4)
+			rm->ol_flags = PKT_TX_IPV4;
+		else if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV6)
+			rm->ol_flags = PKT_TX_IPV6;
+
+		rm->ol_flags |= PKT_TX_TCP_CKSUM;
+		rm->tso_segsz = header->gso_size;
+		/* TSO case */
+		if (header->gso_type == LKL_VIRTIO_NET_HDR_GSO_TCPV4)
+			rm->ol_flags |= (PKT_TX_TCP_SEG | PKT_TX_IP_CKSUM);
+		else if (header->gso_type == LKL_VIRTIO_NET_HDR_GSO_TCPV6)
+			rm->ol_flags |= PKT_TX_TCP_SEG;
+	}
+
+	return sizeof(struct lkl_virtio_net_hdr_v1);
+
+}
+
+static int dpdk_net_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	void *pkt;
+	struct rte_mbuf *rm;
+	struct lkl_netdev_dpdk *nd_dpdk;
+	struct lkl_virtio_net_hdr_v1 *header = NULL;
+	int i, len, sent = 0;
+	void *data = NULL;
+
+	nd_dpdk = (struct lkl_netdev_dpdk *) nd;
+
+	/*
+	 * XXX: someone reported that DPDK's mempool with cache is not thread
+	 * safe (e.g., http://www.dpdk.io/ml/archives/dev/2014-February/001401.html),
+	 * potentially rte_pktmbuf_alloc() is not thread safe here.  so I
+	 * tentatively disabled the cache on mempool by assigning
+	 * MEMPOOL_CACHE_SZ to 0.
+	 */
+	rm = rte_pktmbuf_alloc(nd_dpdk->txpool);
+
+	for (i = 0; i < cnt; i++) {
+		data = iov[i].iov_base;
+		len = (int)iov[i].iov_len;
+
+		if (i == 0) {
+			header = data;
+			data += sizeof(*header);
+			len -= sizeof(*header);
+		}
+
+		if (len == 0)
+			continue;
+
+		pkt = rte_pktmbuf_append(rm, len);
+		if (pkt) {
+			/* XXX: I wanna have M_EXT flag !!! */
+			memcpy(pkt, data, len);
+			sent += len;
+		} else {
+			lkl_printf("dpdk-tx: failed to append: idx=%d len=%d\n",
+				   i, len);
+			rte_pktmbuf_free(rm);
+			return -1;
+		}
+#ifdef DEBUG
+		lkl_printf("dpdk-tx: pkt[%d]len=%d\n", i, len);
+#endif
+	}
+
+	/* preparation for TX offloads */
+	sent += dpdk_net_tx_prep(rm, header);
+
+	/* XXX: should be bulk-trasmitted !! */
+	if (rte_eth_tx_prepare(nd_dpdk->portid, 0, &rm, 1) != 1)
+		lkl_printf("tx_prep failed\n");
+
+	rte_eth_tx_burst(nd_dpdk->portid, 0, &rm, 1);
+
+	rte_pktmbuf_free(rm);
+	return sent;
+}
+
+static int __dpdk_net_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	struct lkl_netdev_dpdk *nd_dpdk;
+	int i = 0;
+	struct rte_mbuf *rm, *first;
+	void *r_data;
+	size_t read = 0, r_size, copylen = 0, offset = 0;
+	struct lkl_virtio_net_hdr_v1 *header = iov[0].iov_base;
+	uint16_t mtu;
+
+	nd_dpdk = (struct lkl_netdev_dpdk *) nd;
+	memset(header, 0, sizeof(struct lkl_virtio_net_hdr_v1));
+
+	first = nd_dpdk->rcv_mbuf[nd_dpdk->bufidx];
+
+	for (rm = nd_dpdk->rcv_mbuf[nd_dpdk->bufidx]; rm; rm = rm->next) {
+		r_data = rte_pktmbuf_mtod(rm, void *);
+		r_size = rte_pktmbuf_data_len(rm);
+
+#ifdef DEBUG
+		lkl_printf("dpdk-rx: mbuf pktlen=%d orig_len=%lu\n",
+			   r_size, iov[i].iov_len);
+#endif
+		/* mergeable buffer starts data after vnet header at [0] */
+		if (nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_MRG_RXBUF) &&
+		    i == 0)
+			offset = sizeof(struct lkl_virtio_net_hdr_v1);
+		else if (nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_GUEST_TSO4) &&
+			 i == 0)
+			i++;
+		else
+			offset = sizeof(struct lkl_virtio_net_hdr_v1);
+
+		read += r_size;
+		while (r_size > 0) {
+			if (i >= cnt) {
+				fprintf(stderr,
+					"dpdk-rx: buffer full. skip it. ");
+				fprintf(stderr,
+					"(cnt=%d, buf[%d]=%lu, size=%lu)\n",
+					i, cnt, iov[i].iov_len, r_size);
+				goto end;
+			}
+
+			copylen = r_size < (iov[i].iov_len - offset) ? r_size
+				: iov[i].iov_len - offset;
+			memcpy(iov[i].iov_base + offset, r_data, copylen);
+
+			r_size -= copylen;
+			offset = 0;
+			i++;
+		}
+	}
+
+end:
+	/* TSO (big_packet mode) */
+	header->flags = LKL_VIRTIO_NET_HDR_F_DATA_VALID;
+	rte_eth_dev_get_mtu(nd_dpdk->portid, &mtu);
+
+	if (read > (mtu + sizeof(struct ether_hdr)
+		    + sizeof(struct lkl_virtio_net_hdr_v1))) {
+		struct rte_net_hdr_lens hdr_lens;
+		uint32_t ptype;
+
+		ptype = rte_net_get_ptype(first, &hdr_lens, RTE_PTYPE_ALL_MASK);
+
+		if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP) {
+			if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV4 &&
+			    nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_GUEST_TSO4))
+				header->gso_type = LKL_VIRTIO_NET_HDR_GSO_TCPV4;
+			/* XXX: Intel X540 doesn't support LRO
+			 * with tcpv6 packets
+			 */
+			if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV6 &&
+			    nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_GUEST_TSO6))
+				header->gso_type = LKL_VIRTIO_NET_HDR_GSO_TCPV6;
+		}
+
+		header->gso_size = mtu - hdr_lens.l3_len - hdr_lens.l4_len;
+		header->hdr_len = hdr_lens.l2_len + hdr_lens.l3_len
+			+ hdr_lens.l4_len;
+	}
+
+	read += sizeof(struct lkl_virtio_net_hdr_v1);
+
+#ifdef DEBUG
+	lkl_printf("dpdk-rx: len=%d mtu=%d type=%d, size=%d, hdrlen=%d\n",
+		   read, mtu, header->gso_type,
+		   header->gso_size, header->hdr_len);
+#endif
+
+	return read;
+}
+
+
+/*
+ * this function is not thread-safe.
+ *
+ * nd_dpdk->rcv_mbuf is specifically not safe in parallel access.  if future
+ * refactor allows us to read in parallel, the buffer (nd_dpdk->rcv_mbuf) shall
+ * be guarded.
+ */
+static int dpdk_net_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	struct lkl_netdev_dpdk *nd_dpdk;
+	int read = 0;
+
+	nd_dpdk = (struct lkl_netdev_dpdk *) nd;
+
+	if (nd_dpdk->npkts == 0) {
+		nd_dpdk->npkts = rte_eth_rx_burst(nd_dpdk->portid, 0,
+						  nd_dpdk->rcv_mbuf,
+						  MAX_PKT_BURST);
+		if (nd_dpdk->npkts <= 0) {
+			/* XXX: need to implement proper poll()
+			 * or interrupt mode PMD of dpdk, which is only
+			 * availbale on ixgbe/igb/e1000 (as of Jan. 2016)
+			 */
+			if (!nd_dpdk->busy_poll)
+				usleep(1);
+			return -1;
+		}
+		nd_dpdk->bufidx = 0;
+	}
+
+	/* mergeable buffer */
+	read = __dpdk_net_rx(nd, iov, cnt);
+
+	rte_pktmbuf_free(nd_dpdk->rcv_mbuf[nd_dpdk->bufidx]);
+
+	nd_dpdk->bufidx++;
+	nd_dpdk->npkts--;
+
+	return read;
+}
+
+static int dpdk_net_poll(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_dpdk *nd_dpdk =
+		container_of(nd, struct lkl_netdev_dpdk, dev);
+
+	if (nd_dpdk->close)
+		return LKL_DEV_NET_POLL_HUP;
+	/*
+	 * dpdk's interrupt mode has equivalent of epoll_wait(2),
+	 * which we can apply here. but AFAIK the mode is only available
+	 * on limited NIC drivers like ixgbe/igb/e1000 (with dpdk v2.2.0),
+	 * while vmxnet3 is not supported e.g..
+	 */
+	return LKL_DEV_NET_POLL_RX | LKL_DEV_NET_POLL_TX;
+}
+
+static void dpdk_net_poll_hup(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_dpdk *nd_dpdk =
+		container_of(nd, struct lkl_netdev_dpdk, dev);
+
+	nd_dpdk->close = 1;
+}
+
+static void dpdk_net_free(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_dpdk *nd_dpdk =
+		container_of(nd, struct lkl_netdev_dpdk, dev);
+
+	free(nd_dpdk);
+}
+
+struct lkl_dev_net_ops dpdk_net_ops = {
+	.tx = dpdk_net_tx,
+	.rx = dpdk_net_rx,
+	.poll = dpdk_net_poll,
+	.poll_hup = dpdk_net_poll_hup,
+	.free = dpdk_net_free,
+};
+
+
+static int dpdk_init;
+struct lkl_netdev *lkl_netdev_dpdk_create(const char *ifparams, int offload,
+					 unsigned char *mac)
+{
+	int ret = 0;
+	struct rte_eth_conf portconf;
+	struct rte_eth_link link;
+	struct lkl_netdev_dpdk *nd;
+	struct rte_eth_dev_info dev_info;
+	char poolname[RTE_MEMZONE_NAMESIZE];
+	char *debug = getenv("LKL_HIJACK_DEBUG");
+	int lkl_debug = 0;
+
+	if (!dpdk_init) {
+		if (debug)
+			lkl_debug = strtol(debug, NULL, 0);
+		if (lkl_debug & 0x400)
+			ealargs[3] = "--log-level=100";
+
+		ret = rte_eal_init(sizeof(ealargs) / sizeof(ealargs[0]),
+				   ealargs);
+		if (ret < 0)
+			lkl_printf("dpdk: failed to initialize eal\n");
+
+		dpdk_init = 1;
+	}
+
+	nd = malloc(sizeof(struct lkl_netdev_dpdk));
+	memset(nd, 0, sizeof(struct lkl_netdev_dpdk));
+	nd->dev.ops = &dpdk_net_ops;
+	nd->portid = portid++;
+	/* busy-poll mode is described 'ifparams' with "*-busy" */
+	nd->busy_poll = strstr(ifparams, "busy") ? 1 : 0;
+	/* we always enable big_packet mode with dpdk. */
+	nd->offload = offload;
+
+	snprintf(poolname, RTE_MEMZONE_NAMESIZE, "%s%s", "tx-", ifparams);
+	nd->txpool =
+		rte_mempool_create(poolname,
+				   MBUF_NUM, MBUF_SIZ, MEMPOOL_CACHE_SZ,
+				   sizeof(struct rte_pktmbuf_pool_private),
+				   rte_pktmbuf_pool_init, NULL,
+				   rte_pktmbuf_init, NULL, 0, 0);
+
+	if (!nd->txpool) {
+		lkl_printf("dpdk: failed to allocate tx pool\n");
+		free(nd);
+		return NULL;
+	}
+
+
+	snprintf(poolname, RTE_MEMZONE_NAMESIZE, "%s%s", "rx-", ifparams);
+	nd->rxpool =
+		rte_mempool_create(poolname, MBUF_NUM, MBUF_SIZ, 0,
+				   sizeof(struct rte_pktmbuf_pool_private),
+				   rte_pktmbuf_pool_init, NULL,
+				   rte_pktmbuf_init, NULL, 0, 0);
+	if (!nd->rxpool) {
+		lkl_printf("dpdk: failed to allocate rx pool\n");
+		free(nd);
+		return NULL;
+	}
+
+	memset(&portconf, 0, sizeof(portconf));
+
+	/* offload bits */
+	/* but, we only configure NIC to use TSO *only if* user specifies. */
+	if (offload & (BIT(LKL_VIRTIO_NET_F_GUEST_TSO4) |
+			BIT(LKL_VIRTIO_NET_F_GUEST_TSO6) |
+			BIT(LKL_VIRTIO_NET_F_MRG_RXBUF))) {
+		portconf.rxmode.enable_lro = 1;
+		portconf.rxmode.hw_strip_crc = 1;
+	}
+
+	ret = rte_eth_dev_configure(nd->portid, NUMQUEUE, NUMQUEUE,
+				    &portconf);
+	if (ret < 0) {
+		lkl_printf("dpdk: failed to configure port\n");
+		free(nd);
+		return NULL;
+	}
+
+	rte_eth_dev_info_get(nd->portid, &dev_info);
+
+	ret = rte_eth_rx_queue_setup(nd->portid, 0, NUMDESC, 0,
+				     &dev_info.default_rxconf, nd->rxpool);
+	if (ret < 0) {
+		lkl_printf("dpdk: failed to setup rx queue\n");
+		free(nd);
+		return NULL;
+	}
+
+	dev_info.default_txconf.txq_flags = 0;
+
+	dev_info.default_txconf.txq_flags |= ETH_TXQ_FLAGS_NOXSUMSCTP;
+	dev_info.default_txconf.txq_flags |= ETH_TXQ_FLAGS_NOVLANOFFL;
+
+
+	ret = rte_eth_tx_queue_setup(nd->portid, 0, NUMDESC, 0,
+				     &dev_info.default_txconf);
+	if (ret < 0) {
+		lkl_printf("dpdk: failed to setup tx queue\n");
+		free(nd);
+		return NULL;
+	}
+
+	ret = rte_eth_dev_start(nd->portid);
+	/* XXX: this function returns positive val (e.g., 12)
+	 * if there's an error
+	 */
+	if (ret != 0) {
+		lkl_printf("dpdk: failed to start device\n");
+		free(nd);
+		return NULL;
+	}
+
+	if (mac) {
+		rte_eth_macaddr_get(nd->portid, (struct ether_addr *)mac);
+		lkl_printf("Port %d: %02x:%02x:%02x:%02x:%02x:%02x\n",
+			   nd->portid,
+			   mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
+	}
+
+	rte_eth_dev_set_link_up(nd->portid);
+
+	rte_eth_link_get(nd->portid, &link);
+	if (!link.link_status) {
+		fprintf(stderr, "dpdk: interface state is down\n");
+		rte_eth_link_get(nd->portid, &link);
+		if (!link.link_status) {
+			fprintf(stderr,
+				"dpdk: interface state is down.. Giving up.\n");
+			return NULL;
+		}
+		lkl_printf("dpdk: interface state should be up now.\n");
+	}
+
+	/* should be promisc ? */
+	rte_eth_promiscuous_enable(nd->portid);
+
+	/* as we always assume to have vnet_hdr for dpdk device. */
+	nd->dev.has_vnet_hdr = 1;
+
+	return (struct lkl_netdev *) nd;
+}
diff --git a/tools/lkl/lib/virtio_net_fd.c b/tools/lkl/lib/virtio_net_fd.c
new file mode 100644
index 000000000000..f8664455e696
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_fd.c
@@ -0,0 +1,217 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * POSIX file descriptor based virtual network interface feature for
+ * LKL Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ *         Octavian Purdila <octavian.purdila@intel.com>
+ *
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <unistd.h>
+#ifdef __FreeBSD__
+#include <sys/syslimits.h>
+#else
+#include <limits.h>
+#endif
+#include <fcntl.h>
+#include <sys/poll.h>
+#include <sys/uio.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+struct lkl_netdev_fd {
+	struct lkl_netdev dev;
+	/* file-descriptor based device */
+	int fd_rx;
+	int fd_tx;
+	/*
+	 * Controlls the poll mask for fd. Can be acccessed concurrently from
+	 * poll, tx, or rx routines but there is no need for syncronization
+	 * because:
+	 *
+	 * (a) TX and RX routines set different variables so even if they update
+	 * at the same time there is no race condition
+	 *
+	 * (b) Even if poll and TX / RX update at the same time poll cannot
+	 * stall: when poll resets the poll variable we know that TX / RX will
+	 * run which means that eventually the poll variable will be set.
+	 */
+	int poll_tx, poll_rx;
+	/* controle pipe */
+	int pipe[2];
+};
+
+static int fd_net_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	do {
+		ret = writev(nd_fd->fd_tx, iov, cnt);
+	} while (ret == -1 && errno == EINTR);
+
+	if (ret < 0) {
+		if (errno != EAGAIN) {
+			perror("write to fd netdev fails");
+		} else {
+			char tmp = 0;
+
+			nd_fd->poll_tx = 1;
+			if (write(nd_fd->pipe[1], &tmp, 1) <= 0)
+				perror("virtio net fd pipe write");
+		}
+	}
+	return ret;
+}
+
+static int fd_net_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	do {
+		ret = readv(nd_fd->fd_rx, (struct iovec *)iov, cnt);
+	} while (ret == -1 && errno == EINTR);
+
+	if (ret < 0) {
+		if (errno != EAGAIN) {
+			perror("virtio net fd read");
+		} else {
+			char tmp = 0;
+
+			nd_fd->poll_rx = 1;
+			if (write(nd_fd->pipe[1], &tmp, 1) < 0)
+				perror("virtio net fd pipe write");
+		}
+	}
+	return ret;
+}
+
+static int fd_net_poll(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+	struct pollfd pfds[3] = {
+		{
+			.fd = nd_fd->fd_rx,
+		},
+		{
+			.fd = nd_fd->fd_tx,
+		},
+		{
+			.fd = nd_fd->pipe[0],
+			.events = POLLIN,
+		},
+	};
+	int ret;
+
+	if (nd_fd->poll_rx)
+		pfds[0].events |= POLLIN|POLLPRI;
+	if (nd_fd->poll_tx)
+		pfds[1].events |= POLLOUT;
+
+	do {
+		ret = poll(pfds, 3, -1);
+	} while (ret == -1 && errno == EINTR);
+
+	if (ret < 0) {
+		perror("virtio net fd poll");
+		return 0;
+	}
+
+	if (pfds[2].revents & (POLLHUP|POLLNVAL))
+		return LKL_DEV_NET_POLL_HUP;
+
+	if (pfds[2].revents & POLLIN) {
+		char tmp[PIPE_BUF];
+
+		ret = read(nd_fd->pipe[0], tmp, PIPE_BUF);
+		if (ret == 0)
+			return LKL_DEV_NET_POLL_HUP;
+		if (ret < 0)
+			perror("virtio net fd pipe read");
+	}
+
+	ret = 0;
+
+	if (pfds[0].revents & (POLLIN|POLLPRI)) {
+		nd_fd->poll_rx = 0;
+		ret |= LKL_DEV_NET_POLL_RX;
+	}
+
+	if (pfds[1].revents & POLLOUT) {
+		nd_fd->poll_tx = 0;
+		ret |= LKL_DEV_NET_POLL_TX;
+	}
+
+	return ret;
+}
+
+static void fd_net_poll_hup(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	/* this will cause a POLLHUP / POLLNVAL in the poll function */
+	close(nd_fd->pipe[0]);
+	close(nd_fd->pipe[1]);
+}
+
+static void fd_net_free(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	close(nd_fd->fd_rx);
+	close(nd_fd->fd_tx);
+	free(nd_fd);
+}
+
+struct lkl_dev_net_ops fd_net_ops =  {
+	.tx = fd_net_tx,
+	.rx = fd_net_rx,
+	.poll = fd_net_poll,
+	.poll_hup = fd_net_poll_hup,
+	.free = fd_net_free,
+};
+
+struct lkl_netdev *lkl_register_netdev_fd(int fd_rx, int fd_tx)
+{
+	struct lkl_netdev_fd *nd;
+
+	nd = malloc(sizeof(*nd));
+	if (!nd) {
+		fprintf(stderr, "fdnet: failed to allocate memory\n");
+		/* TODO: propagate the error state, maybe use errno for that? */
+		return NULL;
+	}
+
+	memset(nd, 0, sizeof(*nd));
+
+	nd->fd_rx = fd_rx;
+	nd->fd_tx = fd_tx;
+	if (pipe(nd->pipe) < 0) {
+		perror("pipe");
+		free(nd);
+		return NULL;
+	}
+
+	if (fcntl(nd->pipe[0], F_SETFL, O_NONBLOCK) < 0) {
+		perror("fnctl");
+		close(nd->pipe[0]);
+		close(nd->pipe[1]);
+		free(nd);
+		return NULL;
+	}
+
+	nd->dev.ops = &fd_net_ops;
+	return &nd->dev;
+}
diff --git a/tools/lkl/lib/virtio_net_fd.h b/tools/lkl/lib/virtio_net_fd.h
new file mode 100644
index 000000000000..713ba13cca7c
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_fd.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _VIRTIO_NET_FD_H
+#define _VIRTIO_NET_FD_H
+
+struct ifreq;
+
+/**
+ * lkl_register_netdev_linux_fdnet - register a file descriptor-based network
+ * device as a NIC
+ *
+ * @fd_rx - a POSIX file descriptor number for input
+ * @fd_tx - a POSIX file descriptor number for output
+ * @returns a struct lkl_netdev_linux_fdnet entry for virtio-net
+ */
+struct lkl_netdev *lkl_register_netdev_fd(int fd_rx, int fd_tx);
+
+
+/**
+ * lkl_netdev_tap_init - initialize tap related structure fot lkl_netdev.
+ *
+ * @path - the path to open the device.
+ * @offload - offload bits for the device
+ * @ifr - struct ifreq for ioctl.
+ */
+struct lkl_netdev *lkl_netdev_tap_init(const char *path, int offload,
+				       struct ifreq *ifr);
+
+#endif /* _VIRTIO_NET_FD_H*/
diff --git a/tools/lkl/lib/virtio_net_macvtap.c b/tools/lkl/lib/virtio_net_macvtap.c
new file mode 100644
index 000000000000..5d6d2c822f2d
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_macvtap.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * macvtap based virtual network interface feature for LKL
+ * Copyright (c) 2016 Hajime Tazaki
+ *
+ * Author: Hajime Tazaki <thehajime@gmail.com>
+ *
+ * Current implementation is linux-specific.
+ */
+
+/*
+ * You need to configure host device in advance.
+ *
+ * sudo ip link add link eth0 name vtap0 type macvtap mode passthru
+ * sudo ip link set dev vtap0 up
+ * sudo chown thehajime /dev/tap22
+ */
+
+#include <net/if.h>
+#include <linux/if_tun.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+struct lkl_netdev *lkl_netdev_macvtap_create(const char *path, int offload)
+{
+	struct ifreq ifr = {
+		.ifr_flags = IFF_TAP | IFF_NO_PI,
+	};
+
+	return lkl_netdev_tap_init(path, offload, &ifr);
+}
diff --git a/tools/lkl/lib/virtio_net_pipe.c b/tools/lkl/lib/virtio_net_pipe.c
new file mode 100644
index 000000000000..c68d4c855499
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_pipe.c
@@ -0,0 +1,76 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * pipe based virtual network interface feature for LKL
+ * Copyright (c) 2017,2016 Motomu Utsumi
+ *
+ * Author: Motomu Utsumi <motomuman@gmail.com>
+ *
+ * Current implementation is linux-specific.
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <fcntl.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+struct lkl_netdev *lkl_netdev_pipe_create(const char *_ifname, int offload)
+{
+	struct lkl_netdev *nd;
+	int fd_rx, fd_tx;
+	char *ifname = strdup(_ifname), *ifname_rx = NULL, *ifname_tx = NULL;
+
+	ifname_rx = strtok(ifname, "|");
+	if (ifname_rx == NULL) {
+		fprintf(stderr, "invalid ifname format: %s\n", ifname);
+		free(ifname);
+		return NULL;
+	}
+
+	ifname_tx = strtok(NULL, "|");
+	if (ifname_tx == NULL) {
+		fprintf(stderr, "invalid ifname format: %s\n", ifname);
+		free(ifname);
+		return NULL;
+	}
+
+	if (strtok(NULL, "|") != NULL) {
+		fprintf(stderr, "invalid ifname format: %s\n", ifname);
+		free(ifname);
+		return NULL;
+	}
+
+	fd_rx = open(ifname_rx, O_RDWR|O_NONBLOCK);
+	if (fd_rx < 0) {
+		perror("can not open ifname_rx pipe");
+		free(ifname);
+		return NULL;
+	}
+
+	fd_tx = open(ifname_tx, O_RDWR|O_NONBLOCK);
+	if (fd_tx < 0) {
+		perror("can not open ifname_tx pipe");
+		close(fd_rx);
+		free(ifname);
+		return NULL;
+	}
+
+	nd = lkl_register_netdev_fd(fd_rx, fd_tx);
+	if (!nd) {
+		perror("failed to register to.");
+		close(fd_rx);
+		close(fd_tx);
+		free(ifname);
+		return NULL;
+	}
+
+	free(ifname);
+	/*
+	 * To avoid mismatch with LKL otherside,
+	 * we always enabled vnet hdr
+	 */
+	nd->has_vnet_hdr = 1;
+	return nd;
+}
diff --git a/tools/lkl/lib/virtio_net_raw.c b/tools/lkl/lib/virtio_net_raw.c
new file mode 100644
index 000000000000..363ccf628569
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_raw.c
@@ -0,0 +1,94 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * raw socket based virtual network interface feature for LKL
+ * Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ *
+ * Current implementation is linux-specific.
+ */
+
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <unistd.h>
+#include <net/if.h>
+#include <arpa/inet.h>
+#ifdef __linux__
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+#elif __FreeBSD__
+#include <netinet/in.h>
+#endif
+#include <fcntl.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+/* since Linux 3.14 (man 7 packet) */
+#ifndef PACKET_QDISC_BYPASS
+#define PACKET_QDISC_BYPASS 20
+#endif
+
+struct lkl_netdev *lkl_netdev_raw_create(const char *ifname)
+{
+#ifdef __linux__
+	int ret;
+	int ifindex =  if_nametoindex(ifname);
+	struct sockaddr_ll ll = {
+		.sll_family = PF_PACKET,
+		.sll_ifindex = ifindex,
+		.sll_protocol = htons(ETH_P_ALL),
+	};
+	struct packet_mreq mreq = {
+		.mr_type = PACKET_MR_PROMISC,
+		.mr_ifindex = ifindex,
+	};
+#endif
+	int fd, fd_flags;
+#ifdef __linux__
+	int val;
+
+	if (ifindex < 0) {
+		perror("if_nametoindex");
+		return NULL;
+	}
+
+	fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
+#elif __FreeBSD__
+	fd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
+#endif
+	if (fd < 0) {
+		perror("socket");
+		return NULL;
+	}
+
+#ifdef __linux__
+	ret = bind(fd, (struct sockaddr *)&ll, sizeof(ll));
+	if (ret) {
+		perror("bind");
+		close(fd);
+		return NULL;
+	}
+
+	ret = setsockopt(fd, SOL_PACKET, PACKET_ADD_MEMBERSHIP, &mreq,
+			sizeof(mreq));
+	if (ret) {
+		perror("PACKET_ADD_MEMBERSHIP PACKET_MR_PROMISC");
+		close(fd);
+		return NULL;
+	}
+
+	val = 1;
+	ret = setsockopt(fd, SOL_PACKET, PACKET_QDISC_BYPASS, &val,
+			 sizeof(val));
+	if (ret)
+		perror("PACKET_QDISC_BYPASS, ignoring");
+#endif
+
+	fd_flags = fcntl(fd, F_GETFD, NULL);
+	fcntl(fd, F_SETFL, fd_flags | O_NONBLOCK);
+
+	return lkl_register_netdev_fd(fd, fd);
+}
diff --git a/tools/lkl/lib/virtio_net_tap.c b/tools/lkl/lib/virtio_net_tap.c
new file mode 100644
index 000000000000..f1f64cee9695
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_tap.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * tun/tap based virtual network interface feature for LKL
+ * Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ *         Octavian Purdila <octavian.purdila@intel.com>
+ *
+ * Current implementation is linux-specific.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <fcntl.h>
+#include <net/if.h>
+#ifdef __linux__
+#include <linux/if_tun.h>
+#elif __FreeBSD__
+#include <net/if_tun.h>
+#endif
+#include <sys/ioctl.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+#define BIT(x) (1ULL << x)
+
+struct lkl_netdev *lkl_netdev_tap_init(const char *path, int offload,
+				       struct ifreq *ifr)
+{
+	struct lkl_netdev *nd;
+	int fd, vnet_hdr_sz = 0;
+#ifdef __linux__
+	int ret, tap_arg = 0;
+
+	if (offload & BIT(LKL_VIRTIO_NET_F_GUEST_CSUM))
+		tap_arg |= TUN_F_CSUM;
+	if (offload & (BIT(LKL_VIRTIO_NET_F_GUEST_TSO4) |
+	    BIT(LKL_VIRTIO_NET_F_MRG_RXBUF)))
+		tap_arg |= TUN_F_TSO4 | TUN_F_CSUM;
+	if (offload & (BIT(LKL_VIRTIO_NET_F_GUEST_TSO6)))
+		tap_arg |= TUN_F_TSO6 | TUN_F_CSUM;
+
+	if (tap_arg || (offload & (BIT(LKL_VIRTIO_NET_F_CSUM) |
+				   BIT(LKL_VIRTIO_NET_F_HOST_TSO4) |
+				   BIT(LKL_VIRTIO_NET_F_HOST_TSO6)))) {
+		ifr->ifr_flags |= IFF_VNET_HDR;
+		vnet_hdr_sz = sizeof(struct lkl_virtio_net_hdr_v1);
+	}
+#endif
+	fd = open(path, O_RDWR|O_NONBLOCK);
+	if (fd < 0) {
+		perror("open");
+		return NULL;
+	}
+
+#ifdef __linux__
+	ret = ioctl(fd, TUNSETIFF, ifr);
+	if (ret < 0) {
+		fprintf(stderr, "%s: failed to attach to: %s\n",
+			path, strerror(errno));
+		close(fd);
+		return NULL;
+	}
+	if (vnet_hdr_sz && ioctl(fd, TUNSETVNETHDRSZ, &vnet_hdr_sz) != 0) {
+		fprintf(stderr, "%s: failed to TUNSETVNETHDRSZ to: %s\n",
+			path, strerror(errno));
+		close(fd);
+		return NULL;
+	}
+	if (ioctl(fd, TUNSETOFFLOAD, tap_arg) != 0) {
+		fprintf(stderr, "%s: failed to TUNSETOFFLOAD: %s\n",
+			path, strerror(errno));
+		close(fd);
+		return NULL;
+	}
+#endif
+	nd = lkl_register_netdev_fd(fd, fd);
+	if (!nd) {
+		perror("failed to register to.");
+		close(fd);
+		return NULL;
+	}
+
+	nd->has_vnet_hdr = (vnet_hdr_sz != 0);
+	return nd;
+}
+
+struct lkl_netdev *lkl_netdev_tap_create(const char *ifname, int offload)
+{
+#ifdef __linux__
+	char *path = "/dev/net/tun";
+#elif __FreeBSD__
+	char path[32];
+
+	sprintf(path, "/dev/%s", ifname);
+#endif
+
+	struct ifreq ifr = {
+#ifdef __linux__
+		.ifr_flags = IFF_TAP | IFF_NO_PI,
+#endif
+	};
+
+	strncpy(ifr.ifr_name, ifname, IFNAMSIZ);
+
+	return lkl_netdev_tap_init(path, offload, &ifr);
+}
diff --git a/tools/lkl/lib/virtio_net_vde.c b/tools/lkl/lib/virtio_net_vde.c
new file mode 100644
index 000000000000..1d017aba91ae
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_vde.c
@@ -0,0 +1,168 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <poll.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include "virtio.h"
+
+#include <libvdeplug.h>
+
+struct lkl_netdev_vde {
+	struct lkl_netdev dev;
+	VDECONN *conn;
+};
+
+struct lkl_netdev *nuse_vif_vde_create(char *switch_path);
+static int net_vde_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+static int net_vde_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+static int net_vde_poll_with_timeout(struct lkl_netdev *nd, int timeout);
+static int net_vde_poll(struct lkl_netdev *nd);
+static void net_vde_poll_hup(struct lkl_netdev *nd);
+static void net_vde_free(struct lkl_netdev *nd);
+
+struct lkl_dev_net_ops vde_net_ops = {
+	.tx = net_vde_tx,
+	.rx = net_vde_rx,
+	.poll = net_vde_poll,
+	.poll_hup = net_vde_poll_hup,
+	.free = net_vde_free,
+};
+
+int net_vde_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+	void *data = iov[0].iov_base;
+	int len = (int)iov[0].iov_len;
+
+	ret = vde_send(nd_vde->conn, data, len, 0);
+	if (ret <= 0 && errno == EAGAIN)
+		return -1;
+	return ret;
+}
+
+int net_vde_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+	void *data = iov[0].iov_base;
+	int len = (int)iov[0].iov_len;
+
+	/*
+	 * Due to a bug in libvdeplug we have to first poll to make sure
+	 * that there is data available.
+	 * The correct solution would be to just use
+	 *   ret = vde_recv(nd_vde->conn, data, len, MSG_DONTWAIT);
+	 * This should be changed once libvdeplug is fixed.
+	 */
+	ret = 0;
+	if (net_vde_poll_with_timeout(nd, 0) & LKL_DEV_NET_POLL_RX)
+		ret = vde_recv(nd_vde->conn, data, len, 0);
+	if (ret <= 0)
+		return -1;
+	return ret;
+}
+
+int net_vde_poll_with_timeout(struct lkl_netdev *nd, int timeout)
+{
+	int ret;
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+	struct pollfd pollfds[] = {
+			{
+					.fd = vde_datafd(nd_vde->conn),
+					.events = POLLIN | POLLOUT,
+			},
+			{
+					.fd = vde_ctlfd(nd_vde->conn),
+					.events = POLLHUP | POLLIN
+			}
+	};
+
+	while (poll(pollfds, 2, timeout) < 0 && errno == EINTR)
+		;
+
+	ret = 0;
+
+	if (pollfds[1].revents & (POLLHUP | POLLNVAL | POLLIN))
+		return LKL_DEV_NET_POLL_HUP;
+	if (pollfds[0].revents & (POLLHUP | POLLNVAL))
+		return LKL_DEV_NET_POLL_HUP;
+
+	if (pollfds[0].revents & POLLIN)
+		ret |= LKL_DEV_NET_POLL_RX;
+	if (pollfds[0].revents & POLLOUT)
+		ret |= LKL_DEV_NET_POLL_TX;
+
+	return ret;
+}
+
+int net_vde_poll(struct lkl_netdev *nd)
+{
+	return net_vde_poll_with_timeout(nd, -1);
+}
+
+void net_vde_poll_hup(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+
+	vde_close(nd_vde->conn);
+}
+
+void net_vde_free(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+
+	free(nd_vde);
+}
+
+struct lkl_netdev *lkl_netdev_vde_create(char const *switch_path)
+{
+	struct lkl_netdev_vde *nd;
+	struct vde_open_args open_args = {.port = 0, .group = 0, .mode = 0700 };
+	char *switch_path_copy = 0;
+
+	nd = malloc(sizeof(*nd));
+	if (!nd) {
+		fprintf(stderr, "Failed to allocate memory.\n");
+		/* TODO: propagate the error state, maybe use errno? */
+		return 0;
+	}
+	nd->dev.ops = &vde_net_ops;
+
+	/* vde_open() allows the null pointer as path which means
+	 * "VDE default path"
+	 */
+	if (switch_path != 0) {
+		/* vde_open() takes a non-const char * which is a bug in their
+		 * function declaration. Even though the implementation does not
+		 * modify the string, we shouldn't just cast away the const.
+		 */
+		size_t switch_path_length = strlen(switch_path);
+
+		switch_path_copy = calloc(switch_path_length + 1, sizeof(char));
+		if (!switch_path_copy) {
+			fprintf(stderr, "Failed to allocate memory.\n");
+			/* TODO: propagate the error state, maybe use errno? */
+			return 0;
+		}
+		strncpy(switch_path_copy, switch_path, switch_path_length);
+	}
+	nd->conn = vde_open(switch_path_copy, "lkl-virtio-net", &open_args);
+	free(switch_path_copy);
+	if (nd->conn == 0) {
+		fprintf(stderr, "Failed to connect to vde switch.\n");
+		/* TODO: propagate the error state, maybe use errno? */
+		return 0;
+	}
+
+	return &nd->dev;
+}
diff --git a/tools/lkl/scripts/dpdk-sdk-build.sh b/tools/lkl/scripts/dpdk-sdk-build.sh
new file mode 100755
index 000000000000..59de4ad8db51
--- /dev/null
+++ b/tools/lkl/scripts/dpdk-sdk-build.sh
@@ -0,0 +1,18 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+dpdk_version="17.02"
+
+git clone -b v${dpdk_version} git://dpdk.org/dpdk dpdk-${dpdk_version}
+
+RTE_SDK=$(pwd)/dpdk-${dpdk_version}
+RTE_TARGET=$(uname -m)-native-linuxapp-gcc
+export RTE_SDK
+export RTE_TARGET
+export EXTRA_CFLAGS="-fPIC -O0 -g3"
+
+set -e
+cd dpdk-${dpdk_version}
+make -j1 T=${RTE_TARGET} config
+make -j3 \
+  || (echo "dpdk build failed" && exit 1)
diff --git a/tools/lkl/tests/net-setup.sh b/tools/lkl/tests/net-setup.sh
new file mode 100644
index 000000000000..cc260ed68a7b
--- /dev/null
+++ b/tools/lkl/tests/net-setup.sh
@@ -0,0 +1,134 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+TEST_TAP_IFNAME=tap
+else
+TEST_TAP_IFNAME=lkl_test_tap
+fi
+TEST_IP_NETWORK=192.168.113.0
+TEST_IP_NETMASK=24
+TEST_IP6_NETWORK=fc03::0
+TEST_IP6_NETMASK=64
+TEST_MAC0="aa:bb:cc:dd:ee:ff"
+TEST_MAC1="aa:bb:cc:dd:ee:aa"
+TEST_NETSERVER_PORT=11223
+
+# $1 - count
+# $2 - netcount
+ip_add()
+{
+    IP_HEX=$(printf '%.2X%.2X%.2X%.2X\n' \
+         `echo $TEST_IP_NETWORK | sed -e 's/\./ /g'`)
+    NET_COUNT=$(( 1 << (32 - $TEST_IP_NETMASK) ))
+    NEXT_IP_HEX=$(printf %.8X `echo $((0x$IP_HEX + $1 + ${2:-0} * $NET_COUNT))`)
+    NEXT_IP=$(printf '%d.%d.%d.%d\n' \
+          `echo $NEXT_IP_HEX | sed -r 's/(..)/0x\1 /g'`)
+    echo -n "$NEXT_IP"
+}
+
+# $1 - count
+# $2 - netcount
+ip6_add()
+{
+    IP6_PREFIX=${TEST_IP6_NETWORK%*::*}
+    IP6_HOST=${TEST_IP6_NETWORK#*::*}
+    echo -n "$(printf "%x" $((0x$IP6_PREFIX+${2:-0})))::$(($IP6_HOST+$1))"
+}
+
+ip_host()
+{
+
+    ip_add 1 $1
+}
+
+ip_lkl()
+{
+    ip_add 2 $1
+}
+
+ip_host_mask()
+{
+    echo -n "$(ip_host $1)/$TEST_IP_NETMASK"
+}
+
+ip_net_mask()
+{
+    echo "$(ip_add 0 $1)/$TEST_IP_NETMASK"
+}
+
+ip6_host()
+{
+    ip6_add 1 $1
+}
+
+ip6_lkl()
+{
+    ip6_add 2 $1
+}
+
+ip6_host_mask()
+{
+    echo -n "$(ip6_host $1)/$TEST_IP6_NETMASK"
+}
+
+ip6_net_mask()
+{
+    echo "$(ip6_add 0 $1)/$TEST_IP6_NETMASK"
+}
+
+tap_ifname()
+{
+    echo -n "$TEST_TAP_IFNAME${1:-0}"
+}
+
+tap_prepare()
+{
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        if ! lkl_test_cmd test -d /dev/net &>/dev/null; then
+            lkl_test_cmd sudo mkdir /dev/net
+            lkl_test_cmd sudo ln -s /dev/tun /dev/net/tun
+        fi
+        TAP_USER="vpn"
+        ANDROID_USER="vpn,vpn,net_admin,inet"
+        export_vars ANDROID_USER
+    else
+        TAP_USER=$USER
+    fi
+}
+
+tap_setup()
+{
+    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+        lkl_test_cmd sudo ifconfig tap create
+        lkl_test_cmd sudo sysctl net.link.tap.up_on_open=1
+        lkl_test_cmd sudo sysctl net.link.tap.user_open=1
+        lkl_test_cmd sudo ifconfig $(tap_ifname) $(ip_host)
+        lkl_test_cmd sudo ifconfig $(tap_ifname) inet6 $(ip6_host)
+        return
+    fi
+
+    lkl_test_cmd sudo ip tuntap add dev $(tap_ifname $1) mode tap user $TAP_USER
+    lkl_test_cmd sudo ip link set dev $(tap_ifname $1) up
+    lkl_test_cmd sudo ip addr add dev $(tap_ifname $1) $(ip_host_mask $1)
+    lkl_test_cmd sudo ip -6 addr add dev $(tap_ifname $1) $(ip6_host_mask $1)
+
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        lkl_test_cmd sudo ip route add $(ip_net_mask $1) \
+                     dev $(tap_ifname $1) proto kernel scope link \
+                     src $(ip_host $1) table local
+        lkl_test_cmd sudo ip -6 route add $(ip6_net_mask $1) \
+                     dev $(tap_ifname $1) table local
+    fi
+}
+
+tap_cleanup()
+{
+    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+        lkl_test_cmd sudo ifconfig $(tap_ifname) destroy
+        return
+    fi
+
+    lkl_test_cmd sudo ip link set dev $(tap_ifname $1) down
+    lkl_test_cmd sudo ip tuntap del dev $(tap_ifname $1) mode tap
+}
diff --git a/tools/lkl/tests/net-test.c b/tools/lkl/tests/net-test.c
new file mode 100644
index 000000000000..d2fd19f1b995
--- /dev/null
+++ b/tools/lkl/tests/net-test.c
@@ -0,0 +1,317 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <errno.h>
+#ifdef __FreeBSD__
+#include <sys/types.h>
+#endif
+#ifdef __MINGW32__
+#include <winsock2.h>
+#else
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#endif
+
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include "cla.h"
+#include "test.h"
+
+enum {
+	BACKEND_TAP,
+	BACKEND_MACVTAP,
+	BACKEND_RAW,
+	BACKEND_DPDK,
+	BACKEND_PIPE,
+	BACKEND_NONE,
+};
+
+const char *backends[] = { "tap", "macvtap", "raw", "dpdk", "pipe", "loopback",
+			   NULL };
+static struct {
+	int backend;
+	const char *ifname;
+	int dhcp, nmlen;
+	unsigned int ip, dst, gateway, sleep;
+} cla = {
+	.backend = BACKEND_NONE,
+	.ip = INADDR_NONE,
+	.gateway = INADDR_NONE,
+	.dst = INADDR_NONE,
+	.sleep = 0,
+};
+
+
+struct cl_arg args[] = {
+	{"backend", 'b', "network backend type", 1, CL_ARG_STR_SET,
+	 &cla.backend, backends},
+	{"ifname", 'i', "interface name", 1, CL_ARG_STR, &cla.ifname},
+	{"dhcp", 'd', "use dhcp to configure LKL", 0, CL_ARG_BOOL, &cla.dhcp},
+	{"ip", 'I', "IPv4 address to use", 1, CL_ARG_IPV4, &cla.ip},
+	{"netmask-len", 'n', "IPv4 netmask length", 1, CL_ARG_INT,
+	 &cla.nmlen},
+	{"gateway", 'g', "IPv4 gateway to use", 1, CL_ARG_IPV4, &cla.gateway},
+	{"dst", 'D', "IPv4 destination address", 1, CL_ARG_IPV4, &cla.dst},
+	{"sleep", 's', "sleep", 1, CL_ARG_INT, &cla.sleep},
+	{0},
+};
+
+u_short
+in_cksum(const u_short *addr, register int len, u_short csum)
+{
+	int nleft = len;
+	const u_short *w = addr;
+	u_short answer;
+	int sum = csum;
+
+	while (nleft > 1)  {
+		sum += *w++;
+		nleft -= 2;
+	}
+
+	if (nleft == 1)
+		sum += htons(*(u_char *)w << 8);
+
+	sum = (sum >> 16) + (sum & 0xffff);
+	sum += (sum >> 16);
+	answer = ~sum;
+	return answer;
+}
+
+static int lkl_test_sleep(void)
+{
+	struct lkl_timespec ts = {
+		.tv_sec = cla.sleep,
+	};
+	int ret;
+
+	ret = lkl_sys_nanosleep((struct __lkl__kernel_timespec *)&ts, NULL);
+	if (ret < 0) {
+		lkl_test_logf("nanosleep error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_icmp(void)
+{
+	int sock, ret;
+	struct lkl_iphdr *iph;
+	struct lkl_icmphdr *icmp;
+	struct lkl_sockaddr_in saddr;
+	struct lkl_pollfd pfd;
+	char buf[32];
+
+	if (cla.dst == INADDR_NONE)
+		return TEST_SKIP;
+
+	memset(&saddr, 0, sizeof(saddr));
+	saddr.sin_family = AF_INET;
+	saddr.sin_addr.lkl_s_addr = cla.dst;
+
+	lkl_test_logf("pinging %s\n",
+		      inet_ntoa(*(struct in_addr *)&saddr.sin_addr));
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_RAW, LKL_IPPROTO_ICMP);
+	if (sock < 0) {
+		lkl_test_logf("socket error (%s)\n", lkl_strerror(sock));
+		return TEST_FAILURE;
+	}
+
+	icmp = malloc(sizeof(struct lkl_icmphdr *));
+	icmp->type = LKL_ICMP_ECHO;
+	icmp->code = 0;
+	icmp->checksum = 0;
+	icmp->un.echo.sequence = 0;
+	icmp->un.echo.id = 0;
+	icmp->checksum = in_cksum((u_short *)icmp, sizeof(*icmp), 0);
+
+	ret = lkl_sys_sendto(sock, icmp, sizeof(*icmp), 0,
+			     (struct lkl_sockaddr *)&saddr,
+			     sizeof(saddr));
+	if (ret < 0) {
+		lkl_test_logf("sendto error (%s)\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	free(icmp);
+
+	pfd.fd = sock;
+	pfd.events = LKL_POLLIN;
+	pfd.revents = 0;
+
+	ret = lkl_sys_poll(&pfd, 1, 1000);
+	if (ret < 0) {
+		lkl_test_logf("poll error (%s)\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_recv(sock, buf, sizeof(buf), LKL_MSG_DONTWAIT);
+	if (ret < 0) {
+		lkl_test_logf("recv error (%s)\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	iph = (struct lkl_iphdr *)buf;
+	icmp = (struct lkl_icmphdr *)(buf + iph->ihl * 4);
+	/* DHCP server may issue an ICMP echo request to a dhcp client */
+	if ((icmp->type != LKL_ICMP_ECHOREPLY || icmp->code != 0) &&
+	    (icmp->type != LKL_ICMP_ECHO)) {
+		lkl_test_logf("no ICMP echo reply (type=%d, code=%d)\n",
+			      icmp->type, icmp->code);
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static struct lkl_netdev *nd;
+
+static int lkl_test_nd_create(void)
+{
+	switch (cla.backend) {
+	case BACKEND_NONE:
+		return TEST_SKIP;
+	case BACKEND_TAP:
+		nd = lkl_netdev_tap_create(cla.ifname, 0);
+		break;
+	case BACKEND_DPDK:
+		nd = lkl_netdev_dpdk_create(cla.ifname, 0, NULL);
+		break;
+	case BACKEND_RAW:
+		nd = lkl_netdev_raw_create(cla.ifname);
+		break;
+	case BACKEND_MACVTAP:
+		nd = lkl_netdev_macvtap_create(cla.ifname, 0);
+		break;
+	case BACKEND_PIPE:
+		nd = lkl_netdev_pipe_create(cla.ifname, 0);
+		break;
+	}
+
+	if (!nd) {
+		lkl_test_logf("failed to create netdev\n");
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int nd_id;
+
+static int lkl_test_nd_add(void)
+{
+	if (cla.backend == BACKEND_NONE)
+		return TEST_SKIP;
+
+	nd_id = lkl_netdev_add(nd, NULL);
+	if (nd_id < 0) {
+		lkl_test_logf("failed to add netdev: %s\n",
+			      lkl_strerror(nd_id));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_nd_remove(void)
+{
+	if (cla.backend == BACKEND_NONE)
+		return TEST_SKIP;
+
+	lkl_netdev_remove(nd_id);
+	lkl_netdev_free(nd);
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
+	"mem=16M loglevel=8 %s", cla.dhcp ? "ip=dhcp" : "");
+LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
+
+static int nd_ifindex;
+
+static int lkl_test_nd_ifindex(void)
+{
+	if (cla.backend == BACKEND_NONE)
+		return TEST_SKIP;
+
+	nd_ifindex = lkl_netdev_get_ifindex(nd_id);
+	if (nd_ifindex < 0) {
+		lkl_test_logf("failed to get ifindex for netdev id %d: %s\n",
+			      nd_id, lkl_strerror(nd_ifindex));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(if_up, lkl_if_up, 0,
+	      cla.backend == BACKEND_NONE ? 1 : nd_ifindex);
+
+static int lkl_test_set_ipv4(void)
+{
+	int ret;
+
+	if (cla.backend == BACKEND_NONE || cla.ip == LKL_INADDR_NONE)
+		return TEST_SKIP;
+
+	ret = lkl_if_set_ipv4(nd_ifindex, cla.ip, cla.nmlen);
+	if (ret < 0) {
+		lkl_test_logf("failed to set IPv4 address: %s\n",
+			      lkl_strerror(ret));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_set_gateway(void)
+{
+	int ret;
+
+	if (cla.backend == BACKEND_NONE || cla.gateway == LKL_INADDR_NONE)
+		return TEST_SKIP;
+
+	ret = lkl_set_ipv4_gateway(cla.gateway);
+	if (ret < 0) {
+		lkl_test_logf("failed to set IPv4 gateway: %s\n",
+			      lkl_strerror(ret));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+struct lkl_test tests[] = {
+	LKL_TEST(nd_create),
+	LKL_TEST(nd_add),
+	LKL_TEST(start_kernel),
+	LKL_TEST(nd_ifindex),
+	LKL_TEST(if_up),
+	LKL_TEST(set_ipv4),
+	LKL_TEST(set_gateway),
+	LKL_TEST(sleep),
+	LKL_TEST(icmp),
+	LKL_TEST(nd_remove),
+	LKL_TEST(stop_kernel),
+};
+
+int main(int argc, const char **argv)
+{
+	if (parse_args(argc, argv, args) < 0)
+		return -1;
+
+	if (cla.ip != LKL_INADDR_NONE && (cla.nmlen < 0 || cla.nmlen > 32)) {
+		fprintf(stderr, "invalid netmask length %d\n", cla.nmlen);
+		return -1;
+	}
+
+	lkl_host_ops.print = lkl_test_log;
+
+	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
+			    "net %s", backends[cla.backend]);
+}
diff --git a/tools/lkl/tests/net.sh b/tools/lkl/tests/net.sh
new file mode 100755
index 000000000000..cd8de53fe0fd
--- /dev/null
+++ b/tools/lkl/tests/net.sh
@@ -0,0 +1,186 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+source $script_dir/test.sh
+source $script_dir/net-setup.sh
+
+cleanup_backend()
+{
+    set -e
+
+    case "$1" in
+    "tap")
+        tap_cleanup
+        ;;
+    "pipe")
+        rm -rf $work_dir
+        ;;
+    "raw")
+        ;;
+    "macvtap")
+        sudo ip link del dev $(tap_ifname) type macvtap
+        ;;
+    "loopback")
+        ;;
+    esac
+}
+
+get_test_ip()
+{
+    # DHCP test parameters
+    TEST_HOST=8.8.8.8
+    HOST_IF=$(lkl_test_cmd ip route get $TEST_HOST | head -n1 |cut -d ' ' -f5)
+    HOST_GW=$(lkl_test_cmd ip route get $TEST_HOST | head -n1 | cut -d ' ' -f3)
+    if lkl_test_cmd ping -c1 -w1 $HOST_GW; then
+        TEST_IP_REMOTE=$HOST_GW
+    elif lkl_test_cmd ping -c1 -w1 $TEST_HOST; then
+        TEST_IP_REMOTE=$TEST_HOST
+    else
+        echo "could not find remote test ip"
+        return $TEST_SKIP
+    fi
+
+    export_vars HOST_IF TEST_IP_REMOTE
+}
+
+setup_backend()
+{
+    set -e
+
+    if [ "$LKL_HOST_CONFIG_POSIX" != "y" ] &&
+       [ "$1" != "loopback" ]; then
+        echo "not a posix environment"
+        return $TEST_SKIP
+    fi
+
+    case "$1" in
+    "loopback")
+        ;;
+    "pipe")
+        if [ -z $(lkl_test_cmd which mkfifo) ]; then
+            echo "no mkfifo command"
+            return $TEST_SKIP
+        else
+            work_dir=$(lkl_test_cmd mktemp -d)
+        fi
+        fifo1=$work_dir/fifo1
+        fifo2=$work_dir/fifo2
+        lkl_test_cmd mkfifo $fifo1
+        lkl_test_cmd mkfifo $fifo2
+        export_vars work_dir fifo1 fifo2
+        ;;
+    "tap")
+        tap_prepare
+        if ! lkl_test_cmd test -c /dev/net/tun; then
+            if [ -z "$LKL_HOST_CONFIG_BSD" ]; then
+                echo "missing /dev/net/tun"
+                return $TEST_SKIP
+            fi
+        fi
+        tap_setup
+        ;;
+    "raw")
+        if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+            return $TEST_SKIP
+        fi
+        get_test_ip
+        ;;
+    "macvtap")
+        get_test_ip
+        if ! lkl_test_cmd sudo ip link add link $HOST_IF \
+             name $(tap_ifname) type macvtap mode passthru; then
+            echo "failed to create macvtap, skipping"
+            return $TEST_SKIP
+        fi
+        MACVTAP=/dev/tap$(lkl_test_cmd ip link show dev $(tap_ifname) | \
+                                 grep -o ^[0-9]*)
+        lkl_test_cmd sudo ip link set dev $(tap_ifname) up
+        lkl_test_cmd sudo chown $USER $MACVTAP
+        export_vars MACVTAP
+        ;;
+    "dpdk")
+        if -z [ $LKL_TEST_NET_DPDK ]; then
+            echo "DPDK needs user setup"
+            return $TEST_SKIP
+        fi
+        ;;
+    *)
+        echo "don't know how to setup backend $1"
+        return $TEST_FAILED
+        ;;
+    esac
+}
+
+run_tests()
+{
+    case "$1" in
+    "loopback")
+        lkl_test_exec $script_dir/net-test --dst 127.0.0.1
+        ;;
+    "pipe")
+        VALGRIND="" lkl_test_exec $script_dir/net-test --backend pipe \
+                      --ifname "$fifo1|$fifo2" \
+                      --ip $(ip_host) --netmask-len $TEST_IP_NETMASK \
+                      --sleep 1800 >/dev/null &
+        cp $script_dir/net-test $script_dir/net-test2
+
+        sleep 10
+        lkl_test_exec $script_dir/net-test2 --backend pipe \
+                      --ifname "$fifo2|$fifo1" \
+                      --ip $(ip_lkl) --netmask-len $TEST_IP_NETMASK \
+                      --dst $(ip_host)
+        rm -f $script_dir/net-test2
+        kill $!
+        wait $! 2>/dev/null
+        ;;
+    "tap")
+        lkl_test_exec $script_dir/net-test --backend tap \
+                      --ifname $(tap_ifname) \
+                      --ip $(ip_lkl) --netmask-len $TEST_IP_NETMASK \
+                      --dst $(ip_host)
+        ;;
+    "raw")
+        lkl_test_exec sudo $script_dir/net-test --backend raw \
+                      --ifname $HOST_IF --dhcp --dst $TEST_IP_REMOTE
+        ;;
+    "macvtap")
+        lkl_test_exec $script_dir/net-test --backend macvtap \
+                      --ifname $MACVTAP \
+                      --dhcp --dst $TEST_IP_REMOTE
+        ;;
+    "dpdk")
+        lkl_test_exec sudo $script_dir/net-test --backend dpdk \
+                      --ifname dpdk0 \
+                      --ip $(ip_lkl) --netmask-len $TEST_IP_NETMASK \
+                      --dst $(ip_host)
+        ;;
+    esac
+}
+
+if [ "$1" = "-b" ]; then
+    shift
+    backend=$1
+    shift
+fi
+
+if [ -z "$backend" ]; then
+    backend="loopback"
+fi
+
+lkl_test_plan 1 "net $backend"
+lkl_test_run 1 setup_backend $backend
+
+if [ $? = $TEST_SKIP ]; then
+    exit 0
+fi
+
+trap "cleanup_backend $backend" EXIT
+
+run_tests $backend
+
+trap : EXIT
+lkl_test_plan 1 "net $backend"
+lkl_test_run 1 cleanup_backend $backend
+
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 25/47] lkl: add support for Windows hosts
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (23 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 24/47] lkl tools: virtio: add network device support Hajime Tazaki
@ 2019-10-23  4:37 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 26/47] lkl tools: add support for Windows host Hajime Tazaki
                   ` (24 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:37 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Jens Staal, Hajime Tazaki, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch allows LKL to be compiled for windows hosts with the mingw
toolchain. Note that patches [1] that fix weak symbols linking are
required to successfully compile LKL with mingw.

The patch disables the modpost pass over vmlinux since modpost only
works with ELF objects.

It also adds and workaround to an #include_next <stdard.h> error which
is apparently caused by using -nosdtinc.

[1] https://sourceware.org/ml/binutils/2015-10/msg00234.html

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Jens Staal <staal1978@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/system/stdarg.h | 2 ++
 include/linux/compiler_attributes.h | 4 ++++
 lib/.gitignore                      | 2 ++
 lib/raid6/.gitignore                | 1 +
 scripts/.gitignore                  | 2 ++
 scripts/basic/.gitignore            | 1 +
 scripts/kconfig/.gitignore          | 1 +
 scripts/link-vmlinux.sh             | 2 ++
 scripts/mod/.gitignore              | 1 +
 9 files changed, 16 insertions(+)
 create mode 100644 arch/um/lkl/include/system/stdarg.h

diff --git a/arch/um/lkl/include/system/stdarg.h b/arch/um/lkl/include/system/stdarg.h
new file mode 100644
index 000000000000..12077a36828c
--- /dev/null
+++ b/arch/um/lkl/include/system/stdarg.h
@@ -0,0 +1,2 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* empty file to avoid #include_next<stdarg.h> error */
diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 6b318efd8a74..1981b1c323c1 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -154,7 +154,11 @@
  *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-format-function-attribute
  * clang: https://clang.llvm.org/docs/AttributeReference.html#format
  */
+#ifdef __MINGW32__
+#define __printf(a, b)             __attribute__((__format__(gnu_printf, a, b)))
+#else
 #define __printf(a, b)                  __attribute__((__format__(printf, a, b)))
+#endif
 #define __scanf(a, b)                   __attribute__((__format__(scanf, a, b)))
 
 /*
diff --git a/lib/.gitignore b/lib/.gitignore
index f2a39c9e5485..eb9f11b81fe1 100644
--- a/lib/.gitignore
+++ b/lib/.gitignore
@@ -2,7 +2,9 @@
 # Generated files
 #
 gen_crc32table
+gen_crc32table.exe
 gen_crc64table
+gen_crc64table.exe
 crc32table.h
 crc64table.h
 oid_registry_data.c
diff --git a/lib/raid6/.gitignore b/lib/raid6/.gitignore
index 3de0d8921286..80e3566535aa 100644
--- a/lib/raid6/.gitignore
+++ b/lib/raid6/.gitignore
@@ -1,4 +1,5 @@
 mktables
+mktables.exe
 altivec*.c
 int*.c
 tables.c
diff --git a/scripts/.gitignore b/scripts/.gitignore
index 17f8cef88fa8..ec9138a39b25 100644
--- a/scripts/.gitignore
+++ b/scripts/.gitignore
@@ -4,8 +4,10 @@
 bin2c
 conmakehash
 kallsyms
+kallsyms.exe
 pnmtologo
 unifdef
+unifdef.exe
 recordmcount
 sortextable
 asn1_compiler
diff --git a/scripts/basic/.gitignore b/scripts/basic/.gitignore
index a776371a3502..77ce153243fa 100644
--- a/scripts/basic/.gitignore
+++ b/scripts/basic/.gitignore
@@ -1 +1,2 @@
 fixdep
+fixdep.exe
diff --git a/scripts/kconfig/.gitignore b/scripts/kconfig/.gitignore
index b5bf92f66d11..aa27000d896f 100644
--- a/scripts/kconfig/.gitignore
+++ b/scripts/kconfig/.gitignore
@@ -8,6 +8,7 @@
 # configuration programs
 #
 conf
+conf.exe
 mconf
 nconf
 qconf
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 915775eb2921..27d2066238c7 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -205,6 +205,7 @@ fi;
 # final build of init/
 ${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init
 
+if [ -e scripts/mod/modpost ]; then
 #link vmlinux.o
 info LD vmlinux.o
 modpost_link vmlinux.o
@@ -214,6 +215,7 @@ ${MAKE} -f "${srctree}/scripts/Makefile.modpost" MODPOST_VMLINUX=1
 
 info MODINFO modules.builtin.modinfo
 ${OBJCOPY} -j .modinfo -O binary vmlinux.o modules.builtin.modinfo
+fi
 
 kallsymso=""
 kallsyms_vmlinux=""
diff --git a/scripts/mod/.gitignore b/scripts/mod/.gitignore
index 3bd11b603173..cd67845e326d 100644
--- a/scripts/mod/.gitignore
+++ b/scripts/mod/.gitignore
@@ -1,4 +1,5 @@
 elfconfig.h
 mk_elfconfig
 modpost
+modpost.exe
 devicetable-offsets.h
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 26/47] lkl tools: add support for Windows host
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (24 preceding siblings ...)
  2019-10-23  4:37 ` [RFC PATCH 25/47] lkl: add support for Windows hosts Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 27/47] lkl: Android ARM (arm/arm64) support Hajime Tazaki
                   ` (23 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Andreas Gnau, Hajime Tazaki, Patrick Collins,
	Akira Moroo, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Add host operations for Windows host and virtio disk support.

Trivial changes to the generic virtio host code are made since mingw %p
format is different then what the MMIO virtion driver expects.

The boot test is updated to support Window hosts as well.

Signed-off-by: Andreas Gnau <andreas.gnau@intel.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/mingw32/sys/socket.h |   4 +
 tools/lkl/lib/nt-host.c                | 375 +++++++++++++++++++++++++
 2 files changed, 379 insertions(+)
 create mode 100644 tools/lkl/include/mingw32/sys/socket.h
 create mode 100644 tools/lkl/lib/nt-host.c

diff --git a/tools/lkl/include/mingw32/sys/socket.h b/tools/lkl/include/mingw32/sys/socket.h
new file mode 100644
index 000000000000..f9ede3170d03
--- /dev/null
+++ b/tools/lkl/include/mingw32/sys/socket.h
@@ -0,0 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* fake file to avoid #include <sys/socket.h> error on non-posix
+ * host (e.g., mingw32)
+ */
diff --git a/tools/lkl/lib/nt-host.c b/tools/lkl/lib/nt-host.c
new file mode 100644
index 000000000000..c7613272be3b
--- /dev/null
+++ b/tools/lkl/lib/nt-host.c
@@ -0,0 +1,375 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <windows.h>
+#include <assert.h>
+#include <unistd.h>
+#undef s_addr
+#include <lkl_host.h>
+#include "iomem.h"
+#include "jmp_buf.h"
+
+#define DIFF_1601_TO_1970_IN_100NS (11644473600L * 10000000L)
+
+struct lkl_mutex {
+	int recursive;
+	HANDLE handle;
+};
+
+struct lkl_sem {
+	HANDLE sem;
+};
+
+struct lkl_tls_key {
+	DWORD key;
+};
+
+static struct lkl_sem *sem_alloc(int count)
+{
+	struct lkl_sem *sem = malloc(sizeof(struct lkl_sem));
+
+	sem->sem = CreateSemaphore(NULL, count, 100, NULL);
+	return sem;
+}
+
+static void sem_up(struct lkl_sem *sem)
+{
+	ReleaseSemaphore(sem->sem, 1, NULL);
+}
+
+static void sem_down(struct lkl_sem *sem)
+{
+	WaitForSingleObject(sem->sem, INFINITE);
+}
+
+static void sem_free(struct lkl_sem *sem)
+{
+	CloseHandle(sem->sem);
+	free(sem);
+}
+
+static struct lkl_mutex *mutex_alloc(int recursive)
+{
+	struct lkl_mutex *_mutex = malloc(sizeof(struct lkl_mutex));
+
+	if (!_mutex)
+		return NULL;
+
+	if (recursive)
+		_mutex->handle = CreateMutex(0, FALSE, 0);
+	else
+		_mutex->handle = CreateSemaphore(NULL, 1, 100, NULL);
+	_mutex->recursive = recursive;
+	return _mutex;
+}
+
+static void mutex_lock(struct lkl_mutex *mutex)
+{
+	WaitForSingleObject(mutex->handle, INFINITE);
+}
+
+static void mutex_unlock(struct lkl_mutex *_mutex)
+{
+	if (_mutex->recursive)
+		ReleaseMutex(_mutex->handle);
+	else
+		ReleaseSemaphore(_mutex->handle, 1, NULL);
+}
+
+static void mutex_free(struct lkl_mutex *_mutex)
+{
+	CloseHandle(_mutex->handle);
+	free(_mutex);
+}
+
+static lkl_thread_t thread_create(void (*fn)(void *), void *arg)
+{
+	DWORD WINAPI(*win_fn)(LPVOID arg) = (DWORD WINAPI(*)(LPVOID))fn;
+	HANDLE h = CreateThread(NULL, 0, win_fn, arg, 0, NULL);
+
+	if (!h)
+		return 0;
+
+	return GetThreadId(h);
+}
+
+static void thread_detach(void)
+{
+}
+
+static void thread_exit(void)
+{
+	ExitThread(0);
+}
+
+static int thread_join(lkl_thread_t tid)
+{
+	int ret;
+	HANDLE *h;
+
+	h = OpenThread(SYNCHRONIZE, FALSE, tid);
+	if (!h)
+		lkl_printf("%s: can't get thread handle\n", __func__);
+
+	ret = WaitForSingleObject(h, INFINITE);
+	if (ret)
+		lkl_printf("%s: %d\n", __func__, ret);
+
+	CloseHandle(h);
+
+	return ret ? -1 : 0;
+}
+
+static lkl_thread_t thread_self(void)
+{
+	return GetThreadId(GetCurrentThread());
+}
+
+static int thread_equal(lkl_thread_t a, lkl_thread_t b)
+{
+	return a == b;
+}
+
+static struct lkl_tls_key *tls_alloc(void (*destructor)(void *))
+{
+	struct lkl_tls_key *ret = malloc(sizeof(struct lkl_tls_key));
+
+	ret->key = FlsAlloc((PFLS_CALLBACK_FUNCTION)destructor);
+	if (ret->key == TLS_OUT_OF_INDEXES) {
+		free(ret);
+		return NULL;
+	}
+	return ret;
+}
+
+static void tls_free(struct lkl_tls_key *key)
+{
+	/* setting to NULL first to prevent the callback from being called */
+	FlsSetValue(key->key, NULL);
+	FlsFree(key->key);
+	free(key);
+}
+
+static int tls_set(struct lkl_tls_key *key, void *data)
+{
+	return FlsSetValue(key->key, data) ? 0 : -1;
+}
+
+static void *tls_get(struct lkl_tls_key *key)
+{
+	return FlsGetValue(key->key);
+}
+
+
+/*
+ * With 64 bits, we can cover about 583 years at a nanosecond resolution.
+ * Windows counts time from 1601 so we do have about 100 years before we
+ * overflow.
+ */
+static unsigned long long time_ns(void)
+{
+	SYSTEMTIME st;
+	FILETIME ft;
+	ULARGE_INTEGER uli;
+
+	GetSystemTime(&st);
+	SystemTimeToFileTime(&st, &ft);
+	uli.LowPart = ft.dwLowDateTime;
+	uli.HighPart = ft.dwHighDateTime;
+
+	return (uli.QuadPart - DIFF_1601_TO_1970_IN_100NS) * 100;
+}
+
+struct timer {
+	HANDLE queue;
+	void (*callback)(void *arg);
+	void *arg;
+};
+
+static void *timer_alloc(void (*fn)(void *), void *arg)
+{
+	struct timer *t;
+
+	t = malloc(sizeof(*t));
+	if (!t)
+		return NULL;
+
+	t->queue = CreateTimerQueue();
+	if (!t->queue) {
+		free(t);
+		return NULL;
+	}
+
+	t->callback = fn;
+	t->arg = arg;
+
+	return t;
+}
+
+static void CALLBACK timer_callback(void *arg, BOOLEAN TimerOrWaitFired)
+{
+	struct timer *t = (struct timer *)arg;
+
+	if (TimerOrWaitFired)
+		t->callback(t->arg);
+}
+
+static int timer_set_oneshot(void *timer, unsigned long ns)
+{
+	struct timer *t = (struct timer *)timer;
+	HANDLE tmp;
+
+	return !CreateTimerQueueTimer(&tmp, t->queue, timer_callback, t,
+				      ns / 1000000, 0, 0);
+}
+
+static void timer_free(void *timer)
+{
+	struct timer *t = (struct timer *)timer;
+	HANDLE completion;
+
+	completion = CreateEvent(NULL, FALSE, FALSE, NULL);
+	DeleteTimerQueueEx(t->queue, completion);
+	WaitForSingleObject(completion, INFINITE);
+	free(t);
+}
+
+static void panic(void)
+{
+	int *x = NULL;
+
+	*x = 1;
+	assert(0);
+}
+
+static void print(const char *str, int len)
+{
+	write(1, str, len);
+}
+
+static long gettid(void)
+{
+	return GetCurrentThreadId();
+}
+
+static void *mem_alloc(unsigned long size)
+{
+	return malloc(size);
+}
+
+struct lkl_host_operations lkl_host_ops = {
+	.panic = panic,
+	.thread_create = thread_create,
+	.thread_detach = thread_detach,
+	.thread_exit = thread_exit,
+	.thread_join = thread_join,
+	.thread_self = thread_self,
+	.thread_equal = thread_equal,
+	.sem_alloc = sem_alloc,
+	.sem_free = sem_free,
+	.sem_up = sem_up,
+	.sem_down = sem_down,
+	.mutex_alloc = mutex_alloc,
+	.mutex_free = mutex_free,
+	.mutex_lock = mutex_lock,
+	.mutex_unlock = mutex_unlock,
+	.tls_alloc = tls_alloc,
+	.tls_free = tls_free,
+	.tls_set = tls_set,
+	.tls_get = tls_get,
+	.time = time_ns,
+	.timer_alloc = timer_alloc,
+	.timer_set_oneshot = timer_set_oneshot,
+	.timer_free = timer_free,
+	.print = print,
+	.mem_alloc = mem_alloc,
+	.mem_free = free,
+	.ioremap = lkl_ioremap,
+	.iomem_access = lkl_iomem_access,
+	.virtio_devices = lkl_virtio_devs,
+	.gettid = gettid,
+	.jmp_buf_set = jmp_buf_set,
+	.jmp_buf_longjmp = jmp_buf_longjmp,
+};
+
+int handle_get_capacity(struct lkl_disk disk, unsigned long long *res)
+{
+	LARGE_INTEGER tmp;
+
+	if (!GetFileSizeEx(disk.handle, &tmp))
+		return -1;
+
+	*res = tmp.QuadPart;
+	return 0;
+}
+
+static int blk_request(struct lkl_disk disk, struct lkl_blk_req *req)
+{
+	unsigned long long offset = req->sector * 512;
+	OVERLAPPED ov = { 0, };
+	int err = 0, ret;
+
+	switch (req->type) {
+	case LKL_DEV_BLK_TYPE_READ:
+	case LKL_DEV_BLK_TYPE_WRITE:
+	{
+		int i;
+
+		for (i = 0; i < req->count; i++) {
+			DWORD res;
+			struct iovec *buf = &req->buf[i];
+
+			ov.Offset = offset & 0xffffffff;
+			ov.OffsetHigh = offset >> 32;
+
+			if (req->type == LKL_DEV_BLK_TYPE_READ)
+				ret = ReadFile(disk.handle, buf->iov_base,
+					       buf->iov_len, &res, &ov);
+			else
+				ret = WriteFile(disk.handle, buf->iov_base,
+						buf->iov_len, &res, &ov);
+			if (!ret) {
+				lkl_printf("%s: I/O error: %d\n", __func__,
+					   GetLastError());
+				err = -1;
+				goto out;
+			}
+
+			if (res != buf->iov_len) {
+				lkl_printf("%s: I/O error: short: %d %d\n",
+					   res, buf->iov_len);
+				err = -1;
+				goto out;
+			}
+
+			offset += buf->iov_len;
+		}
+		break;
+	}
+	case LKL_DEV_BLK_TYPE_FLUSH:
+	case LKL_DEV_BLK_TYPE_FLUSH_OUT:
+		ret = FlushFileBuffers(disk.handle);
+		if (!ret)
+			err = 1;
+		break;
+	default:
+		return LKL_DEV_BLK_STATUS_UNSUP;
+	}
+
+out:
+	if (err < 0)
+		return LKL_DEV_BLK_STATUS_IOERR;
+
+	return LKL_DEV_BLK_STATUS_OK;
+}
+
+struct lkl_dev_blk_ops lkl_dev_blk_ops = {
+	.get_capacity = handle_get_capacity,
+	.request = blk_request,
+};
+
+/* Needed to resolve linker error on Win32. We don't really support
+ * any network IO on Windows, anyway, so there's no loss here.
+ */
+int lkl_netdevs_remove(void)
+{
+	return 0;
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 27/47] lkl: Android ARM (arm/arm64) support
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (25 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 26/47] lkl tools: add support for Windows host Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 28/47] lkl tools: add lklfuse Hajime Tazaki
                   ` (22 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Hajime Tazaki, Akira Moroo

Initial attempt to run an application with hijack library on Android
platform.  Tested mostly on Android 6.x and 7.x.

The build process assumes that the android ndk toolchain is installed in
a host system as circle.yml does in its test.  arm32 build uses
alternate linker, stored in tools/lkl/bin directory, in order to avoid
the link issue (issue #59).

The CircleCI test infrastructure requires to use ubuntu 14.04 for this
test.

* Limitations
- aarch64 isn't tested on circleci due to difficulties on aarch64
emulator.
- bionic libc on android-24 emulator (arm32) doesn't call destructor, so
some of tests in hijack-test.sh fail.
- net.sh doesn't properly test network related issue.

Fixes #59.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 tools/lkl/bin/arm-linux-androideabi-ld | 1 +
 1 file changed, 1 insertion(+)
 create mode 120000 tools/lkl/bin/arm-linux-androideabi-ld

diff --git a/tools/lkl/bin/arm-linux-androideabi-ld b/tools/lkl/bin/arm-linux-androideabi-ld
new file mode 120000
index 000000000000..4194d24c4b5c
--- /dev/null
+++ b/tools/lkl/bin/arm-linux-androideabi-ld
@@ -0,0 +1 @@
+arm-linux-androideabi-ld.gold
\ No newline at end of file
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 28/47] lkl tools: add lklfuse
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (26 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 27/47] lkl: Android ARM (arm/arm64) support Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 29/47] lkl: add initial system call hijack support (a.k.a. NUSE of libos) Hajime Tazaki
                   ` (21 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um
  Cc: Rafael Gieschke, Conrad Meyer, Octavian Purdila, Akira Moroo,
	Quentin Anglade, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add a simple fuse based program that can mount filesystem in userspace
using LKL.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Quentin Anglade <quentin.anglade@objectif-libre.com>
Signed-off-by: Rafael Gieschke <rafael.gieschke@rz.uni-freiburg.de>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/lklfuse.c        | 658 +++++++++++++++++++++++++++++++++++++
 tools/lkl/tests/lklfuse.sh | 110 +++++++
 2 files changed, 768 insertions(+)
 create mode 100644 tools/lkl/lklfuse.c
 create mode 100755 tools/lkl/tests/lklfuse.sh

diff --git a/tools/lkl/lklfuse.c b/tools/lkl/lklfuse.c
new file mode 100644
index 000000000000..4e6c8fe250d0
--- /dev/null
+++ b/tools/lkl/lklfuse.c
@@ -0,0 +1,658 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <sys/stat.h>
+#include <string.h>
+#include <errno.h>
+#include <unistd.h>
+#define FUSE_USE_VERSION 26
+#include <fuse.h>
+#include <fuse/fuse_opt.h>
+#include <fuse/fuse_lowlevel.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#define LKLFUSE_VERSION "0.1"
+
+struct lklfuse {
+	const char *file;
+	const char *log;
+	const char *type;
+	const char *opts;
+	struct lkl_disk disk;
+	int disk_id;
+	int part;
+	int ro;
+	int mb;
+} lklfuse = {
+	.mb = 64,
+};
+
+#define LKLFUSE_OPT(t, p, v) { t, offsetof(struct lklfuse, p), v }
+
+enum {
+	KEY_HELP,
+	KEY_VERSION,
+};
+
+static struct fuse_opt lklfuse_opts[] = {
+	LKLFUSE_OPT("log=%s", log, 0),
+	LKLFUSE_OPT("type=%s", type, 0),
+	LKLFUSE_OPT("mb=%d", mb, 0),
+	LKLFUSE_OPT("opts=%s", opts, 0),
+	LKLFUSE_OPT("part=%d", part, 0),
+	FUSE_OPT_KEY("-h", KEY_HELP),
+	FUSE_OPT_KEY("--help", KEY_HELP),
+	FUSE_OPT_KEY("-V",             KEY_VERSION),
+	FUSE_OPT_KEY("--version",      KEY_VERSION),
+	FUSE_OPT_END
+};
+
+static void usage(void)
+{
+	printf(
+"usage: lklfuse file mountpoint [options]\n"
+"\n"
+"general options:\n"
+"    -o opt,[opt...]        mount options\n"
+"    -h   --help            print help\n"
+"    -V   --version         print version\n"
+"\n"
+"lklfuse options:\n"
+"    -o log=FILE            log file\n"
+"    -o type=fstype         filesystem type\n"
+"    -o mb=memory in mb     ammount of memory to allocate\n"
+"    -o part=parition       partition to mount\n"
+"    -o ro                  open file read-only\n"
+"    -o opts=options        mount options (use \\ to escape , and =)\n"
+);
+}
+
+static int lklfuse_opt_proc(void *data, const char *arg, int key,
+			  struct fuse_args *args)
+{
+	switch (key) {
+	case FUSE_OPT_KEY_OPT:
+		if (strcmp(arg, "ro") == 0)
+			lklfuse.ro = 1;
+		return 1;
+
+	case FUSE_OPT_KEY_NONOPT:
+		if (!lklfuse.file) {
+			lklfuse.file = strdup(arg);
+			return 0;
+		}
+		return 1;
+
+	case KEY_HELP:
+		usage();
+		fuse_opt_add_arg(args, "-ho");
+		fuse_main(args->argc, args->argv, NULL, NULL);
+		exit(1);
+
+	case KEY_VERSION:
+		printf("lklfuse version %s\n", LKLFUSE_VERSION);
+		fuse_opt_add_arg(args, "--version");
+		fuse_main(args->argc, args->argv, NULL, NULL);
+		exit(0);
+
+	default:
+		fprintf(stderr, "internal error\n");
+		abort();
+	}
+}
+
+static void lklfuse_xlat_stat(const struct lkl_stat *in, struct stat *st)
+{
+	st->st_dev = in->st_dev;
+	st->st_ino = in->st_ino;
+	st->st_mode = in->st_mode;
+	st->st_nlink = in->st_nlink;
+	st->st_uid = in->st_uid;
+	st->st_gid = in->st_gid;
+	st->st_rdev = in->st_rdev;
+	st->st_size = in->st_size;
+	st->st_blksize = in->st_blksize;
+	st->st_blocks = in->st_blocks;
+	st->st_atim.tv_sec = in->lkl_st_atime;
+	st->st_atim.tv_nsec = in->st_atime_nsec;
+	st->st_mtim.tv_sec = in->lkl_st_mtime;
+	st->st_mtim.tv_nsec = in->st_mtime_nsec;
+	st->st_ctim.tv_sec = in->lkl_st_ctime;
+	st->st_ctim.tv_nsec = in->st_ctime_nsec;
+}
+
+static int lklfuse_fgetattr(const char *path, struct stat *st,
+			    struct fuse_file_info *fi)
+{
+	long ret;
+	struct lkl_stat lkl_stat;
+
+	ret = lkl_sys_fstat(fi->fh, &lkl_stat);
+	if (ret)
+		return ret;
+
+	lklfuse_xlat_stat(&lkl_stat, st);
+	return 0;
+}
+
+static int lklfuse_getattr(const char *path, struct stat *st)
+{
+	long ret;
+	struct lkl_stat lkl_stat;
+
+	ret = lkl_sys_lstat(path, &lkl_stat);
+	if (ret)
+		return ret;
+
+	lklfuse_xlat_stat(&lkl_stat, st);
+	return 0;
+}
+
+static int lklfuse_readlink(const char *path, char *buf, size_t len)
+{
+	long ret;
+
+	ret = lkl_sys_readlink(path, buf, len);
+	if (ret < 0)
+		return ret;
+
+	if ((size_t)ret == len)
+		ret = len - 1;
+
+	buf[ret] = 0;
+
+	return 0;
+}
+
+static int lklfuse_mknod(const char *path, mode_t mode, dev_t dev)
+{
+	return lkl_sys_mknod(path, mode, dev);
+}
+
+static int lklfuse_mkdir(const char *path, mode_t mode)
+{
+	return lkl_sys_mkdir(path, mode);
+}
+
+static int lklfuse_unlink(const char *path)
+{
+	return lkl_sys_unlink(path);
+}
+
+static int lklfuse_rmdir(const char *path)
+{
+	return lkl_sys_rmdir(path);
+}
+
+static int lklfuse_symlink(const char *oldname, const char *newname)
+{
+	return lkl_sys_symlink(oldname, newname);
+}
+
+
+static int lklfuse_rename(const char *oldname, const char *newname)
+{
+	return lkl_sys_rename(oldname, newname);
+}
+
+static int lklfuse_link(const char *oldname, const char *newname)
+{
+	return lkl_sys_link(oldname, newname);
+}
+
+static int lklfuse_chmod(const char *path, mode_t mode)
+{
+	return lkl_sys_chmod(path, mode);
+}
+
+
+static int lklfuse_chown(const char *path, uid_t uid, gid_t gid)
+{
+	return lkl_sys_fchownat(LKL_AT_FDCWD, path, uid, gid,
+				LKL_AT_SYMLINK_NOFOLLOW);
+}
+
+static int lklfuse_truncate(const char *path, off_t off)
+{
+	return lkl_sys_truncate(path, off);
+}
+
+static int lklfuse_open3(const char *path, bool create, mode_t mode,
+			 struct fuse_file_info *fi)
+{
+	long ret;
+	int flags;
+
+	if ((fi->flags & O_ACCMODE) == O_RDONLY)
+		flags = LKL_O_RDONLY;
+	else if ((fi->flags & O_ACCMODE) == O_WRONLY)
+		flags = LKL_O_WRONLY;
+	else if ((fi->flags & O_ACCMODE) == O_RDWR)
+		flags = LKL_O_RDWR;
+	else
+		return -EINVAL;
+
+	if (create)
+		flags |= LKL_O_CREAT;
+
+	ret = lkl_sys_open(path, flags, mode);
+	if (ret < 0)
+		return ret;
+
+	fi->fh = ret;
+
+	return 0;
+}
+
+static int lklfuse_create(const char *path, mode_t mode,
+			  struct fuse_file_info *fi)
+{
+	return lklfuse_open3(path, true, mode, fi);
+}
+
+static int lklfuse_open(const char *path, struct fuse_file_info *fi)
+{
+	return lklfuse_open3(path, false, 0, fi);
+}
+
+static int lklfuse_read(const char *path, char *buf, size_t size, off_t offset,
+		      struct fuse_file_info *fi)
+{
+	long ret;
+	ssize_t orig_size = size;
+
+	do {
+		ret = lkl_sys_pread64(fi->fh, buf, size, offset);
+		if (ret <= 0)
+			break;
+		size -= ret;
+		offset += ret;
+		buf += ret;
+	} while (size > 0);
+
+	return ret < 0 ? ret : orig_size - (ssize_t)size;
+
+}
+
+static int lklfuse_write(const char *path, const char *buf, size_t size,
+		       off_t offset, struct fuse_file_info *fi)
+{
+	long ret;
+	ssize_t orig_size = size;
+
+	do {
+		ret = lkl_sys_pwrite64(fi->fh, buf, size, offset);
+		if (ret <= 0)
+			break;
+		size -= ret;
+		offset += ret;
+		buf += ret;
+	} while (size > 0);
+
+	return ret < 0 ? ret : orig_size - (ssize_t)size;
+}
+
+
+static int lklfuse_statfs(const char *path, struct statvfs *stat)
+{
+	long ret;
+	struct lkl_statfs lkl_statfs;
+
+	ret = lkl_sys_statfs(path, &lkl_statfs);
+	if (ret < 0)
+		return ret;
+
+	stat->f_bsize = lkl_statfs.f_bsize;
+	stat->f_frsize = lkl_statfs.f_frsize;
+	stat->f_blocks = lkl_statfs.f_blocks;
+	stat->f_bfree = lkl_statfs.f_bfree;
+	stat->f_bavail = lkl_statfs.f_bavail;
+	stat->f_files = lkl_statfs.f_files;
+	stat->f_ffree = lkl_statfs.f_ffree;
+	stat->f_favail = stat->f_ffree;
+	stat->f_fsid = *(unsigned long *)&lkl_statfs.f_fsid.val[0];
+	stat->f_flag = lkl_statfs.f_flags;
+	stat->f_namemax = lkl_statfs.f_namelen;
+
+	return 0;
+}
+
+static int lklfuse_flush(const char *path, struct fuse_file_info *fi)
+{
+	return 0;
+}
+
+static int lklfuse_release(const char *path, struct fuse_file_info *fi)
+{
+	return lkl_sys_close(fi->fh);
+}
+
+static int lklfuse_fsync(const char *path, int datasync,
+		       struct fuse_file_info *fi)
+{
+	if (datasync)
+		return lkl_sys_fdatasync(fi->fh);
+	else
+		return lkl_sys_fsync(fi->fh);
+}
+
+static int lklfuse_setxattr(const char *path, const char *name, const char *val,
+		   size_t size, int flags)
+{
+	return lkl_sys_setxattr(path, name, val, size, flags);
+}
+
+static int lklfuse_getxattr(const char *path, const char *name, char *val,
+			  size_t size)
+{
+	return lkl_sys_getxattr(path, name, val, size);
+}
+
+static int lklfuse_listxattr(const char *path, char *list, size_t size)
+{
+	return lkl_sys_listxattr(path, list, size);
+}
+
+static int lklfuse_removexattr(const char *path, const char *name)
+{
+	return lkl_sys_removexattr(path, name);
+}
+
+static int lklfuse_opendir(const char *path, struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir;
+	int err;
+
+	dir = lkl_opendir(path, &err);
+	if (!dir)
+		return err;
+
+	fi->fh = (uintptr_t)dir;
+
+	return 0;
+}
+
+/** Read directory
+ *
+ * This supersedes the old getdir() interface.  New applications
+ * should use this.
+ *
+ * The filesystem may choose between two modes of operation:
+ *
+ * 1) The readdir implementation ignores the offset parameter, and
+ * passes zero to the filler function's offset.  The filler
+ * function will not return '1' (unless an error happens), so the
+ * whole directory is read in a single readdir operation.  This
+ * works just like the old getdir() method.
+ *
+ * 2) The readdir implementation keeps track of the offsets of the
+ * directory entries.  It uses the offset parameter and always
+ * passes non-zero offset to the filler function.  When the buffer
+ * is full (or an error happens) the filler function will return
+ * '1'.
+ *
+ * Introduced in version 2.3
+ */
+static int lklfuse_readdir(const char *path, void *buf, fuse_fill_dir_t fill,
+			 off_t off, struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir = (struct lkl_dir *)(uintptr_t)fi->fh;
+	struct lkl_linux_dirent64 *de;
+
+	while ((de = lkl_readdir(dir))) {
+		struct stat st = { 0, };
+
+		st.st_ino = de->d_ino;
+		st.st_mode = de->d_type << 12;
+
+		if (fill(buf, de->d_name, &st, 0))
+			break;
+	}
+
+	if (!de)
+		return lkl_errdir(dir);
+
+	return 0;
+}
+
+static int lklfuse_releasedir(const char *path, struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir = (struct lkl_dir *)(uintptr_t)fi->fh;
+
+	return lkl_closedir(dir);
+}
+
+static int lklfuse_fsyncdir(const char *path, int datasync,
+			  struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir = (struct lkl_dir *)(uintptr_t)fi->fh;
+	int fd = lkl_dirfd(dir);
+
+	if (datasync)
+		return lkl_sys_fdatasync(fd);
+	else
+		return lkl_sys_fsync(fd);
+}
+
+static int lklfuse_access(const char *path, int mode)
+{
+	return lkl_sys_access(path, mode);
+}
+
+static int lklfuse_utimens(const char *path, const struct timespec tv[2])
+{
+	struct lkl_timespec ts[2];
+
+	ts[0].tv_sec = tv[0].tv_sec;
+	ts[0].tv_nsec = tv[0].tv_nsec;
+	ts[1].tv_sec = tv[0].tv_sec;
+	ts[1].tv_nsec = tv[0].tv_nsec;
+
+	return lkl_sys_utimensat(-1, path, (struct __lkl__kernel_timespec *)ts,
+				 LKL_AT_SYMLINK_NOFOLLOW);
+}
+
+static int lklfuse_fallocate(const char *path, int mode, off_t offset,
+			     off_t len, struct fuse_file_info *fi)
+{
+	return lkl_sys_fallocate(fi->fh, mode, offset, len);
+}
+
+const struct fuse_operations lklfuse_ops = {
+	.flag_nullpath_ok = 1,
+	.flag_nopath = 1,
+	.flag_utime_omit_ok = 1,
+
+	.getattr = lklfuse_getattr,
+	.readlink = lklfuse_readlink,
+	.mknod = lklfuse_mknod,
+	.mkdir = lklfuse_mkdir,
+	.unlink = lklfuse_unlink,
+	.rmdir = lklfuse_rmdir,
+	.symlink = lklfuse_symlink,
+	.rename = lklfuse_rename,
+	.link = lklfuse_link,
+	.chmod = lklfuse_chmod,
+	.chown = lklfuse_chown,
+	.truncate = lklfuse_truncate,
+	.open = lklfuse_open,
+	.read = lklfuse_read,
+	.write = lklfuse_write,
+	.statfs = lklfuse_statfs,
+	.flush = lklfuse_flush,
+	.release = lklfuse_release,
+	.fsync = lklfuse_fsync,
+	.setxattr = lklfuse_setxattr,
+	.getxattr = lklfuse_getxattr,
+	.listxattr = lklfuse_listxattr,
+	.removexattr = lklfuse_removexattr,
+	.opendir = lklfuse_opendir,
+	.readdir = lklfuse_readdir,
+	.releasedir = lklfuse_releasedir,
+	.fsyncdir = lklfuse_fsyncdir,
+	.access = lklfuse_access,
+	.create = lklfuse_create,
+	.fgetattr = lklfuse_fgetattr,
+	/* .lock, */
+	.utimens = lklfuse_utimens,
+	/* .bmap, */
+	/* .ioctl, */
+	/* .poll, */
+	/* .write_buf, (SG io) */
+	/* .read_buf, (SG io) */
+	/* .flock, */
+	.fallocate = lklfuse_fallocate,
+};
+
+static int start_lkl(void)
+{
+	long ret;
+	char mpoint[32], cmdline[16];
+
+	snprintf(cmdline, sizeof(cmdline), "mem=%dM", lklfuse.mb);
+	ret = lkl_start_kernel(&lkl_host_ops, cmdline);
+	if (ret) {
+		fprintf(stderr, "can't start kernel: %s\n", lkl_strerror(ret));
+		goto out;
+	}
+
+	ret = lkl_mount_dev(lklfuse.disk_id, lklfuse.part, lklfuse.type,
+			    lklfuse.ro ? LKL_MS_RDONLY : 0, lklfuse.opts,
+			    mpoint, sizeof(mpoint));
+
+	if (ret) {
+		fprintf(stderr, "can't mount disk: %s\n", lkl_strerror(ret));
+		goto out_halt;
+	}
+
+	ret = lkl_sys_chroot(mpoint);
+	if (ret) {
+		fprintf(stderr, "can't chdir to %s: %s\n", mpoint,
+			lkl_strerror(ret));
+		goto out_umount;
+	}
+
+	return 0;
+
+out_umount:
+	lkl_umount_dev(lklfuse.disk_id, lklfuse.part, 0, 1000);
+
+out_halt:
+	lkl_sys_halt();
+
+out:
+	return ret;
+}
+
+static void stop_lkl(void)
+{
+	int ret;
+
+	ret = lkl_sys_chdir("/");
+	if (ret)
+		fprintf(stderr, "can't chdir to /: %s\n", lkl_strerror(ret));
+	ret = lkl_sys_umount("/", 0);
+	if (ret)
+		fprintf(stderr, "failed to umount disk: %d: %s\n",
+			lklfuse.disk_id, lkl_strerror(ret));
+	lkl_sys_halt();
+}
+
+int main(int argc, char **argv)
+{
+	struct fuse_args args = FUSE_ARGS_INIT(argc, argv);
+	struct fuse_chan *ch;
+	struct fuse *fuse;
+	struct stat st;
+	char *mnt;
+	int fg, mt, ret;
+
+	if (fuse_opt_parse(&args, &lklfuse, lklfuse_opts, lklfuse_opt_proc))
+		return 1;
+
+	if (!lklfuse.file || !lklfuse.type) {
+		fprintf(stderr, "no file or filesystem type specified\n");
+		return 1;
+	}
+
+	if (fuse_parse_cmdline(&args, &mnt, &mt, &fg))
+		return 1;
+
+	ret = stat(mnt, &st);
+	if (ret) {
+		perror(mnt);
+		goto out_free;
+	}
+
+	ret = open(lklfuse.file, lklfuse.ro ? O_RDONLY : O_RDWR);
+	if (ret < 0) {
+		perror(lklfuse.file);
+		goto out_free;
+	}
+
+	lklfuse.disk.fd = ret;
+
+	ret = lkl_disk_add(&lklfuse.disk);
+	if (ret < 0) {
+		fprintf(stderr, "can't add disk: %s\n", lkl_strerror(ret));
+		goto out_close_disk;
+	}
+
+	lklfuse.disk_id = ret;
+
+	ch = fuse_mount(mnt, &args);
+	if (!ch) {
+		ret = -1;
+		goto out_close_disk;
+	}
+
+	fuse = fuse_new(ch, &args, &lklfuse_ops, sizeof(lklfuse_ops), NULL);
+	if (!fuse) {
+		ret = -1;
+		goto out_fuse_unmount;
+	}
+
+	fuse_opt_free_args(&args);
+
+	if (fuse_daemonize(fg) ||
+	    fuse_set_signal_handlers(fuse_get_session(fuse))) {
+		ret = -1;
+		goto out_fuse_destroy;
+	}
+
+	ret = start_lkl();
+	if (ret) {
+		ret = -1;
+		goto out_remove_signals;
+	}
+
+	if (mt)
+		fprintf(stderr, "warning: multithreaded mode not supported\n");
+
+	ret = fuse_loop(fuse);
+
+	stop_lkl();
+
+out_remove_signals:
+	fuse_remove_signal_handlers(fuse_get_session(fuse));
+
+out_fuse_unmount:
+	if (ch)
+		fuse_unmount(mnt, ch);
+
+out_fuse_destroy:
+	if (fuse)
+		fuse_destroy(fuse);
+
+out_close_disk:
+	close(lklfuse.disk.fd);
+
+out_free:
+	free(mnt);
+
+	return ret < 0 ? 1 : 0;
+}
diff --git a/tools/lkl/tests/lklfuse.sh b/tools/lkl/tests/lklfuse.sh
new file mode 100755
index 000000000000..7f35dd53fc4e
--- /dev/null
+++ b/tools/lkl/tests/lklfuse.sh
@@ -0,0 +1,110 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+source $script_dir/test.sh
+
+cleanup()
+{
+    set -e
+
+    sleep 1
+    fusermount -u $dir
+    rm $file
+    rmdir $dir
+}
+
+
+# $1 - disk image
+# $2 - fstype
+function prepfs()
+{
+    set -e
+
+    dd if=/dev/zero of=$1 bs=1024 count=102400
+
+    yes | mkfs.$2 $1
+}
+
+# $1 - disk image
+# $2 - mount point
+# $3 - filesystem type
+lklfuse_mount()
+{
+    ${script_dir}/../lklfuse $1 $2 -o type=$3
+}
+
+# $1 - mount point
+lklfuse_basic()
+{
+    set -e
+
+    cd $1
+    touch a
+    if ! [ -e ]; then exit 1; fi
+    rm a
+    mkdir a
+    if ! [ -d ]; then exit 1; fi
+    rmdir a
+}
+
+# $1 - dir
+# $2 - filesystem type
+lklfuse_stressng()
+{
+    set -e
+
+    if [ -z $(which stress-ng) ]; then
+        echo "missing stress-ng"
+        return $TEST_SKIP
+    fi
+
+    cd $1
+
+    if [ "$2" = "vfat" ]; then
+        exclude="chmod,filename,link,mknod,symlink,xattr"
+    fi
+
+    stress-ng --class filesystem --all 0 --timeout 10 \
+	      --exclude fiemap,$exclude --fallocate-bytes 10m \
+	      --sync-file-bytes 10m
+}
+
+if [ "$1" = "-t" ]; then
+    shift
+    fstype=$1
+    shift
+fi
+
+if [ -z "$fstype" ]; then
+    fstype="ext4"
+fi
+
+if ! [ -x $script_dir/../lklfuse ]; then
+    lkl_test_plan 0 "lklfuse.sh $fstype"
+    echo "lklfuse not available"
+    exit 0
+fi
+
+if ! [ -e /dev/fuse ]; then
+    lkl_test_plan 0 "lklfuse.sh $fstype"
+    echo "/dev/fuse not available"
+    exit 0
+fi
+
+
+file=`mktemp`
+dir=`mktemp -d`
+
+trap cleanup EXIT
+
+lkl_test_plan 4 "lklfuse $fstype"
+
+lkl_test_run 1 prepfs $file $fstype
+lkl_test_run 2 lklfuse_mount $file $dir $fstype
+lkl_test_run 3 lklfuse_basic $dir
+# stress-ng returns 2 with no apparent failures so skip it for now
+#lkl_test_run 4 lklfuse_stressng $dir $fstype
+trap : EXIT
+lkl_test_run 4 cleanup
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 29/47] lkl: add initial system call hijack support (a.k.a. NUSE of libos)
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (27 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 28/47] lkl tools: add lklfuse Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 30/47] lkl: add documentation Hajime Tazaki
                   ` (20 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um
  Cc: H . K . Jerry Chu, Xiao Jia, Octavian Purdila, Motomu Utsumi,
	Akira Moroo, Thomas Liebetraut, Hajime Tazaki, Patrick Collins,
	Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

This commit introduces initial support of system call hijack, based on
LD_PRELOAD with POSIX applications on a host.

Note that system call hijack by renaming symbol by LD_PRELOAD is not a
complete solution: it must address various issues with dirty tricks.

Those tricks/issues are:
- introduce file descriptor offset (i.e., fd + offset)
- path name isolation (i.e., chrooted)
- need of handling mixture of fd between host and lkl-ed ones
- un-hijackable symbol (__socket inside if_nametoindex() of linux
  glibc) needs to be hijacked by upper call (i.e., if_nametoindex)

Nevertheless, it is powerful in some case such as replacing network
stack only for an application.

It has been tested with socket(AF_INET/AF_INET6/AF_NETLINK) without any
external netdevices, i.e. only works with localhost (127.0.0.1/::1).
It may need more work on non-Linux host.

select(2)/poll(2)/epoll_create(2) need more work.

The below should work on Linux.
% ./tools/lkl/bin/hijack.sh ip ad

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
[Octavian: use lkl_sys_* calls instead of lkl_sys_wrapper_* calls]
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/bin/lkl-hijack.sh    |  23 +
 tools/lkl/lib/hijack/Build     |   4 +
 tools/lkl/lib/hijack/hijack.c  | 618 +++++++++++++++++++++++++++
 tools/lkl/lib/hijack/init.c    | 252 +++++++++++
 tools/lkl/lib/hijack/init.h    |   8 +
 tools/lkl/lib/hijack/xlate.c   | 613 ++++++++++++++++++++++++++
 tools/lkl/lib/hijack/xlate.h   |  13 +
 tools/lkl/tests/hijack-test.sh | 760 +++++++++++++++++++++++++++++++++
 tools/lkl/tests/run_netperf.sh |  98 +++++
 9 files changed, 2389 insertions(+)
 create mode 100755 tools/lkl/bin/lkl-hijack.sh
 create mode 100644 tools/lkl/lib/hijack/Build
 create mode 100644 tools/lkl/lib/hijack/hijack.c
 create mode 100644 tools/lkl/lib/hijack/init.c
 create mode 100644 tools/lkl/lib/hijack/init.h
 create mode 100644 tools/lkl/lib/hijack/xlate.c
 create mode 100644 tools/lkl/lib/hijack/xlate.h
 create mode 100755 tools/lkl/tests/hijack-test.sh
 create mode 100755 tools/lkl/tests/run_netperf.sh

diff --git a/tools/lkl/bin/lkl-hijack.sh b/tools/lkl/bin/lkl-hijack.sh
new file mode 100755
index 000000000000..7cf92856dfad
--- /dev/null
+++ b/tools/lkl/bin/lkl-hijack.sh
@@ -0,0 +1,23 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+##
+## This wrapper script works to replace system calls symbols such as
+## socket(2), recvmsg(2) for the redirection to LKL. Ideally it works
+## with any applications, but in practice (tm) it depends on the maturity
+## of hijack library (liblkl-hijack.so).
+##
+## Since LD_PRELOAD technique with setuid/setgid binary is tricky, you may
+## need to use sudo (or equivalents) to do it (e.g., ping).
+##
+## % sudo hijack.sh ping 127.0.0.1
+##
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+export LD_LIBRARY_PATH=${script_dir}/../lib/hijack
+if [ -n ${LKL_HIJACK_DEBUG+x}  ]
+then
+  trap '' TSTP
+fi
+LD_PRELOAD=liblkl-hijack.so $*
diff --git a/tools/lkl/lib/hijack/Build b/tools/lkl/lib/hijack/Build
new file mode 100644
index 000000000000..e68e93a3328a
--- /dev/null
+++ b/tools/lkl/lib/hijack/Build
@@ -0,0 +1,4 @@
+liblkl-hijack-y += hijack.o
+liblkl-hijack-y += init.o
+liblkl-hijack-y += xlate.o
+
diff --git a/tools/lkl/lib/hijack/hijack.c b/tools/lkl/lib/hijack/hijack.c
new file mode 100644
index 000000000000..774f74258669
--- /dev/null
+++ b/tools/lkl/lib/hijack/hijack.c
@@ -0,0 +1,618 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * system calls hijack code
+ * Copyright (c) 2015 Hajime Tazaki
+ *
+ * Author: Hajime Tazaki <tazaki@sfc.wide.ad.jp>
+ *
+ * Note: some of the code is picked from rumpkernel, written by Antti Kantee.
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdarg.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <sys/mman.h>
+#define __USE_GNU
+#include <dlfcn.h>
+#include <sys/socket.h>
+#include <sys/select.h>
+#include <sys/epoll.h>
+#include <stdint.h>
+#include <string.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <poll.h>
+#include <sys/ioctl.h>
+#include <assert.h>
+#include <pthread.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include "xlate.h"
+#include "init.h"
+
+static int is_lklfd(int fd)
+{
+	if (fd < LKL_FD_OFFSET)
+		return 0;
+
+	return 1;
+}
+
+static void *resolve_sym(const char *sym)
+{
+	void *resolv;
+
+	resolv = dlsym(RTLD_NEXT, sym);
+	if (!resolv) {
+		fprintf(stderr, "dlsym fail %s (%s)\n", sym, dlerror());
+		assert(0);
+	}
+	return resolv;
+}
+
+typedef long (*host_call)(long p1, long p2, long p3, long p4, long p5, long p6);
+
+static host_call host_calls[__lkl__NR_syscalls];
+/* internally managed fd list for epoll */
+int dual_fds[LKL_FD_OFFSET];
+
+#define HOOK_FD_CALL(name)						\
+	static void __attribute__((constructor(101)))			\
+	init_host_##name(void)						\
+	{								\
+		host_calls[__lkl__NR_##name] = resolve_sym(#name);	\
+	}								\
+									\
+	long name##_hook(long p1, long p2, long p3, long p4, long p5,	\
+			 long p6)					\
+	{								\
+		long p[6] = {p1, p2, p3, p4, p5, p6 };			\
+									\
+		if (!host_calls[__lkl__NR_##name])			\
+			host_calls[__lkl__NR_##name] = resolve_sym(#name); \
+		if (!is_lklfd(p1))					\
+			return host_calls[__lkl__NR_##name](p1, p2, p3,	\
+							    p4, p5, p6); \
+									\
+		return lkl_set_errno(lkl_syscall(__lkl__NR_##name, p));	\
+	}								\
+	asm(".global " #name);						\
+	asm(".set " #name "," #name "_hook")
+
+#define HOOK_CALL_USE_HOST_BEFORE_START(name)				\
+	static void __attribute__((constructor(101)))			\
+	init_host_##name(void)						\
+	{								\
+		host_calls[__lkl__NR_##name] = resolve_sym(#name);	\
+	}								\
+									\
+	long name##_hook(long p1, long p2, long p3, long p4, long p5,	\
+			 long p6)					\
+	{								\
+		long p[6] = {p1, p2, p3, p4, p5, p6 };			\
+									\
+		if (!host_calls[__lkl__NR_##name])			\
+			host_calls[__lkl__NR_##name] = resolve_sym(#name); \
+		if (!lkl_running)					\
+			return host_calls[__lkl__NR_##name](p1, p2, p3,	\
+							    p4, p5, p6); \
+									\
+		return lkl_set_errno(lkl_syscall(__lkl__NR_##name, p));	\
+	}								\
+	asm(".global " #name);						\
+	asm(".set " #name "," #name "_hook")
+
+#define HOST_CALL(name)							\
+	static long (*host_##name)();					\
+	static void __attribute__((constructor(101)))			\
+	init2_host_##name(void)						\
+	{								\
+		host_##name = resolve_sym(#name);			\
+	}
+
+#define HOOK_CALL(name)							\
+	long name##_hook(long p1, long p2, long p3, long p4, long p5,	\
+			 long p6)					\
+	{								\
+		long p[6] = {p1, p2, p3, p4, p5, p6};			\
+									\
+		return lkl_set_errno(lkl_syscall(__lkl__NR_##name, p));	\
+	}								\
+	asm(".global " #name);						\
+	asm(".set " #name "," #name "_hook")
+
+#define CHECK_HOST_CALL(name)				\
+	if (!host_##name)				\
+		host_##name = resolve_sym(#name)
+
+static int lkl_call(int nr, int args, ...)
+{
+	long params[6];
+	va_list vl;
+	int i;
+
+	va_start(vl, args);
+	for (i = 0; i < args; i++)
+		params[i] = va_arg(vl, long);
+	va_end(vl);
+
+	return lkl_set_errno(lkl_syscall(nr, params));
+}
+
+HOOK_FD_CALL(recvmsg);
+HOOK_FD_CALL(sendmsg);
+HOOK_FD_CALL(sendmmsg);
+HOOK_FD_CALL(getsockname);
+HOOK_FD_CALL(getpeername);
+HOOK_FD_CALL(bind);
+HOOK_FD_CALL(connect);
+HOOK_FD_CALL(listen);
+HOOK_FD_CALL(shutdown);
+HOOK_FD_CALL(accept);
+HOOK_FD_CALL(write);
+HOOK_FD_CALL(writev);
+HOOK_FD_CALL(sendto);
+HOOK_FD_CALL(read);
+HOOK_FD_CALL(readv);
+HOOK_FD_CALL(recvfrom);
+HOOK_FD_CALL(splice);
+HOOK_FD_CALL(vmsplice);
+
+HOOK_CALL_USE_HOST_BEFORE_START(accept4);
+HOOK_CALL_USE_HOST_BEFORE_START(pipe2);
+
+HOST_CALL(write);
+HOST_CALL(pipe2);
+
+HOST_CALL(setsockopt);
+int setsockopt(int fd, int level, int optname, const void *optval,
+	       socklen_t optlen)
+{
+	CHECK_HOST_CALL(setsockopt);
+	if (!is_lklfd(fd))
+		return host_setsockopt(fd, level, optname, optval, optlen);
+	return lkl_call(__lkl__NR_setsockopt, 5, fd, lkl_solevel_xlate(level),
+			lkl_soname_xlate(optname), (void *)optval, optlen);
+}
+
+HOST_CALL(getsockopt);
+int getsockopt(int fd, int level, int optname, void *optval, socklen_t *optlen)
+{
+	CHECK_HOST_CALL(getsockopt);
+	if (!is_lklfd(fd))
+		return host_getsockopt(fd, level, optname, optval, optlen);
+	return lkl_call(__lkl__NR_getsockopt, 5, fd, lkl_solevel_xlate(level),
+			lkl_soname_xlate(optname), optval, (int *)optlen);
+}
+
+HOST_CALL(socket);
+int socket(int domain, int type, int protocol)
+{
+	CHECK_HOST_CALL(socket);
+	if (domain == AF_UNIX || domain == PF_PACKET)
+		return host_socket(domain, type, protocol);
+
+	if (!lkl_running)
+		return host_socket(domain, type, protocol);
+
+	return lkl_call(__lkl__NR_socket, 3, domain, type, protocol);
+}
+
+HOST_CALL(ioctl);
+#ifdef __ANDROID__
+int ioctl(int fd, int req, ...)
+#else
+int ioctl(int fd, unsigned long req, ...)
+#endif
+{
+	va_list vl;
+	long arg;
+
+	va_start(vl, req);
+	arg = va_arg(vl, long);
+	va_end(vl);
+
+	CHECK_HOST_CALL(ioctl);
+
+	if (!is_lklfd(fd))
+		return host_ioctl(fd, req, arg);
+	return lkl_call(__lkl__NR_ioctl, 3, fd, lkl_ioctl_req_xlate(req), arg);
+}
+
+
+HOST_CALL(fcntl);
+int fcntl(int fd, int cmd, ...)
+{
+	va_list vl;
+	long arg;
+
+	va_start(vl, cmd);
+	arg = va_arg(vl, long);
+	va_end(vl);
+
+	CHECK_HOST_CALL(fcntl);
+
+	if (!is_lklfd(fd))
+		return host_fcntl(fd, cmd, arg);
+	return lkl_call(__lkl__NR_fcntl, 3, fd, lkl_fcntl_cmd_xlate(cmd), arg);
+}
+
+HOST_CALL(poll);
+int poll(struct pollfd *fds, nfds_t nfds, int timeout)
+{
+	unsigned int i, lklfds = 0, hostfds = 0;
+
+	CHECK_HOST_CALL(poll);
+
+	for (i = 0; i < nfds; i++) {
+		if (is_lklfd(fds[i].fd))
+			lklfds = 1;
+		else
+			hostfds = 1;
+	}
+
+	/* FIXME: need to handle mixed case of hostfd and lklfd. */
+	if (lklfds && hostfds)
+		return lkl_set_errno(-LKL_EOPNOTSUPP);
+
+
+	if (hostfds)
+		return host_poll(fds, nfds, timeout);
+
+	return lkl_sys_poll((struct lkl_pollfd *)fds, nfds, timeout);
+}
+
+int __poll(struct pollfd *, nfds_t, int) __attribute__((alias("poll")));
+
+HOST_CALL(select);
+int select(int nfds, fd_set *r, fd_set *w, fd_set *e, struct timeval *t)
+{
+	int fd, hostfds = 0, lklfds = 0;
+
+	CHECK_HOST_CALL(select);
+
+	for (fd = 0; fd < nfds; fd++) {
+		if (r != 0 && FD_ISSET(fd, r)) {
+			if (is_lklfd(fd))
+				lklfds = 1;
+			else
+				hostfds = 1;
+		}
+		if (w != 0 && FD_ISSET(fd, w)) {
+			if (is_lklfd(fd))
+				lklfds = 1;
+			else
+				hostfds = 1;
+		}
+		if (e != 0 && FD_ISSET(fd, e)) {
+			if (is_lklfd(fd))
+				lklfds = 1;
+			else
+				hostfds = 1;
+		}
+	}
+
+	/* FIXME: handle mixed case of hostfd and lklfd */
+	if (lklfds && hostfds)
+		return lkl_set_errno(-LKL_EOPNOTSUPP);
+
+	if (hostfds)
+		return host_select(nfds, r, w, e, t);
+
+	return lkl_sys_select(nfds, (lkl_fd_set *)r, (lkl_fd_set *)w,
+			      (lkl_fd_set *)e, (struct lkl_timeval *)t);
+}
+
+HOST_CALL(close);
+int close(int fd)
+{
+	CHECK_HOST_CALL(close);
+
+	if (!is_lklfd(fd)) {
+		/* handle epoll's dual_fd */
+		if ((dual_fds[fd] != -1) && lkl_running) {
+			lkl_call(__lkl__NR_close, 1, dual_fds[fd]);
+			dual_fds[fd] = -1;
+		}
+
+		return host_close(fd);
+	}
+
+	return lkl_call(__lkl__NR_close, 1, fd);
+}
+
+HOST_CALL(epoll_create);
+int epoll_create(int size)
+{
+	int host_fd;
+
+	CHECK_HOST_CALL(epoll_create);
+
+	host_fd = host_epoll_create(size);
+	if (!host_fd) {
+		fprintf(stderr, "%s fail (%d)\n", __func__, errno);
+		return -1;
+	}
+
+	if (!lkl_running)
+		return host_fd;
+
+	dual_fds[host_fd] = lkl_sys_epoll_create(size);
+
+	/* always returns the host fd */
+	return host_fd;
+}
+
+HOST_CALL(epoll_create1);
+int epoll_create1(int flags)
+{
+	int host_fd;
+
+	CHECK_HOST_CALL(epoll_create1);
+
+	host_fd = host_epoll_create1(flags);
+	if (!host_fd) {
+		fprintf(stderr, "%s fail (%d)\n", __func__, errno);
+		return -1;
+	}
+
+	if (!lkl_running)
+		return host_fd;
+
+	dual_fds[host_fd] = lkl_sys_epoll_create1(flags);
+
+	/* always returns the host fd */
+	return host_fd;
+}
+
+
+HOST_CALL(epoll_ctl);
+int epoll_ctl(int epollfd, int op, int fd, struct epoll_event *event)
+{
+	CHECK_HOST_CALL(epoll_ctl);
+
+	if (!is_lklfd(fd))
+		return host_epoll_ctl(epollfd, op, fd, event);
+
+	return lkl_call(__lkl__NR_epoll_ctl, 4, dual_fds[epollfd],
+			op, fd, event);
+}
+
+struct epollarg {
+	int epfd;
+	struct epoll_event *events;
+	int maxevents;
+	int timeout;
+	int pipefd;
+	int errnum;
+};
+
+HOST_CALL(epoll_wait)
+static void *host_epollwait(void *arg)
+{
+	struct epollarg *earg = arg;
+	int ret;
+
+	ret = host_epoll_wait(earg->epfd, earg->events,
+			      earg->maxevents, earg->timeout);
+	if (ret == -1)
+		earg->errnum = errno;
+	lkl_call(__lkl__NR_write, 3, earg->pipefd, &ret, sizeof(ret));
+
+	return (void *)(intptr_t)ret;
+}
+
+int epoll_wait(int epfd, struct epoll_event *events,
+	       int maxevents, int timeout)
+{
+	CHECK_HOST_CALL(epoll_wait);
+	CHECK_HOST_CALL(pipe2);
+
+	int l_pipe[2] = {-1, -1}, h_pipe[2] = {-1, -1};
+	struct epoll_event host_ev, lkl_ev;
+	int ret_events = maxevents;
+	struct epoll_event h_events[ret_events], l_events[ret_events];
+	struct epollarg earg;
+	pthread_t thread;
+	void *trv_val;
+	int i, ret, ret_lkl, ret_host;
+
+	ret = lkl_sys_pipe(l_pipe);
+	if (ret == -1) {
+		fprintf(stderr, "lkl pipe error(errno=%d)\n", errno);
+		return -1;
+	}
+
+	ret = host_pipe2(h_pipe, 0);
+	if (ret == -1) {
+		fprintf(stderr, "host pipe error(errno=%d)\n", errno);
+		return -1;
+	}
+
+	if (dual_fds[epfd] == -1) {
+		fprintf(stderr, "epollfd isn't available (%d)\n", epfd);
+		abort();
+	}
+
+	/* wait pipe at host/lkl epoll_fd */
+	memset(&lkl_ev, 0, sizeof(lkl_ev));
+	lkl_ev.events = EPOLLIN;
+	lkl_ev.data.fd = l_pipe[0];
+	ret = lkl_call(__lkl__NR_epoll_ctl, 4, dual_fds[epfd], EPOLL_CTL_ADD,
+		       l_pipe[0], &lkl_ev);
+	if (ret == -1) {
+		fprintf(stderr, "epoll_ctl error(epfd=%d:%d, fd=%d, err=%d)\n",
+			epfd, dual_fds[epfd], l_pipe[0], errno);
+		return -1;
+	}
+
+	memset(&host_ev, 0, sizeof(host_ev));
+	host_ev.events = EPOLLIN;
+	host_ev.data.fd = h_pipe[0];
+	ret = host_epoll_ctl(epfd, EPOLL_CTL_ADD, h_pipe[0], &host_ev);
+	if (ret == -1) {
+		fprintf(stderr, "host epoll_ctl error(%d, %d, %d, %d)\n",
+			epfd, h_pipe[0], h_pipe[1], errno);
+		return -1;
+	}
+
+
+	/* now wait by epoll_wait on 2 threads */
+	memset(h_events, 0, sizeof(struct epoll_event) * ret_events);
+	memset(l_events, 0, sizeof(struct epoll_event) * ret_events);
+	earg.epfd = epfd;
+	earg.events = h_events;
+	earg.maxevents = maxevents;
+	earg.timeout = timeout;
+	earg.pipefd = l_pipe[1];
+	pthread_create(&thread, NULL, host_epollwait, &earg);
+
+	ret_lkl = lkl_sys_epoll_wait(dual_fds[epfd],
+				     (struct lkl_epoll_event *)l_events,
+				     maxevents, timeout);
+	if (ret_lkl == -1) {
+		fprintf(stderr,
+			"lkl_%s_wait error(epfd=%d:%d, fd=%d, err=%d)\n",
+			__func__, epfd, dual_fds[epfd], l_pipe[0], errno);
+		return -1;
+	}
+	host_write(h_pipe[1], &ret, sizeof(ret));
+	pthread_join(thread, &trv_val);
+	ret_host = (int)(intptr_t)trv_val;
+	if (ret_host == -1) {
+		fprintf(stderr,
+			"host epoll_ctl error(%d, %d, %d, %d)\n", epfd,
+			h_pipe[0], h_pipe[1], errno);
+		return -1;
+	}
+
+	ret = lkl_call(__lkl__NR_epoll_ctl, 4, dual_fds[epfd], EPOLL_CTL_DEL,
+		       l_pipe[0], &lkl_ev);
+	if (ret == -1) {
+		fprintf(stderr,
+			"lkl epoll_ctl error(epfd=%d:%d, fd=%d, err=%d)\n",
+			epfd, dual_fds[epfd], l_pipe[0], errno);
+		return -1;
+	}
+
+	ret = host_epoll_ctl(epfd, EPOLL_CTL_DEL, h_pipe[0], &host_ev);
+	if (ret == -1) {
+		fprintf(stderr, "host epoll_ctl error(%d, %d, %d, %d)\n",
+			epfd, h_pipe[0], h_pipe[1], errno);
+		return -1;
+	}
+
+	memset(events, 0, sizeof(struct epoll_event) * maxevents);
+	ret = 0;
+	if (ret_host > 0) {
+		for (i = 0; i < ret_host; i++) {
+			if (h_events[i].data.fd == h_pipe[0])
+				continue;
+			if (is_lklfd(h_events[i].data.fd))
+				continue;
+
+			memcpy(events, &(h_events[i]),
+			       sizeof(struct epoll_event));
+			events++;
+			ret++;
+		}
+	}
+	if (ret_lkl > 0) {
+		for (i = 0; i < ret_lkl; i++) {
+			if (l_events[i].data.fd == l_pipe[0])
+				continue;
+			if (!is_lklfd(l_events[i].data.fd))
+				continue;
+
+			memcpy(events, &(l_events[i]),
+			       sizeof(struct epoll_event));
+			events++;
+			ret++;
+		}
+	}
+
+	lkl_call(__lkl__NR_close, 1, l_pipe[0]);
+	lkl_call(__lkl__NR_close, 1, l_pipe[1]);
+	host_close(h_pipe[0]);
+	host_close(h_pipe[1]);
+
+	return ret;
+}
+
+int eventfd(unsigned int count, int flags)
+{
+	if (!lkl_running) {
+		int (*f)(unsigned int a, int b) = resolve_sym("eventfd");
+
+		return f(count, flags);
+	}
+
+	return lkl_sys_eventfd2(count, flags);
+}
+
+HOST_CALL(eventfd_read);
+int eventfd_read(int fd, uint64_t *value)
+{
+	CHECK_HOST_CALL(eventfd_read);
+
+	if (!is_lklfd(fd))
+		return host_eventfd_read(fd, value);
+
+	return lkl_sys_read(fd, (void *) value,
+			    sizeof(*value)) != sizeof(*value) ? -1 : 0;
+}
+
+HOST_CALL(eventfd_write);
+int eventfd_write(int fd, uint64_t value)
+{
+	CHECK_HOST_CALL(eventfd_write);
+
+	if (!is_lklfd(fd))
+		return host_eventfd_write(fd, value);
+
+	return lkl_sys_write(fd, (void *) &value,
+			     sizeof(value)) != sizeof(value) ? -1 : 0;
+}
+
+HOST_CALL(mmap)
+void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset)
+{
+	CHECK_HOST_CALL(mmap);
+
+	if (addr != NULL || flags != (MAP_ANONYMOUS|MAP_PRIVATE) ||
+	    prot != (PROT_READ|PROT_WRITE) || fd != -2 || offset != 0)
+		return (void *)host_mmap(addr, length, prot, flags, fd, offset);
+	return lkl_sys_mmap(addr, length, prot, flags, fd, offset);
+}
+
+#ifndef __ANDROID__
+HOST_CALL(__xstat64)
+int stat(const char *pathname, struct stat *buf)
+{
+	CHECK_HOST_CALL(__xstat64);
+	return host___xstat64(0, pathname, buf);
+}
+#endif
+
+ssize_t send(int fd, const void *buf, size_t len, int flags)
+{
+	return sendto(fd, buf, len, flags, 0, 0);
+}
+
+ssize_t recv(int fd, void *buf, size_t len, int flags)
+{
+	return recvfrom(fd, buf, len, flags, 0, 0);
+}
+
+extern int pipe2(int fd[2], int flag);
+int pipe(int fd[2])
+{
+	if (!lkl_running)
+		return host_calls[__lkl__NR_pipe2]((long)fd, 0, 0, 0, 0, 0);
+
+	return pipe2(fd, 0);
+
+}
diff --git a/tools/lkl/lib/hijack/init.c b/tools/lkl/lib/hijack/init.c
new file mode 100644
index 000000000000..de00f2018e59
--- /dev/null
+++ b/tools/lkl/lib/hijack/init.c
@@ -0,0 +1,252 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * system calls hijack code
+ * Copyright (c) 2015 Hajime Tazaki
+ *
+ * Author: Hajime Tazaki <tazaki@sfc.wide.ad.jp>
+ *
+ * Note: some of the code is picked from rumpkernel, written by Antti Kantee.
+ */
+
+#include <stdio.h>
+#include <net/if.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/types.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <signal.h>
+#include <lkl.h>
+#include <lkl_host.h>
+#include <lkl_config.h>
+
+#include "xlate.h"
+#include "init.h"
+
+#define __USE_GNU
+#include <dlfcn.h>
+
+#define _GNU_SOURCE
+#include <sched.h>
+
+/* Mount points are named after filesystem types so they should never
+ * be longer than ~6 characters.
+ */
+#define MAX_FSTYPE_LEN 50
+
+static void PinToCpus(const cpu_set_t *cpus)
+{
+	if (sched_setaffinity(0, sizeof(cpu_set_t), cpus))
+		perror("sched_setaffinity");
+}
+
+static void PinToFirstCpu(const cpu_set_t *cpus)
+{
+	int j;
+	cpu_set_t pinto;
+
+	CPU_ZERO(&pinto);
+	for (j = 0; j < CPU_SETSIZE; j++) {
+		if (CPU_ISSET(j, cpus)) {
+			lkl_printf("LKL: Pin To CPU %d\n", j);
+			CPU_SET(j, &pinto);
+			PinToCpus(&pinto);
+			return;
+		}
+	}
+}
+
+int lkl_debug, lkl_running;
+
+static struct lkl_config *cfg;
+
+static int config_load(void)
+{
+	int len, ret = -1;
+	char *buf;
+	int fd;
+	char *path = getenv("LKL_HIJACK_CONFIG_FILE");
+
+	cfg = (struct lkl_config *)malloc(sizeof(struct lkl_config));
+	if (!cfg) {
+		perror("config malloc");
+		return -1;
+	}
+	memset(cfg, 0, sizeof(struct lkl_config));
+
+	ret = lkl_load_config_env(cfg);
+	if (ret < 0)
+		return ret;
+
+	if (path)
+		fd = open(path, O_RDONLY, 0);
+	else if (access("lkl-hijack.json", R_OK) == 0)
+		fd = open("lkl-hijack.json", O_RDONLY, 0);
+	else
+		return 0;
+	if (fd < 0) {
+		fprintf(stderr, "config_file open %s: %s\n",
+			path, strerror(errno));
+		return -1;
+	}
+	len = lseek(fd, 0, SEEK_END);
+	lseek(fd, 0, SEEK_SET);
+	if (len < 0) {
+		perror("config size check (lseek)");
+		return -1;
+	} else if (len == 0) {
+		return 0;
+	}
+	buf = (char *)malloc(len * sizeof(char) + 1);
+	if (!buf) {
+		perror("config buf malloc");
+		return -1;
+	}
+	ret = read(fd, buf, len);
+	if (ret < 0) {
+		perror("config file read");
+		free(buf);
+		return -1;
+	}
+	ret = lkl_load_config_json(cfg, buf);
+	free(buf);
+	return ret;
+}
+
+void __attribute__((constructor))
+hijack_init(void)
+{
+	int ret, i, dev_null;
+	int single_cpu_mode = 0;
+	cpu_set_t ori_cpu;
+
+	ret = config_load();
+	if (ret < 0)
+		return;
+
+	/* reflect pre-configuration */
+	lkl_load_config_pre(cfg);
+
+	/* hijack library specific configurations */
+	if (cfg->debug)
+		lkl_register_dbg_handler();
+
+	if (lkl_debug & 0x200) {
+		char c;
+
+		printf("press 'enter' to continue\n");
+		if (scanf("%c", &c) <= 0) {
+			fprintf(stderr, "scanf() fails\n");
+			return;
+		}
+	}
+	if (cfg->single_cpu) {
+		single_cpu_mode = atoi(cfg->single_cpu);
+		switch (single_cpu_mode) {
+		case 0:
+		case 1:
+		case 2:
+			break;
+		default:
+			fprintf(stderr, "single cpu mode must be 0~2.\n");
+			single_cpu_mode = 0;
+			break;
+		}
+	}
+
+	if (single_cpu_mode) {
+		if (sched_getaffinity(0, sizeof(cpu_set_t), &ori_cpu)) {
+			perror("sched_getaffinity");
+			single_cpu_mode = 0;
+		}
+	}
+
+	/* Pin to a single cpu.
+	 * Any children thread created after it are pinned to the same CPU.
+	 */
+	if (single_cpu_mode == 2)
+		PinToFirstCpu(&ori_cpu);
+
+	if (single_cpu_mode == 1)
+		PinToFirstCpu(&ori_cpu);
+
+#ifdef __ANDROID__
+	struct sigaction sa;
+
+	sa.sa_handler = SIG_IGN;
+	sa.sa_flags = 0;
+	if (sigaction(32, &sa, 0) == -1) {
+		perror("sigaction");
+		exit(1);
+	}
+#endif
+
+	ret = lkl_start_kernel(&lkl_host_ops, cfg->boot_cmdline);
+	if (ret) {
+		fprintf(stderr, "can't start kernel: %s\n", lkl_strerror(ret));
+		return;
+	}
+
+	lkl_running = 1;
+
+	/* initialize epoll manage list */
+	memset(dual_fds, -1, sizeof(int) * LKL_FD_OFFSET);
+
+	/* restore cpu affinity */
+	if (single_cpu_mode)
+		PinToCpus(&ori_cpu);
+
+	ret = lkl_set_fd_limit(65535);
+	if (ret)
+		fprintf(stderr, "lkl_set_fd_limit failed: %s\n",
+			lkl_strerror(ret));
+
+	/* fillup FDs up to LKL_FD_OFFSET */
+	ret = lkl_sys_mknod("/dev_null", LKL_S_IFCHR | 0600, LKL_MKDEV(1, 3));
+	dev_null = lkl_sys_open("/dev_null", LKL_O_RDONLY, 0);
+	if (dev_null < 0) {
+		fprintf(stderr, "failed to open /dev/null: %s\n",
+				lkl_strerror(dev_null));
+		return;
+	}
+
+	for (i = 1; i < LKL_FD_OFFSET; i++)
+		lkl_sys_dup(dev_null);
+
+	/* lo iff_up */
+	lkl_if_up(1);
+
+	/* reflect post-configuration */
+	lkl_load_config_post(cfg);
+}
+
+void __attribute__((destructor))
+hijack_fini(void)
+{
+	int i;
+	int err;
+
+	/* The following pauses the kernel before exiting allowing one
+	 * to debug or collect stattistics/diagnosis info from it.
+	 */
+	if (lkl_debug & 0x100) {
+		while (1)
+			pause();
+	}
+
+	if (!cfg)
+		return;
+
+	lkl_unload_config(cfg);
+	free(cfg);
+
+	if (!lkl_running)
+		return;
+
+	for (i = 0; i < LKL_FD_OFFSET; i++)
+		lkl_sys_close(i);
+
+	err = lkl_sys_halt();
+	if (err)
+		fprintf(stderr, "lkl_sys_halt: %s\n", lkl_strerror(err));
+}
diff --git a/tools/lkl/lib/hijack/init.h b/tools/lkl/lib/hijack/init.h
new file mode 100644
index 000000000000..c4039e018b2b
--- /dev/null
+++ b/tools/lkl/lib/hijack/init.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_HIJACK_INIT_H
+#define _LKL_HIJACK_INIT_H
+
+extern int lkl_running;
+extern int dual_fds[];
+
+#endif /*_LKL_HIJACK_INIT_H */
diff --git a/tools/lkl/lib/hijack/xlate.c b/tools/lkl/lib/hijack/xlate.c
new file mode 100644
index 000000000000..b96a0107116a
--- /dev/null
+++ b/tools/lkl/lib/hijack/xlate.c
@@ -0,0 +1,613 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <errno.h>
+#define __USE_GNU
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#undef st_atime
+#undef st_mtime
+#undef st_ctime
+#include <lkl.h>
+
+#include "xlate.h"
+
+long lkl_set_errno(long err)
+{
+	if (err >= 0)
+		return err;
+
+	switch (err) {
+	case -LKL_EPERM:
+		errno = EPERM;
+		break;
+	case -LKL_ENOENT:
+		errno = ENOENT;
+		break;
+	case -LKL_ESRCH:
+		errno = ESRCH;
+		break;
+	case -LKL_EINTR:
+		errno = EINTR;
+		break;
+	case -LKL_EIO:
+		errno = EIO;
+		break;
+	case -LKL_ENXIO:
+		errno = ENXIO;
+		break;
+	case -LKL_E2BIG:
+		errno = E2BIG;
+		break;
+	case -LKL_ENOEXEC:
+		errno = ENOEXEC;
+		break;
+	case -LKL_EBADF:
+		errno = EBADF;
+		break;
+	case -LKL_ECHILD:
+		errno = ECHILD;
+		break;
+	case -LKL_EAGAIN:
+		errno = EAGAIN;
+		break;
+	case -LKL_ENOMEM:
+		errno = ENOMEM;
+		break;
+	case -LKL_EACCES:
+		errno = EACCES;
+		break;
+	case -LKL_EFAULT:
+		errno = EFAULT;
+		break;
+	case -LKL_ENOTBLK:
+		errno = ENOTBLK;
+		break;
+	case -LKL_EBUSY:
+		errno = EBUSY;
+		break;
+	case -LKL_EEXIST:
+		errno = EEXIST;
+		break;
+	case -LKL_EXDEV:
+		errno = EXDEV;
+		break;
+	case -LKL_ENODEV:
+		errno = ENODEV;
+		break;
+	case -LKL_ENOTDIR:
+		errno = ENOTDIR;
+		break;
+	case -LKL_EISDIR:
+		errno = EISDIR;
+		break;
+	case -LKL_EINVAL:
+		errno = EINVAL;
+		break;
+	case -LKL_ENFILE:
+		errno = ENFILE;
+		break;
+	case -LKL_EMFILE:
+		errno = EMFILE;
+		break;
+	case -LKL_ENOTTY:
+		errno = ENOTTY;
+		break;
+	case -LKL_ETXTBSY:
+		errno = ETXTBSY;
+		break;
+	case -LKL_EFBIG:
+		errno = EFBIG;
+		break;
+	case -LKL_ENOSPC:
+		errno = ENOSPC;
+		break;
+	case -LKL_ESPIPE:
+		errno = ESPIPE;
+		break;
+	case -LKL_EROFS:
+		errno = EROFS;
+		break;
+	case -LKL_EMLINK:
+		errno = EMLINK;
+		break;
+	case -LKL_EPIPE:
+		errno = EPIPE;
+		break;
+	case -LKL_EDOM:
+		errno = EDOM;
+		break;
+	case -LKL_ERANGE:
+		errno = ERANGE;
+		break;
+	case -LKL_EDEADLK:
+		errno = EDEADLK;
+		break;
+	case -LKL_ENAMETOOLONG:
+		errno = ENAMETOOLONG;
+		break;
+	case -LKL_ENOLCK:
+		errno = ENOLCK;
+		break;
+	case -LKL_ENOSYS:
+		errno = ENOSYS;
+		break;
+	case -LKL_ENOTEMPTY:
+		errno = ENOTEMPTY;
+		break;
+	case -LKL_ELOOP:
+		errno = ELOOP;
+		break;
+	case -LKL_ENOMSG:
+		errno = ENOMSG;
+		break;
+	case -LKL_EIDRM:
+		errno = EIDRM;
+		break;
+	case -LKL_ECHRNG:
+		errno = ECHRNG;
+		break;
+	case -LKL_EL2NSYNC:
+		errno = EL2NSYNC;
+		break;
+	case -LKL_EL3HLT:
+		errno = EL3HLT;
+		break;
+	case -LKL_EL3RST:
+		errno = EL3RST;
+		break;
+	case -LKL_ELNRNG:
+		errno = ELNRNG;
+		break;
+	case -LKL_EUNATCH:
+		errno = EUNATCH;
+		break;
+	case -LKL_ENOCSI:
+		errno = ENOCSI;
+		break;
+	case -LKL_EL2HLT:
+		errno = EL2HLT;
+		break;
+	case -LKL_EBADE:
+		errno = EBADE;
+		break;
+	case -LKL_EBADR:
+		errno = EBADR;
+		break;
+	case -LKL_EXFULL:
+		errno = EXFULL;
+		break;
+	case -LKL_ENOANO:
+		errno = ENOANO;
+		break;
+	case -LKL_EBADRQC:
+		errno = EBADRQC;
+		break;
+	case -LKL_EBADSLT:
+		errno = EBADSLT;
+		break;
+	case -LKL_EBFONT:
+		errno = EBFONT;
+		break;
+	case -LKL_ENOSTR:
+		errno = ENOSTR;
+		break;
+	case -LKL_ENODATA:
+		errno = ENODATA;
+		break;
+	case -LKL_ETIME:
+		errno = ETIME;
+		break;
+	case -LKL_ENOSR:
+		errno = ENOSR;
+		break;
+	case -LKL_ENONET:
+		errno = ENONET;
+		break;
+	case -LKL_ENOPKG:
+		errno = ENOPKG;
+		break;
+	case -LKL_EREMOTE:
+		errno = EREMOTE;
+		break;
+	case -LKL_ENOLINK:
+		errno = ENOLINK;
+		break;
+	case -LKL_EADV:
+		errno = EADV;
+		break;
+	case -LKL_ESRMNT:
+		errno = ESRMNT;
+		break;
+	case -LKL_ECOMM:
+		errno = ECOMM;
+		break;
+	case -LKL_EPROTO:
+		errno = EPROTO;
+		break;
+	case -LKL_EMULTIHOP:
+		errno = EMULTIHOP;
+		break;
+	case -LKL_EDOTDOT:
+		errno = EDOTDOT;
+		break;
+	case -LKL_EBADMSG:
+		errno = EBADMSG;
+		break;
+	case -LKL_EOVERFLOW:
+		errno = EOVERFLOW;
+		break;
+	case -LKL_ENOTUNIQ:
+		errno = ENOTUNIQ;
+		break;
+	case -LKL_EBADFD:
+		errno = EBADFD;
+		break;
+	case -LKL_EREMCHG:
+		errno = EREMCHG;
+		break;
+	case -LKL_ELIBACC:
+		errno = ELIBACC;
+		break;
+	case -LKL_ELIBBAD:
+		errno = ELIBBAD;
+		break;
+	case -LKL_ELIBSCN:
+		errno = ELIBSCN;
+		break;
+	case -LKL_ELIBMAX:
+		errno = ELIBMAX;
+		break;
+	case -LKL_ELIBEXEC:
+		errno = ELIBEXEC;
+		break;
+	case -LKL_EILSEQ:
+		errno = EILSEQ;
+		break;
+	case -LKL_ERESTART:
+		errno = ERESTART;
+		break;
+	case -LKL_ESTRPIPE:
+		errno = ESTRPIPE;
+		break;
+	case -LKL_EUSERS:
+		errno = EUSERS;
+		break;
+	case -LKL_ENOTSOCK:
+		errno = ENOTSOCK;
+		break;
+	case -LKL_EDESTADDRREQ:
+		errno = EDESTADDRREQ;
+		break;
+	case -LKL_EMSGSIZE:
+		errno = EMSGSIZE;
+		break;
+	case -LKL_EPROTOTYPE:
+		errno = EPROTOTYPE;
+		break;
+	case -LKL_ENOPROTOOPT:
+		errno = ENOPROTOOPT;
+		break;
+	case -LKL_EPROTONOSUPPORT:
+		errno = EPROTONOSUPPORT;
+		break;
+	case -LKL_ESOCKTNOSUPPORT:
+		errno = ESOCKTNOSUPPORT;
+		break;
+	case -LKL_EOPNOTSUPP:
+		errno = EOPNOTSUPP;
+		break;
+	case -LKL_EPFNOSUPPORT:
+		errno = EPFNOSUPPORT;
+		break;
+	case -LKL_EAFNOSUPPORT:
+		errno = EAFNOSUPPORT;
+		break;
+	case -LKL_EADDRINUSE:
+		errno = EADDRINUSE;
+		break;
+	case -LKL_EADDRNOTAVAIL:
+		errno = EADDRNOTAVAIL;
+		break;
+	case -LKL_ENETDOWN:
+		errno = ENETDOWN;
+		break;
+	case -LKL_ENETUNREACH:
+		errno = ENETUNREACH;
+		break;
+	case -LKL_ENETRESET:
+		errno = ENETRESET;
+		break;
+	case -LKL_ECONNABORTED:
+		errno = ECONNABORTED;
+		break;
+	case -LKL_ECONNRESET:
+		errno = ECONNRESET;
+		break;
+	case -LKL_ENOBUFS:
+		errno = ENOBUFS;
+		break;
+	case -LKL_EISCONN:
+		errno = EISCONN;
+		break;
+	case -LKL_ENOTCONN:
+		errno = ENOTCONN;
+		break;
+	case -LKL_ESHUTDOWN:
+		errno = ESHUTDOWN;
+		break;
+	case -LKL_ETOOMANYREFS:
+		errno = ETOOMANYREFS;
+		break;
+	case -LKL_ETIMEDOUT:
+		errno = ETIMEDOUT;
+		break;
+	case -LKL_ECONNREFUSED:
+		errno = ECONNREFUSED;
+		break;
+	case -LKL_EHOSTDOWN:
+		errno = EHOSTDOWN;
+		break;
+	case -LKL_EHOSTUNREACH:
+		errno = EHOSTUNREACH;
+		break;
+	case -LKL_EALREADY:
+		errno = EALREADY;
+		break;
+	case -LKL_EINPROGRESS:
+		errno = EINPROGRESS;
+		break;
+	case -LKL_ESTALE:
+		errno = ESTALE;
+		break;
+	case -LKL_EUCLEAN:
+		errno = EUCLEAN;
+		break;
+	case -LKL_ENOTNAM:
+		errno = ENOTNAM;
+		break;
+	case -LKL_ENAVAIL:
+		errno = ENAVAIL;
+		break;
+	case -LKL_EISNAM:
+		errno = EISNAM;
+		break;
+	case -LKL_EREMOTEIO:
+		errno = EREMOTEIO;
+		break;
+	case -LKL_EDQUOT:
+		errno = EDQUOT;
+		break;
+	case -LKL_ENOMEDIUM:
+		errno = ENOMEDIUM;
+		break;
+	case -LKL_EMEDIUMTYPE:
+		errno = EMEDIUMTYPE;
+		break;
+	case -LKL_ECANCELED:
+		errno = ECANCELED;
+		break;
+	case -LKL_ENOKEY:
+		errno = ENOKEY;
+		break;
+	case -LKL_EKEYEXPIRED:
+		errno = EKEYEXPIRED;
+		break;
+	case -LKL_EKEYREVOKED:
+		errno = EKEYREVOKED;
+		break;
+	case -LKL_EKEYREJECTED:
+		errno = EKEYREJECTED;
+		break;
+	case -LKL_EOWNERDEAD:
+		errno = EOWNERDEAD;
+		break;
+	case -LKL_ENOTRECOVERABLE:
+		errno = ENOTRECOVERABLE;
+		break;
+	case -LKL_ERFKILL:
+		errno = ERFKILL;
+		break;
+	case -LKL_EHWPOISON:
+		errno = EHWPOISON;
+		break;
+	}
+
+	return -1;
+}
+
+int lkl_soname_xlate(int soname)
+{
+	switch (soname) {
+	case SO_DEBUG:
+		return LKL_SO_DEBUG;
+	case SO_REUSEADDR:
+		return LKL_SO_REUSEADDR;
+	case SO_TYPE:
+		return LKL_SO_TYPE;
+	case SO_ERROR:
+		return LKL_SO_ERROR;
+	case SO_DONTROUTE:
+		return LKL_SO_DONTROUTE;
+	case SO_BROADCAST:
+		return LKL_SO_BROADCAST;
+	case SO_SNDBUF:
+		return LKL_SO_SNDBUF;
+	case SO_RCVBUF:
+		return LKL_SO_RCVBUF;
+	case SO_SNDBUFFORCE:
+		return LKL_SO_SNDBUFFORCE;
+	case SO_RCVBUFFORCE:
+		return LKL_SO_RCVBUFFORCE;
+	case SO_KEEPALIVE:
+		return LKL_SO_KEEPALIVE;
+	case SO_OOBINLINE:
+		return LKL_SO_OOBINLINE;
+	case SO_NO_CHECK:
+		return LKL_SO_NO_CHECK;
+	case SO_PRIORITY:
+		return LKL_SO_PRIORITY;
+	case SO_LINGER:
+		return LKL_SO_LINGER;
+	case SO_BSDCOMPAT:
+		return LKL_SO_BSDCOMPAT;
+#ifdef SO_REUSEPORT
+	case SO_REUSEPORT:
+		return LKL_SO_REUSEPORT;
+#endif
+	case SO_PASSCRED:
+		return LKL_SO_PASSCRED;
+	case SO_PEERCRED:
+		return LKL_SO_PEERCRED;
+	case SO_RCVLOWAT:
+		return LKL_SO_RCVLOWAT;
+	case SO_SNDLOWAT:
+		return LKL_SO_SNDLOWAT;
+	case SO_RCVTIMEO:
+		return LKL_SO_RCVTIMEO;
+	case SO_SNDTIMEO:
+		return LKL_SO_SNDTIMEO;
+	case SO_SECURITY_AUTHENTICATION:
+		return LKL_SO_SECURITY_AUTHENTICATION;
+	case SO_SECURITY_ENCRYPTION_TRANSPORT:
+		return LKL_SO_SECURITY_ENCRYPTION_TRANSPORT;
+	case SO_SECURITY_ENCRYPTION_NETWORK:
+		return LKL_SO_SECURITY_ENCRYPTION_NETWORK;
+	case SO_BINDTODEVICE:
+		return LKL_SO_BINDTODEVICE;
+	case SO_ATTACH_FILTER:
+		return LKL_SO_ATTACH_FILTER;
+	case SO_DETACH_FILTER:
+		return LKL_SO_DETACH_FILTER;
+	case SO_PEERNAME:
+		return LKL_SO_PEERNAME;
+	case SO_TIMESTAMP:
+		return LKL_SO_TIMESTAMP;
+	case SO_ACCEPTCONN:
+		return LKL_SO_ACCEPTCONN;
+	case SO_PEERSEC:
+		return LKL_SO_PEERSEC;
+	case SO_PASSSEC:
+		return LKL_SO_PASSSEC;
+	case SO_TIMESTAMPNS:
+		return LKL_SO_TIMESTAMPNS;
+	case SO_MARK:
+		return LKL_SO_MARK;
+	case SO_TIMESTAMPING:
+		return LKL_SO_TIMESTAMPING;
+	case SO_PROTOCOL:
+		return LKL_SO_PROTOCOL;
+	case SO_DOMAIN:
+		return LKL_SO_DOMAIN;
+	case SO_RXQ_OVFL:
+		return LKL_SO_RXQ_OVFL;
+#ifdef SO_WIFI_STATUS
+	case SO_WIFI_STATUS:
+		return LKL_SO_WIFI_STATUS;
+#endif
+#ifdef SO_PEEK_OFF
+	case SO_PEEK_OFF:
+		return LKL_SO_PEEK_OFF;
+#endif
+#ifdef SO_NOFCS
+	case SO_NOFCS:
+		return LKL_SO_NOFCS;
+#endif
+#ifdef SO_LOCK_FILTER
+	case SO_LOCK_FILTER:
+		return LKL_SO_LOCK_FILTER;
+#endif
+#ifdef SO_SELECT_ERR_QUEUE
+	case SO_SELECT_ERR_QUEUE:
+		return LKL_SO_SELECT_ERR_QUEUE;
+#endif
+#ifdef SO_BUSY_POLL
+	case SO_BUSY_POLL:
+		return LKL_SO_BUSY_POLL;
+#endif
+#ifdef SO_MAX_PACING_RATE
+	case SO_MAX_PACING_RATE:
+		return LKL_SO_MAX_PACING_RATE;
+#endif
+	}
+
+	return soname;
+}
+
+int lkl_solevel_xlate(int solevel)
+{
+	switch (solevel) {
+	case SOL_SOCKET:
+		return LKL_SOL_SOCKET;
+	}
+
+	return solevel;
+}
+
+unsigned long lkl_ioctl_req_xlate(unsigned long req)
+{
+	switch (req) {
+	case FIOSETOWN:
+		return LKL_FIOSETOWN;
+	case SIOCSPGRP:
+		return LKL_SIOCSPGRP;
+	case FIOGETOWN:
+		return LKL_FIOGETOWN;
+	case SIOCGPGRP:
+		return LKL_SIOCGPGRP;
+	case SIOCATMARK:
+		return LKL_SIOCATMARK;
+	case SIOCGSTAMP:
+		return LKL_SIOCGSTAMP;
+	case SIOCGSTAMPNS:
+		return LKL_SIOCGSTAMPNS;
+	}
+
+	/* TODO: asm/termios.h translations */
+
+	return req;
+}
+
+int lkl_fcntl_cmd_xlate(int cmd)
+{
+	switch (cmd) {
+	case F_DUPFD:
+		return LKL_F_DUPFD;
+	case F_GETFD:
+		return LKL_F_GETFD;
+	case F_SETFD:
+		return LKL_F_SETFD;
+	case F_GETFL:
+		return LKL_F_GETFL;
+	case F_SETFL:
+		return LKL_F_SETFL;
+	case F_GETLK:
+		return LKL_F_GETLK;
+	case F_SETLK:
+		return LKL_F_SETLK;
+	case F_SETLKW:
+		return LKL_F_SETLKW;
+	case F_SETOWN:
+		return LKL_F_SETOWN;
+	case F_GETOWN:
+		return LKL_F_GETOWN;
+	case F_SETSIG:
+		return LKL_F_SETSIG;
+	case F_GETSIG:
+		return LKL_F_GETSIG;
+#ifndef LKL_CONFIG_64BIT
+	case F_GETLK64:
+		return LKL_F_GETLK64;
+	case F_SETLK64:
+		return LKL_F_SETLK64;
+	case F_SETLKW64:
+		return LKL_F_SETLKW64;
+#endif
+	case F_SETOWN_EX:
+		return LKL_F_SETOWN_EX;
+	case F_GETOWN_EX:
+		return LKL_F_GETOWN_EX;
+	}
+
+	return cmd;
+}
+
diff --git a/tools/lkl/lib/hijack/xlate.h b/tools/lkl/lib/hijack/xlate.h
new file mode 100644
index 000000000000..0c0281f241a6
--- /dev/null
+++ b/tools/lkl/lib/hijack/xlate.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_HIJACK_XLATE_H
+#define _LKL_HIJACK_XLATE_H
+
+long lkl_set_errno(long err);
+int lkl_soname_xlate(int soname);
+int lkl_solevel_xlate(int solevel);
+unsigned long lkl_ioctl_req_xlate(unsigned long req);
+int lkl_fcntl_cmd_xlate(int cmd);
+
+#define LKL_FD_OFFSET (FD_SETSIZE/2)
+
+#endif /* _LKL_HIJACK_XLATE_H */
diff --git a/tools/lkl/tests/hijack-test.sh b/tools/lkl/tests/hijack-test.sh
new file mode 100755
index 000000000000..a62aa5b251e0
--- /dev/null
+++ b/tools/lkl/tests/hijack-test.sh
@@ -0,0 +1,760 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+clear_wdir()
+{
+    test -f ${VDESWITCH}.pid && kill $(cat ${VDESWITCH}.pid)
+    rm -rf ${wdir}
+    tap_cleanup
+    tap_cleanup 1
+}
+
+set_cfgjson()
+{
+    cfgjson=${wdir}/hijack-test$1.conf
+
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        adb shell cat \> ${cfgjson}
+    else
+        cat > ${cfgjson}
+    fi
+
+    export_vars cfgjson
+}
+
+run_hijack_cfg()
+{
+    lkl_test_cmd LKL_HIJACK_CONFIG_FILE=$cfgjson $hijack $@
+}
+
+run_hijack()
+{
+    lkl_test_cmd $hijack $@
+}
+
+run_netperf()
+{
+    lkl_test_cmd TEST_NETSERVER_PORT=$TEST_NETSERVER_PORT \
+                 LKL_HIJACK_CONFIG_FILE=$cfgjson $netperf $@
+}
+
+test_ping()
+{
+    set -e
+
+    run_hijack ${ping} -c 1 127.0.0.1
+}
+
+test_ping6()
+{
+    set -e
+
+    run_hijack ${ping6} -c 1 ::1
+}
+
+test_mount_and_dump()
+{
+    set -e
+
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        echo "TODO: android-23 doesn't call destructor..."
+        return $TEST_SKIP
+    fi
+
+    set_cfgjson << EOF
+    {
+        "mount":"proc,sysfs",
+        "dump":"/sysfs/class/net/lo/mtu,/sysfs/class/net/lo/dev_id",
+        "debug": "1"
+    }
+EOF
+
+    ans=$(run_hijack_cfg $(lkl_test_cmd which true))
+    echo "$ans"
+    echo "$ans" | grep "^65536" # lo's MTU
+    echo "$ans" | grep "0x0" # lo's dev_id
+}
+
+test_boot_cmdline()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "debug":"1",
+        "boot_cmdline":"loglevel=1"
+    }
+EOF
+
+    ans=$(run_hijack_cfg $(lkl_test_cmd which true))
+    echo "$ans"
+    [ $(echo "$ans" | wc -l) = 1 ]
+}
+
+
+test_pipe_setup()
+{
+    set -e
+
+    mkfifo ${fifo1}
+    mkfifo ${fifo2}
+
+    set_cfgjson << EOF
+    {
+        "interfaces":
+        [
+            {
+                "type":"pipe",
+                "param":"${fifo1}|${fifo2}",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "mac":"$TEST_MAC0",
+            }
+        ]
+    }
+EOF
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+}
+
+test_pipe_ping()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_lkl)",
+        "gateway6":"$(ip6_lkl)",
+        "interfaces":
+        [
+            {
+                "type":"pipe",
+                "param":"${fifo1}|${fifo2}",
+                "ip":"$(ip_host)",
+                "masklen":"$TEST_IP_NETMASK",
+                "mac":"$TEST_MAC0",
+                "ipv6":"$(ip6_host)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_hijack_cfg $(lkl_test_cmd which sleep) 10 &
+
+    set_cfgjson 2 << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"pipe",
+                "param":"${fifo2}|${fifo1}",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "mac":"$TEST_MAC0",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    # Ping under LKL
+    run_hijack_cfg ${ping} -c 1 -w 10 $(ip_host)
+
+    # Ping 6 under LKL
+    run_hijack_cfg ${ping6} -c 1 -w 10 $(ip6_host)
+
+    wait
+}
+
+test_tap_setup()
+{
+    set -e
+
+    # Set up the TAP device we'd like to use
+    tap_setup
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "debug":"1",
+        "interfaces": [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac": "$TEST_MAC0"
+            }
+        ]
+    }
+EOF
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+    echo "$addr" | grep "$(ip6_lkl)"
+    ! echo "$addr" | grep "WARN: failed to free"
+}
+
+test_tap_cleanup()
+{
+    tap_cleanup
+    tap_cleanup 1
+}
+
+test_tap_ping_host()
+{
+    set -e
+
+    # Make sure we can ping the host from inside LKL
+    run_hijack_cfg ${ping} -c 1 $(ip_host)
+    run_hijack_cfg ${ping6} -c 1 $(ip6_host)
+}
+
+test_tap_ping_lkl()
+{
+    set -e
+
+    # Now let's check that the host can see LKL.
+    lkl_test_cmd sudo ip -6 neigh del $(ip6_lkl) dev $(tap_ifname)
+    lkl_test_cmd sudo ip neigh del $(ip_lkl) dev $(tap_ifname)
+    run_hijack_cfg $(lkl_test_cmd which sleep) 3 &
+    sleep 2
+    lkl_test_cmd sudo ping -i 0.01 -c 65 $(ip_lkl)
+    lkl_test_cmd sudo ping6 -i 0.01 -c 65 $(ip6_lkl)
+}
+
+test_tap_neighbours()
+{
+    set -e
+
+    neigh1="$(ip_add 100)|12:34:56:78:9a:bc"
+    neigh2="$(ip6_add 100)|12:34:56:78:9a:be"
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "neigh":"${neigh1};${neigh2}"
+            }
+        ]
+    }
+EOF
+
+    # add neighbor entries
+    ans=$(run_hijack_cfg ip neighbor show) || true
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bc"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:be"
+
+    # gateway
+    ans=$(run_hijack_cfg ip route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip_host)"
+
+    # gateway v6
+    ans=$(run_hijack_cfg ip -6 route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip6_host)"
+}
+
+test_tap_netperf_stream_tso_csum()
+{
+    set -e
+
+    # offload
+    # LKL_VIRTIO_NET_F_HOST_TSO4 && LKL_VIRTIO_NET_F_GUEST_TSO4
+    # LKL_VIRTIO_NET_F_CSUM && LKL_VIRTIO_NET_F_GUEST_CSUM
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "offload":"0x883",
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_netperf $(ip_host) TCP_STREAM
+}
+
+test_tap_netperf_maerts_csum_tso()
+{
+    run_netperf $(ip_host) TCP_MAERTS
+}
+
+test_tap_netperf_stream_csum_tso_mrgrxbuf()
+{
+    set -e
+
+    # offload
+    # LKL_VIRTIO_NET_F_HOST_TSO4 && LKL_VIRTIO_NET_F_MRG_RXBUF
+    # LKL_VIRTIO_NET_F_CSUM && LKL_VIRTIO_NET_F_GUEST_CSUM
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "offload":"0x8803",
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_netperf $(ip_host) TCP_MAERTS
+}
+
+test_tap_netperf_tcp_rr()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_netperf $(ip_host) TCP_RR
+}
+
+test_tap_netperf_tcp_stream()
+{
+    set -e
+
+    run_netperf $(ip_host) TCP_STREAM
+}
+
+test_tap_netperf_tcp_maerts()
+{
+    set -e
+
+    run_netperf $(ip_host) TCP_MAERTS
+}
+
+
+test_tap_qdisc()
+{
+    set -e
+
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        return $TEST_SKIP
+    fi
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0",
+                "qdisc":"root|fq"
+            }
+        ]
+    }
+EOF
+
+    qdisc=$(run_hijack_cfg tc -s -d qdisc show)
+    echo "$qdisc"
+    echo "$qdisc" | grep "qdisc fq" > /dev/null
+    echo "$qdisc" | grep throttled > /dev/null
+}
+
+test_tap_multi_if_setup()
+{
+    set -e
+
+    # Set up 2nd TAP device we'd like to use
+    tap_setup 1
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0"
+            },
+            {
+                "type":"tap",
+                "param":"$(tap_ifname 1)",
+                "ip":"$(ip_lkl 1)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl 1)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC1"
+            }
+        ]
+    }
+EOF
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+    echo "$addr" | grep "$(ip6_lkl)"
+    echo "$addr" | grep eth1
+    echo "$addr" | grep $(ip_lkl 1)
+    echo "$addr" | grep "$TEST_MAC1"
+    echo "$addr" | grep "$(ip6_lkl 1)"
+    ! echo "$addr" | grep "WARN: failed to free"
+}
+
+test_tap_multi_if_ping()
+{
+    run_hijack_cfg ${ping} -c 1 $(ip_host)
+    run_hijack_cfg ${ping6} -c 1 $(ip6_host)
+    run_hijack_cfg ${ping} -c 1 $(ip_host 1)
+    run_hijack_cfg ${ping6} -c 1 $(ip6_host 1)
+}
+
+test_tap_multi_if_neigh()
+{
+
+    neigh1="$(ip_host)00|12:34:56:78:9a:bc"
+    neigh2="$(ip6_host)00|12:34:56:78:9a:be"
+    neigh3="$(ip_host 1)00|12:34:56:78:9a:bd"
+    neigh4="$(ip6_host 1)00|12:34:56:78:9a:bf"
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0",
+                "neigh":"${neigh1};${neigh2}"
+            },
+            {
+                "type":"tap",
+                "param":"$(tap_ifname 1)",
+                "ip":"$(ip_lkl 1)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl 1)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC1",
+                "neigh":"${neigh3};${neigh4}"
+            }
+        ]
+    }
+EOF
+
+    # add neighbor entries
+    ans=$(run_hijack_cfg ip neighbor show) || true
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bc"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:be"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bd"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bf"
+}
+
+test_tap_multi_if_gateway()
+{
+    ans=$(run_hijack_cfg ip route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip_host)"
+}
+
+test_tap_multi_if_gateway_v6()
+{
+    ans=$(run_hijack_cfg ip -6 route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip6_host)"
+}
+
+
+test_tap_multitable_setup()
+{
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ifgateway":"$(ip_host)",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "ifgateway6":"$(ip6_host)",
+                "mac":"$TEST_MAC0",
+                "neigh":"${neigh1};${neigh2}"
+            },
+            {
+                "type":"tap",
+                "param":"$(tap_ifname 1)",
+                "ip":"$(ip_lkl 1)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ifgateway":"$(ip_host 1)",
+                "ipv6":"$(ip6_lkl 1)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "ifgateway6":"$(ip6_host 1)",
+                "mac":"$TEST_MAC1",
+                "neigh":"${neigh3};${neigh4}"
+            }
+        ]
+    }
+EOF
+}
+
+test_tap_multitable_ipv4_rule()
+{
+    addr=$(run_hijack_cfg ip rule show)
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep $(ip_lkl 1)
+}
+
+test_tap_multitable_ipv6_rule()
+{
+    addr=$(run_hijack_cfg ip -6 rule show)
+    echo "$addr" | grep $(ip6_lkl)
+    echo "$addr" | grep $(ip6_lkl 1)
+}
+
+test_tap_multitable_ipv4_rule_table_4()
+{
+    addr=$(run_hijack_cfg ip route show table 4)
+    echo "$addr" | grep $(ip_host)
+}
+
+test_tap_multitable_ipv6_rule_table_5()
+{
+    addr=$(run_hijack_cfg ip -6 route show table 5)
+    echo "$addr" | grep fc03::
+    echo "$addr" | grep $(ip6_host)
+}
+
+test_tap_multitable_ipv6_rule_table_6()
+{
+    addr=$(run_hijack_cfg ip route show table 6)
+    echo "$addr" | grep $(ip_host 1)
+}
+
+test_tap_multitable_ipv6_rule_table_7()
+{
+    addr=$(run_hijack_cfg ip -6 route show table 7)
+    echo "$addr" | grep fc04::
+    echo "$addr" | grep $(ip6_host 1)
+}
+
+test_vde_setup()
+{
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"vde",
+                "param":"${VDESWITCH}",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0",
+                "neigh":"${neigh1};${neigh2}"
+            }
+        ]
+    }
+EOF
+
+    tap_setup
+
+    sleep 2
+    vde_switch -d -t $(tap_ifname) -s ${VDESWITCH} -p ${VDESWITCH}.pid
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+}
+
+test_vde_cleanup()
+{
+    tap_cleanup
+}
+
+test_vde_ping_host()
+{
+    run_hijack_cfg ./ping $(ip_host) -c 1
+}
+
+test_vde_ping_lkl()
+{
+    lkl_test_cmd sudo arp -d $(ip_lkl)
+    lkl_test_cmd sudo ping -i 0.01 -c 65 $(ip_lkl) &
+    run_hijack_cfg sleep 3
+}
+
+source ${script_dir}/test.sh
+source ${script_dir}/net-setup.sh
+
+if [[ ! -e ${basedir}/lib/hijack/liblkl-hijack.so ]]; then
+    lkl_test_plan 0 "hijack tests"
+    echo "missing liblkl-hijack.so"
+    exit 0
+fi
+
+if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+    wdir=$ANDROID_WDIR
+    adb_push lib/hijack/liblkl-hijack.so bin/lkl-hijack.sh tests/net-setup.sh \
+             tests/run_netperf.sh tests/hijack-test.sh
+    ping="ping"
+    ping6="ping6"
+    hijack="$wdir/bin/lkl-hijack.sh"
+    netperf="$wdir/tests/run_netperf.sh"
+else
+    # Make a temporary directory to run tests in, since we'll be copying
+    # things there.
+    wdir=$(mktemp -d)
+    cp `which ping` ${wdir}
+    cp `which ping6` ${wdir}
+    ping=${wdir}/ping
+    ping6=${wdir}/ping6
+    hijack=$basedir/bin/lkl-hijack.sh
+    netperf=$basedir/tests/run_netperf.sh
+fi
+
+fifo1=${wdir}/fifo1
+fifo2=${wdir}/fifo2
+VDESWITCH=${wdir}/vde_switch
+
+# And make sure we clean up when we're done
+trap "clear_wdir &>/dev/null" EXIT
+
+lkl_test_plan 5 "hijack basic tests"
+lkl_test_run 1 run_hijack ip addr
+lkl_test_run 2 run_hijack ip route
+lkl_test_run 3 test_ping
+lkl_test_run 4 test_ping6
+lkl_test_run 5 test_mount_and_dump
+lkl_test_run 6 test_boot_cmdline
+
+if [ -z "$(QUIET=1 lkl_test_cmd which mkfifo)" ]; then
+    lkl_test_plan 0 "hijack pipe backend tests"
+    echo "no mkfifo command"
+else
+    lkl_test_plan 2 "hijack pipe backend tests"
+    lkl_test_run 1 test_pipe_setup
+    lkl_test_run 2 test_pipe_ping
+fi
+
+tap_prepare
+
+if ! lkl_test_cmd test -c /dev/net/tun &>/dev/null; then
+    lkl_test_plan 0 "hijack tap backend tests"
+    echo "missing /dev/net/tun"
+else
+    lkl_test_plan 23 "hijack tap backend tests"
+    lkl_test_run 1 test_tap_setup
+    lkl_test_run 2 test_tap_ping_host
+    lkl_test_run 3 test_tap_ping_lkl
+    lkl_test_run 4 test_tap_neighbours
+    lkl_test_run 5 test_tap_netperf_tcp_rr
+    lkl_test_run 6 test_tap_netperf_tcp_stream
+    lkl_test_run 7 test_tap_netperf_tcp_maerts
+    lkl_test_run 8 test_tap_netperf_stream_tso_csum
+    lkl_test_run 9 test_tap_netperf_maerts_csum_tso
+    lkl_test_run 10 test_tap_netperf_stream_csum_tso_mrgrxbuf
+    lkl_test_run 11 test_tap_qdisc
+    lkl_test_run 12 test_tap_multi_if_setup
+    lkl_test_run 13 test_tap_multi_if_ping
+    lkl_test_run 14 test_tap_multi_if_neigh
+    lkl_test_run 15 test_tap_multi_if_gateway
+    lkl_test_run 16 test_tap_multi_if_gateway_v6
+    lkl_test_run 17 test_tap_multitable_setup
+    lkl_test_run 18 test_tap_multitable_ipv4_rule
+    lkl_test_run 19 test_tap_multitable_ipv6_rule
+    lkl_test_run 20 test_tap_multitable_ipv4_rule_table_4
+    lkl_test_run 21 test_tap_multitable_ipv6_rule_table_5
+    lkl_test_run 22 test_tap_multitable_ipv6_rule_table_6
+    lkl_test_run 23 test_tap_multitable_ipv6_rule_table_7
+    lkl_test_run 24 test_tap_cleanup
+fi
+
+if [ -z "$LKL_HOST_CONFIG_VIRTIO_NET_VDE" ]; then
+    lkl_test_plan 0 "vde tests"
+    echo "vde not supported"
+elif [ ! -x "$(which vde_switch)" ]; then
+    lkl_test_plan 0 "hijack vde tests"
+    echo "could not find a vde_switch executable"
+else
+    lkl_test_plan 3 "hijack vde tests"
+    lkl_test_run 1 test_vde_setup
+    lkl_test_run 2 test_vde_ping_host
+    lkl_test_run 3 test_vde_ping_lkl
+    lkl_test_run 4 test_vde_cleanup
+fi
diff --git a/tools/lkl/tests/run_netperf.sh b/tools/lkl/tests/run_netperf.sh
new file mode 100755
index 000000000000..08c4337b7830
--- /dev/null
+++ b/tools/lkl/tests/run_netperf.sh
@@ -0,0 +1,98 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+# Usage
+#  ./run_netperf.sh [ip] [test_name] [use_taskset] [num_runs]
+
+set -e
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+hijack_script=${script_dir}/../bin/lkl-hijack.sh
+
+num_runs="1"
+test_name="TCP_STREAM"
+use_taskset="0"
+host_ip="localhost"
+taskset_cmd="taskset -c 1"
+test_len=10  # second
+
+if [ ! -x "$(which netperf)" ]; then
+    echo "WARNING: Cannot find a netserver executable, skipping netperf tests."
+    exit $TEST_SKIP
+fi
+
+if [ $# -ge 1 ]; then
+    host_ip=$1
+fi
+if [ $# -ge 2 ]; then
+    test_name=$2
+fi
+if [ $# -ge 3 ]; then
+    use_taskset=$2
+fi
+if [ $# -ge 4 ]; then
+    num_runs=$3
+fi
+if [ $# -ge 5 ]; then
+    echo "BAD NUMBER of INPUTS."
+    exit 1
+fi
+
+if [ $use_taskset = "0" ]; then
+  taskset_cmd=""
+fi
+
+clean() {
+    kill %1 || true
+}
+
+clean_with_tap() {
+    tap_cleanup &> /dev/null || true
+    clean
+    rm -rf ${work_dir}
+}
+
+# LKL_HIJACK_CONFIG_FILE is not set, which means it's not called from
+# hijack-test.sh. Needs to set up things first.
+if [ -z ${LKL_HIJACK_CONFIG_FILE+x} ]; then
+
+    # Setting up environmental vars and TAP
+    work_dir=$(mktemp -d)
+    cfgjson=${work_dir}/hijack-test.conf
+    export LKL_HIJACK_CONFIG_FILE=$cfgjson
+
+    cat <<EOF > ${cfgjson}
+    {
+         "interfaces": [
+               {
+                    "type": "tap"
+                    "param": "$(tap_ifname)"
+                    "ip": "$(ip_lkl)"
+                    "masklen":"$TEST_IP_NETMASK"
+                    "ipv6":"$(ip6_lkl)"
+                    "masklen6":"$TEST_IP6_NETMASK"
+               }
+         ]
+    }
+EOF
+
+    . $script_dir/net-setup.sh
+    host_ip=$(ip_host)
+
+    tap_prepare
+    tap_setup
+    trap clean_with_tap EXIT
+fi
+
+netserver -D -N -p $TEST_NETSERVER_PORT &
+
+trap clean EXIT
+
+echo NUM=$num_runs, TEST=$test_name, TASKSET=$use_taskset
+for i in `seq $num_runs`; do
+    echo Test: $i
+    set -x
+    $taskset_cmd ${hijack_script} netperf -p $TEST_NETSERVER_PORT -H $host_ip \
+		         -t $test_name -l $test_len
+    set +x
+done
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 30/47] lkl: add documentation
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (28 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 29/47] lkl: add initial system call hijack support (a.k.a. NUSE of libos) Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 31/47] cpu: add cpu_yield_to_irqs Hajime Tazaki
                   ` (19 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um
  Cc: H . K . Jerry Chu, Conrad Meyer, Octavian Purdila, Motomu Utsumi,
	Akira Moroo, Thomas Liebetraut, Patrick Collins, Chenyang Zhong,
	Yuan Liu, Gustavo Bittencourt

From: Octavian Purdila <tavi.purdila@gmail.com>

We also added a symlink to README.md to display at github.

Signed-off-by: Chenyang Zhong <zhongcy95@gmail.com>
Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Gustavo Bittencourt <gbitten@gmail.com>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 Documentation/lkl.txt | 470 ++++++++++++++++++++++++++++++++++++++++++
 README.md             |   1 +
 2 files changed, 471 insertions(+)
 create mode 100644 Documentation/lkl.txt
 create mode 120000 README.md

diff --git a/Documentation/lkl.txt b/Documentation/lkl.txt
new file mode 100644
index 000000000000..97fe407b0bc6
--- /dev/null
+++ b/Documentation/lkl.txt
@@ -0,0 +1,470 @@
+
+Introduction
+============
+
+LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code as
+extensively as possible with minimal effort and reduced maintenance overhead.
+
+Examples of how LKL can be used are: creating userspace applications (running on
+Linux and other operating systems) that can read or write Linux filesystems or
+can use the Linux networking stack, creating kernel drivers for other operating
+systems that can read Linux filesystems, bootloaders support for reading/writing
+Linux filesystems, etc.
+
+With LKL, the kernel code is compiled into an object file that can be directly
+linked by applications. The API offered by LKL is based on the Linux system call
+interface.
+
+LKL is implemented as an architecture port in arch/lkl. It uses host operations
+defined by the application or a host library (tools/lkl/lib).
+
+
+Supported hosts
+===============
+
+The supported hosts for now are POSIX and Windows userspace applications.
+
+
+Building LKL, the host library and LKL based tools
+==================================================
+
+    $ make -C tools/lkl
+
+will build LKL as a object file, it will install it in tools/lkl/lib together
+with the headers files in tools/lkl/include then will build the host library,
+tests and a few of application examples:
+
+* tests/boot - a simple applications that uses LKL and exercises the basic LKL
+APIs
+
+* fs2tar - a tool that converts a filesystem image to a tar archive
+
+* cptofs/cpfromfs - a tool that copies files to/from a filesystem image
+
+* lklfuse - a tool that can mount a filesystem image in userspace,
+  without root privileges, using FUSE
+
+
+Building LKL on FreeBSD
+-----------------------
+
+    $ pkg install binutils gcc gnubc gmake gsed coreutils bison flex python argp-standalone
+
+    #Prefer ports binutils and GNU bc(1):
+    $ export PATH=/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/lib64/ccache
+
+    $ gmake -C tools/lkl
+
+Building LKL on Ubuntu
+-----------------------
+
+    $ sudo apt-get install libfuse-dev libarchive-dev xfsprogs
+
+    # Optional, if you would like to be able to run tests
+    $ sudo apt-get install btrfs-tools
+    $ pip install yamlish junit_xml
+
+    $ make -C tools/lkl
+
+    # To check that everything works:
+    $ cd tools/lkl
+    $ make run-tests
+
+
+Building LKL for Windows
+------------------------
+
+In order to build LKL for Win32 the mingw cross compiler needs to be installed
+on the host (e.g. on Ubuntu the following packages are required:
+binutils-mingw-w64-i686, gcc-mingw-w64-base, gcc-mingw-w64-i686
+mingw-w64-common, mingw-w64-i686-dev).
+
+Due to a bug in mingw regarding weak symbols the following patches needs to be
+applied to mingw-binutils:
+
+https://sourceware.org/ml/binutils/2015-10/msg00234.html
+
+and i686-w64-mingw32-gas, i686-w64-mingw32-ld and i686-w64-mingw32-objcopy need
+to be recompiled.
+
+With that pre-requisites fullfilled you can now build LKL for Win32 with the
+following command:
+
+    $ make CROSS_COMPILE=i686-w64-mingw32- -C tools/lkl
+
+
+
+Building LKL on Windows
+------------------------
+
+To build on Windows, certain GNU tools need to be installed. These tools can come
+from several different projects, such as cygwin, unxutils, gnu-win32 or busybox-w32.
+Below is one minimal/modular set-up based on msys2.
+
+### Common build dependencies:
+* [MSYS2](https://sourceforge.net/projects/msys2/) (provides GNU bash and many other utilities)
+* Extra utilities from MSYS2/pacman: bc, base-devel
+
+### General considerations:
+* No spaces in pathnames (source, prefix, destination,...)!
+* Make sure that all utilities are in the PATH.
+* Win64 (and MinGW 64-bit crt) is LLP64, which causes conflicts in size of "long" in the
+Linux source. Linux (and lkl) can (currently) not
+be built on LLP64.
+* Cygwin (and msys2) are LP64, like linux.
+
+### For MSYS2 (and Cygwin):
+Msys2 will install a gcc tool chain as part of the base-devel bundle. Binutils (2.26) is already
+patched for NT weak externals. Using the msys2 shell, cd to the lkl sources and run:
+
+    $ make -C tools/lkl
+
+### For MinGW:
+Install mingw-w64-i686-toolchain via pacman, mingw-w64-i686-binutils (2.26) is already patched
+for NT weak externals. Start a MinGW Win32 shell (64-bit will not work, see above)
+and run:
+
+    $ make -C tools/lkl
+
+
+LKL hijack library
+==================
+
+LKL hijack library (liblkl-hijack.so) is used to replace system calls used by an
+application on the fly so that the application can use LKL instead of the kernel
+of host operating system. LD_PRELOAD is used to dynamically override system
+calls with this library when you execute a program.
+
+You can usually use this library via a wrapper script.
+
+    $ cd tools/lkl
+    $ ./bin/lkl-hijack.sh ip address show
+
+In order to configure the behavior of LKL, a json file can be used. You can
+specify json file with environmental variables (LKL_HIJACK_CONFIG_FILE). If
+there is nothing specified, LKL tries to find with the name 'lkl-hijack.json'
+for the configuration file.  You can also use the old-style configuration with
+environmental variables (e.g., LKL_HIJACK_NET_IFTYPE) but those are overridden
+if a json file is specified.
+
+```
+     $ cat conf.json
+     {
+       "gateway":"192.168.0.1",
+       "gateway6":"2001:db8:0:f101::1",
+       "debug":"1",
+       "singlecpu":"1",
+       "sysctl":"net.ipv4.tcp_wmem=4096 87380 2147483647",
+       "boot_cmdline":"ip=dhcp",
+       "interfaces":[
+               {
+                       "mac":"12:34:56:78:9a:bc",
+                       "type":"tap",
+                       "param":"tap7",
+                       "ip":"192.168.0.2",
+                       "masklen":"24",
+                       "ifgateway":"192.168.0.1",
+                       "ipv6":"2001:db8:0:f101::2",
+                       "masklen6":"64",
+                       "ifgateway6":"2001:db8:0:f101::1",
+                       "offload":"0xc803"
+               },
+               {
+                       "mac":"12:34:56:78:9a:bd",
+                       "type":"tap",
+                       "param":"tap77",
+                       "ip":"192.168.1.2",
+                       "masklen":"24",
+                       "ifgateway":"192.168.1.1",
+                       "ipv6":"2001:db8:0:f102::2",
+                       "masklen6":"64",
+                       "ifgateway6":"2001:db8:0:f102::1",
+                       "offload":"0xc803"
+               }
+       ]
+     }
+     $ LKL_HIJACK_CONFIG_FILE="conf.json" lkl-hijack.sh ip addr s
+```
+
+The following are the list of keys to describe a JSON file.
+
+* IPv4 gateway address
+
+  key: "gateway"
+  value type: string
+
+  the gateway IPv4 address of LKL network stack.
+```
+     "gateway":"192.168.0.1"
+```
+
+* IPv6 gateway address
+
+  key: "gateway6"
+  value type: string
+
+  the gateway IPv6 address of LKL network stack.
+```
+     "gateway6":"2001:db8:0:f101::1"
+```
+
+* Debug
+
+  key: "debug"
+  value type: string
+
+  Setting it causes some debug information (both from the kernel and the
+  LKL library) to be enabled.  If zero' is specified it is disabled.
+  It is also used as a bit mask to turn on specific debugging facilities.
+  E.g., setting it to "0x100" will cause the LKL kernel to pause after
+  the hijack'ed app exits. This allows one to debug or collect info from
+  the LKL kernel before it quits.
+```
+     "debug":"1"
+```
+
+* Single CPU pinning
+
+  key: "singlecpu"
+  value type: string
+
+  Pin LKL kernel threads on to a single host cpu. value "1" pins
+  only LKL kernel threads while value "2" also pins polling
+  threads.
+```
+     "singlecpu":"1"
+```
+
+* SYSCTL
+
+  key: "sysctl"
+  value type: string
+
+  Configure sysctl values of the booted kernel via the hijack library. Multiple
+  entries can be specified.
+```
+     "sysctl":"net.ipv4.tcp_wmem=4096 87380 2147483647"
+```
+
+* Boot command line
+
+  key: "boot_cmdline"
+  value type: string
+
+  Specify the command line to the kernel boot so that change the configuration
+  on a kernel instance.  For instance, you can change the memory size with
+  below.
+```
+     "boot_cmdline": "mem=1G"
+```
+
+* Mount
+
+  key: "mount"
+  value type: string
+
+```
+     "mount": "proc,sysfs"
+```
+
+* Network Interface Configuration
+
+  key: "interfaces"
+  value type: array of objects
+
+  This key takes a set of sub-keys to configure a single interface. Each key is defined as follows.
+  ```
+       "interfaces":[{....},{....}]
+  ```
+
+
+	* Interface type
+
+	  key: "type"
+	  value type: string
+
+	  The interface type in host operating system to connect to LKL.
+	  The following example specifies a tap interface.
+	```
+	     "type":"tap"
+	```
+
+	* Interface parameter
+
+	  key: "param"
+	  value type: string
+
+	  Additional configuration parameters for the interface specified by Interface type (type).
+	  The parameters depend on the interface type.
+	```
+	     "type":"tap",
+	     "param":"tap0"
+	```
+
+	* Interface MTU size
+
+	  key: "mtu"
+	  value type: string
+
+	  the MTU size of the interface.
+	```
+	     "mtu":"1280"
+	```
+
+	* Interface IPv4 address
+
+	  key: "ip"
+	  value type: string
+
+	  the IPv4 address of the interface.
+	  If you want to use DHCP for the IP address assignment,
+	  use "boot_cmdline" with "ip=dhcp" option.
+	```
+	     "ip":"192.168.0.2"
+	```
+	```
+	     "boot_cmdline":"ip=dhcp"
+	```
+
+	* Interface IPv4 netmask length
+
+	  key: "masklen"
+	  value type: string
+
+	  the network mask length of the interface.
+	```
+	     "ip":"192.168.0.2",
+	     "masklen":"24"
+	```
+
+	* Interface IPv4 gateway on routing policy table
+
+	  key: "ifgateway"
+	  value type: string
+
+	  If you specify this parameter, LKL adds routing policy table.
+	  And then LKL creates link local and gateway route on this table.
+	  Table SELECTOR is "from" and PREFIX is address you assigned to this interface.
+	  Table id is 2 * (interface index).
+	  This parameter could be used to configure LKL for mptcp, for example.
+
+	```
+	     "ip":"192.168.0.2",
+	     "masklen":"24",
+	     "ifgateway":"192.168.0.1"
+	```
+
+	* Interface IPv6 address
+
+	  key: "ipv6"
+	  value type: string
+
+	  the IPv6 address of the interface.
+	```
+	     "ipv6":"2001:db8:0:f101::2"
+	```
+
+	* Interface IPv6 netmask length
+
+	  key: "masklen6"
+	  value type: string
+
+	  the network mask length of the interface.
+	```
+	     "ipv6":"2001:db8:0:f101::2",
+	     "masklen":"64"
+	```
+
+	* Interface IPv6 gateway on routing policy table
+
+	  key: "ifgateway6"
+	  value type: string
+
+	  If you specify this parameter, LKL adds routing policy table.
+	  And then LKL creates link local and gateway route on this table.
+	  Table SELECTOR is "from" and PREFIX is address you assigned to this interface.
+	  Table id is 2 * (interface index) + 1.
+	  This parameter could be used to configure LKL for mptcp, for example.
+	```
+	     "ipv6":"2001:db8:0:f101::2",
+	     "masklen":"64"
+	     "ifgateway6":"2001:db8:0:f101::1",
+	```
+
+	* Interface MAC address
+
+	  key: "mac"
+	  value type: string
+
+	  the MAC address of the interface.
+	```
+	     "mac":"12:34:56:78:9a:bc"
+	```
+
+	* Interfac neighbor entries
+
+	  key: "neigh"
+	  value type: string
+
+	  Add a list of permanent neighbor entries in the form of "ip|mac;ip|mac;...". ipv6 are supported
+	```
+	     "neigh":"192.168.0.1|12:34:56:78:9a:bc;2001:db8:0:f101::1|12:34:56:78:9a:be"
+	```
+
+	* Interface qdisc entries
+
+	  key: "qdisc"
+	  value type: string
+
+	  Add a qdisc entry in the form of "root|type;root|type;...".
+	```
+	     "qdisc":"root|fq"
+	```
+
+	* Interface offload
+
+	  key: "offload"
+	  value type: string
+
+	  Work as a bit mask to enable selective device offload features. E.g.,
+	  to enable "mergeable RX buffer" (LKL_VIRTIO_NET_F_MRG_RXBUF) +
+	  "guest csum" (LKL_VIRTIO_NET_F_GUEST_CSUM) device features, simply set
+	  it to 0x8002.
+	  See virtio_net.h for a list of offload features and their bit masks.
+	```
+	     "offload":"0x8002"
+	```
+
+* Delay
+
+  key: "delay_main"
+  value type: string
+
+  The delay before calling main() function of the application after the
+  initialization of LKL.  Some subsystems in Linux tree require a certain
+  amount of time before accepting a request from application, such as
+  delivery of address assignment to an network interface.  This parameter
+  is used in such case.  The value is described as a microsecond value.
+```
+     "delay_main":"500000"
+```
+
+FAQ
+===
+
+Q: How is LKL different from UML?
+
+A: UML prodivides a full OS environment (e.g. user/kernel separation, user
+processes) and also has requirements (a filesystem, processes, etc.) that makes
+it hard to use it for standalone applications. UML also relies heavily on Linux
+hosts. On the other hand LKL is designed to be linked directly with the
+application and hence does not have user/kernel separation which makes it easier
+to use it in standalone applications.
+
+
+Q: How is LKL different from LibOS?
+
+A: LibOS re-implements high-level kernel APIs for timers, softirqs, scheduling,
+sysctl, SLAB/SLUB, etc. LKL behaves like any arch port, implementing the arch
+level operations requested by the Linux kernel. LKL also offers a host interface
+so that support for multiple hosts can be implemented.
diff --git a/README.md b/README.md
new file mode 120000
index 000000000000..35c47a921f67
--- /dev/null
+++ b/README.md
@@ -0,0 +1 @@
+Documentation/lkl.txt
\ No newline at end of file
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 31/47] cpu: add cpu_yield_to_irqs
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (29 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 30/47] lkl: add documentation Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 32/47] tools: Add the lkl host library to the common tools Makefile Hajime Tazaki
                   ` (18 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Add a new architecture function that should be called in loops that rely
on interrupts to exit the loop (e.g. loops that use a jiffies expression
for the exit condition).

This is needed for architectures where interrupts can not preempt the
currently running thread (e.g. lkl).

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 crypto/xor.c        | 2 ++
 include/linux/cpu.h | 1 +
 kernel/cpu.c        | 5 +++++
 lib/raid6/algos.c   | 9 ++++++---
 4 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/crypto/xor.c b/crypto/xor.c
index ea7349e6ed23..c55a89a9e659 100644
--- a/crypto/xor.c
+++ b/crypto/xor.c
@@ -14,6 +14,7 @@
 #include <linux/raid/xor.h>
 #include <linux/jiffies.h>
 #include <linux/preempt.h>
+#include <linux/cpu.h>
 #include <asm/xor.h>
 
 #ifndef XOR_SELECT_TEMPLATE
@@ -85,6 +86,7 @@ do_xor_speed(struct xor_block_template *tmpl, void *b1, void *b2)
 			mb();
 			count++;
 			mb();
+			cpu_yield_to_irqs();
 		}
 		if (count > max)
 			max = count;
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index fcb1386bb0d4..887702d29498 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -180,6 +180,7 @@ int cpu_report_state(int cpu);
 int cpu_check_up_prepare(int cpu);
 void cpu_set_state_online(int cpu);
 void play_idle(unsigned long duration_ms);
+void cpu_yield_to_irqs(void);
 
 #ifdef CONFIG_HOTPLUG_CPU
 bool cpu_wait_death(unsigned int cpu, int seconds);
diff --git a/kernel/cpu.c b/kernel/cpu.c
index e84c0873559e..9ca61a55ed0c 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2339,6 +2339,11 @@ void __init boot_cpu_hotplug_init(void)
 	this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);
 }
 
+void __weak cpu_yield_to_irqs(void)
+{
+}
+EXPORT_SYMBOL(cpu_yield_to_irqs);
+
 enum cpu_mitigations cpu_mitigations __ro_after_init = CPU_MITIGATIONS_AUTO;
 
 static int __init mitigations_parse_cmdline(char *arg)
diff --git a/lib/raid6/algos.c b/lib/raid6/algos.c
index 17417eee0866..7e6121443ebc 100644
--- a/lib/raid6/algos.c
+++ b/lib/raid6/algos.c
@@ -18,6 +18,7 @@
 #else
 #include <linux/module.h>
 #include <linux/gfp.h>
+#include <linux/cpu.h>
 #if !RAID6_USE_EMPTY_ZERO_PAGE
 /* In .bss so it's zeroed */
 const char raid6_empty_zero_page[PAGE_SIZE] __attribute__((aligned(256)));
@@ -29,7 +30,7 @@ struct raid6_calls raid6_call;
 EXPORT_SYMBOL_GPL(raid6_call);
 
 const struct raid6_calls * const raid6_algos[] = {
-#if defined(__i386__) && !defined(__arch_um__)
+#ifdef CONFIG_X86_32
 #ifdef CONFIG_AS_AVX512
 	&raid6_avx512x2,
 	&raid6_avx512x1,
@@ -45,7 +46,7 @@ const struct raid6_calls * const raid6_algos[] = {
 	&raid6_mmxx2,
 	&raid6_mmxx1,
 #endif
-#if defined(__x86_64__) && !defined(__arch_um__)
+#ifdef CONFIG_X86_64
 #ifdef CONFIG_AS_AVX512
 	&raid6_avx512x4,
 	&raid6_avx512x2,
@@ -79,7 +80,7 @@ const struct raid6_calls * const raid6_algos[] = {
 	&raid6_neonx2,
 	&raid6_neonx1,
 #endif
-#if defined(__ia64__)
+#ifdef CONFIG_IA64
 	&raid6_intx32,
 	&raid6_intx16,
 #endif
@@ -173,6 +174,7 @@ static inline const struct raid6_calls *raid6_choose_gen(
 					    j1 + (1<<RAID6_TIME_JIFFIES_LG2))) {
 				(*algo)->gen_syndrome(disks, PAGE_SIZE, *dptrs);
 				perf++;
+				cpu_yield_to_irqs();
 			}
 			preempt_enable();
 
@@ -197,6 +199,7 @@ static inline const struct raid6_calls *raid6_choose_gen(
 				(*algo)->xor_syndrome(disks, start, stop,
 						      PAGE_SIZE, *dptrs);
 				perf++;
+				cpu_yield_to_irqs();
 			}
 			preempt_enable();
 
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 32/47] tools: Add the lkl host library to the common tools Makefile
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (30 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 31/47] cpu: add cpu_yield_to_irqs Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 33/47] signal: use CONFIG_X86_32 instead of __i386__ Hajime Tazaki
                   ` (17 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Thomas Liebetraut, Akira Moroo

From: Thomas Liebetraut <thomas@tommie-lie.de>

This patch includes the lkl host library to the Kernel tools buildsystem.
This also means that lkl can now be compiled like any other "tool" using:

  $ make tools/lkl

Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
[Octavian: remove make ARCH=lkl defconfig as it is not (yet) necessary]
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/Makefile | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/Makefile b/tools/Makefile
index 68defd7ecf5d..0506d7dde63f 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -23,6 +23,7 @@ help:
 	@echo '  kvm_stat               - top-like utility for displaying kvm statistics'
 	@echo '  leds                   - LEDs  tools'
 	@echo '  liblockdep             - user-space wrapper for kernel locking-validator'
+	@echo '  lkl                    - The Linux Kernel Library host libraries and tools'
 	@echo '  bpf                    - misc BPF tools'
 	@echo '  pci                    - PCI tools'
 	@echo '  perf                   - Linux performance measurement and analysis tool'
@@ -63,7 +64,7 @@ acpi: FORCE
 cpupower: FORCE
 	$(call descend,power/$@)
 
-cgroup firewire hv guest spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging: FORCE
+cgroup firewire hv guest lkl spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging: FORCE
 	$(call descend,$@)
 
 liblockdep: FORCE
@@ -107,7 +108,7 @@ acpi_install:
 cpupower_install:
 	$(call descend,power/$(@:_install=),install)
 
-cgroup_install firewire_install gpio_install hv_install iio_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install:
+cgroup_install firewire_install gpio_install hv_install iio_install lkl_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install:
 	$(call descend,$(@:_install=),install)
 
 liblockdep_install:
@@ -133,7 +134,7 @@ install: acpi_install cgroup_install cpupower_install gpio_install \
 		perf_install selftests_install turbostat_install usb_install \
 		virtio_install vm_install bpf_install x86_energy_perf_policy_install \
 		tmon_install freefall_install objtool_install kvm_stat_install \
-		wmi_install pci_install debugging_install intel-speed-select_install
+		wmi_install lkl_install pci_install debugging_install intel-speed-select_install
 
 acpi_clean:
 	$(call descend,power/acpi,clean)
@@ -141,7 +142,7 @@ acpi_clean:
 cpupower_clean:
 	$(call descend,power/cpupower,clean)
 
-cgroup_clean hv_clean firewire_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean:
+cgroup_clean hv_clean firewire_clean lkl_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean:
 	$(call descend,$(@:_clean=),clean)
 
 liblockdep_clean:
@@ -179,7 +180,7 @@ clean: acpi_clean cgroup_clean cpupower_clean hv_clean firewire_clean \
 		perf_clean selftests_clean turbostat_clean spi_clean usb_clean virtio_clean \
 		vm_clean bpf_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
 		freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean \
-		gpio_clean objtool_clean leds_clean wmi_clean pci_clean firmware_clean debugging_clean \
+		gpio_clean objtool_clean leds_clean wmi_clean lkl_clean pci_clean firmware_clean debugging_clean \
 		intel-speed-select_clean
 
 .PHONY: FORCE
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 33/47] signal: use CONFIG_X86_32 instead of __i386__
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (31 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 32/47] tools: Add the lkl host library to the common tools Makefile Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 34/47] arch: add __SYSCALL_DEFINE_ARCH Hajime Tazaki
                   ` (16 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

This allows um/lkl to build/run ?
[XXX: need to check if this requires or not]

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 kernel/signal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 534fec266a33..561de0e1e66a 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1241,7 +1241,7 @@ static void print_fatal_signal(int signr)
 	struct pt_regs *regs = signal_pt_regs();
 	pr_info("potentially unexpected fatal signal %d.\n", signr);
 
-#if defined(__i386__) && !defined(__arch_um__)
+#ifdef CONFIG_X86_32
 	pr_info("code at %08lx: ", regs->ip);
 	{
 		int i;
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 34/47] arch: add __SYSCALL_DEFINE_ARCH
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (32 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 33/47] signal: use CONFIG_X86_32 instead of __i386__ Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 35/47] xfs: support for non-mmu architectures Hajime Tazaki
                   ` (15 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

This allows the architecture code to process the system call
definitions. It is used by LKL to create strong typed function
definitions for system calls.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 include/linux/syscalls.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 88145da7d140..77e52fe19923 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -203,9 +203,14 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
 }
 #endif
 
+#ifndef __SYSCALL_DEFINE_ARCH
+#define __SYSCALL_DEFINE_ARCH(x, sname, ...)
+#endif
+
 #ifndef SYSCALL_DEFINE0
 #define SYSCALL_DEFINE0(sname)					\
 	SYSCALL_METADATA(_##sname, 0);				\
+	__SYSCALL_DEFINE_ARCH(0, _##sname);			\
 	asmlinkage long sys_##sname(void);			\
 	ALLOW_ERROR_INJECTION(sys_##sname, ERRNO);		\
 	asmlinkage long sys_##sname(void)
@@ -222,6 +227,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
 
 #define SYSCALL_DEFINEx(x, sname, ...)				\
 	SYSCALL_METADATA(sname, x, __VA_ARGS__)			\
+	__SYSCALL_DEFINE_ARCH(x, sname, __VA_ARGS__)		\
 	__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
 
 #define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 35/47] xfs: support for non-mmu architectures
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (33 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 34/47] arch: add __SYSCALL_DEFINE_ARCH Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 36/47] checkpatch: avoid showing BIT_ULL warnings for tools/ files Hajime Tazaki
                   ` (14 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Naive implementation for non-mmu architectures: allocate physically
contiguous xfs buffers with alloc_pages. Terribly inefficient with
memory and fragmentation on high I/O loads but it may be good enough
for basic usage (which most non-mmu architectures will need).

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 fs/xfs/xfs_buf.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index ca0849043f54..c4bb390cc9b0 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -313,6 +313,7 @@ xfs_buf_free(
 	ASSERT(list_empty(&bp->b_lru));
 
 	if (bp->b_flags & _XBF_PAGES) {
+#ifdef CONFIG_MMU
 		uint		i;
 
 		if (xfs_buf_is_vmapped(bp))
@@ -324,6 +325,10 @@ xfs_buf_free(
 
 			__free_page(page);
 		}
+#else
+		free_pages((unsigned long)page_to_virt(bp->b_pages[0]),
+			   order_base_2(bp->b_page_count));
+#endif
 	} else if (bp->b_flags & _XBF_KMEM)
 		kmem_free(bp->b_addr);
 	_xfs_buf_free_pages(bp);
@@ -390,7 +395,14 @@ xfs_buf_allocate_memory(
 		struct page	*page;
 		uint		retries = 0;
 retry:
+#ifdef CONFIG_MMU
 		page = alloc_page(gfp_mask);
+#else
+		if (i == 0)
+			page = alloc_pages(gfp_mask, order_base_2(page_count));
+		else
+			page = bp->b_pages[0] + i;
+#endif
 		if (unlikely(page == NULL)) {
 			if (flags & XBF_READ_AHEAD) {
 				bp->b_page_count = i;
@@ -425,8 +437,10 @@ xfs_buf_allocate_memory(
 	return 0;
 
 out_free_pages:
+#ifdef CONFIG_MMU
 	for (i = 0; i < bp->b_page_count; i++)
 		__free_page(bp->b_pages[i]);
+#endif
 	bp->b_flags &= ~_XBF_PAGES;
 	return error;
 }
@@ -446,6 +460,7 @@ _xfs_buf_map_pages(
 	} else if (flags & XBF_UNMAPPED) {
 		bp->b_addr = NULL;
 	} else {
+#ifdef CONFIG_MMU
 		int retried = 0;
 		unsigned nofs_flag;
 
@@ -466,6 +481,9 @@ _xfs_buf_map_pages(
 			vm_unmap_aliases();
 		} while (retried++ <= 1);
 		memalloc_nofs_restore(nofs_flag);
+#else
+		bp->b_addr = page_to_virt(bp->b_pages[0]);
+#endif
 
 		if (!bp->b_addr)
 			return -ENOMEM;
@@ -915,11 +933,19 @@ xfs_buf_get_uncached(
 	if (error)
 		goto fail_free_buf;
 
+#ifdef CONFIG_MMU
 	for (i = 0; i < page_count; i++) {
 		bp->b_pages[i] = alloc_page(xb_to_gfp(flags));
 		if (!bp->b_pages[i])
 			goto fail_free_mem;
 	}
+#else
+	bp->b_pages[0] = alloc_pages(flags, order_base_2(page_count));
+	if (!bp->b_pages[0])
+		goto fail_free_buf;
+	for (i = 1; i < page_count; i++)
+		bp->b_pages[i] = bp->b_pages[i-1] + 1;
+#endif
 	bp->b_flags |= _XBF_PAGES;
 
 	error = _xfs_buf_map_pages(bp, 0);
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 36/47] checkpatch: avoid showing BIT_ULL warnings for tools/ files
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (34 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 35/47] xfs: support for non-mmu architectures Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 37/47] Revert "vmlinux.lds.h: remove stale <linux/export.h> include" Hajime Tazaki
                   ` (13 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Directly using shift operations in userspace compiled code should not
trigger warnings as BIT_ULL macros are not available outside the
kernel.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 scripts/checkpatch.pl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 93a7edfe0f05..e739f565497e 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -6313,7 +6313,8 @@ sub process {
 		    $line =~ /#\s*define\s+\w+\s+\(?\s*1\s*([ulUL]*)\s*\<\<\s*(?:\d+|$Ident)\s*\)?/) {
 			my $ull = "";
 			$ull = "_ULL" if (defined($1) && $1 =~ /ll/i);
-			if (CHK("BIT_MACRO",
+			if ($realfile !~ m@\btools/@ &&
+			    CHK("BIT_MACRO",
 				"Prefer using the BIT$ull macro\n" . $herecurr) &&
 			    $fix) {
 				$fixed[$fixlinenr] =~ s/\(?\s*1\s*[ulUL]*\s*<<\s*(\d+|$Ident)\s*\)?/BIT${ull}($1)/;
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 37/47] Revert "vmlinux.lds.h: remove stale <linux/export.h> include"
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (35 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 36/47] checkpatch: avoid showing BIT_ULL warnings for tools/ files Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 38/47] Revert "export.h: remove code for prefixing symbols with underscore" Hajime Tazaki
                   ` (12 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Akira Moroo <retrage01@gmail.com>

This reverts commit 7953002a7c6561c93defd19c81737012ef5a10dc.

Signed-off-by: Akira Moroo <retrage01@gmail.com>
---
 include/asm-generic/vmlinux.lds.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index cd28f63bfbc7..8c923ca77d56 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -54,6 +54,8 @@
 #define LOAD_OFFSET 0
 #endif
 
+#include <linux/export.h>
+
 /* Align . to a 8 byte boundary equals to maximum function alignment. */
 #define ALIGN_FUNCTION()  . = ALIGN(8)
 
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 38/47] Revert "export.h: remove code for prefixing symbols with underscore"
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (36 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 37/47] Revert "vmlinux.lds.h: remove stale <linux/export.h> include" Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 39/47] Revert "linux/linkage.h: replace VMLINUX_SYMBOL_STR() with __stringify()" Hajime Tazaki
                   ` (11 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Hajime Tazaki, Akira Moroo

for lkl, mingw32 requires underscore-ed symbols.

This reverts commit 94e58e0ac31284fa26597c0e00a9b1d87a691d02.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 include/asm-generic/export.h | 34 ++++++++++++++++++++++------------
 include/linux/export.h       | 23 ++++++++++++++++++-----
 2 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index 294d6ae785d4..69ce0914b025 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -27,32 +27,42 @@
 #endif
 .endm
 
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+#define KSYM(name) _##name
+#else
+#define KSYM(name) name
+#endif
+
 /*
  * note on .section use: @progbits vs %progbits nastiness doesn't matter,
  * since we immediately emit into those sections anyway.
  */
 .macro ___EXPORT_SYMBOL name,val,sec
 #ifdef CONFIG_MODULES
-	.globl __ksymtab_\name
+	.globl KSYM(__ksymtab_\name)
 	.section ___ksymtab\sec+\name,"a"
 	.balign KSYM_ALIGN
-__ksymtab_\name:
-	__put \val, __kstrtab_\name
+KSYM(__ksymtab_\name):
+	__put \val, KSYM(__kstrtab_\name)
 	.previous
 	.section __ksymtab_strings,"a"
-__kstrtab_\name:
+KSYM(__kstrtab_\name):
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+	.asciz "_\name"
+#else
 	.asciz "\name"
+#endif
 	.previous
 #ifdef CONFIG_MODVERSIONS
 	.section ___kcrctab\sec+\name,"a"
 	.balign KCRC_ALIGN
-__kcrctab_\name:
+KSYM(__kcrctab_\name):
 #if defined(CONFIG_MODULE_REL_CRCS)
-	.long __crc_\name - .
+	.long KSYM(__crc_\name) - .
 #else
-	.long __crc_\name
+	.long KSYM(__crc_\name)
 #endif
-	.weak __crc_\name
+	.weak KSYM(__crc_\name)
 	.previous
 #endif
 #endif
@@ -85,12 +95,12 @@ __ksym_marker_\sym:
 #endif
 
 #define EXPORT_SYMBOL(name)					\
-	__EXPORT_SYMBOL(name, KSYM_FUNC(name),)
+	__EXPORT_SYMBOL(name, KSYM_FUNC(KSYM(name)),)
 #define EXPORT_SYMBOL_GPL(name) 				\
-	__EXPORT_SYMBOL(name, KSYM_FUNC(name), _gpl)
+	__EXPORT_SYMBOL(name, KSYM_FUNC(KSYM(name)), _gpl)
 #define EXPORT_DATA_SYMBOL(name)				\
-	__EXPORT_SYMBOL(name, name,)
+	__EXPORT_SYMBOL(name, KSYM(name),)
 #define EXPORT_DATA_SYMBOL_GPL(name)				\
-	__EXPORT_SYMBOL(name, name,_gpl)
+	__EXPORT_SYMBOL(name, KSYM(name),_gpl)
 
 #endif
diff --git a/include/linux/export.h b/include/linux/export.h
index fd8711ed9ac4..34c34d09103c 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -10,6 +10,19 @@
  * hackers place grumpy comments in header files.
  */
 
+/* Some toolchains use a `_' prefix for all user symbols. */
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+#define __VMLINUX_SYMBOL(x) _##x
+#define __VMLINUX_SYMBOL_STR(x) "_" #x
+#else
+#define __VMLINUX_SYMBOL(x) x
+#define __VMLINUX_SYMBOL_STR(x) #x
+#endif
+
+/* Indirect, so macros are expanded before pasting. */
+#define VMLINUX_SYMBOL(x) __VMLINUX_SYMBOL(x)
+#define VMLINUX_SYMBOL_STR(x) __VMLINUX_SYMBOL_STR(x)
+
 #ifndef __ASSEMBLY__
 #ifdef MODULE
 extern struct module __this_module;
@@ -27,14 +40,14 @@ extern struct module __this_module;
 #if defined(CONFIG_MODULE_REL_CRCS)
 #define __CRC_SYMBOL(sym, sec)						\
 	asm("	.section \"___kcrctab" sec "+" #sym "\", \"a\"	\n"	\
-	    "	.weak	__crc_" #sym "				\n"	\
-	    "	.long	__crc_" #sym " - .			\n"	\
+	    "	.weak	" VMLINUX_SYMBOL_STR(__crc_##sym) "	\n"	\
+	    "	.long	" VMLINUX_SYMBOL_STR(__crc_##sym) " - .	\n"	\
 	    "	.previous					\n");
 #else
 #define __CRC_SYMBOL(sym, sec)						\
 	asm("	.section \"___kcrctab" sec "+" #sym "\", \"a\"	\n"	\
-	    "	.weak	__crc_" #sym "				\n"	\
-	    "	.long	__crc_" #sym "				\n"	\
+	    "	.weak	" VMLINUX_SYMBOL_STR(__crc_##sym) "	\n"	\
+	    "	.long	" VMLINUX_SYMBOL_STR(__crc_##sym) "	\n"	\
 	    "	.previous					\n");
 #endif
 #else
@@ -80,7 +93,7 @@ struct kernel_symbol {
 	__CRC_SYMBOL(sym, sec)						\
 	static const char __kstrtab_##sym[]				\
 	__attribute__((section("__ksymtab_strings"), used, aligned(1)))	\
-	= #sym;								\
+	= VMLINUX_SYMBOL_STR(#sym);					\
 	__KSYMTAB_ENTRY(sym, sec)
 
 #if defined(__DISABLE_EXPORTS)
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 39/47] Revert "linux/linkage.h: replace VMLINUX_SYMBOL_STR() with __stringify()"
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (37 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 38/47] Revert "export.h: remove code for prefixing symbols with underscore" Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 40/47] Revert "vmlinux.lds.h: remove no-op macro VMLINUX_SYMBOL()" Hajime Tazaki
                   ` (10 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Hajime Tazaki, Akira Moroo

for lkl, mingw32 requires underscore-ed symbols.

This reverts commit 00979ce4fcc90d488c7f27f750097adc6b11bd07.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 include/linux/linkage.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/linkage.h b/include/linux/linkage.h
index 7e020782ade2..d287823ee947 100644
--- a/include/linux/linkage.h
+++ b/include/linux/linkage.h
@@ -24,16 +24,16 @@
 
 #ifndef cond_syscall
 #define cond_syscall(x)	asm(				\
-	".weak " __stringify(x) "\n\t"			\
-	".set  " __stringify(x) ","			\
-		 __stringify(sys_ni_syscall))
+	".weak " VMLINUX_SYMBOL_STR(x) "\n\t"		\
+	".set  " VMLINUX_SYMBOL_STR(x) ","		\
+		 VMLINUX_SYMBOL_STR(sys_ni_syscall))
 #endif
 
 #ifndef SYSCALL_ALIAS
 #define SYSCALL_ALIAS(alias, name) asm(			\
-	".globl " __stringify(alias) "\n\t"		\
-	".set   " __stringify(alias) ","		\
-		  __stringify(name))
+	".globl " VMLINUX_SYMBOL_STR(alias) "\n\t"	\
+	".set   " VMLINUX_SYMBOL_STR(alias) ","		\
+		  VMLINUX_SYMBOL_STR(name))
 #endif
 
 #define __page_aligned_data	__section(.data..page_aligned) __aligned(PAGE_SIZE)
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 40/47] Revert "vmlinux.lds.h: remove no-op macro VMLINUX_SYMBOL()"
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (38 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 39/47] Revert "linux/linkage.h: replace VMLINUX_SYMBOL_STR() with __stringify()" Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 41/47] Revert "kbuild: remove CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX" Hajime Tazaki
                   ` (9 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Hajime Tazaki, Akira Moroo

for lkl, mingw32 requires underscore-ed symbols.

This reverts commit a6214385005333202c8cc1744c7075a9e1a26b9a.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 include/asm-generic/vmlinux.lds.h | 291 +++++++++++++++---------------
 1 file changed, 146 insertions(+), 145 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 8c923ca77d56..e5cbf009bc31 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -119,67 +119,67 @@
 			__stop_mcount_loc = .;
 #else
 #define MCOUNT_REC()	. = ALIGN(8);				\
-			__start_mcount_loc = .;			\
+			VMLINUX_SYMBOL(__start_mcount_loc) = .; \
 			KEEP(*(__mcount_loc))			\
-			__stop_mcount_loc = .;
+			VMLINUX_SYMBOL(__stop_mcount_loc) = .;
 #endif
 #else
 #define MCOUNT_REC()
 #endif
 
 #ifdef CONFIG_TRACE_BRANCH_PROFILING
-#define LIKELY_PROFILE()	__start_annotated_branch_profile = .;	\
-				KEEP(*(_ftrace_annotated_branch))	\
-				__stop_annotated_branch_profile = .;
+#define LIKELY_PROFILE()	VMLINUX_SYMBOL(__start_annotated_branch_profile) = .; \
+				*(_ftrace_annotated_branch)			      \
+				VMLINUX_SYMBOL(__stop_annotated_branch_profile) = .;
 #else
 #define LIKELY_PROFILE()
 #endif
 
 #ifdef CONFIG_PROFILE_ALL_BRANCHES
-#define BRANCH_PROFILE()	__start_branch_profile = .;		\
-				KEEP(*(_ftrace_branch))			\
-				__stop_branch_profile = .;
+#define BRANCH_PROFILE()	VMLINUX_SYMBOL(__start_branch_profile) = .;   \
+				*(_ftrace_branch)			      \
+				VMLINUX_SYMBOL(__stop_branch_profile) = .;
 #else
 #define BRANCH_PROFILE()
 #endif
 
 #ifdef CONFIG_KPROBES
 #define KPROBE_BLACKLIST()	. = ALIGN(8);				      \
-				__start_kprobe_blacklist = .;		      \
+				VMLINUX_SYMBOL(__start_kprobe_blacklist) = .; \
 				KEEP(*(_kprobe_blacklist))		      \
-				__stop_kprobe_blacklist = .;
+				VMLINUX_SYMBOL(__stop_kprobe_blacklist) = .;
 #else
 #define KPROBE_BLACKLIST()
 #endif
 
 #ifdef CONFIG_FUNCTION_ERROR_INJECTION
 #define ERROR_INJECT_WHITELIST()	STRUCT_ALIGN();			      \
-			__start_error_injection_whitelist = .;		      \
+			VMLINUX_SYMBOL(__start_error_injection_whitelist) = .;\
 			KEEP(*(_error_injection_whitelist))		      \
-			__stop_error_injection_whitelist = .;
+			VMLINUX_SYMBOL(__stop_error_injection_whitelist) = .;
 #else
 #define ERROR_INJECT_WHITELIST()
 #endif
 
 #ifdef CONFIG_EVENT_TRACING
 #define FTRACE_EVENTS()	. = ALIGN(8);					\
-			__start_ftrace_events = .;			\
+			VMLINUX_SYMBOL(__start_ftrace_events) = .;	\
 			KEEP(*(_ftrace_events))				\
-			__stop_ftrace_events = .;			\
-			__start_ftrace_eval_maps = .;			\
+			VMLINUX_SYMBOL(__stop_ftrace_events) = .;	\
+			VMLINUX_SYMBOL(__start_ftrace_eval_maps) = .;	\
 			KEEP(*(_ftrace_eval_map))			\
-			__stop_ftrace_eval_maps = .;
+			VMLINUX_SYMBOL(__stop_ftrace_eval_maps) = .;
 #else
 #define FTRACE_EVENTS()
 #endif
 
 #ifdef CONFIG_TRACING
-#define TRACE_PRINTKS()	 __start___trace_bprintk_fmt = .;      \
+#define TRACE_PRINTKS() VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .;      \
 			 KEEP(*(__trace_printk_fmt)) /* Trace_printk fmt' pointer */ \
-			 __stop___trace_bprintk_fmt = .;
-#define TRACEPOINT_STR() __start___tracepoint_str = .;	\
+			 VMLINUX_SYMBOL(__stop___trace_bprintk_fmt) = .;
+#define TRACEPOINT_STR() VMLINUX_SYMBOL(__start___tracepoint_str) = .;	\
 			 KEEP(*(__tracepoint_str)) /* Trace_printk fmt' pointer */ \
-			 __stop___tracepoint_str = .;
+			 VMLINUX_SYMBOL(__stop___tracepoint_str) = .;
 #else
 #define TRACE_PRINTKS()
 #define TRACEPOINT_STR()
@@ -187,27 +187,27 @@
 
 #ifdef CONFIG_FTRACE_SYSCALLS
 #define TRACE_SYSCALLS() . = ALIGN(8);					\
-			 __start_syscalls_metadata = .;			\
+			 VMLINUX_SYMBOL(__start_syscalls_metadata) = .;	\
 			 KEEP(*(__syscalls_metadata))			\
-			 __stop_syscalls_metadata = .;
+			 VMLINUX_SYMBOL(__stop_syscalls_metadata) = .;
 #else
 #define TRACE_SYSCALLS()
 #endif
 
 #ifdef CONFIG_BPF_EVENTS
 #define BPF_RAW_TP() STRUCT_ALIGN();					\
-			 __start__bpf_raw_tp = .;			\
+			 VMLINUX_SYMBOL(__start__bpf_raw_tp) = .;	\
 			 KEEP(*(__bpf_raw_tp_map))			\
-			 __stop__bpf_raw_tp = .;
+			 VMLINUX_SYMBOL(__stop__bpf_raw_tp) = .;
 #else
 #define BPF_RAW_TP()
 #endif
 
 #ifdef CONFIG_SERIAL_EARLYCON
 #define EARLYCON_TABLE() . = ALIGN(8);				\
-			 __earlycon_table = .;			\
+			 VMLINUX_SYMBOL(__earlycon_table) = .;	\
 			 KEEP(*(__earlycon_table))		\
-			 __earlycon_table_end = .;
+			 VMLINUX_SYMBOL(__earlycon_table_end) = .;
 #else
 #define EARLYCON_TABLE()
 #endif
@@ -227,7 +227,7 @@
 #define _OF_TABLE_0(name)
 #define _OF_TABLE_1(name)						\
 	. = ALIGN(8);							\
-	__##name##_of_table = .;					\
+	VMLINUX_SYMBOL(__##name##_of_table) = .;			\
 	KEEP(*(__##name##_of_table))					\
 	KEEP(*(__##name##_of_table_end))
 
@@ -241,9 +241,9 @@
 #ifdef CONFIG_ACPI
 #define ACPI_PROBE_TABLE(name)						\
 	. = ALIGN(8);							\
-	__##name##_acpi_probe_table = .;				\
+	VMLINUX_SYMBOL(__##name##_acpi_probe_table) = .;		\
 	KEEP(*(__##name##_acpi_probe_table))				\
-	__##name##_acpi_probe_table_end = .;
+	VMLINUX_SYMBOL(__##name##_acpi_probe_table_end) = .;
 #else
 #define ACPI_PROBE_TABLE(name)
 #endif
@@ -260,9 +260,9 @@
 
 #define KERNEL_DTB()							\
 	STRUCT_ALIGN();							\
-	__dtb_start = .;						\
+	VMLINUX_SYMBOL(__dtb_start) = .;				\
 	KEEP(*(.dtb.init.rodata))					\
-	__dtb_end = .;
+	VMLINUX_SYMBOL(__dtb_end) = .;
 
 /*
  * .data section
@@ -275,16 +275,16 @@
 	MEM_KEEP(init.data*)						\
 	MEM_KEEP(exit.data*)						\
 	*(.data.unlikely)						\
-	__start_once = .;						\
+	VMLINUX_SYMBOL(__start_once) = .;				\
 	*(.data.once)							\
-	__end_once = .;							\
+	VMLINUX_SYMBOL(__end_once) = .;					\
 	STRUCT_ALIGN();							\
 	*(__tracepoints)						\
 	/* implement dynamic printk debug */				\
 	. = ALIGN(8);							\
-	__start___verbose = .;						\
+	VMLINUX_SYMBOL(__start___verbose) = .;                          \
 	KEEP(*(__verbose))                                              \
-	__stop___verbose = .;						\
+	VMLINUX_SYMBOL(__stop___verbose) = .;				\
 	LIKELY_PROFILE()		       				\
 	BRANCH_PROFILE()						\
 	TRACE_PRINTKS()							\
@@ -296,10 +296,10 @@
  */
 #define NOSAVE_DATA							\
 	. = ALIGN(PAGE_SIZE);						\
-	__nosave_begin = .;						\
+	VMLINUX_SYMBOL(__nosave_begin) = .;				\
 	*(.data..nosave)						\
 	. = ALIGN(PAGE_SIZE);						\
-	__nosave_end = .;
+	VMLINUX_SYMBOL(__nosave_end) = .;
 
 #define PAGE_ALIGNED_DATA(page_align)					\
 	. = ALIGN(page_align);						\
@@ -316,19 +316,19 @@
 
 #define INIT_TASK_DATA(align)						\
 	. = ALIGN(align);						\
-	__start_init_task = .;						\
-	init_thread_union = .;						\
-	init_stack = .;							\
-	KEEP(*(.data..init_task))					\
-	KEEP(*(.data..init_thread_info))				\
-	. = __start_init_task + THREAD_SIZE;				\
-	__end_init_task = .;
+	VMLINUX_SYMBOL(__start_init_task) = .;				\
+	VMLINUX_SYMBOL(init_thread_union) = .;				\
+	VMLINUX_SYMBOL(init_stack) = .;					\
+	*(.data..init_task)						\
+	*(.data..init_thread_info)					\
+	. = VMLINUX_SYMBOL(__start_init_task) + THREAD_SIZE;		\
+	VMLINUX_SYMBOL(__end_init_task) = .;
 
 #define JUMP_TABLE_DATA							\
 	. = ALIGN(8);							\
-	__start___jump_table = .;					\
+	VMLINUX_SYMBOL(__start___jump_table) = .;                       \
 	KEEP(*(__jump_table))						\
-	__stop___jump_table = .;
+	VMLINUX_SYMBOL(__stop___jump_table) = .;
 
 /*
  * Allow architectures to handle ro_after_init data on their
@@ -336,10 +336,10 @@
  */
 #ifndef RO_AFTER_INIT_DATA
 #define RO_AFTER_INIT_DATA						\
-	__start_ro_after_init = .;					\
+	VMLINUX_SYMBOL(__start_ro_after_init) = .;			\
 	*(.data..ro_after_init)						\
 	JUMP_TABLE_DATA							\
-	__end_ro_after_init = .;
+	VMLINUX_SYMBOL(__end_ro_after_init) = .;
 #endif
 
 /*
@@ -347,14 +347,14 @@
  */
 #define RO_DATA_SECTION(align)						\
 	. = ALIGN((align));						\
-	.rodata           : AT(ADDR(.rodata) - LOAD_OFFSET) {		\
-		__start_rodata = .;					\
-		*(.rodata) *(.rodata.*)					\
+	RODATA_SECTION    : AT(ADDR(RODATA_SECTION) - LOAD_OFFSET) {	\
+		VMLINUX_SYMBOL(__start_rodata) = .;			\
+		*(RODATA_SECTION) *(RODATA_SECTION.*)			\
 		RO_AFTER_INIT_DATA	/* Read only after init */	\
 		. = ALIGN(8);						\
-		__start___tracepoints_ptrs = .;				\
+		VMLINUX_SYMBOL(__start___tracepoints_ptrs) = .;		\
 		KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
-		__stop___tracepoints_ptrs = .;				\
+		VMLINUX_SYMBOL(__stop___tracepoints_ptrs) = .;		\
 		*(__tracepoints_strings)/* Tracepoints: strings */	\
 	}								\
 									\
@@ -364,109 +364,109 @@
 									\
 	/* PCI quirks */						\
 	.pci_fixup        : AT(ADDR(.pci_fixup) - LOAD_OFFSET) {	\
-		__start_pci_fixups_early = .;				\
+		VMLINUX_SYMBOL(__start_pci_fixups_early) = .;		\
 		KEEP(*(.pci_fixup_early))				\
-		__end_pci_fixups_early = .;				\
-		__start_pci_fixups_header = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_early) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_header) = .;		\
 		KEEP(*(.pci_fixup_header))				\
-		__end_pci_fixups_header = .;				\
-		__start_pci_fixups_final = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_header) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_final) = .;		\
 		KEEP(*(.pci_fixup_final))				\
-		__end_pci_fixups_final = .;				\
-		__start_pci_fixups_enable = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_final) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_enable) = .;		\
 		KEEP(*(.pci_fixup_enable))				\
-		__end_pci_fixups_enable = .;				\
-		__start_pci_fixups_resume = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_enable) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_resume) = .;		\
 		KEEP(*(.pci_fixup_resume))				\
-		__end_pci_fixups_resume = .;				\
-		__start_pci_fixups_resume_early = .;			\
+		VMLINUX_SYMBOL(__end_pci_fixups_resume) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_resume_early) = .;	\
 		KEEP(*(.pci_fixup_resume_early))			\
-		__end_pci_fixups_resume_early = .;			\
-		__start_pci_fixups_suspend = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_resume_early) = .;	\
+		VMLINUX_SYMBOL(__start_pci_fixups_suspend) = .;		\
 		KEEP(*(.pci_fixup_suspend))				\
-		__end_pci_fixups_suspend = .;				\
-		__start_pci_fixups_suspend_late = .;			\
+		VMLINUX_SYMBOL(__end_pci_fixups_suspend) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_suspend_late) = .;	\
 		KEEP(*(.pci_fixup_suspend_late))			\
-		__end_pci_fixups_suspend_late = .;			\
+		VMLINUX_SYMBOL(__end_pci_fixups_suspend_late) = .;	\
 	}								\
 									\
 	/* Built-in firmware blobs */					\
 	.builtin_fw        : AT(ADDR(.builtin_fw) - LOAD_OFFSET) {	\
-		__start_builtin_fw = .;					\
+		VMLINUX_SYMBOL(__start_builtin_fw) = .;			\
 		KEEP(*(.builtin_fw))					\
-		__end_builtin_fw = .;					\
+		VMLINUX_SYMBOL(__end_builtin_fw) = .;			\
 	}								\
 									\
 	TRACEDATA							\
 									\
 	/* Kernel symbol table: Normal symbols */			\
 	__ksymtab         : AT(ADDR(__ksymtab) - LOAD_OFFSET) {		\
-		__start___ksymtab = .;					\
+		VMLINUX_SYMBOL(__start___ksymtab) = .;			\
 		KEEP(*(SORT(___ksymtab+*)))				\
-		__stop___ksymtab = .;					\
+		VMLINUX_SYMBOL(__stop___ksymtab) = .;			\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__ksymtab_gpl     : AT(ADDR(__ksymtab_gpl) - LOAD_OFFSET) {	\
-		__start___ksymtab_gpl = .;				\
+		VMLINUX_SYMBOL(__start___ksymtab_gpl) = .;		\
 		KEEP(*(SORT(___ksymtab_gpl+*)))				\
-		__stop___ksymtab_gpl = .;				\
+		VMLINUX_SYMBOL(__stop___ksymtab_gpl) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: Normal unused symbols */		\
 	__ksymtab_unused  : AT(ADDR(__ksymtab_unused) - LOAD_OFFSET) {	\
-		__start___ksymtab_unused = .;				\
+		VMLINUX_SYMBOL(__start___ksymtab_unused) = .;		\
 		KEEP(*(SORT(___ksymtab_unused+*)))			\
-		__stop___ksymtab_unused = .;				\
+		VMLINUX_SYMBOL(__stop___ksymtab_unused) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only unused symbols */		\
 	__ksymtab_unused_gpl : AT(ADDR(__ksymtab_unused_gpl) - LOAD_OFFSET) { \
-		__start___ksymtab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__start___ksymtab_unused_gpl) = .;	\
 		KEEP(*(SORT(___ksymtab_unused_gpl+*)))			\
-		__stop___ksymtab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__stop___ksymtab_unused_gpl) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: GPL-future-only symbols */		\
 	__ksymtab_gpl_future : AT(ADDR(__ksymtab_gpl_future) - LOAD_OFFSET) { \
-		__start___ksymtab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__start___ksymtab_gpl_future) = .;	\
 		KEEP(*(SORT(___ksymtab_gpl_future+*)))			\
-		__stop___ksymtab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__stop___ksymtab_gpl_future) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: Normal symbols */			\
 	__kcrctab         : AT(ADDR(__kcrctab) - LOAD_OFFSET) {		\
-		__start___kcrctab = .;					\
+		VMLINUX_SYMBOL(__start___kcrctab) = .;			\
 		KEEP(*(SORT(___kcrctab+*)))				\
-		__stop___kcrctab = .;					\
+		VMLINUX_SYMBOL(__stop___kcrctab) = .;			\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__kcrctab_gpl     : AT(ADDR(__kcrctab_gpl) - LOAD_OFFSET) {	\
-		__start___kcrctab_gpl = .;				\
+		VMLINUX_SYMBOL(__start___kcrctab_gpl) = .;		\
 		KEEP(*(SORT(___kcrctab_gpl+*)))				\
-		__stop___kcrctab_gpl = .;				\
+		VMLINUX_SYMBOL(__stop___kcrctab_gpl) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: Normal unused symbols */		\
 	__kcrctab_unused  : AT(ADDR(__kcrctab_unused) - LOAD_OFFSET) {	\
-		__start___kcrctab_unused = .;				\
+		VMLINUX_SYMBOL(__start___kcrctab_unused) = .;		\
 		KEEP(*(SORT(___kcrctab_unused+*)))			\
-		__stop___kcrctab_unused = .;				\
+		VMLINUX_SYMBOL(__stop___kcrctab_unused) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only unused symbols */		\
 	__kcrctab_unused_gpl : AT(ADDR(__kcrctab_unused_gpl) - LOAD_OFFSET) { \
-		__start___kcrctab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__start___kcrctab_unused_gpl) = .;	\
 		KEEP(*(SORT(___kcrctab_unused_gpl+*)))			\
-		__stop___kcrctab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__stop___kcrctab_unused_gpl) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: GPL-future-only symbols */		\
 	__kcrctab_gpl_future : AT(ADDR(__kcrctab_gpl_future) - LOAD_OFFSET) { \
-		__start___kcrctab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__start___kcrctab_gpl_future) = .;	\
 		KEEP(*(SORT(___kcrctab_gpl_future+*)))			\
-		__stop___kcrctab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__stop___kcrctab_gpl_future) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: strings */				\
@@ -483,18 +483,18 @@
 									\
 	/* Built-in module parameters. */				\
 	__param : AT(ADDR(__param) - LOAD_OFFSET) {			\
-		__start___param = .;					\
+		VMLINUX_SYMBOL(__start___param) = .;			\
 		KEEP(*(__param))					\
-		__stop___param = .;					\
+		VMLINUX_SYMBOL(__stop___param) = .;			\
 	}								\
 									\
 	/* Built-in module versions. */					\
 	__modver : AT(ADDR(__modver) - LOAD_OFFSET) {			\
-		__start___modver = .;					\
+		VMLINUX_SYMBOL(__start___modver) = .;			\
 		KEEP(*(__modver))					\
-		__stop___modver = .;					\
+		VMLINUX_SYMBOL(__stop___modver) = .;			\
 		. = ALIGN((align));					\
-		__end_rodata = .;					\
+		VMLINUX_SYMBOL(__end_rodata) = .;			\
 	}								\
 	. = ALIGN((align));
 
@@ -524,47 +524,47 @@
  * address even at second ld pass when generating System.map */
 #define SCHED_TEXT							\
 		ALIGN_FUNCTION();					\
-		__sched_text_start = .;					\
+		VMLINUX_SYMBOL(__sched_text_start) = .;			\
 		*(.sched.text)						\
-		__sched_text_end = .;
+		VMLINUX_SYMBOL(__sched_text_end) = .;
 
 /* spinlock.text is aling to function alignment to secure we have same
  * address even at second ld pass when generating System.map */
 #define LOCK_TEXT							\
 		ALIGN_FUNCTION();					\
-		__lock_text_start = .;					\
+		VMLINUX_SYMBOL(__lock_text_start) = .;			\
 		*(.spinlock.text)					\
-		__lock_text_end = .;
+		VMLINUX_SYMBOL(__lock_text_end) = .;
 
 #define CPUIDLE_TEXT							\
 		ALIGN_FUNCTION();					\
-		__cpuidle_text_start = .;				\
+		VMLINUX_SYMBOL(__cpuidle_text_start) = .;		\
 		*(.cpuidle.text)					\
-		__cpuidle_text_end = .;
+		VMLINUX_SYMBOL(__cpuidle_text_end) = .;
 
 #define KPROBES_TEXT							\
 		ALIGN_FUNCTION();					\
-		__kprobes_text_start = .;				\
+		VMLINUX_SYMBOL(__kprobes_text_start) = .;		\
 		*(.kprobes.text)					\
-		__kprobes_text_end = .;
+		VMLINUX_SYMBOL(__kprobes_text_end) = .;
 
 #define ENTRY_TEXT							\
 		ALIGN_FUNCTION();					\
-		__entry_text_start = .;					\
+		VMLINUX_SYMBOL(__entry_text_start) = .;			\
 		*(.entry.text)						\
-		__entry_text_end = .;
+		VMLINUX_SYMBOL(__entry_text_end) = .;
 
 #define IRQENTRY_TEXT							\
 		ALIGN_FUNCTION();					\
-		__irqentry_text_start = .;				\
+		VMLINUX_SYMBOL(__irqentry_text_start) = .;		\
 		*(.irqentry.text)					\
-		__irqentry_text_end = .;
+		VMLINUX_SYMBOL(__irqentry_text_end) = .;
 
 #define SOFTIRQENTRY_TEXT						\
 		ALIGN_FUNCTION();					\
-		__softirqentry_text_start = .;				\
+		VMLINUX_SYMBOL(__softirqentry_text_start) = .;		\
 		*(.softirqentry.text)					\
-		__softirqentry_text_end = .;
+		VMLINUX_SYMBOL(__softirqentry_text_end) = .;
 
 /* Section used for early init (in .S files) */
 #define HEAD_TEXT  KEEP(*(.head.text))
@@ -580,9 +580,9 @@
 #define EXCEPTION_TABLE(align)						\
 	. = ALIGN(align);						\
 	__ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {		\
-		__start___ex_table = .;					\
+		VMLINUX_SYMBOL(__start___ex_table) = .;			\
 		KEEP(*(__ex_table))					\
-		__stop___ex_table = .;					\
+		VMLINUX_SYMBOL(__stop___ex_table) = .;			\
 	}
 
 /*
@@ -596,11 +596,11 @@
 
 #ifdef CONFIG_CONSTRUCTORS
 #define KERNEL_CTORS()	. = ALIGN(8);			   \
-			__ctors_start = .;		   \
+			VMLINUX_SYMBOL(__ctors_start) = .; \
 			KEEP(*(.ctors))			   \
 			KEEP(*(SORT(.init_array.*)))	   \
 			KEEP(*(.init_array))		   \
-			__ctors_end = .;
+			VMLINUX_SYMBOL(__ctors_end) = .;
 #else
 #define KERNEL_CTORS()
 #endif
@@ -736,9 +736,9 @@
 #define BUG_TABLE							\
 	. = ALIGN(8);							\
 	__bug_table : AT(ADDR(__bug_table) - LOAD_OFFSET) {		\
-		__start___bug_table = .;				\
+		VMLINUX_SYMBOL(__start___bug_table) = .;		\
 		KEEP(*(__bug_table))					\
-		__stop___bug_table = .;					\
+		VMLINUX_SYMBOL(__stop___bug_table) = .;			\
 	}
 #else
 #define BUG_TABLE
@@ -748,22 +748,22 @@
 #define ORC_UNWIND_TABLE						\
 	. = ALIGN(4);							\
 	.orc_unwind_ip : AT(ADDR(.orc_unwind_ip) - LOAD_OFFSET) {	\
-		__start_orc_unwind_ip = .;				\
+		VMLINUX_SYMBOL(__start_orc_unwind_ip) = .;		\
 		KEEP(*(.orc_unwind_ip))					\
-		__stop_orc_unwind_ip = .;				\
+		VMLINUX_SYMBOL(__stop_orc_unwind_ip) = .;		\
 	}								\
 	. = ALIGN(2);							\
 	.orc_unwind : AT(ADDR(.orc_unwind) - LOAD_OFFSET) {		\
-		__start_orc_unwind = .;					\
+		VMLINUX_SYMBOL(__start_orc_unwind) = .;			\
 		KEEP(*(.orc_unwind))					\
-		__stop_orc_unwind = .;					\
+		VMLINUX_SYMBOL(__stop_orc_unwind) = .;			\
 	}								\
 	. = ALIGN(4);							\
 	.orc_lookup : AT(ADDR(.orc_lookup) - LOAD_OFFSET) {		\
-		orc_lookup = .;						\
+		VMLINUX_SYMBOL(orc_lookup) = .;				\
 		. += (((SIZEOF(.text) + LOOKUP_BLOCK_SIZE - 1) /	\
 			LOOKUP_BLOCK_SIZE) + 1) * 4;			\
-		orc_lookup_end = .;					\
+		VMLINUX_SYMBOL(orc_lookup_end) = .;			\
 	}
 #else
 #define ORC_UNWIND_TABLE
@@ -773,9 +773,9 @@
 #define TRACEDATA							\
 	. = ALIGN(4);							\
 	.tracedata : AT(ADDR(.tracedata) - LOAD_OFFSET) {		\
-		__tracedata_start = .;					\
+		VMLINUX_SYMBOL(__tracedata_start) = .;			\
 		KEEP(*(.tracedata))					\
-		__tracedata_end = .;					\
+		VMLINUX_SYMBOL(__tracedata_end) = .;			\
 	}
 #else
 #define TRACEDATA
@@ -783,24 +783,24 @@
 
 #define NOTES								\
 	.notes : AT(ADDR(.notes) - LOAD_OFFSET) {			\
-		__start_notes = .;					\
-		KEEP(*(.note.*))					\
-		__stop_notes = .;					\
+		VMLINUX_SYMBOL(__start_notes) = .;			\
+		*(.note.*)						\
+		VMLINUX_SYMBOL(__stop_notes) = .;			\
 	}
 
 #define INIT_SETUP(initsetup_align)					\
 		. = ALIGN(initsetup_align);				\
-		__setup_start = .;					\
+		VMLINUX_SYMBOL(__setup_start) = .;			\
 		KEEP(*(.init.setup))					\
-		__setup_end = .;
+		VMLINUX_SYMBOL(__setup_end) = .;
 
 #define INIT_CALLS_LEVEL(level)						\
-		__initcall##level##_start = .;				\
+		VMLINUX_SYMBOL(__initcall##level##_start) = .;		\
 		KEEP(*(.initcall##level##.init))			\
 		KEEP(*(.initcall##level##s.init))			\
 
 #define INIT_CALLS							\
-		__initcall_start = .;					\
+		VMLINUX_SYMBOL(__initcall_start) = .;			\
 		KEEP(*(.initcallearly.init))				\
 		INIT_CALLS_LEVEL(0)					\
 		INIT_CALLS_LEVEL(1)					\
@@ -811,17 +811,17 @@
 		INIT_CALLS_LEVEL(rootfs)				\
 		INIT_CALLS_LEVEL(6)					\
 		INIT_CALLS_LEVEL(7)					\
-		__initcall_end = .;
+		VMLINUX_SYMBOL(__initcall_end) = .;
 
 #define CON_INITCALL							\
-		__con_initcall_start = .;				\
+		VMLINUX_SYMBOL(__con_initcall_start) = .;		\
 		KEEP(*(.con_initcall.init))				\
-		__con_initcall_end = .;
+		VMLINUX_SYMBOL(__con_initcall_end) = .;
 
 #ifdef CONFIG_BLK_DEV_INITRD
 #define INIT_RAM_FS							\
 	. = ALIGN(4);							\
-	__initramfs_start = .;						\
+	VMLINUX_SYMBOL(__initramfs_start) = .;				\
 	KEEP(*(.init.ramfs))						\
 	. = ALIGN(8);							\
 	KEEP(*(.init.ramfs.info))
@@ -877,7 +877,7 @@
  * sharing between subsections for different purposes.
  */
 #define PERCPU_INPUT(cacheline)						\
-	__per_cpu_start = .;						\
+	VMLINUX_SYMBOL(__per_cpu_start) = .;				\
 	*(.data..percpu..first)						\
 	. = ALIGN(PAGE_SIZE);						\
 	*(.data..percpu..page_aligned)					\
@@ -887,7 +887,7 @@
 	*(.data..percpu)						\
 	*(.data..percpu..shared_aligned)				\
 	PERCPU_DECRYPTED_SECTION					\
-	__per_cpu_end = .;
+	VMLINUX_SYMBOL(__per_cpu_end) = .;
 
 /**
  * PERCPU_VADDR - define output section for percpu area
@@ -914,11 +914,12 @@
  * address, use PERCPU_SECTION.
  */
 #define PERCPU_VADDR(cacheline, vaddr, phdr)				\
-	__per_cpu_load = .;						\
-	.data..percpu vaddr : AT(__per_cpu_load - LOAD_OFFSET) {	\
+	VMLINUX_SYMBOL(__per_cpu_load) = .;				\
+	.data..percpu vaddr : AT(VMLINUX_SYMBOL(__per_cpu_load)		\
+				- LOAD_OFFSET) {			\
 		PERCPU_INPUT(cacheline)					\
 	} phdr								\
-	. = __per_cpu_load + SIZEOF(.data..percpu);
+	. = VMLINUX_SYMBOL(__per_cpu_load) + SIZEOF(.data..percpu);
 
 /**
  * PERCPU_SECTION - define output section for percpu area, simple version
@@ -935,7 +936,7 @@
 #define PERCPU_SECTION(cacheline)					\
 	. = ALIGN(PAGE_SIZE);						\
 	.data..percpu	: AT(ADDR(.data..percpu) - LOAD_OFFSET) {	\
-		__per_cpu_load = .;					\
+		VMLINUX_SYMBOL(__per_cpu_load) = .;			\
 		PERCPU_INPUT(cacheline)					\
 	}
 
@@ -974,9 +975,9 @@
 #define INIT_TEXT_SECTION(inittext_align)				\
 	. = ALIGN(inittext_align);					\
 	.init.text : AT(ADDR(.init.text) - LOAD_OFFSET) {		\
-		_sinittext = .;						\
+		VMLINUX_SYMBOL(_sinittext) = .;				\
 		INIT_TEXT						\
-		_einittext = .;						\
+		VMLINUX_SYMBOL(_einittext) = .;				\
 	}
 
 #define INIT_DATA_SECTION(initsetup_align)				\
@@ -990,8 +991,8 @@
 
 #define BSS_SECTION(sbss_align, bss_align, stop_align)			\
 	. = ALIGN(sbss_align);						\
-	__bss_start = .;						\
+	VMLINUX_SYMBOL(__bss_start) = .;				\
 	SBSS(sbss_align)						\
 	BSS(bss_align)							\
 	. = ALIGN(stop_align);						\
-	__bss_stop = .;
+	VMLINUX_SYMBOL(__bss_stop) = .;
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 41/47] Revert "kbuild: remove CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX"
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (39 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 40/47] Revert "vmlinux.lds.h: remove no-op macro VMLINUX_SYMBOL()" Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 42/47] Revert "kallsyms: remove symbol prefix support" Hajime Tazaki
                   ` (8 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Hajime Tazaki, Akira Moroo

This reverts commit 704db5433fb43acbf1486303721bd0cbb65af251.

for lkl, mingw32 requires underscore-ed symbols.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 arch/Kconfig                | 6 ++++++
 scripts/Makefile.build      | 7 ++++++-
 scripts/adjust_autoksyms.sh | 7 ++++++-
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index a7b57dd42c26..a01df2ae6a1b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -594,6 +594,12 @@ config MODULES_USE_ELF_REL
 	  Modules only use ELF REL relocations.  Modules with ELF RELA
 	  relocations will give an error.
 
+config HAVE_UNDERSCORE_SYMBOL_PREFIX
+	bool
+	help
+	  Some architectures generate an _ in front of C symbols; things like
+	  module loading and assembly files need to know about this.
+
 config HAVE_IRQ_EXIT_ON_IRQ_STACK
 	bool
 	help
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 2f66ed388d1c..c6fe3e092ae0 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -444,10 +444,15 @@ targets += $(lib-target)
 
 dummy-object = $(obj)/.lib_exports.o
 ksyms-lds = $(dot-target).lds
+ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+ref_prefix = EXTERN(_
+else
+ref_prefix = EXTERN(
+endif
 
 quiet_cmd_export_list = EXPORTS $@
 cmd_export_list = $(OBJDUMP) -h $< | \
-	sed -ne '/___ksymtab/s/.*+\([^ ]*\).*/EXTERN(\1)/p' >$(ksyms-lds);\
+	sed -ne '/___ksymtab/s/.*+\([^ ]*\).*/$(ref_prefix)\1)/p' >$(ksyms-lds);\
 	rm -f $(dummy-object);\
 	echo | $(CC) $(a_flags) -c -o $(dummy-object) -x assembler -;\
 	$(LD) $(ld_flags) -r -o $@ -T $(ksyms-lds) $(dummy-object);\
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
index a904bf1f5e67..10e49e00a1f6 100755
--- a/scripts/adjust_autoksyms.sh
+++ b/scripts/adjust_autoksyms.sh
@@ -49,7 +49,12 @@ EOT
 sed 's/ko$/mod/' modules.order |
 xargs -n1 sed -n -e '2{s/ /\n/g;/^$/!p;}' -- |
 sort -u |
-sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$new_ksyms_file"
+while read sym; do
+	if [ -n "$CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX" ]; then
+		sym="${sym#_}"
+	fi
+	echo "#define __KSYM_${sym} 1"
+done >> "$new_ksyms_file"
 
 # Special case for modversions (see modpost.c)
 if [ -n "$CONFIG_MODVERSIONS" ]; then
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 42/47] Revert "kallsyms: remove symbol prefix support"
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (40 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 41/47] Revert "kbuild: remove CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX" Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 43/47] kallsyms: Add a config option to select section for kallsyms Hajime Tazaki
                   ` (7 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Hajime Tazaki, Akira Moroo

This reverts commit 534c9f2ec4c92adbe8791125e7ba66d5023ad51f.

for lkl, mingw32 requires underscore-ed symbols.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 scripts/kallsyms.c      | 49 +++++++++++++++++++++++++++++++----------
 scripts/link-vmlinux.sh |  4 ++++
 2 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index ae6504d07fd6..8a62e1b6cf22 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -60,6 +60,7 @@ static struct sym_entry *table;
 static unsigned int table_size, table_cnt;
 static int all_symbols = 0;
 static int absolute_percpu = 0;
+static char symbol_prefix_char = '\0';
 static int base_relative = 0;
 
 static int token_profit[0x10000];
@@ -72,6 +73,7 @@ static unsigned char best_table_len[256];
 static void usage(void)
 {
 	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
+			"[--symbol-prefix=<prefix char>] "
 			"[--base-relative] < in.map > out.S\n");
 	exit(1);
 }
@@ -109,22 +111,28 @@ static int check_symbol_range(const char *sym, unsigned long long addr,
 
 static int read_symbol(FILE *in, struct sym_entry *s)
 {
-	char sym[500], stype;
+	char str[500];
+	char *sym, stype;
 	int rc;
 
-	rc = fscanf(in, "%llx %c %499s\n", &s->addr, &stype, sym);
+	rc = fscanf(in, "%llx %c %499s\n", &s->addr, &stype, str);
 	if (rc != 3) {
-		if (rc != EOF && fgets(sym, 500, in) == NULL)
+		if (rc != EOF && fgets(str, 500, in) == NULL)
 			fprintf(stderr, "Read error or end of file.\n");
 		return -1;
 	}
-	if (strlen(sym) >= KSYM_NAME_LEN) {
-		fprintf(stderr, "Symbol %s too long for kallsyms (%zu >= %d).\n"
+	if (strlen(str) > KSYM_NAME_LEN) {
+		fprintf(stderr, "Symbol %s too long for kallsyms (%zu vs %d).\n"
 				"Please increase KSYM_NAME_LEN both in kernel and kallsyms.c\n",
-			sym, strlen(sym), KSYM_NAME_LEN);
+			str, strlen(str), KSYM_NAME_LEN);
 		return -1;
 	}
 
+	sym = str;
+	/* skip prefix char */
+	if (symbol_prefix_char && str[0] == symbol_prefix_char)
+		sym++;
+
 	/* Ignore most absolute/undefined (?) symbols. */
 	if (strcmp(sym, "_text") == 0)
 		_text = s->addr;
@@ -145,7 +153,7 @@ static int read_symbol(FILE *in, struct sym_entry *s)
 		 is_arm_mapping_symbol(sym))
 		return -1;
 	/* exclude also MIPS ELF local symbols ($L123 instead of .L123) */
-	else if (sym[0] == '$')
+	else if (str[0] == '$')
 		return -1;
 	/* exclude debugging symbols */
 	else if (stype == 'N' || stype == 'n')
@@ -156,14 +164,14 @@ static int read_symbol(FILE *in, struct sym_entry *s)
 
 	/* include the type field in the symbol name, so that it gets
 	 * compressed together */
-	s->len = strlen(sym) + 1;
+	s->len = strlen(str) + 1;
 	s->sym = malloc(s->len + 1);
 	if (!s->sym) {
 		fprintf(stderr, "kallsyms failure: "
 			"unable to allocate required amount of memory\n");
 		exit(EXIT_FAILURE);
 	}
-	strcpy((char *)s->sym + 1, sym);
+	strcpy((char *)s->sym + 1, str);
 	s->sym[0] = stype;
 
 	s->percpu_absolute = 0;
@@ -226,6 +234,11 @@ static int symbol_valid(struct sym_entry *s)
 	int i;
 	char *sym_name = (char *)s->sym + 1;
 
+	/* skip prefix char */
+	if (symbol_prefix_char && *sym_name == symbol_prefix_char)
+		sym_name++;
+
+
 	/* if --all-symbols is not specified, then symbols outside the text
 	 * and inittext sections are discarded */
 	if (!all_symbols) {
@@ -290,9 +303,15 @@ static void read_map(FILE *in)
 
 static void output_label(char *label)
 {
-	printf(".globl %s\n", label);
+	if (symbol_prefix_char)
+		printf(".globl %c%s\n", symbol_prefix_char, label);
+	else
+		printf(".globl %s\n", label);
 	printf("\tALGN\n");
-	printf("%s:\n", label);
+	if (symbol_prefix_char)
+		printf("%c%s:\n", symbol_prefix_char, label);
+	else
+		printf("%s:\n", label);
 }
 
 /* uncompress a compressed symbol. When this function is called, the best table
@@ -749,7 +768,13 @@ int main(int argc, char **argv)
 				absolute_percpu = 1;
 			else if (strcmp(argv[i], "--base-relative") == 0)
 				base_relative = 1;
-			else
+			else if (strncmp(argv[i], "--symbol-prefix=", 16) == 0) {
+				char *p = &argv[i][16];
+				/* skip quote */
+				if ((*p == '"' && *(p+2) == '"') || (*p == '\'' && *(p+2) == '\''))
+					p++;
+				symbol_prefix_char = *p;
+			} else
 				usage();
 		}
 	} else if (argc != 1)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 27d2066238c7..553d966a1986 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -117,6 +117,10 @@ kallsyms()
 	info KSYM ${2}
 	local kallsymopt;
 
+	if [ -n "${CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX}" ]; then
+		kallsymopt="${kallsymopt} --symbol-prefix=_"
+	fi
+
 	if [ -n "${CONFIG_KALLSYMS_ALL}" ]; then
 		kallsymopt="${kallsymopt} --all-symbols"
 	fi
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 43/47] kallsyms: Add a config option to select section for kallsyms
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (41 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 42/47] Revert "kallsyms: remove symbol prefix support" Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 44/47] um lkl: use ARCH=um SUBARCH=lkl for tools/lkl Hajime Tazaki
                   ` (6 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Andreas Abel, Akira Moroo

From: Andreas Abel <aabel@google.com>

This commit adds a kernel config option to select whether the
kallsyms data should be in the .rodata section (the default for
non-LKL builds), or in the .data section (the default for LKL).

This is to avoid relocations in the text segment (TEXTRELs) that
would otherwise occur with LKL when the .rodata and the .text
section end up in the same segment.

Having TEXTRELs can lead to a number of issues:

1. If a shared library contains a TEXTREL, the corresponding memory
pages cannot be shared.

2. Android >=6 and SELinux do not support binaries with TEXTRELs
(http://android-developers.blogspot.com/2016/06/android-changes-for-ndk-developers.html).

3. If a program has a TEXTREL, uses an ifunc, and is compiled with
early binding, this can lead to a segmentation fault when processing
the relocation for the ifunc during dynamic linking because the text
segment is made temporarily non-executable to process the TEXTREL
(line 248 in dl_reloc.c).

Signed-off-by: Andreas Abel <aabel@google.com>
---
 init/Kconfig            | 12 ++++++++++++
 scripts/kallsyms.c      |  9 ++++++++-
 scripts/link-vmlinux.sh |  4 ++++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/init/Kconfig b/init/Kconfig
index 81293d78a6ad..bd1a846e0ee0 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1385,6 +1385,18 @@ config POSIX_TIMERS
 
 	  If unsure say y.
 
+config KALLSYMS_USE_DATA_SECTION
+	bool "Use .data instead of .rodata section for kallsyms"
+	depends on KALLSYMS
+	default n
+	help
+	  Enabling this option will put the kallsyms data in the .data section
+	  instead of the .rodata section.
+
+	  This is useful when building the kernel as a library, as it avoids
+	  relocations in the text segment that could otherwise occur if the
+	  .rodata section is in the same segment as the .text section.
+
 config PRINTK
 	default y
 	bool "Enable support for printk" if EXPERT
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 8a62e1b6cf22..11d01516e1c8 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -59,6 +59,7 @@ static struct addr_range percpu_range = {
 static struct sym_entry *table;
 static unsigned int table_size, table_cnt;
 static int all_symbols = 0;
+static int use_data_section;
 static int absolute_percpu = 0;
 static char symbol_prefix_char = '\0';
 static int base_relative = 0;
@@ -73,6 +74,7 @@ static unsigned char best_table_len[256];
 static void usage(void)
 {
 	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
+			"[--use-data-section] "
 			"[--symbol-prefix=<prefix char>] "
 			"[--base-relative] < in.map > out.S\n");
 	exit(1);
@@ -362,7 +364,10 @@ static void write_src(void)
 	printf("#define ALGN .balign 4\n");
 	printf("#endif\n");
 
-	printf("\t.section .rodata, \"a\"\n");
+	if (use_data_section)
+		printf("\t.section .data\n");
+	else
+		printf("\t.section .rodata, \"a\"\n");
 
 	/* Provide proper symbols relocatability by their relativeness
 	 * to a fixed anchor point in the runtime image, either '_text'
@@ -768,6 +773,8 @@ int main(int argc, char **argv)
 				absolute_percpu = 1;
 			else if (strcmp(argv[i], "--base-relative") == 0)
 				base_relative = 1;
+			else if (strcmp(argv[i], "--use-data-section") == 0)
+				use_data_section = 1;
 			else if (strncmp(argv[i], "--symbol-prefix=", 16) == 0) {
 				char *p = &argv[i][16];
 				/* skip quote */
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 553d966a1986..3fc1fc406b38 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -133,6 +133,10 @@ kallsyms()
 		kallsymopt="${kallsymopt} --base-relative"
 	fi
 
+	if [ -n "${CONFIG_KALLSYMS_USE_DATA_SECTION}" ]; then
+		kallsymopt="${kallsymopt} --use-data-section"
+	fi
+
 	local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL}               \
 		      ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
 
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 44/47] um lkl: use ARCH=um SUBARCH=lkl for tools/lkl
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (42 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 43/47] kallsyms: Add a config option to select section for kallsyms Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 45/47] um lkl: add CI tests Hajime Tazaki
                   ` (5 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo

From: Akira Moroo <retrage01@gmail.com>

This unifies LKL code under arch/um so that we can treat LKL as one of
mode inside UML.

Signed-off-by: Akira Moroo <retrage01@gmail.com>
---
 arch/um/Kconfig                               |  50 ++++----
 arch/um/Makefile                              | 115 ++----------------
 arch/um/auto.conf                             |   0
 arch/um/include/asm/Kbuild                    |   5 +
 arch/um/lkl/um/Makefile                       |   1 +
 .../um/lkl/um/include/sysdep/kernel-offsets.h |   4 +
 6 files changed, 44 insertions(+), 131 deletions(-)
 create mode 100644 arch/um/auto.conf
 create mode 100644 arch/um/lkl/um/Makefile
 create mode 100644 arch/um/lkl/um/include/sysdep/kernel-offsets.h

diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index 3c3adfc486f2..d7e9af63cf8f 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -5,23 +5,23 @@ menu "UML-specific options"
 config UML
 	bool
 	default y
-	select ARCH_HAS_KCOV
-	select ARCH_NO_PREEMPT
-	select HAVE_ARCH_AUDITSYSCALL
-	select HAVE_ARCH_SECCOMP_FILTER
-	select HAVE_UID16
-	select HAVE_FUTEX_CMPXCHG if FUTEX
-	select HAVE_DEBUG_KMEMLEAK
-	select HAVE_DEBUG_BUGVERBOSE
-	select GENERIC_IRQ_SHOW
-	select GENERIC_CPU_DEVICES
-	select GENERIC_CLOCKEVENTS
-	select HAVE_GCC_PLUGINS
-	select TTY # Needed for line.c
+	select ARCH_HAS_KCOV if !UML_LKL
+	select ARCH_NO_PREEMPT if !UML_LKL
+	select HAVE_ARCH_AUDITSYSCALL if !UML_LKL
+	select HAVE_ARCH_SECCOMP_FILTER if !UML_LKL
+	select HAVE_UID16 if !UML_LKL
+	select HAVE_FUTEX_CMPXCHG if (FUTEX && !UML_LKL)
+	select HAVE_DEBUG_KMEMLEAK if !UML_LKL
+	select HAVE_DEBUG_BUGVERBOSE if !UML_LKL
+	select GENERIC_IRQ_SHOW if !UML_LKL
+	select GENERIC_CPU_DEVICES if !UML_LKL
+	select GENERIC_CLOCKEVENTS if !UML_LKL
+	select HAVE_GCC_PLUGINS if !UML_LKL
+	select TTY if !UML_LKL # Needed for line.c
 
 config MMU
 	bool
-	default y
+	default y if !UML_LKL
 
 config NO_IOMEM
 	def_bool y
@@ -34,20 +34,20 @@ config SBUS
 
 config TRACE_IRQFLAGS_SUPPORT
 	bool
-	default y
+	default y if !UML_LKL
 
 config LOCKDEP_SUPPORT
 	bool
-	default y
+	default y if !UML_LKL
 
 config STACKTRACE_SUPPORT
 	bool
-	default y
-	select STACKTRACE
+	default y if !UML_LKL
+	select STACKTRACE if !UML_LKL
 
 config GENERIC_CALIBRATE_DELAY
 	bool
-	default y
+	default y if !UML_LKL
 
 config HZ
 	int
@@ -73,12 +73,12 @@ config STATIC_LINK
 
 config LD_SCRIPT_STATIC
 	bool
-	default y
+	default y if !UML_LKL
 	depends on STATIC_LINK
 
 config LD_SCRIPT_DYN
 	bool
-	default y
+	default y if !UML_LKL
 	depends on !LD_SCRIPT_STATIC
 	select MODULE_REL_CRCS if MODVERSIONS
 
@@ -106,7 +106,7 @@ config HOSTFS
 config MCONSOLE
 	bool "Management console"
 	depends on PROC_FS
-	default y
+	default y if !UML_LKL
 	help
 	  The user mode linux management console is a low-level interface to
 	  the kernel, somewhat like the i386 SysRq interface.  Since there is
@@ -169,7 +169,7 @@ config PGTABLE_LEVELS
 	default 2
 
 config SECCOMP
-	def_bool y
+	def_bool y if !UML_LKL
 	prompt "Enable seccomp to safely compute untrusted bytecode"
 	---help---
 	  This kernel feature is useful for number crunching applications
@@ -198,4 +198,6 @@ config UML_TIME_TRAVEL_SUPPORT
 
 endmenu
 
-source "arch/um/drivers/Kconfig"
+if !UML_LKL
+	source "arch/um/drivers/Kconfig"
+endif
diff --git a/arch/um/Makefile b/arch/um/Makefile
index d2daa206872d..21fff60d63ea 100644
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -23,10 +23,6 @@ OS := $(shell uname -s)
 # features.
 SHELL := /bin/bash
 
-core-y			+= $(ARCH_DIR)/kernel/		\
-			   $(ARCH_DIR)/drivers/		\
-			   $(ARCH_DIR)/os-$(OS)/
-
 MODE_INCLUDE	+= -I$(srctree)/$(ARCH_DIR)/include/shared/skas
 
 HEADER_ARCH 	:= $(SUBARCH)
@@ -35,114 +31,19 @@ ifneq ($(filter $(SUBARCH),x86 x86_64 i386),)
 	HEADER_ARCH := x86
 endif
 
-ifdef CONFIG_64BIT
-	KBUILD_CFLAGS += -mcmodel=large
-endif
-
 HOST_DIR := arch/$(HEADER_ARCH)
 
+ifneq ($(filter $(HEADER_ARCH),lkl),)
+	HOST_DIR := $(ARCH_DIR)/$(HEADER_ARCH)
+endif
+
 include $(ARCH_DIR)/Makefile-skas
 include $(HOST_DIR)/Makefile.um
 
 core-y += $(HOST_DIR)/um/
 
-SHARED_HEADERS	:= $(ARCH_DIR)/include/shared
-ARCH_INCLUDE	:= -I$(srctree)/$(SHARED_HEADERS)
-ARCH_INCLUDE	+= -I$(srctree)/$(HOST_DIR)/um/shared
-KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/um
-
-# -Dvmap=kernel_vmap prevents anything from referencing the libpcap.o symbol so
-# named - it's a common symbol in libpcap, so we get a binary which crashes.
-#
-# Same things for in6addr_loopback and mktime - found in libc. For these two we
-# only get link-time error, luckily.
-#
-# -Dlongjmp=kernel_longjmp prevents anything from referencing the libpthread.a
-# embedded copy of longjmp, same thing for setjmp.
-#
-# These apply to USER_CFLAGS to.
-
-KBUILD_CFLAGS += $(CFLAGS) $(CFLAGS-y) -D__arch_um__ \
-	$(ARCH_INCLUDE) $(MODE_INCLUDE) -Dvmap=kernel_vmap	\
-	-Dlongjmp=kernel_longjmp -Dsetjmp=kernel_setjmp \
-	-Din6addr_loopback=kernel_in6addr_loopback \
-	-Din6addr_any=kernel_in6addr_any -Dstrrchr=kernel_strrchr
-
-KBUILD_AFLAGS += $(ARCH_INCLUDE)
-
-USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \
-		$(ARCH_INCLUDE) $(MODE_INCLUDE) $(filter -I%,$(CFLAGS)) \
-		-D_FILE_OFFSET_BITS=64 -idirafter $(srctree)/include \
-		-idirafter $(objtree)/include -D__KERNEL__ -D__UM_HOST__
-
-#This will adjust *FLAGS accordingly to the platform.
-include $(ARCH_DIR)/Makefile-os-$(OS)
-
-KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/include \
-		   -I$(srctree)/$(HOST_DIR)/include/uapi \
-		   -I$(objtree)/$(HOST_DIR)/include/generated \
-		   -I$(objtree)/$(HOST_DIR)/include/generated/uapi
-
-# -Derrno=kernel_errno - This turns all kernel references to errno into
-# kernel_errno to separate them from the libc errno.  This allows -fno-common
-# in KBUILD_CFLAGS.  Otherwise, it would cause ld to complain about the two different
-# errnos.
-# These apply to kernelspace only.
-#
-# strip leading and trailing whitespace to make the USER_CFLAGS removal of these
-# defines more robust
-
-KERNEL_DEFINES = $(strip -Derrno=kernel_errno -Dsigprocmask=kernel_sigprocmask \
-			 -Dmktime=kernel_mktime $(ARCH_KERNEL_DEFINES))
-KBUILD_CFLAGS += $(KERNEL_DEFINES)
-
-PHONY += linux
-
-all: linux
-
-linux: vmlinux
-	@echo '  LINK $@'
-	$(Q)ln -f $< $@
-
-define archhelp
-  echo '* linux		- Binary kernel image (./linux) - for backward'
-  echo '		   compatibility only, this creates a hard link to the'
-  echo '		   real kernel binary, the "vmlinux" binary you'
-  echo '		   find in the kernel root.'
-endef
-
-archheaders:
-	$(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(HEADER_ARCH) asm-generic archheaders
-
-archprepare:
-	$(Q)$(MAKE) $(build)=$(HOST_DIR)/um include/generated/user_constants.h
-
-LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static
-LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)
-
-CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \
-	$(call cc-option, -fno-stack-protector,) \
-	$(call cc-option, -fno-stack-protector-all,)
-
-# Options used by linker script
-export LDS_START      := $(START)
-export LDS_ELF_ARCH   := $(ELF_ARCH)
-export LDS_ELF_FORMAT := $(ELF_FORMAT)
-
-# The wrappers will select whether using "malloc" or the kernel allocator.
-LINK_WRAPS = -Wl,--wrap,malloc -Wl,--wrap,free -Wl,--wrap,calloc
-
-LD_FLAGS_CMDLINE = $(foreach opt,$(KBUILD_LDFLAGS),-Wl,$(opt))
-
-# Used by link-vmlinux.sh which has special support for um link
-export CFLAGS_vmlinux := $(LINK-y) $(LINK_WRAPS) $(LD_FLAGS_CMDLINE)
-
-# When cleaning we don't include .config, so we don't include
-# TT or skas makefiles and don't clean skas_ptregs.h.
-CLEAN_FILES += linux x.i gmon.out
-
-archclean:
-	@find . \( -name '*.bb' -o -name '*.bbg' -o -name '*.da' \
-		-o -name '*.gcov' \) -type f -print | xargs rm -f
+ifneq ($(SUBARCH),lkl)
+  include $(ARCH_DIR)/Makefile.um
+endif
 
-export HEADER_ARCH SUBARCH USER_CFLAGS CFLAGS_NO_HARDENING OS DEV_NULL_PATH
+export HEADER_ARCH SUBARCH
diff --git a/arch/um/auto.conf b/arch/um/auto.conf
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild
index 398006d27e40..6c2aa280f1d9 100644
--- a/arch/um/include/asm/Kbuild
+++ b/arch/um/include/asm/Kbuild
@@ -17,6 +17,11 @@ generic-y += kdebug.h
 generic-y += mcs_spinlock.h
 generic-y += mm-arch-hooks.h
 generic-y += mmiowb.h
+generic-$(UML_LKL) += mmu.h
+generic-$(UML_LKL) += mmu_context.h
+generic-$(UML_LKL) += module.h
+generic-$(UML_LKL) += msgbuf.h
+generic-$(UML_LKL) += page.h
 generic-y += param.h
 generic-y += pci.h
 generic-y += percpu.h
diff --git a/arch/um/lkl/um/Makefile b/arch/um/lkl/um/Makefile
new file mode 100644
index 000000000000..f66554cd5c45
--- /dev/null
+++ b/arch/um/lkl/um/Makefile
@@ -0,0 +1 @@
+# SPDX-License-Identifier: GPL-2.0
diff --git a/arch/um/lkl/um/include/sysdep/kernel-offsets.h b/arch/um/lkl/um/include/sysdep/kernel-offsets.h
new file mode 100644
index 000000000000..27004731b0ab
--- /dev/null
+++ b/arch/um/lkl/um/include/sysdep/kernel-offsets.h
@@ -0,0 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Dummy kernel-offsets.h file. Required by kbuild and ready to be used
+ * - hint!
+ */
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 45/47] um lkl: add CI tests
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (43 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 44/47] um lkl: use ARCH=um SUBARCH=lkl for tools/lkl Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 46/47] um: use lkl virtio_net_tap device as UML device Hajime Tazaki
                   ` (4 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Hajime Tazaki, Akira Moroo

We use CircleCI for the tests, which should check regressions before
merging.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 .circleci/config.yml             | 246 +++++++++++++++++++++++++++++++
 tools/lkl/scripts/checkpatch.sh  |  60 ++++++++
 tools/lkl/scripts/lkl-jenkins.sh |  21 +++
 3 files changed, 327 insertions(+)
 create mode 100644 .circleci/config.yml
 create mode 100755 tools/lkl/scripts/checkpatch.sh
 create mode 100755 tools/lkl/scripts/lkl-jenkins.sh

diff --git a/.circleci/config.yml b/.circleci/config.yml
new file mode 100644
index 000000000000..7d140d9a2acb
--- /dev/null
+++ b/.circleci/config.yml
@@ -0,0 +1,246 @@
+version: 2
+general:
+  artifacts:
+
+do_steps: &do_steps
+ steps:
+  - run: echo "$CROSS_COMPILE" > ~/_cross_compile
+  - restore_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: checkout build tree
+      command: |
+        mkdir -p ~/.ssh/
+        ssh-keyscan -H github.com >> ~/.ssh/known_hosts
+        if ! [ -d .git ]; then
+          git clone --depth=1 $CIRCLE_REPOSITORY_URL .;
+        fi
+        if [[ $CIRCLE_BRANCH == pull/* ]]; then
+           git fetch --depth=1 origin $CIRCLE_BRANCH/head;
+        else
+           git fetch --depth=1 origin $CIRCLE_BRANCH;
+        fi
+        git reset --hard $CIRCLE_SHA1
+  - save_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+      paths:
+        - /home/ubuntu/project/.git
+  - run:
+      name: clean
+      command: |
+        make mrproper
+        cd tools/lkl && make clean-conf
+        rm -rf ~/junit
+  - run: mkdir -p /home/ubuntu/.ccache
+  - restore_cache:
+      key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: build DPDK
+      command: |
+        if [ "$MKARG" = "dpdk=yes" ]; then
+          sudo apt-get update
+          if ! sudo apt-get install -y linux-headers-$(uname -r) ; then
+             cd /lib/modules && sudo ln -sf 4.4.0-97-generic `uname -r` && \
+               cd /home/ubuntu/project
+          fi
+          cd tools/lkl && ./scripts/dpdk-sdk-build.sh;
+        fi
+  - run:
+      name: copy mingw binutils
+      command: |
+        if [ "$CROSS_COMPILE" = "i686-w64-mingw32-" ]; then
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/i686-w64-mingw32-as
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/i686-w64-mingw32-ld
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/i686-w64-mingw32-objcopy
+          sudo cp i686-w64-mingw32-* /usr/bin;
+        elif [ "$CROSS_COMPILE" = "arm-linux-androideabi-" ]; then
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/arm-linux-androideabi-ld.gold
+          sudo cp arm-linux-androideabi-ld.gold /usr/bin/arm-linux-androideabi-ld;
+        fi
+  - run:
+      name: start emulator
+      command: |
+        if [[ $CROSS_COMPILE == *android* ]]; then
+          emulator -avd Nexus5_API24 -no-window -no-audio -no-boot-anim;
+        elif [[ $CROSS_COMPILE == *freebsd* ]]; then
+          cd /home/ubuntu && $QEMU
+        fi
+      background: true
+  - run: cd tools/lkl && make -j8 ${MKARG}
+  - run: mkdir -p ~/destdir && cd tools/lkl && make DESTDIR=~/destdir
+  - save_cache:
+     paths:
+       - /home/ubuntu/.ccache
+     key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+  - run:
+      name: wait emulator to boot
+      command: |
+        if [[ $CROSS_COMPILE == *android* ]]; then
+          /home/ubuntu/circle-android.sh wait-for-boot;
+        elif [[ $CROSS_COMPILE == *freebsd* ]]; then
+          while ! $MYSSH -o ConnectTimeout=1 exit 2> /dev/null
+          do
+             sleep 5
+          done
+        fi
+  - run:
+      name: run tests
+      command: |
+        mkdir -p ~/junit
+        make -C tools/lkl run-tests tests="--junit-dir ~/junit"
+        find ./tools/lkl/ -type f -name "*.xml" -exec mv {} ~/junit/ \;
+      no_output_timeout: "90m"
+  - store_test_results:
+      path: ~/junit
+  - store_artifacts:
+      path: ~/junit
+
+
+do_uml_steps: &do_uml_steps
+ steps:
+  - run: echo "$CROSS_COMPILE" > ~/_cross_compile
+  - restore_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: checkout build tree
+      command: |
+        mkdir -p ~/.ssh/
+        ssh-keyscan -H github.com >> ~/.ssh/known_hosts
+        if ! [ -d .git ]; then
+          git clone --depth=1 $CIRCLE_REPOSITORY_URL .;
+        fi
+        if [[ $CIRCLE_BRANCH == pull/* ]]; then
+           git fetch --depth=1 origin $CIRCLE_BRANCH/head;
+        else
+           git fetch --depth=1 origin $CIRCLE_BRANCH;
+        fi
+        git reset --hard $CIRCLE_SHA1
+  - save_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+      paths:
+        - /home/ubuntu/project/.git
+  - run: mkdir -p /home/ubuntu/.ccache
+  - restore_cache:
+      key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: build
+      command: |
+        make -C tools/lkl/
+        make defconfig ARCH=um
+        make ARCH=um
+  - save_cache:
+     paths:
+       - /home/ubuntu/.ccache
+     key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+  - run:
+      name: test
+      command: |
+        ./linux rootfstype=hostfs ro mem=1g loglevel=10 init="/bin/bash -c exit" || export RETVAL=$?
+        # SIGABRT=6 => 128+6
+        if [ $RETVAL != "134" ]; then
+          exit 1
+        fi
+
+## Customize the test machine
+jobs:
+  x86_64:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     MKARG: "dpdk=no"
+   <<: *do_steps
+
+  i386:
+   docker:
+     - image: lkldocker/circleci-i386:0.1
+   environment:
+     CROSS_COMPILE: ""
+   <<: *do_steps
+
+  mingw32:
+   docker:
+     - image: lkldocker/circleci-mingw:0.6
+   environment:
+     CROSS_COMPILE: "i686-w64-mingw32-"
+   <<: *do_steps
+
+  android-arm32:
+   docker:
+     - image: lkldocker/circleci-android-arm32:0.6
+   environment:
+     CROSS_COMPILE: "arm-linux-androideabi-"
+     LKL_ANDROID_TEST: 1
+     ANDROID_SDK_ROOT: /home/ubuntu/android-sdk
+   <<: *do_steps
+
+  android-aarch64:
+   docker:
+     - image: lkldocker/circleci-android-arm64:0.6
+   environment:
+     CROSS_COMPILE: "aarch64-linux-android-"
+     LKL_ANDROID_TEST: 1
+     ANDROID_SDK_ROOT: /home/ubuntu/android-sdk
+   <<: *do_steps
+
+  freebsd11_x86_64:
+   docker:
+     - image: lkldocker/circleci-freebsd11-x86_64:0.4
+   environment:
+     CROSS_COMPILE: "x86_64-pc-freebsd11-"
+   <<: *do_steps
+
+  x86_64_valgrind:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     MKARG: "dpdk=no"
+     VALGRIND: 1
+   <<: *do_steps
+
+  x86_64_uml:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     TMPDIR: "/tmp" # required for not using /dev/shm
+   <<: *do_uml_steps
+
+  checkpatch:
+   docker:
+     - image: lkldocker/circleci:0.5
+   environment:
+   steps:
+     - restore_cache:
+        key: code-tree-full-history-{{ .Environment.CACHE_VERSION }}
+     - checkout
+     - run: tools/lkl/scripts/checkpatch.sh
+     - save_cache:
+        key: code-tree-full-history-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+        paths:
+          - /home/ubuntu/project/.git
+        when: always
+
+workflows:
+  version: 2
+  build:
+    jobs:
+     - x86_64
+     - mingw32
+     - android-arm32
+     - android-aarch64
+     - freebsd11_x86_64
+     - checkpatch
+     - i386
+     - x86_64_uml
+  nightly:
+    triggers:
+      - schedule:
+          cron: "0 0 * * *"
+          filters:
+            branches:
+              only:
+                - master
+    jobs:
+      - x86_64_valgrind
diff --git a/tools/lkl/scripts/checkpatch.sh b/tools/lkl/scripts/checkpatch.sh
new file mode 100755
index 000000000000..0c02ca6b21a2
--- /dev/null
+++ b/tools/lkl/scripts/checkpatch.sh
@@ -0,0 +1,60 @@
+#!/bin/sh -ex
+# SPDX-License-Identifier: GPL-2.0
+
+if [ -z "$origin_master" ]; then
+    origin_master="origin/master"
+fi
+
+UPSTREAM=git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
+LKL=github.com:lkl/linux.git
+
+upstream=`git remote -v | grep $UPSTREAM | cut -f1 | head -n1`
+lkl=`git remote -v | grep $LKL | cut -f1 | head -n1`
+
+if [ -z "$upstream" ]; then
+    git fetch --tags --progress git://$UPSTREAM
+else
+    git fetch --tags $upstream
+fi
+
+if [ -z "$lkl" ]; then
+    git remote add lkl-upstream git@$LKL || true
+    lkl=`git remote -v | grep $LKL | cut -f1 | head -n1`
+fi
+
+if [ -z "$lkl" ]; then
+    echo "can't find lkl remote, quiting"
+    exit 1
+fi
+
+git fetch $lkl
+git fetch --tags $upstream
+
+# find the last upstream tag to avoid checking upstream commits during
+# upstream merges
+tag=`git tag --sort='-*authordate' | grep ^v | head -n1`
+tmp=`mktemp -d`
+
+commits=$(git log --no-merges --pretty=format:%h HEAD ^$lkl/master ^$tag)
+for c in $commits; do
+    git format-patch -1 -o $tmp $c
+done
+
+if [ -z "$c" ]; then
+    echo "there are not commits/patches to check, quiting."
+    rmdir $tmp
+    exit 0
+fi
+
+./scripts/checkpatch.pl --ignore FILE_PATH_CHANGES $tmp/*.patch
+rm $tmp/*.patch
+
+# checkpatch.pl does not know how to deal with 3 way diffs which would
+# be useful to check the conflict resolutions during merges...
+#for c in `git log --merges --pretty=format:%h HEAD ^$origin_master ^$tag`; do
+#    git log --pretty=email $c -1 > $tmp/$c.patch
+#    git diff $c $c^1 $c^2 >> $tmp/$c.patch
+#done
+
+rmdir $tmp
+
diff --git a/tools/lkl/scripts/lkl-jenkins.sh b/tools/lkl/scripts/lkl-jenkins.sh
new file mode 100755
index 000000000000..eaadc6e90143
--- /dev/null
+++ b/tools/lkl/scripts/lkl-jenkins.sh
@@ -0,0 +1,21 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+set -e
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+basedir=$(cd $script_dir/../../..; pwd)
+
+export PATH=$PATH:/sbin
+
+build_and_test()
+{
+    cd $basedir
+    make mrproper
+    cd tools/lkl
+    make clean-conf
+    make -j4
+    make run-tests
+}
+
+build_and_test
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 46/47] um: use lkl virtio_net_tap device as UML device
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (44 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 45/47] um lkl: add CI tests Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-23  4:38 ` [RFC PATCH 47/47] um: add lkl virtio-blk device Hajime Tazaki
                   ` (3 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Hajime Tazaki, Akira Moroo

This also expands supporting virtio-mmio driver, which involves multiple
addition to Kconfig as well.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 .circleci/config.yml             |   2 +-
 arch/um/Kconfig                  |   6 --
 arch/um/Makefile.um              |   4 +
 arch/um/configs/x86_64_defconfig |   5 ++
 arch/um/include/asm/Kbuild       |   1 +
 arch/um/include/asm/io.h         |   4 +
 arch/um/os-Linux/Makefile        |   5 ++
 arch/um/os-Linux/lkl_dev.c       | 134 +++++++++++++++++++++++++++++++
 arch/x86/um/syscalls_64.c        |  53 ++++++++++++
 tools/lkl/lib/Makefile           |  32 ++++++++
 tools/lkl/lib/virtio.c           |  17 +++-
 tools/lkl/lib/virtio.h           |  22 +++++
 tools/lkl/lib/virtio_net.c       |  23 +++++-
 tools/lkl/lib/virtio_net_fd.c    |  22 -----
 tools/lkl/lib/virtio_net_fd.h    |  22 +++++
 15 files changed, 320 insertions(+), 32 deletions(-)
 create mode 100644 arch/um/os-Linux/lkl_dev.c
 create mode 100644 tools/lkl/lib/Makefile

diff --git a/.circleci/config.yml b/.circleci/config.yml
index 7d140d9a2acb..cec55ad93dc6 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -135,7 +135,7 @@ do_uml_steps: &do_uml_steps
   - run:
       name: test
       command: |
-        ./linux rootfstype=hostfs ro mem=1g loglevel=10 init="/bin/bash -c exit" || export RETVAL=$?
+        ./linux rootfstype=hostfs ro mem=1g loglevel=10 veth0=tap,tap0,0xc803 init="/bin/bash -c exit" || export RETVAL=$?
         # SIGABRT=6 => 128+6
         if [ $RETVAL != "134" ]; then
           exit 1
diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index d7e9af63cf8f..a32dd84f0bf2 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -23,9 +23,6 @@ config MMU
 	bool
 	default y if !UML_LKL
 
-config NO_IOMEM
-	def_bool y
-
 config ISA
 	bool
 
@@ -160,9 +157,6 @@ config MMAPPER
 	  This driver allows a host file to be used as emulated IO memory inside
 	  UML.
 
-config NO_DMA
-	def_bool y
-
 config PGTABLE_LEVELS
 	int
 	default 3 if 3_LEVEL_PGTABLES
diff --git a/arch/um/Makefile.um b/arch/um/Makefile.um
index 24a088e5df04..65cfc4393e3d 100644
--- a/arch/um/Makefile.um
+++ b/arch/um/Makefile.um
@@ -11,6 +11,8 @@ core-y			+= $(ARCH_DIR)/kernel/		\
 			 $(ARCH_DIR)/drivers/		\
 			 $(ARCH_DIR)/os-$(OS)/
 
+core-y			+= $(srctree)/tools/lkl/lib/
+
 ifdef CONFIG_64BIT
 	KBUILD_CFLAGS += -mcmodel=large
 endif
@@ -52,6 +54,8 @@ KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/include \
 		   -I$(objtree)/$(HOST_DIR)/include/generated \
 		   -I$(objtree)/$(HOST_DIR)/include/generated/uapi
 
+KBUILD_CPPFLAGS += -I$(srctree)/$(ARCH_DIR)/lkl/include -I$(srctree)/$(ARCH_DIR)/
+
 # -Derrno=kernel_errno - This turns all kernel references to errno into
 # kernel_errno to separate them from the libc errno.  This allows -fno-common
 # in KBUILD_CFLAGS.  Otherwise, it would cause ld to complain about the two different
diff --git a/arch/um/configs/x86_64_defconfig b/arch/um/configs/x86_64_defconfig
index 3281d7600225..917982b6cd60 100644
--- a/arch/um/configs/x86_64_defconfig
+++ b/arch/um/configs/x86_64_defconfig
@@ -70,3 +70,8 @@ CONFIG_NLS=y
 CONFIG_DEBUG_INFO=y
 CONFIG_FRAME_WARN=1024
 CONFIG_DEBUG_KERNEL=y
+CONFIG_VIRTIO=y
+CONFIG_VIRTIO_MENU=y
+CONFIG_VIRTIO_MMIO=y
+CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y
+CONFIG_VIRTIO_NET=y
diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild
index 6c2aa280f1d9..54037cdb320e 100644
--- a/arch/um/include/asm/Kbuild
+++ b/arch/um/include/asm/Kbuild
@@ -5,6 +5,7 @@ generic-y += compat.h
 generic-y += current.h
 generic-y += delay.h
 generic-y += device.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += extable.h
diff --git a/arch/um/include/asm/io.h b/arch/um/include/asm/io.h
index 96f77b5232aa..f23700d3c071 100644
--- a/arch/um/include/asm/io.h
+++ b/arch/um/include/asm/io.h
@@ -2,11 +2,15 @@
 #ifndef _ASM_UM_IO_H
 #define _ASM_UM_IO_H
 
+#ifndef CONFIG_HAS_IOMEM
 #define ioremap ioremap
 static inline void __iomem *ioremap(phys_addr_t offset, size_t size)
 {
 	return (void __iomem *)(unsigned long)offset;
 }
+#else
+#include <lkl/include/asm/io.h>
+#endif
 
 #define iounmap iounmap
 static inline void iounmap(void __iomem *addr)
diff --git a/arch/um/os-Linux/Makefile b/arch/um/os-Linux/Makefile
index 839915b8c31c..d90d88a2f34e 100644
--- a/arch/um/os-Linux/Makefile
+++ b/arch/um/os-Linux/Makefile
@@ -11,9 +11,14 @@ obj-y = execvp.o file.o helper.o irq.o main.o mem.o process.o \
 	umid.o user_syms.o util.o drivers/ skas/
 
 obj-$(CONFIG_ARCH_REUSE_HOST_VSYSCALL_AREA) += elf_aux.o
+obj-y += lkl_dev.o
+
+CFLAGS_lkl_dev.o:=-I$(srctree)/tools/lkl/include -Wno-undef
 
 USER_OBJS := $(user-objs-y) elf_aux.o execvp.o file.o helper.o irq.o \
 	main.o mem.o process.o registers.o sigio.o signal.o start_up.o time.o \
 	tty.o umid.o util.o
 
+USER_OBJS += lkl_dev.o
+
 include arch/um/scripts/Makefile.rules
diff --git a/arch/um/os-Linux/lkl_dev.c b/arch/um/os-Linux/lkl_dev.c
new file mode 100644
index 000000000000..698062917ed5
--- /dev/null
+++ b/arch/um/os-Linux/lkl_dev.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <stdlib.h>
+#include <string.h>
+#include <init.h>
+#include <os.h>
+#include <kern_util.h>
+#include <errno.h>
+
+#include <lkl.h>
+#include <lkl_host.h>
+
+extern struct lkl_host_operations lkl_host_ops;
+struct lkl_host_operations *lkl_ops = &lkl_host_ops;
+
+static struct lkl_netdev *nd;
+
+int __init uml_netdev_prepare(char *iftype, char *ifparams, char *ifoffload)
+{
+	int offload = 0;
+
+	if (ifoffload)
+		offload = strtol(ifoffload, NULL, 0);
+
+	if ((strcmp(iftype, "tap") == 0)) {
+		nd = lkl_netdev_tap_create(ifparams, offload);
+#ifdef notyet
+	} else if ((strcmp(iftype, "macvtap") == 0)) {
+		nd = lkl_netdev_macvtap_create(ifparams, offload);
+#endif
+	} else {
+		if (offload) {
+			lkl_printf("WARN: %s isn't supported on %s\n",
+				   "LKL_HIJACK_OFFLOAD",
+				   iftype);
+			lkl_printf(
+				"WARN: Disabling offload features.\n");
+		}
+		offload = 0;
+	}
+#ifdef notyet
+	if (strcmp(iftype, "raw") == 0)
+		nd = lkl_netdev_raw_create(ifparams);
+#endif
+
+	return 0;
+}
+
+
+int __init uml_netdev_add(void)
+{
+	if (nd)
+		lkl_netdev_add(nd, NULL);
+
+	return 0;
+}
+__initcall(uml_netdev_add);
+
+static int __init lkl_eth_setup(char *str, int *niu)
+{
+	char *end, *iftype, *ifparams, *ifoffload;
+	int devid, err = -EINVAL;
+
+	/* veth */
+	devid = strtoul(str, &end, 0);
+	if (end == str) {
+		os_warn("Bad device number\n");
+		return err;
+	}
+
+	/* = */
+	str = end;
+	if (*str != '=') {
+		os_warn("Expected '=' after device number\n");
+		return err;
+	}
+	str++;
+
+	/* <iftype> */
+	iftype = str;
+
+	/* <ifparams> */
+	ifparams = strchr(str, ',');
+	if (ifparams == NULL) {
+		os_warn("failed to parse ifparams\n");
+		return -1;
+	}
+	*ifparams = '\0';
+	ifparams++;
+
+	str = ifparams;
+	/* <offload> */
+	ifoffload = strchr(str, ',');
+	*ifoffload = '\0';
+	ifoffload++;
+
+	os_info("str=%s, iftype=%s, ifparams=%s, offload=%s\n",
+		str, iftype, ifparams, ifoffload);
+
+	/* preparation */
+	uml_netdev_prepare(iftype, ifparams, ifoffload);
+
+	return 1;
+}
+
+__uml_setup("veth", lkl_eth_setup,
+"veth[0-9]+=<iftype>,<ifparams>,<offload>\n"
+"    Configure a network device.\n\n"
+);
+
+/* stub functions */
+int lkl_is_running(void)
+{
+	return 1;
+}
+
+
+void lkl_put_irq(int i, const char *user)
+{
+}
+
+/* XXX */
+static int free_irqs[2] = {5, 13};
+int lkl_get_free_irq(const char *user)
+{
+	static int irq_idx;
+	return free_irqs[irq_idx++];
+}
+
+int lkl_trigger_irq(int irq)
+{
+	do_IRQ(irq, NULL);
+	return 0;
+}
diff --git a/arch/x86/um/syscalls_64.c b/arch/x86/um/syscalls_64.c
index 58f51667e2e4..e70dc7f76b19 100644
--- a/arch/x86/um/syscalls_64.c
+++ b/arch/x86/um/syscalls_64.c
@@ -9,6 +9,7 @@
 #include <linux/sched/mm.h>
 #include <linux/syscalls.h>
 #include <linux/uaccess.h>
+#include <linux/platform_device.h>
 #include <asm/prctl.h> /* XXX This should get the constants from libc */
 #include <os.h>
 
@@ -87,3 +88,55 @@ void arch_switch_to(struct task_struct *to)
 
 	arch_prctl(to, ARCH_SET_FS, (void __user *) to->thread.arch.fs);
 }
+
+SYSCALL_DEFINE3(virtio_mmio_device_add, long, base, long, size, unsigned int,
+		irq)
+{
+	struct platform_device *pdev;
+	int ret;
+
+	struct resource res[] = {
+		[0] = {
+		       .start = base,
+		       .end = base + size - 1,
+		       .flags = IORESOURCE_MEM,
+		       },
+		[1] = {
+		       .start = irq,
+		       .end = irq,
+		       .flags = IORESOURCE_IRQ,
+		       },
+	};
+
+	pdev = platform_device_alloc("virtio-mmio", PLATFORM_DEVID_AUTO);
+	if (!pdev) {
+		dev_err(&pdev->dev,
+			"%s: Unable to device alloc for virtio-mmio\n",
+			__func__);
+		return -ENOMEM;
+	}
+
+	ret = platform_device_add_resources(pdev, res, ARRAY_SIZE(res));
+	if (ret) {
+		dev_err(&pdev->dev,
+			"%s: Unable to add resources for %s%d\n", __func__,
+			pdev->name, pdev->id);
+		goto exit_device_put;
+	}
+
+	ret = platform_device_add(pdev);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "%s: Unable to add %s%d\n", __func__,
+			pdev->name, pdev->id);
+		goto exit_release_pdev;
+	}
+
+	return pdev->id;
+
+exit_release_pdev:
+	platform_device_del(pdev);
+exit_device_put:
+	platform_device_put(pdev);
+
+	return ret;
+}
diff --git a/tools/lkl/lib/Makefile b/tools/lkl/lib/Makefile
new file mode 100644
index 000000000000..8dc0009c680e
--- /dev/null
+++ b/tools/lkl/lib/Makefile
@@ -0,0 +1,32 @@
+
+USER_CFLAGS += -I$(srctree)/tools/lkl/include \
+		-Wno-strict-prototypes -Wno-undef \
+		-Wframe-larger-than=20480 -O0 -g
+
+USER_OBJS += fs.o iomem.o net.o jmp_buf.o virtio.o virtio_net.o \
+	 virtio_net_fd.o virtio_net_tap.o utils.o posix-host.o \
+	../../perf/pmu-events/jsmn.o
+
+#obj-y += fs.o
+obj-y += iomem.o
+#obj-y += net.o
+obj-y += jmp_buf.o
+obj-y += posix-host.o
+#obj-$(LKL_HOST_CONFIG_NT) += nt-host.o
+obj-y += utils.o
+#obj-y += virtio_blk.o
+obj-y += virtio.o
+#obj-y += dbg.o
+#obj-y += dbg_handler.o
+obj-y += virtio_net.o
+obj-y += virtio_net_fd.o
+obj-y += virtio_net_tap.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_raw.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET_MACVTAP) += virtio_net_macvtap.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET_DPDK) += virtio_net_dpdk.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET_VDE) += virtio_net_vde.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_pipe.o
+obj-y += ../../perf/pmu-events/jsmn.o
+#obj-y += config.o
+
+include arch/um/scripts/Makefile.rules
diff --git a/tools/lkl/lib/virtio.c b/tools/lkl/lib/virtio.c
index 4b3dbba607c3..98539e270320 100644
--- a/tools/lkl/lib/virtio.c
+++ b/tools/lkl/lib/virtio.c
@@ -46,6 +46,12 @@
 		lkl_host_ops.panic();					\
 	} while (0)
 
+#ifdef __arch_um__
+extern unsigned long uml_physmem;
+#else
+static unsigned long uml_physmem;
+#endif
+
 struct virtio_queue {
 	uint32_t num_max;
 	uint32_t num;
@@ -216,7 +222,8 @@ static void add_dev_buf_from_vring_desc(struct virtio_req *req,
 {
 	struct iovec *buf = &req->buf[req->buf_count++];
 
-	buf->iov_base = (void *)(uintptr_t)le64toh(vring_desc->addr);
+	buf->iov_base = (void *)(uintptr_t)le64toh(vring_desc->addr)
+		+ uml_physmem;
 	buf->iov_len = le32toh(vring_desc->len);
 
 	if (!(buf->iov_base && buf->iov_len))
@@ -304,8 +311,10 @@ void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx)
 	if (!q->ready)
 		return;
 
+#ifndef __arch_um__
 	if (dev->ops->acquire_queue)
 		dev->ops->acquire_queue(dev, qidx);
+#endif
 
 	while (q->last_avail_idx != le16toh(q->avail->idx)) {
 		/*
@@ -319,8 +328,10 @@ void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx)
 			virtio_set_avail_event(q, q->avail->idx);
 	}
 
+#ifndef __arch_um__
 	if (dev->ops->release_queue)
 		dev->ops->release_queue(dev, qidx);
+#endif
 }
 
 static inline uint32_t virtio_read_device_features(struct virtio_dev *dev)
@@ -406,7 +417,7 @@ static inline void set_ptr_low(void **ptr, uint32_t val)
 	uint64_t tmp = (uintptr_t)*ptr;
 
 	tmp = (tmp & 0xFFFFFFFF00000000) | val;
-	*ptr = (void *)(long)tmp;
+	*ptr = (void *)(long)tmp + uml_physmem;
 }
 
 static inline void set_ptr_high(void **ptr, uint32_t val)
@@ -579,6 +590,7 @@ int virtio_dev_setup(struct virtio_dev *dev, int queues, int num_max)
 
 int virtio_dev_cleanup(struct virtio_dev *dev)
 {
+#ifndef __arch_um__
 	char devname[100];
 	long fd, ret;
 	long mount_ret;
@@ -622,6 +634,7 @@ int virtio_dev_cleanup(struct virtio_dev *dev)
 	lkl_put_irq(dev->irq, "virtio");
 	unregister_iomem(dev->base);
 	lkl_host_ops.mem_free(dev->queue);
+#endif
 	return 0;
 }
 
diff --git a/tools/lkl/lib/virtio.h b/tools/lkl/lib/virtio.h
index 7427aa8fad79..be06ef09f8b0 100644
--- a/tools/lkl/lib/virtio.h
+++ b/tools/lkl/lib/virtio.h
@@ -87,6 +87,28 @@ void virtio_req_complete(struct virtio_req *req, uint32_t len);
 void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx);
 void virtio_set_queue_max_merge_len(struct virtio_dev *dev, int q, int len);
 
+#ifdef __arch_um__
+//#include <irq_kern.h>
+#include <irq_user.h>
+enum irqreturn {
+	IRQ_HANDLED		= (1 << 0),
+	IRQ_WAKE_THREAD		= (1 << 1),
+};
+
+typedef enum irqreturn irqreturn_t;
+typedef irqreturn_t (*irq_handler_t)(int, void *);
+
+#define IRQF_SHARED		0x00000080
+
+extern int um_request_irq(unsigned int irq, int fd, int type,
+			  irq_handler_t handler,
+			  unsigned long irqflags,  const char *devname,
+			  void *dev_id);
+
+long sys_virtio_mmio_device_add(long base, long size, unsigned int irq);
+#define lkl_sys_virtio_mmio_device_add sys_virtio_mmio_device_add
+#endif /* __arch_um__ */
+
 #define container_of(ptr, type, member) \
 	(type *)((char *)(ptr) - __builtin_offsetof(type, member))
 
diff --git a/tools/lkl/lib/virtio_net.c b/tools/lkl/lib/virtio_net.c
index 60743109215b..224d7bf50702 100644
--- a/tools/lkl/lib/virtio_net.c
+++ b/tools/lkl/lib/virtio_net.c
@@ -2,6 +2,7 @@
 #include <string.h>
 #include <lkl_host.h>
 #include "virtio.h"
+#include "virtio_net_fd.h"
 #include "endian.h"
 
 #include <lkl/linux/virtio_net.h>
@@ -211,9 +212,21 @@ static struct lkl_mutex **init_queue_locks(int num_queues)
 	return ret;
 }
 
+#ifdef __arch_um__
+static irqreturn_t um_virtio_intr(int irq, void *dev_id)
+{
+	struct virtio_dev *dev = dev_id;
+
+	virtio_process_queue(dev, 0);
+	return 0;
+}
+#endif
+
 int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
 {
 	struct virtio_net_dev *dev;
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
 	int ret = -LKL_ENOMEM;
 
 	dev = lkl_host_ops.mem_alloc(sizeof(*dev));
@@ -251,16 +264,22 @@ int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
 	if (ret)
 		goto out_free;
 
+#ifdef __arch_um__
+	um_request_irq(dev->dev.irq, nd_fd->fd_rx, IRQ_READ, um_virtio_intr,
+		       IRQF_SHARED, "virtio", dev);
+#endif
+
 	/*
 	 * We may receive upto 64KB TSO packet so collect as many descriptors as
 	 * there are available up to 64KB in total len.
 	 */
 	if (dev->dev.device_features & BIT(LKL_VIRTIO_NET_F_MRG_RXBUF))
 		virtio_set_queue_max_merge_len(&dev->dev, RX_QUEUE_IDX, 65536);
-
+#ifndef __arch_um__
 	dev->poll_tid = lkl_host_ops.thread_create(poll_thread, dev);
 	if (dev->poll_tid == 0)
 		goto out_cleanup_dev;
+#endif
 
 	ret = dev_register(dev);
 	if (ret < 0)
@@ -279,6 +298,7 @@ int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
 	return ret;
 }
 
+#ifndef __arch_um__
 /* Return 0 for success, -1 for failure. */
 void lkl_netdev_remove(int id)
 {
@@ -314,6 +334,7 @@ void lkl_netdev_remove(int id)
 	free_queue_locks(dev->queue_locks, NUM_QUEUES);
 	lkl_host_ops.mem_free(dev);
 }
+#endif
 
 void lkl_netdev_free(struct lkl_netdev *nd)
 {
diff --git a/tools/lkl/lib/virtio_net_fd.c b/tools/lkl/lib/virtio_net_fd.c
index f8664455e696..a19193cfeca9 100644
--- a/tools/lkl/lib/virtio_net_fd.c
+++ b/tools/lkl/lib/virtio_net_fd.c
@@ -25,28 +25,6 @@
 #include "virtio.h"
 #include "virtio_net_fd.h"
 
-struct lkl_netdev_fd {
-	struct lkl_netdev dev;
-	/* file-descriptor based device */
-	int fd_rx;
-	int fd_tx;
-	/*
-	 * Controlls the poll mask for fd. Can be acccessed concurrently from
-	 * poll, tx, or rx routines but there is no need for syncronization
-	 * because:
-	 *
-	 * (a) TX and RX routines set different variables so even if they update
-	 * at the same time there is no race condition
-	 *
-	 * (b) Even if poll and TX / RX update at the same time poll cannot
-	 * stall: when poll resets the poll variable we know that TX / RX will
-	 * run which means that eventually the poll variable will be set.
-	 */
-	int poll_tx, poll_rx;
-	/* controle pipe */
-	int pipe[2];
-};
-
 static int fd_net_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
 {
 	int ret;
diff --git a/tools/lkl/lib/virtio_net_fd.h b/tools/lkl/lib/virtio_net_fd.h
index 713ba13cca7c..fe6d6d8e3ab4 100644
--- a/tools/lkl/lib/virtio_net_fd.h
+++ b/tools/lkl/lib/virtio_net_fd.h
@@ -4,6 +4,28 @@
 
 struct ifreq;
 
+struct lkl_netdev_fd {
+	struct lkl_netdev dev;
+	/* file-descriptor based device */
+	int fd_rx;
+	int fd_tx;
+	/*
+	 * Controlls the poll mask for fd. Can be acccessed concurrently from
+	 * poll, tx, or rx routines but there is no need for syncronization
+	 * because:
+	 *
+	 * (a) TX and RX routines set different variables so even if they update
+	 * at the same time there is no race condition
+	 *
+	 * (b) Even if poll and TX / RX update at the same time poll cannot
+	 * stall: when poll resets the poll variable we know that TX / RX will
+	 * run which means that eventually the poll variable will be set.
+	 */
+	int poll_tx, poll_rx;
+	/* controle pipe */
+	int pipe[2];
+};
+
 /**
  * lkl_register_netdev_linux_fdnet - register a file descriptor-based network
  * device as a NIC
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC PATCH 47/47] um: add lkl virtio-blk device
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (45 preceding siblings ...)
  2019-10-23  4:38 ` [RFC PATCH 46/47] um: use lkl virtio_net_tap device as UML device Hajime Tazaki
@ 2019-10-23  4:38 ` Hajime Tazaki
  2019-10-25 21:34   ` Richard Weinberger
                   ` (2 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-23  4:38 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Hajime Tazaki, Akira Moroo

Now uml can use a virtio-blk device via 'vubd0=<filename>' over
virtio-mmio driver.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 .circleci/config.yml             |  4 ++-
 arch/um/configs/x86_64_defconfig |  1 +
 arch/um/os-Linux/lkl_dev.c       | 56 +++++++++++++++++++++++++++++++-
 tools/lkl/lib/Makefile           |  6 ++--
 4 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/.circleci/config.yml b/.circleci/config.yml
index cec55ad93dc6..266ed58b6fd7 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -135,7 +135,9 @@ do_uml_steps: &do_uml_steps
   - run:
       name: test
       command: |
-        ./linux rootfstype=hostfs ro mem=1g loglevel=10 veth0=tap,tap0,0xc803 init="/bin/bash -c exit" || export RETVAL=$?
+        dd if=/dev/zero of=disk.img bs=1024 count=20480
+        mkfs.ext4 disk.img
+        ./linux rootfstype=hostfs ro mem=1g loglevel=10 veth0=tap,tap0,0xc803 vubd0=disk.img init='/bin/bash -x -c "mount -t ext4 /dev/vda /mnt ; ls -l /mnt/; ip addr ; exit"' || export RETVAL=$?
         # SIGABRT=6 => 128+6
         if [ $RETVAL != "134" ]; then
           exit 1
diff --git a/arch/um/configs/x86_64_defconfig b/arch/um/configs/x86_64_defconfig
index 917982b6cd60..e5b7c048a701 100644
--- a/arch/um/configs/x86_64_defconfig
+++ b/arch/um/configs/x86_64_defconfig
@@ -75,3 +75,4 @@ CONFIG_VIRTIO_MENU=y
 CONFIG_VIRTIO_MMIO=y
 CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y
 CONFIG_VIRTIO_NET=y
+CONFIG_VIRTIO_BLK=y
\ No newline at end of file
diff --git a/arch/um/os-Linux/lkl_dev.c b/arch/um/os-Linux/lkl_dev.c
index 698062917ed5..e08f113dfc0b 100644
--- a/arch/um/os-Linux/lkl_dev.c
+++ b/arch/um/os-Linux/lkl_dev.c
@@ -6,6 +6,7 @@
 #include <os.h>
 #include <kern_util.h>
 #include <errno.h>
+#include <fcntl.h>
 
 #include <lkl.h>
 #include <lkl_host.h>
@@ -14,6 +15,7 @@ extern struct lkl_host_operations lkl_host_ops;
 struct lkl_host_operations *lkl_ops = &lkl_host_ops;
 
 static struct lkl_netdev *nd;
+static struct lkl_disk disk;
 
 int __init uml_netdev_prepare(char *iftype, char *ifparams, char *ifoffload)
 {
@@ -108,13 +110,65 @@ __uml_setup("veth", lkl_eth_setup,
 "    Configure a network device.\n\n"
 );
 
+int __init uml_blkdev_add(void)
+{
+	int disk_id = 0;
+
+	if (disk.fd)
+		disk_id = lkl_disk_add(&disk);
+
+	if (disk_id < 0)
+		return -1;
+
+	return 0;
+}
+__initcall(uml_blkdev_add);
+
+static int __init lkl_ubd_setup(char *str, int *niu)
+{
+	char *end, *fname;
+	int devid, err = -EINVAL;
+
+	/* veth */
+	devid = strtoul(str, &end, 0);
+	if (end == str) {
+		os_warn("Bad device number\n");
+		return err;
+	}
+
+	/* = */
+	str = end;
+	if (*str != '=') {
+		os_warn("Expected '=' after device number\n");
+		return err;
+	}
+	str++;
+
+	/* <filename> */
+	fname = str;
+
+	os_info("fname=%s\n", fname);
+	/* create */
+	disk.fd = open(fname, O_RDWR);
+	if (disk.fd < 0)
+		return -1;
+
+	disk.ops = NULL;
+
+	return 1;
+}
+__uml_setup("vubd", lkl_ubd_setup,
+"vubd<n>=<filename>\n"
+"    Configure a block device.\n\n"
+);
+
+
 /* stub functions */
 int lkl_is_running(void)
 {
 	return 1;
 }
 
-
 void lkl_put_irq(int i, const char *user)
 {
 }
diff --git a/tools/lkl/lib/Makefile b/tools/lkl/lib/Makefile
index 8dc0009c680e..3cee5b0133b1 100644
--- a/tools/lkl/lib/Makefile
+++ b/tools/lkl/lib/Makefile
@@ -3,9 +3,9 @@ USER_CFLAGS += -I$(srctree)/tools/lkl/include \
 		-Wno-strict-prototypes -Wno-undef \
 		-Wframe-larger-than=20480 -O0 -g
 
-USER_OBJS += fs.o iomem.o net.o jmp_buf.o virtio.o virtio_net.o \
+USER_OBJS += iomem.o jmp_buf.o virtio.o virtio_net.o \
 	 virtio_net_fd.o virtio_net_tap.o utils.o posix-host.o \
-	../../perf/pmu-events/jsmn.o
+	 virtio_blk.o ../../perf/pmu-events/jsmn.o
 
 #obj-y += fs.o
 obj-y += iomem.o
@@ -14,7 +14,7 @@ obj-y += jmp_buf.o
 obj-y += posix-host.o
 #obj-$(LKL_HOST_CONFIG_NT) += nt-host.o
 obj-y += utils.o
-#obj-y += virtio_blk.o
+obj-y += virtio_blk.o
 obj-y += virtio.o
 #obj-y += dbg.o
 #obj-y += dbg_handler.o
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 00/47] Unifying LKL into UML
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
@ 2019-10-25 21:34   ` Richard Weinberger
  2019-10-23  4:37 ` [RFC PATCH 02/47] kbuild: allow architectures to automatically define kconfig symbols Hajime Tazaki
                     ` (48 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-10-25 21:34 UTC (permalink / raw)
  To: Hajime Tazaki; +Cc: linux-um, Octavian Purdila, Akira Moroo, Linux-Arch

On Wed, Oct 23, 2019 at 6:39 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> This RFC patchset is to ask opinions from UML people, whether LKL codes is
> good to integrate into UML code base.  We wish to have any kind of feedback
> from your kind reviews.  There are numbers of commits which should be asked
> for reviews to other mailing lists; we will do it later once we got
> discussed in this mailing list.

Thanks a lot for doing this, this effort is much appreciated! :-)

> # sorry for the long list of patches: we can make it smaller by only
>   including basic set of LKL (e.g., removing foreign OS support, etc) if
>   you wish.

Let use see how the review goes. First I'll give it a high level review to make
sure we all talk about the same things.

Please CC linux-arch@vger.kernel.org for the next patch round.
Integrating LKL (into arch/um/) is something which needs more audience and
feedback from Arnd Bergmann, our global arch maintainer.


>
>
> LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
> as extensively as possible with minimal effort and reduced maintenance
> overhead.
>
> Examples of how LKL can be used are: creating userspace applications
> (running on Linux and other operating systems) that can read or write Linux
> filesystems or can use the Linux networking stack, creating kernel drivers
> for other operating systems that can read Linux filesystems, bootloaders
> support for reading/writing Linux filesystems, etc.
>
> With LKL, the kernel code is compiled into an object file that can be
> directly linked by applications. The API offered by LKL is based on the
> Linux system call interface.
>
> LKL is originally implemented as an architecture port in arch/lkl, but this
> series of commits tries to integrate this into arch/um as one of the mode
> of UML.  This was discussed during RFC email of LKL (*1).
>
> The latest LKL version can be found at https://github.com/lkl/linux
>
> Milestone
> =========
> This patches is just a first step toward upstreaming *library mode* of
> Linux kernel, but we think we need to have several steps toward our goal,
> describing in the below.
>
> 1. Put LKL code under arch/um (arch/um/lkl), and build it in a
> separate way from UML.

Makes sense.

> 2. Share common parts of implementation between UML and LKL.

Since both UML and LKL are usermode ports there is a lot of potential.

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 00/47] Unifying LKL into UML
@ 2019-10-25 21:34   ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-10-25 21:34 UTC (permalink / raw)
  To: Hajime Tazaki; +Cc: Octavian Purdila, Linux-Arch, linux-um, Akira Moroo

On Wed, Oct 23, 2019 at 6:39 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> This RFC patchset is to ask opinions from UML people, whether LKL codes is
> good to integrate into UML code base.  We wish to have any kind of feedback
> from your kind reviews.  There are numbers of commits which should be asked
> for reviews to other mailing lists; we will do it later once we got
> discussed in this mailing list.

Thanks a lot for doing this, this effort is much appreciated! :-)

> # sorry for the long list of patches: we can make it smaller by only
>   including basic set of LKL (e.g., removing foreign OS support, etc) if
>   you wish.

Let use see how the review goes. First I'll give it a high level review to make
sure we all talk about the same things.

Please CC linux-arch@vger.kernel.org for the next patch round.
Integrating LKL (into arch/um/) is something which needs more audience and
feedback from Arnd Bergmann, our global arch maintainer.


>
>
> LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
> as extensively as possible with minimal effort and reduced maintenance
> overhead.
>
> Examples of how LKL can be used are: creating userspace applications
> (running on Linux and other operating systems) that can read or write Linux
> filesystems or can use the Linux networking stack, creating kernel drivers
> for other operating systems that can read Linux filesystems, bootloaders
> support for reading/writing Linux filesystems, etc.
>
> With LKL, the kernel code is compiled into an object file that can be
> directly linked by applications. The API offered by LKL is based on the
> Linux system call interface.
>
> LKL is originally implemented as an architecture port in arch/lkl, but this
> series of commits tries to integrate this into arch/um as one of the mode
> of UML.  This was discussed during RFC email of LKL (*1).
>
> The latest LKL version can be found at https://github.com/lkl/linux
>
> Milestone
> =========
> This patches is just a first step toward upstreaming *library mode* of
> Linux kernel, but we think we need to have several steps toward our goal,
> describing in the below.
>
> 1. Put LKL code under arch/um (arch/um/lkl), and build it in a
> separate way from UML.

Makes sense.

> 2. Share common parts of implementation between UML and LKL.

Since both UML and LKL are usermode ports there is a lot of potential.
From my side it is also no big deal if there is some duplication which can be
resolved in later releases. Unifiing needs deep thoughts and miding odd corner
cases.

> 3. Reimplement UML features with LKL API (if we wish)

Yep. In the last release UML got virtio support, so there is hope. ;-)

> For the step 1, we put LKL as one of SUBARCH in order to make less effort
> to integrate (make ARCH=um SUBARCH=lkl).  The modification to existing UML
> code is trying to be minimized.

I'm not sure if SUBARCH is the right approach. How do I build a i386
lkl on x86_64?
Maybe we can use another variable like UMMODE={library,kernel}?

> The RFC patches also includes and a bit of step 2 as a proof of possibility
> to share the code.  For this, we used the virtio device code of LKL and use
> it from UML by enabling virtio-mmio driver with UML code.
>
>
>
> Building LKL the host library and LKL applications
> ==================================================
>
> % cd tools/lkl
> % make

Is there a reason why tool/lkl is not under arch/um?

> will build LKL as a object file, it will install it in tools/lkl/lib together
> with the headers files in tools/lkl/include then will build the host library,
> tests and a few of application examples:
>
> * tests/boot - a simple applications that uses LKL and exercises the basic
> LKL APIs
>
> * tests/net-test - a simple applications that uses network feature of
> LKL and exercises the basic network-related APIs
>
> * fs2tar - a tool that converts a filesystem image to a tar archive
>
> * cptofs/cpfromfs - a tool that copies files to/from a filesystem image
>
> % make run-tests
>
> should run the above `tests/boot` and `tests/net-test` and report errors if
> there are any.
>
> Supported hosts
> ===============
>
> Currently LKL supports POSIX and Windows userspace applications. New hosts
> can be added relatively easy if the host supports gcc and GNU ld. Previous
> versions of LKL supported Windows kernel and Haiku kernel hosts, and we
> also have WIP patches (not included in this RFC) with rump-hypercall
> interface, used in UEFI, as well as macOS userspace (part of POSIX?).
>
> There is also musl-libc port for LKL, which might be interested in for some
> folks.
>
>
> Further readings about LKL
> =========================
>
> - Discussion in github LKL issue
> https://github.com/lkl/linux/issues/304
>
> - LKL (an article)
> https://www.researchgate.net/profile/Nicolae_Tapus2/publication/224164682_LKL_The_Linux_kernel_library/links/02bfe50fd921ab4f7c000000.pdf
>
> *1 RFC email to LKML (back in 2015)
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1012277.html
>
>
>
> Please review the following changes for suitability for inclusion. If you have
> any objections or suggestions for improvement, please respond to the patches. If
> you agree with the changes, please provide your Acked-by.
>
> The following changes since commit 73625ed66389d4c620520058d828f43a93ab4d0c:
>
>   um: irq: Fix LAST_IRQ usage in init_IRQ() (2019-09-16 08:38:58 +0200)
>
> are available in the Git repository at:
>
>   git://github.com/thehajime/linux d380ec02dd0cd97afe08706093b59329e6b09fe2
>   https://github.com/thehajime/linux/tree/upstream-to-uml-5.5-rc1
>
> Akira Moroo (2):
>   Revert "vmlinux.lds.h: remove stale <linux/export.h> include"
>   um lkl: use ARCH=um SUBARCH=lkl for tools/lkl
>
> Andreas Abel (1):
>   kallsyms: Add a config option to select section for kallsyms
>
> Hajime Tazaki (9):
>   lkl: Android ARM (arm/arm64) support
>   Revert "export.h: remove code for prefixing symbols with underscore"
>   Revert "linux/linkage.h: replace VMLINUX_SYMBOL_STR() with
>     __stringify()"
>   Revert "vmlinux.lds.h: remove no-op macro VMLINUX_SYMBOL()"
>   Revert "kbuild: remove CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX"
>   Revert "kallsyms: remove symbol prefix support"
>   um lkl: add CI tests
>   um: use lkl virtio_net_tap device as UML device
>   um: add lkl virtio-blk device
>
> Octavian Purdila (34):
>   asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
>   kbuild: allow architectures to automatically define kconfig symbols
>   lkl: architecture skeleton for Linux kernel library
>   lkl: host interface
>   lkl: memory handling
>   lkl: kernel threads support
>   lkl: interrupt support
>   lkl: system call interface and application API
>   lkl: timers, time and delay support
>   lkl: memory mapped I/O support
>   lkl: basic kernel console support
>   lkl: initialization and cleanup
>   lkl: plug in the build system
>   lkl tools: skeleton for host side library, tests and tools
>   lkl tools: host lib: add utilities functions
>   lkl tools: host lib: memory mapped I/O helpers
>   lkl tools: host lib: virtio devices
>   lkl tools: host lib: virtio block device
>   lkl tools: host lib: filesystem helpers
>   lkl tools: host lib: posix host operations
>   lkl tools: "boot" test
>   lkl tools: tool that converts a filesystem image to tar
>   lkl tools: tool that reads/writes to/from a filesystem image
>   lkl tools: virtio: add network device support
>   lkl: add support for Windows hosts
>   lkl tools: add support for Windows host
>   lkl tools: add lklfuse
>   lkl: add initial system call hijack support (a.k.a. NUSE of libos)
>   lkl: add documentation
>   cpu: add cpu_yield_to_irqs
>   signal: use CONFIG_X86_32 instead of __i386__
>   arch: add __SYSCALL_DEFINE_ARCH
>   xfs: support for non-mmu architectures
>   checkpatch: avoid showing BIT_ULL warnings for tools/ files
>
> Thomas Liebetraut (1):
>   tools: Add the lkl host library to the common tools Makefile
>
>  .circleci/config.yml                          | 248 +++++
>  Documentation/lkl.txt                         | 470 +++++++++
>  MAINTAINERS                                   |   8 +
>  Makefile                                      |   3 +
>  README.md                                     |   1 +
>  arch/Kconfig                                  |   6 +
>  arch/um/Kconfig                               |  56 +-
>  arch/um/Makefile                              | 115 +--
>  arch/um/Makefile.um                           | 121 +++
>  arch/um/auto.conf                             |   0
>  arch/um/configs/x86_64_defconfig              |   6 +
>  arch/um/include/asm/Kbuild                    |   6 +
>  arch/um/include/asm/io.h                      |   4 +
>  arch/um/lkl/.gitignore                        |   2 +
>  arch/um/lkl/Kconfig                           |  96 ++
>  arch/um/lkl/Kconfig.debug                     |   0
>  arch/um/lkl/Makefile                          |   0
>  arch/um/lkl/Makefile.um                       |  70 ++
>  arch/um/lkl/auto.conf                         |   1 +
>  arch/um/lkl/configs/lkl_defconfig             |  95 ++
>  arch/um/lkl/include/asm/Kbuild                |  80 ++
>  arch/um/lkl/include/asm/bitsperlong.h         |  11 +
>  arch/um/lkl/include/asm/byteorder.h           |   7 +
>  arch/um/lkl/include/asm/cpu.h                 |  14 +
>  arch/um/lkl/include/asm/elf.h                 |  15 +
>  arch/um/lkl/include/asm/host_ops.h            |  12 +
>  arch/um/lkl/include/asm/io.h                  | 104 ++
>  arch/um/lkl/include/asm/irq.h                 |  15 +
>  arch/um/lkl/include/asm/mutex.h               |   7 +
>  arch/um/lkl/include/asm/page.h                |  14 +
>  arch/um/lkl/include/asm/pgtable.h             |  62 ++
>  arch/um/lkl/include/asm/processor.h           |  60 ++
>  arch/um/lkl/include/asm/ptrace.h              |  25 +
>  arch/um/lkl/include/asm/sched.h               |  23 +
>  arch/um/lkl/include/asm/setup.h               |   7 +
>  arch/um/lkl/include/asm/syscalls.h            |  18 +
>  arch/um/lkl/include/asm/syscalls_32.h         |  43 +
>  arch/um/lkl/include/asm/thread_info.h         |  70 ++
>  arch/um/lkl/include/asm/tlb.h                 |  12 +
>  arch/um/lkl/include/asm/uaccess.h             |  64 ++
>  arch/um/lkl/include/asm/unistd.h              |  29 +
>  arch/um/lkl/include/asm/unistd_32.h           |  31 +
>  arch/um/lkl/include/asm/vmlinux.lds.h         |  14 +
>  arch/um/lkl/include/asm/xor.h                 |   9 +
>  arch/um/lkl/include/system/stdarg.h           |   2 +
>  arch/um/lkl/include/uapi/asm/Kbuild           |   9 +
>  arch/um/lkl/include/uapi/asm/bitsperlong.h    |  13 +
>  arch/um/lkl/include/uapi/asm/byteorder.h      |  11 +
>  arch/um/lkl/include/uapi/asm/host_ops.h       | 153 +++
>  arch/um/lkl/include/uapi/asm/irq.h            |  36 +
>  arch/um/lkl/include/uapi/asm/sigcontext.h     |  16 +
>  arch/um/lkl/include/uapi/asm/siginfo.h        |  11 +
>  arch/um/lkl/include/uapi/asm/swab.h           |  11 +
>  arch/um/lkl/include/uapi/asm/syscalls.h       | 348 +++++++
>  arch/um/lkl/include/uapi/asm/unistd.h         |  18 +
>  arch/um/lkl/kernel/Makefile                   |   4 +
>  arch/um/lkl/kernel/asm-offsets.c              |   2 +
>  arch/um/lkl/kernel/console.c                  |  42 +
>  arch/um/lkl/kernel/cpu.c                      | 223 +++++
>  arch/um/lkl/kernel/irq.c                      | 193 ++++
>  arch/um/lkl/kernel/misc.c                     |  60 ++
>  arch/um/lkl/kernel/setup.c                    | 193 ++++
>  arch/um/lkl/kernel/syscalls.c                 | 246 +++++
>  arch/um/lkl/kernel/syscalls_32.c              | 159 +++
>  arch/um/lkl/kernel/threads.c                  | 227 +++++
>  arch/um/lkl/kernel/time.c                     | 145 +++
>  arch/um/lkl/kernel/vmlinux.lds.S              |  51 +
>  arch/um/lkl/mm/Makefile                       |   1 +
>  arch/um/lkl/mm/bootmem.c                      |  66 ++
>  arch/um/lkl/scripts/headers_install.py        | 195 ++++
>  arch/um/lkl/um/Makefile                       |   1 +
>  .../um/lkl/um/include/sysdep/kernel-offsets.h |   4 +
>  arch/um/os-Linux/Makefile                     |   5 +
>  arch/um/os-Linux/lkl_dev.c                    | 188 ++++
>  arch/x86/um/syscalls_64.c                     |  53 +
>  crypto/xor.c                                  |   2 +
>  fs/xfs/xfs_buf.c                              |  26 +
>  include/asm-generic/atomic64.h                |   2 +
>  include/asm-generic/export.h                  |  34 +-
>  include/asm-generic/vmlinux.lds.h             | 293 +++---
>  include/linux/compiler_attributes.h           |   4 +
>  include/linux/cpu.h                           |   1 +
>  include/linux/export.h                        |  23 +-
>  include/linux/linkage.h                       |  12 +-
>  include/linux/syscalls.h                      |   6 +
>  init/Kconfig                                  |  12 +
>  kernel/cpu.c                                  |   5 +
>  kernel/signal.c                               |   2 +-
>  lib/.gitignore                                |   2 +
>  lib/raid6/.gitignore                          |   1 +
>  lib/raid6/algos.c                             |   9 +-
>  scripts/.gitignore                            |   2 +
>  scripts/Makefile.build                        |   7 +-
>  scripts/adjust_autoksyms.sh                   |   7 +-
>  scripts/basic/.gitignore                      |   1 +
>  scripts/checkpatch.pl                         |   3 +-
>  scripts/kallsyms.c                            |  58 +-
>  scripts/kconfig/.gitignore                    |   1 +
>  scripts/link-vmlinux.sh                       |  10 +
>  scripts/mod/.gitignore                        |   1 +
>  tools/Makefile                                |  11 +-
>  tools/lkl/.gitignore                          |  14 +
>  tools/lkl/Build                               |   6 +
>  tools/lkl/Makefile                            | 130 +++
>  tools/lkl/Makefile.autoconf                   | 114 +++
>  tools/lkl/Targets                             |  27 +
>  tools/lkl/bin/arm-linux-androideabi-ld        |   1 +
>  tools/lkl/bin/lkl-hijack.sh                   |  23 +
>  tools/lkl/cptofs.c                            | 635 ++++++++++++
>  tools/lkl/fs2tar.c                            | 410 ++++++++
>  tools/lkl/include/.gitignore                  |   1 +
>  tools/lkl/include/lkl.h                       | 928 ++++++++++++++++++
>  tools/lkl/include/lkl_config.h                |  61 ++
>  tools/lkl/include/lkl_host.h                  | 160 +++
>  tools/lkl/include/mingw32/sys/socket.h        |   4 +
>  tools/lkl/lib/.gitignore                      |   3 +
>  tools/lkl/lib/Build                           |  25 +
>  tools/lkl/lib/Makefile                        |  32 +
>  tools/lkl/lib/config.c                        | 793 +++++++++++++++
>  tools/lkl/lib/dbg.c                           | 300 ++++++
>  tools/lkl/lib/dbg_handler.c                   |  44 +
>  tools/lkl/lib/endian.h                        |  31 +
>  tools/lkl/lib/fs.c                            | 433 ++++++++
>  tools/lkl/lib/hijack/Build                    |   4 +
>  tools/lkl/lib/hijack/hijack.c                 | 618 ++++++++++++
>  tools/lkl/lib/hijack/init.c                   | 252 +++++
>  tools/lkl/lib/hijack/init.h                   |   8 +
>  tools/lkl/lib/hijack/xlate.c                  | 613 ++++++++++++
>  tools/lkl/lib/hijack/xlate.h                  |  13 +
>  tools/lkl/lib/iomem.c                         |  88 ++
>  tools/lkl/lib/iomem.h                         |  15 +
>  tools/lkl/lib/jmp_buf.c                       |  14 +
>  tools/lkl/lib/jmp_buf.h                       |   8 +
>  tools/lkl/lib/net.c                           | 818 +++++++++++++++
>  tools/lkl/lib/nt-host.c                       | 375 +++++++
>  tools/lkl/lib/posix-host.c                    | 439 +++++++++
>  tools/lkl/lib/utils.c                         | 266 +++++
>  tools/lkl/lib/virtio.c                        | 644 ++++++++++++
>  tools/lkl/lib/virtio.h                        | 115 +++
>  tools/lkl/lib/virtio_blk.c                    | 132 +++
>  tools/lkl/lib/virtio_net.c                    | 342 +++++++
>  tools/lkl/lib/virtio_net_dpdk.c               | 480 +++++++++
>  tools/lkl/lib/virtio_net_fd.c                 | 195 ++++
>  tools/lkl/lib/virtio_net_fd.h                 |  50 +
>  tools/lkl/lib/virtio_net_macvtap.c            |  32 +
>  tools/lkl/lib/virtio_net_pipe.c               |  76 ++
>  tools/lkl/lib/virtio_net_raw.c                |  94 ++
>  tools/lkl/lib/virtio_net_tap.c                | 111 +++
>  tools/lkl/lib/virtio_net_vde.c                | 168 ++++
>  tools/lkl/lklfuse.c                           | 658 +++++++++++++
>  tools/lkl/scripts/checkpatch.sh               |  60 ++
>  tools/lkl/scripts/dpdk-sdk-build.sh           |  18 +
>  tools/lkl/scripts/lkl-jenkins.sh              |  21 +
>  tools/lkl/tests/Build                         |   3 +
>  tools/lkl/tests/boot.c                        | 562 +++++++++++
>  tools/lkl/tests/boot.sh                       |   9 +
>  tools/lkl/tests/cla.c                         | 159 +++
>  tools/lkl/tests/cla.h                         |  33 +
>  tools/lkl/tests/disk.c                        | 189 ++++
>  tools/lkl/tests/disk.sh                       |  70 ++
>  tools/lkl/tests/hijack-test.sh                | 760 ++++++++++++++
>  tools/lkl/tests/lklfuse.sh                    | 110 +++
>  tools/lkl/tests/net-setup.sh                  | 134 +++
>  tools/lkl/tests/net-test.c                    | 317 ++++++
>  tools/lkl/tests/net.sh                        | 186 ++++
>  tools/lkl/tests/run.py                        | 186 ++++
>  tools/lkl/tests/run_netperf.sh                |  98 ++
>  tools/lkl/tests/tap13.py                      | 209 ++++
>  tools/lkl/tests/test.c                        | 126 +++
>  tools/lkl/tests/test.h                        |  72 ++
>  tools/lkl/tests/test.sh                       | 240 +++++
>  tools/lkl/tests/valgrind.supp                 |  85 ++
>  tools/lkl/tests/valgrind2xunit.py             |  69 ++
>  173 files changed, 19464 insertions(+), 330 deletions(-)
>  create mode 100644 .circleci/config.yml
>  create mode 100644 Documentation/lkl.txt
>  create mode 120000 README.md
>  create mode 100644 arch/um/Makefile.um
>  create mode 100644 arch/um/auto.conf
>  create mode 100644 arch/um/lkl/.gitignore
>  create mode 100644 arch/um/lkl/Kconfig
>  create mode 100644 arch/um/lkl/Kconfig.debug
>  create mode 100644 arch/um/lkl/Makefile
>  create mode 100644 arch/um/lkl/Makefile.um
>  create mode 100644 arch/um/lkl/auto.conf
>  create mode 100644 arch/um/lkl/configs/lkl_defconfig
>  create mode 100644 arch/um/lkl/include/asm/Kbuild
>  create mode 100644 arch/um/lkl/include/asm/bitsperlong.h
>  create mode 100644 arch/um/lkl/include/asm/byteorder.h
>  create mode 100644 arch/um/lkl/include/asm/cpu.h
>  create mode 100644 arch/um/lkl/include/asm/elf.h
>  create mode 100644 arch/um/lkl/include/asm/host_ops.h
>  create mode 100644 arch/um/lkl/include/asm/io.h
>  create mode 100644 arch/um/lkl/include/asm/irq.h
>  create mode 100644 arch/um/lkl/include/asm/mutex.h
>  create mode 100644 arch/um/lkl/include/asm/page.h
>  create mode 100644 arch/um/lkl/include/asm/pgtable.h
>  create mode 100644 arch/um/lkl/include/asm/processor.h
>  create mode 100644 arch/um/lkl/include/asm/ptrace.h
>  create mode 100644 arch/um/lkl/include/asm/sched.h
>  create mode 100644 arch/um/lkl/include/asm/setup.h
>  create mode 100644 arch/um/lkl/include/asm/syscalls.h
>  create mode 100644 arch/um/lkl/include/asm/syscalls_32.h
>  create mode 100644 arch/um/lkl/include/asm/thread_info.h
>  create mode 100644 arch/um/lkl/include/asm/tlb.h
>  create mode 100644 arch/um/lkl/include/asm/uaccess.h
>  create mode 100644 arch/um/lkl/include/asm/unistd.h
>  create mode 100644 arch/um/lkl/include/asm/unistd_32.h
>  create mode 100644 arch/um/lkl/include/asm/vmlinux.lds.h
>  create mode 100644 arch/um/lkl/include/asm/xor.h
>  create mode 100644 arch/um/lkl/include/system/stdarg.h
>  create mode 100644 arch/um/lkl/include/uapi/asm/Kbuild
>  create mode 100644 arch/um/lkl/include/uapi/asm/bitsperlong.h
>  create mode 100644 arch/um/lkl/include/uapi/asm/byteorder.h
>  create mode 100644 arch/um/lkl/include/uapi/asm/host_ops.h
>  create mode 100644 arch/um/lkl/include/uapi/asm/irq.h
>  create mode 100644 arch/um/lkl/include/uapi/asm/sigcontext.h
>  create mode 100644 arch/um/lkl/include/uapi/asm/siginfo.h
>  create mode 100644 arch/um/lkl/include/uapi/asm/swab.h
>  create mode 100644 arch/um/lkl/include/uapi/asm/syscalls.h
>  create mode 100644 arch/um/lkl/include/uapi/asm/unistd.h
>  create mode 100644 arch/um/lkl/kernel/Makefile
>  create mode 100644 arch/um/lkl/kernel/asm-offsets.c
>  create mode 100644 arch/um/lkl/kernel/console.c
>  create mode 100644 arch/um/lkl/kernel/cpu.c
>  create mode 100644 arch/um/lkl/kernel/irq.c
>  create mode 100644 arch/um/lkl/kernel/misc.c
>  create mode 100644 arch/um/lkl/kernel/setup.c
>  create mode 100644 arch/um/lkl/kernel/syscalls.c
>  create mode 100644 arch/um/lkl/kernel/syscalls_32.c
>  create mode 100644 arch/um/lkl/kernel/threads.c
>  create mode 100644 arch/um/lkl/kernel/time.c
>  create mode 100644 arch/um/lkl/kernel/vmlinux.lds.S
>  create mode 100644 arch/um/lkl/mm/Makefile
>  create mode 100644 arch/um/lkl/mm/bootmem.c
>  create mode 100755 arch/um/lkl/scripts/headers_install.py
>  create mode 100644 arch/um/lkl/um/Makefile
>  create mode 100644 arch/um/lkl/um/include/sysdep/kernel-offsets.h
>  create mode 100644 arch/um/os-Linux/lkl_dev.c
>  create mode 100644 tools/lkl/.gitignore
>  create mode 100644 tools/lkl/Build
>  create mode 100644 tools/lkl/Makefile
>  create mode 100644 tools/lkl/Makefile.autoconf
>  create mode 100644 tools/lkl/Targets
>  create mode 120000 tools/lkl/bin/arm-linux-androideabi-ld
>  create mode 100755 tools/lkl/bin/lkl-hijack.sh
>  create mode 100644 tools/lkl/cptofs.c
>  create mode 100644 tools/lkl/fs2tar.c
>  create mode 100644 tools/lkl/include/.gitignore
>  create mode 100644 tools/lkl/include/lkl.h
>  create mode 100644 tools/lkl/include/lkl_config.h
>  create mode 100644 tools/lkl/include/lkl_host.h
>  create mode 100644 tools/lkl/include/mingw32/sys/socket.h
>  create mode 100644 tools/lkl/lib/.gitignore
>  create mode 100644 tools/lkl/lib/Build
>  create mode 100644 tools/lkl/lib/Makefile
>  create mode 100644 tools/lkl/lib/config.c
>  create mode 100644 tools/lkl/lib/dbg.c
>  create mode 100644 tools/lkl/lib/dbg_handler.c
>  create mode 100644 tools/lkl/lib/endian.h
>  create mode 100644 tools/lkl/lib/fs.c
>  create mode 100644 tools/lkl/lib/hijack/Build
>  create mode 100644 tools/lkl/lib/hijack/hijack.c
>  create mode 100644 tools/lkl/lib/hijack/init.c
>  create mode 100644 tools/lkl/lib/hijack/init.h
>  create mode 100644 tools/lkl/lib/hijack/xlate.c
>  create mode 100644 tools/lkl/lib/hijack/xlate.h
>  create mode 100644 tools/lkl/lib/iomem.c
>  create mode 100644 tools/lkl/lib/iomem.h
>  create mode 100644 tools/lkl/lib/jmp_buf.c
>  create mode 100644 tools/lkl/lib/jmp_buf.h
>  create mode 100644 tools/lkl/lib/net.c
>  create mode 100644 tools/lkl/lib/nt-host.c
>  create mode 100644 tools/lkl/lib/posix-host.c
>  create mode 100644 tools/lkl/lib/utils.c
>  create mode 100644 tools/lkl/lib/virtio.c
>  create mode 100644 tools/lkl/lib/virtio.h
>  create mode 100644 tools/lkl/lib/virtio_blk.c
>  create mode 100644 tools/lkl/lib/virtio_net.c
>  create mode 100644 tools/lkl/lib/virtio_net_dpdk.c
>  create mode 100644 tools/lkl/lib/virtio_net_fd.c
>  create mode 100644 tools/lkl/lib/virtio_net_fd.h
>  create mode 100644 tools/lkl/lib/virtio_net_macvtap.c
>  create mode 100644 tools/lkl/lib/virtio_net_pipe.c
>  create mode 100644 tools/lkl/lib/virtio_net_raw.c
>  create mode 100644 tools/lkl/lib/virtio_net_tap.c
>  create mode 100644 tools/lkl/lib/virtio_net_vde.c
>  create mode 100644 tools/lkl/lklfuse.c
>  create mode 100755 tools/lkl/scripts/checkpatch.sh
>  create mode 100755 tools/lkl/scripts/dpdk-sdk-build.sh
>  create mode 100755 tools/lkl/scripts/lkl-jenkins.sh
>  create mode 100644 tools/lkl/tests/Build
>  create mode 100644 tools/lkl/tests/boot.c
>  create mode 100755 tools/lkl/tests/boot.sh
>  create mode 100644 tools/lkl/tests/cla.c
>  create mode 100644 tools/lkl/tests/cla.h
>  create mode 100644 tools/lkl/tests/disk.c
>  create mode 100755 tools/lkl/tests/disk.sh
>  create mode 100755 tools/lkl/tests/hijack-test.sh
>  create mode 100755 tools/lkl/tests/lklfuse.sh
>  create mode 100644 tools/lkl/tests/net-setup.sh
>  create mode 100644 tools/lkl/tests/net-test.c
>  create mode 100755 tools/lkl/tests/net.sh
>  create mode 100755 tools/lkl/tests/run.py
>  create mode 100755 tools/lkl/tests/run_netperf.sh
>  create mode 100644 tools/lkl/tests/tap13.py
>  create mode 100644 tools/lkl/tests/test.c
>  create mode 100644 tools/lkl/tests/test.h
>  create mode 100644 tools/lkl/tests/test.sh
>  create mode 100644 tools/lkl/tests/valgrind.supp
>  create mode 100755 tools/lkl/tests/valgrind2xunit.py
>
> --
> 2.20.1 (Apple Git-117)
>
>
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um



-- 
Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library
  2019-10-23  4:37 ` [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library Hajime Tazaki
@ 2019-10-25 21:40   ` Richard Weinberger
  2019-10-27  2:36     ` Hajime Tazaki
  0 siblings, 1 reply; 206+ messages in thread
From: Richard Weinberger @ 2019-10-25 21:40 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: H . K . Jerry Chu, Levente Kurusa, Matthieu Coudron,
	Conrad Meyer, Octavian Purdila, Lai Jiangshan, Jens Staal,
	Motomu Utsumi, linux-um, Akira Moroo, Petros Angelatos,
	Andreas Abel, Xiao Jia, Mark Stillwell, Edison M . Castro,
	Patrick Collins, Pierre-Hugues Husson, Michael Zimmermann,
	Luca Dariz, Yuan Liu

On Wed, Oct 23, 2019 at 6:39 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
> +LINUX KERNEL LIBRARY
> +M:     Octavian Purdila <octavian.purdila@intel.com>
> +M:     Hajime Tazaki <thehajime@gmail.com>
> +L:     linux-kernel-library@freelists.org
> +S:     Maintained
> +F:     arch/lkl/
> +F:     tools/lkl/
> +

The arch/lkl path is outdated.
So, you and Octavian will maintain LKL?

Do you want to be sub maintainers of arch/um/lkl and send pull requests to me
or co-maintain the whole UML ecosystem together with me and Anton?

I'm perfectly fine with both variants but tend to the latter one since
it is less overhead.
In case you need your PGP keys signed, next week I'll be in Lyon at
OSS, ELCE, ...

-- 
Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 00/47] Unifying LKL into UML
  2019-10-25 21:34   ` Richard Weinberger
@ 2019-10-27  2:34     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-27  2:34 UTC (permalink / raw)
  To: richard.weinberger; +Cc: linux-um, tavi.purdila, retrage01, linux-arch


Hello Richard,

Thanks for the review.

On Sat, 26 Oct 2019 06:34:59 +0900,
Richard Weinberger wrote:
> 
> On Wed, Oct 23, 2019 at 6:39 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > This RFC patchset is to ask opinions from UML people, whether LKL codes is
> > good to integrate into UML code base.  We wish to have any kind of feedback
> > from your kind reviews.  There are numbers of commits which should be asked
> > for reviews to other mailing lists; we will do it later once we got
> > discussed in this mailing list.
> 
> Thanks a lot for doing this, this effort is much appreciated! :-)
> 
> > # sorry for the long list of patches: we can make it smaller by only
> >   including basic set of LKL (e.g., removing foreign OS support, etc) if
> >   you wish.
> 
> Let use see how the review goes. First I'll give it a high level review to make
> sure we all talk about the same things.

Thanks.

> Please CC linux-arch@vger.kernel.org for the next patch round.
> Integrating LKL (into arch/um/) is something which needs more audience and
> feedback from Arnd Bergmann, our global arch maintainer.

Sure, will Cc.

> >
> > LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
> > as extensively as possible with minimal effort and reduced maintenance
> > overhead.
> >
> > Examples of how LKL can be used are: creating userspace applications
> > (running on Linux and other operating systems) that can read or write Linux
> > filesystems or can use the Linux networking stack, creating kernel drivers
> > for other operating systems that can read Linux filesystems, bootloaders
> > support for reading/writing Linux filesystems, etc.
> >
> > With LKL, the kernel code is compiled into an object file that can be
> > directly linked by applications. The API offered by LKL is based on the
> > Linux system call interface.
> >
> > LKL is originally implemented as an architecture port in arch/lkl, but this
> > series of commits tries to integrate this into arch/um as one of the mode
> > of UML.  This was discussed during RFC email of LKL (*1).
> >
> > The latest LKL version can be found at https://github.com/lkl/linux
> >
> > Milestone
> > =========
> > This patches is just a first step toward upstreaming *library mode* of
> > Linux kernel, but we think we need to have several steps toward our goal,
> > describing in the below.
> >
> > 1. Put LKL code under arch/um (arch/um/lkl), and build it in a
> > separate way from UML.
> 
> Makes sense.
> 
> > 2. Share common parts of implementation between UML and LKL.
> 
> Since both UML and LKL are usermode ports there is a lot of potential.
> From my side it is also no big deal if there is some duplication which can be
> resolved in later releases. Unifiing needs deep thoughts and miding odd corner
> cases.

I understand.

> > 3. Reimplement UML features with LKL API (if we wish)
> 
> Yep. In the last release UML got virtio support, so there is hope. ;-)

Good news.

> > For the step 1, we put LKL as one of SUBARCH in order to make less effort
> > to integrate (make ARCH=um SUBARCH=lkl).  The modification to existing UML
> > code is trying to be minimized.
> 
> I'm not sure if SUBARCH is the right approach. How do I build a i386
> lkl on x86_64?

This is currently handled under tools/lkl: building
arch/um/lkl part only requires toolchain information (e.g.,
CROSS_COMPILE=).

So to build i386 liblkl.a, do `make ARCH=um SUBARCH=lkl`,
which might not be intuitive..

> Maybe we can use another variable like UMMODE={library,kernel}?

We will try to find this way to switch the mode instead.

> > The RFC patches also includes and a bit of step 2 as a proof of possibility
> > to share the code.  For this, we used the virtio device code of LKL and use
> > it from UML by enabling virtio-mmio driver with UML code.
> >
> >
> >
> > Building LKL the host library and LKL applications
> > ==================================================
> >
> > % cd tools/lkl
> > % make
> 
> Is there a reason why tool/lkl is not under arch/um?

I thought that this way makes clear distinction between host
hardware/environment *dependent* (tools/lkl) part and
*independent* (arch/um/lkl).

We can rename it to tools/um instead.

But if using new tools directory makes noisy, we would try to
move those under arch/um.

-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 00/47] Unifying LKL into UML
@ 2019-10-27  2:34     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-27  2:34 UTC (permalink / raw)
  To: richard.weinberger; +Cc: tavi.purdila, linux-arch, linux-um, retrage01


Hello Richard,

Thanks for the review.

On Sat, 26 Oct 2019 06:34:59 +0900,
Richard Weinberger wrote:
> 
> On Wed, Oct 23, 2019 at 6:39 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > This RFC patchset is to ask opinions from UML people, whether LKL codes is
> > good to integrate into UML code base.  We wish to have any kind of feedback
> > from your kind reviews.  There are numbers of commits which should be asked
> > for reviews to other mailing lists; we will do it later once we got
> > discussed in this mailing list.
> 
> Thanks a lot for doing this, this effort is much appreciated! :-)
> 
> > # sorry for the long list of patches: we can make it smaller by only
> >   including basic set of LKL (e.g., removing foreign OS support, etc) if
> >   you wish.
> 
> Let use see how the review goes. First I'll give it a high level review to make
> sure we all talk about the same things.

Thanks.

> Please CC linux-arch@vger.kernel.org for the next patch round.
> Integrating LKL (into arch/um/) is something which needs more audience and
> feedback from Arnd Bergmann, our global arch maintainer.

Sure, will Cc.

> >
> > LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
> > as extensively as possible with minimal effort and reduced maintenance
> > overhead.
> >
> > Examples of how LKL can be used are: creating userspace applications
> > (running on Linux and other operating systems) that can read or write Linux
> > filesystems or can use the Linux networking stack, creating kernel drivers
> > for other operating systems that can read Linux filesystems, bootloaders
> > support for reading/writing Linux filesystems, etc.
> >
> > With LKL, the kernel code is compiled into an object file that can be
> > directly linked by applications. The API offered by LKL is based on the
> > Linux system call interface.
> >
> > LKL is originally implemented as an architecture port in arch/lkl, but this
> > series of commits tries to integrate this into arch/um as one of the mode
> > of UML.  This was discussed during RFC email of LKL (*1).
> >
> > The latest LKL version can be found at https://github.com/lkl/linux
> >
> > Milestone
> > =========
> > This patches is just a first step toward upstreaming *library mode* of
> > Linux kernel, but we think we need to have several steps toward our goal,
> > describing in the below.
> >
> > 1. Put LKL code under arch/um (arch/um/lkl), and build it in a
> > separate way from UML.
> 
> Makes sense.
> 
> > 2. Share common parts of implementation between UML and LKL.
> 
> Since both UML and LKL are usermode ports there is a lot of potential.
> From my side it is also no big deal if there is some duplication which can be
> resolved in later releases. Unifiing needs deep thoughts and miding odd corner
> cases.

I understand.

> > 3. Reimplement UML features with LKL API (if we wish)
> 
> Yep. In the last release UML got virtio support, so there is hope. ;-)

Good news.

> > For the step 1, we put LKL as one of SUBARCH in order to make less effort
> > to integrate (make ARCH=um SUBARCH=lkl).  The modification to existing UML
> > code is trying to be minimized.
> 
> I'm not sure if SUBARCH is the right approach. How do I build a i386
> lkl on x86_64?

This is currently handled under tools/lkl: building
arch/um/lkl part only requires toolchain information (e.g.,
CROSS_COMPILE=).

So to build i386 liblkl.a, do `make ARCH=um SUBARCH=lkl`,
which might not be intuitive..

> Maybe we can use another variable like UMMODE={library,kernel}?

We will try to find this way to switch the mode instead.

> > The RFC patches also includes and a bit of step 2 as a proof of possibility
> > to share the code.  For this, we used the virtio device code of LKL and use
> > it from UML by enabling virtio-mmio driver with UML code.
> >
> >
> >
> > Building LKL the host library and LKL applications
> > ==================================================
> >
> > % cd tools/lkl
> > % make
> 
> Is there a reason why tool/lkl is not under arch/um?

I thought that this way makes clear distinction between host
hardware/environment *dependent* (tools/lkl) part and
*independent* (arch/um/lkl).

We can rename it to tools/um instead.

But if using new tools directory makes noisy, we would try to
move those under arch/um.

-- Hajime

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library
  2019-10-25 21:40   ` Richard Weinberger
@ 2019-10-27  2:36     ` Hajime Tazaki
  2019-10-29  4:04       ` Lai Jiangshan
  0 siblings, 1 reply; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-27  2:36 UTC (permalink / raw)
  To: richard.weinberger
  Cc: levex, mattator, cem, tavi.purdila, jiangshanlai, staal1978,
	motomuman, linux-um, retrage01, petrosagg, edisonmcastro, xiaoj,
	mark, pscollins, phh, sigmaepsilon92, luca.dariz, liuyuan


# dropping two Cc's since those are not reachable..

On Sat, 26 Oct 2019 06:40:05 +0900,
Richard Weinberger wrote:
> 
> On Wed, Oct 23, 2019 at 6:39 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> > +LINUX KERNEL LIBRARY
> > +M:     Octavian Purdila <octavian.purdila@intel.com>
> > +M:     Hajime Tazaki <thehajime@gmail.com>
> > +L:     linux-kernel-library@freelists.org
> > +S:     Maintained
> > +F:     arch/lkl/
> > +F:     tools/lkl/
> > +
> 
> The arch/lkl path is outdated.

Ah, should be updated.  We will fix it.

> So, you and Octavian will maintain LKL?

Yes.

> Do you want to be sub maintainers of arch/um/lkl and send pull requests to me
> or co-maintain the whole UML ecosystem together with me and Anton?
>
> I'm perfectly fine with both variants but tend to the latter one since
> it is less overhead.

I was not thinking well enough for the maintenance procedure;
I agree that the latter case is better, but for the early
stage of this integration, I think starting with the former
(send pull-req from LKL to you/Anton) would be nice.

LKL is now on github and utilizes several useful features
(CI test at each pull request, issue tracking, wiki), and if
possible I'd also like to migrate those tools, or make them
available to UML because this makes easier maintenance for
LKL.

What do you think ?

> In case you need your PGP keys signed, next week I'll be in Lyon at
> OSS, ELCE, ...

I won't be there, unfortunately..

-- Hajime

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library
  2019-10-27  2:36     ` Hajime Tazaki
@ 2019-10-29  4:04       ` Lai Jiangshan
  2019-10-29  7:13         ` Hajime Tazaki
  0 siblings, 1 reply; 206+ messages in thread
From: Lai Jiangshan @ 2019-10-29  4:04 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: levex, mattator, cem, richard.weinberger, staal1978, motomuman,
	linux-um, retrage01, petrosagg, tavi.purdila, xiaoj, mark,
	edisonmcastro, pscollins, phh, sigmaepsilon92, luca.dariz,
	liuyuan

Hello, Hajime

I can't get how UML&LKL is going to unify even I read
the cover-letter of the patchset. After quick glance, what I
understand is that the patchset just puts LKL under
arch/um/lkl rather than arch/lkl. It is still separated "arch" for me.

Could you put me in more detail of the plan to unify them please?

Thanks
Lai

On Sun, Oct 27, 2019 at 10:36 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
>
> # dropping two Cc's since those are not reachable..
>
> On Sat, 26 Oct 2019 06:40:05 +0900,
> Richard Weinberger wrote:
> >
> > On Wed, Oct 23, 2019 at 6:39 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> > >
> > > From: Octavian Purdila <tavi.purdila@gmail.com>
> > > +LINUX KERNEL LIBRARY
> > > +M:     Octavian Purdila <octavian.purdila@intel.com>
> > > +M:     Hajime Tazaki <thehajime@gmail.com>
> > > +L:     linux-kernel-library@freelists.org
> > > +S:     Maintained
> > > +F:     arch/lkl/
> > > +F:     tools/lkl/
> > > +
> >
> > The arch/lkl path is outdated.
>
> Ah, should be updated.  We will fix it.
>
> > So, you and Octavian will maintain LKL?
>
> Yes.
>
> > Do you want to be sub maintainers of arch/um/lkl and send pull requests to me
> > or co-maintain the whole UML ecosystem together with me and Anton?
> >
> > I'm perfectly fine with both variants but tend to the latter one since
> > it is less overhead.
>
> I was not thinking well enough for the maintenance procedure;
> I agree that the latter case is better, but for the early
> stage of this integration, I think starting with the former
> (send pull-req from LKL to you/Anton) would be nice.
>
> LKL is now on github and utilizes several useful features
> (CI test at each pull request, issue tracking, wiki), and if
> possible I'd also like to migrate those tools, or make them
> available to UML because this makes easier maintenance for
> LKL.
>
> What do you think ?
>
> > In case you need your PGP keys signed, next week I'll be in Lyon at
> > OSS, ELCE, ...
>
> I won't be there, unfortunately..
>
> -- Hajime

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library
  2019-10-29  4:04       ` Lai Jiangshan
@ 2019-10-29  7:13         ` Hajime Tazaki
  2019-10-29  7:57           ` Johannes Berg
  0 siblings, 1 reply; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-29  7:13 UTC (permalink / raw)
  To: jiangshanlai
  Cc: levex, mattator, cem, richard.weinberger, staal1978, motomuman,
	linux-um, retrage01, petrosagg, tavi.purdila, xiaoj, mark,
	edisonmcastro, pscollins, phh, sigmaepsilon92, luca.dariz,
	liuyuan


Hello Lai,

On Tue, 29 Oct 2019 13:04:59 +0900,
Lai Jiangshan wrote:
> 
> Hello, Hajime
> 
> I can't get how UML&LKL is going to unify even I read
> the cover-letter of the patchset. After quick glance, what I
> understand is that the patchset just puts LKL under
> arch/um/lkl rather than arch/lkl. It is still separated "arch" for me.
> 
> Could you put me in more detail of the plan to unify them please?

Thanks for the comment.

You're right: current patchset only share Makefile(s)
between UML and LKL.  This is the first step toward the
unification in my plan, described in the milestone of the
cover letter (copy-pasted below).

> Milestone
> =========
> (snip)
> 1. Put LKL code under arch/um (arch/um/lkl), and build it in a
> separate way from UML.
> 2. Share common parts of implementation between UML and LKL.
> 3. Reimplement UML features with LKL API (if we wish)

For the step 2, [PATCH 46/47] and [47/47] are the kind of
examples for this level of unification (sorry for the very
dirty code).

For the step 3, I don't have any WIP code nor detailed plan.
Implementing ptrace-based (or similar) syscall interception
would be one of the development (I believe there are more).

Offering UML feature-sets, keeping compatibility, while
benefiting from LKL (e.g., various underlying environment
support) would be very high-level goal since there are many
users of UML (various test tool projects, including coming
Kunit).



Having similar archs in Linux kernel is not likely to
happen; that was feedback after RFC email of LKL (in 2015).
This motivates us to upstream the LKL code into arch/um.

-- Hajime

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library
  2019-10-29  7:13         ` Hajime Tazaki
@ 2019-10-29  7:57           ` Johannes Berg
  2019-10-29  8:15             ` Richard Weinberger
  2019-10-30  3:19             ` Hajime Tazaki
  0 siblings, 2 replies; 206+ messages in thread
From: Johannes Berg @ 2019-10-29  7:57 UTC (permalink / raw)
  To: Hajime Tazaki, jiangshanlai
  Cc: levex, mattator, cem, richard.weinberger, liuyuan, staal1978,
	motomuman, linux-um, retrage01, petrosagg, tavi.purdila, xiaoj,
	mark, pscollins, phh, sigmaepsilon92, luca.dariz, edisonmcastro

Hi,

> You're right: current patchset only share Makefile(s)
> between UML and LKL.  This is the first step toward the
> unification in my plan, described in the milestone of the
> cover letter (copy-pasted below).
> 
> > Milestone
> > =========
> > (snip)
> > 1. Put LKL code under arch/um (arch/um/lkl), and build it in a
> > separate way from UML.
> > 2. Share common parts of implementation between UML and LKL.
> > 3. Reimplement UML features with LKL API (if we wish)
> 
> For the step 2, [PATCH 46/47] and [47/47] are the kind of
> examples for this level of unification (sorry for the very
> dirty code).
> 
> For the step 3, I don't have any WIP code nor detailed plan.
> Implementing ptrace-based (or similar) syscall interception
> would be one of the development (I believe there are more).
> 
> Offering UML feature-sets, keeping compatibility, while
> benefiting from LKL (e.g., various underlying environment
> support) would be very high-level goal since there are many
> users of UML (various test tool projects, including coming
> Kunit).

Aren't you going about this the wrong way around?

I mean, this reads like you're proposing to start from LKL and
reimplement UML on top of it, but we currently have UML in the tree and
LKL isn't. Seems backward to me.

Also, looking at the patches, I'm not a huge fan of the whole "drop LKL
into UML". UML is already complex enough as is, with its memory model
and all, mixing in LKL makes it way more complex.

I don't think "drop LKL under UML" was what people had in mind when they
suggested that the two merge ...

johannes


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 00/47] Unifying LKL into UML
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
                   ` (47 preceding siblings ...)
  2019-10-25 21:34   ` Richard Weinberger
@ 2019-10-29  7:57 ` Johannes Berg
  2019-10-29 15:45   ` Hajime Tazaki
  2019-11-08  5:02   ` Hajime Tazaki
  49 siblings, 1 reply; 206+ messages in thread
From: Johannes Berg @ 2019-10-29  7:57 UTC (permalink / raw)
  To: Hajime Tazaki, linux-um; +Cc: Octavian Purdila, Akira Moroo

On Wed, 2019-10-23 at 13:37 +0900, Hajime Tazaki wrote:
> 
> LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
> as extensively as possible with minimal effort and reduced maintenance
> overhead.

[snip]

Can you comment a bit on what's what?

For example, I look at patch 24 ("lkl tools: virtio: add network device
support") and wonder what that really is? Is it a hypervisor-side
implementation of the virtio-net device? If so, why is that considered
"tools"? (In ARCH=um, the hv-code usually lives in
arch/um/drivers/*_user.c or similar.)

Also, taking that as an example again, I think that's something we
should rather leave out initially - it looks like it has hooks into DPDK
and all kinds of other network interfaces on the host, which duplicates
a lot of existing functionality in ARCH=um.

Additionally, we (Intel) recently contributed a vhost-user backend, so
we don't really *need* a hypervisor implementation of e.g. DPDK
integration at all, that should be possible over vhost-user instead.

Looking further at the series - many of your patches really need better
commit logs explaining what and why they do something. Particularly the
reverts, but even trivial patches like the first one in the series.

patch 2: doesn't explain why it's necessary - how is this not covered by
adding a "config SOMETHING\n  def_bool y" in the architecture?

patch 4: kernel-doc doesn't parse, it's also very awkward to add this
without any users, why not add a very simple version of the struct with
the first patch needing it, and then extend it in each next patch?

[oops, out of time, will continue later]

johannes


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library
  2019-10-29  7:57           ` Johannes Berg
@ 2019-10-29  8:15             ` Richard Weinberger
  2019-10-30  3:19             ` Hajime Tazaki
  1 sibling, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-10-29  8:15 UTC (permalink / raw)
  To: Johannes Berg
  Cc: liuyuan, levex, mattator, cem, tavi purdila, linux-um, staal1978,
	motomuman, jiangshanlai, retrage01, petrosagg, edisonmcastro,
	xiaoj, mark, pscollins, phh, sigmaepsilon92, luca dariz,
	Hajime Tazaki

----- Ursprüngliche Mail -----
> Von: "Johannes Berg" <johannes@sipsolutions.net>
> An: "Hajime Tazaki" <thehajime@gmail.com>, jiangshanlai@gmail.com
> CC: levex@linux.com, mattator@gmail.com, cem@freebsd.org, "Richard Weinberger" <richard.weinberger@gmail.com>,
> staal1978@gmail.com, motomuman@gmail.com, "linux-um" <linux-um@lists.infradead.org>, retrage01@gmail.com,
> petrosagg@gmail.com, "tavi purdila" <tavi.purdila@gmail.com>, xiaoj@google.com, mark@stillwell.me,
> edisonmcastro@hotmail.com, pscollins@google.com, phh@phh.me, sigmaepsilon92@gmail.com, "luca dariz"
> <luca.dariz@gmail.com>, liuyuan@google.com
> Gesendet: Dienstag, 29. Oktober 2019 08:57:43
> Betreff: Re: [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library

> Hi,
> 
>> You're right: current patchset only share Makefile(s)
>> between UML and LKL.  This is the first step toward the
>> unification in my plan, described in the milestone of the
>> cover letter (copy-pasted below).
>> 
>> > Milestone
>> > =========
>> > (snip)
>> > 1. Put LKL code under arch/um (arch/um/lkl), and build it in a
>> > separate way from UML.
>> > 2. Share common parts of implementation between UML and LKL.
>> > 3. Reimplement UML features with LKL API (if we wish)
>> 
>> For the step 2, [PATCH 46/47] and [47/47] are the kind of
>> examples for this level of unification (sorry for the very
>> dirty code).
>> 
>> For the step 3, I don't have any WIP code nor detailed plan.
>> Implementing ptrace-based (or similar) syscall interception
>> would be one of the development (I believe there are more).
>> 
>> Offering UML feature-sets, keeping compatibility, while
>> benefiting from LKL (e.g., various underlying environment
>> support) would be very high-level goal since there are many
>> users of UML (various test tool projects, including coming
>> Kunit).
> 
> Aren't you going about this the wrong way around?
> 
> I mean, this reads like you're proposing to start from LKL and
> reimplement UML on top of it, but we currently have UML in the tree and
> LKL isn't. Seems backward to me.

This is surely not intended. 
 
> Also, looking at the patches, I'm not a huge fan of the whole "drop LKL
> into UML". UML is already complex enough as is, with its memory model
> and all, mixing in LKL makes it way more complex.
> 
> I don't think "drop LKL under UML" was what people had in mind when they
> suggested that the two merge ...

Yes, just having a lkl folder under arch/um/ is for sure not the final solution.
The goal is merging as much as possible.
But having purely lkl stuff in a subfolder like arch/um/lkl/ is not a big deal.
 
This patch series is just a start not the final solution which will get merged. :-)

Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 00/47] Unifying LKL into UML
  2019-10-29  7:57 ` Johannes Berg
@ 2019-10-29 15:45   ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-29 15:45 UTC (permalink / raw)
  To: johannes; +Cc: tavi.purdila, linux-um, retrage01


Hello Johannes,

Thanks for the review.

On Tue, 29 Oct 2019 16:57:54 +0900,
Johannes Berg wrote:
> 
> On Wed, 2019-10-23 at 13:37 +0900, Hajime Tazaki wrote:
> > 
> > LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
> > as extensively as possible with minimal effort and reduced maintenance
> > overhead.
> 
> [snip]
> 
> Can you comment a bit on what's what?
> 
> For example, I look at patch 24 ("lkl tools: virtio: add network device
> support") and wonder what that really is?

Indeed....  We will describe more to be understandable.

> Is it a hypervisor-side implementation of the virtio-net device? If so,
> why is that considered "tools"? (In ARCH=um, the hv-code usually lives in
> arch/um/drivers/*_user.c or similar.)

Yes, this was a device (hv)-side implementation.

LKL currently takes 2-stage build: pure kernel part (arch/um/lkl) and
userspace one (tools/lkl).  Though I think it's good to have clear
distinction in different directories, we can move tools/lkl under arch/um, if
this is the better way.

> Also, taking that as an example again, I think that's something we
> should rather leave out initially - it looks like it has hooks into DPDK
> and all kinds of other network interfaces on the host, which duplicates
> a lot of existing functionality in ARCH=um.
> 
> Additionally, we (Intel) recently contributed a vhost-user backend, so
> we don't really *need* a hypervisor implementation of e.g. DPDK
> integration at all, that should be possible over vhost-user instead.

I understand.

If there is a need for in-process virtio device implementation (as LKL
does), connecting to vanilla virtio mmio driver, I hope it's still
valuable.

> Looking further at the series - many of your patches really need better
> commit logs explaining what and why they do something. Particularly the
> reverts, but even trivial patches like the first one in the series.

thanks.  the revert commits are required for Windows host target, which
uses '_' prefix for the symbols.

We will refine the log thorough the all commits and get you back.

> patch 2: doesn't explain why it's necessary - how is this not covered by
> adding a "config SOMETHING\n  def_bool y" in the architecture?

ah, good catch.
maybe giving an example of the contents of auto.conf should be helpful.
We will address this in next patchset.

> patch 4: kernel-doc doesn't parse, it's also very awkward to add this
> without any users, why not add a very simple version of the struct with
> the first patch needing it, and then extend it in each next patch?

I agree to start with simple struct and expand it afterward.  We will
address this in next round.

kernel-doc is something we didn't test so far.
We will also test in advance.

> [oops, out of time, will continue later]

Thank you again for the detailed review.

-- Hajime

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library
  2019-10-29  7:57           ` Johannes Berg
  2019-10-29  8:15             ` Richard Weinberger
@ 2019-10-30  3:19             ` Hajime Tazaki
  1 sibling, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-10-30  3:19 UTC (permalink / raw)
  To: johannes
  Cc: levex, mattator, cem, richard.weinberger, linux-um, staal1978,
	motomuman, jiangshanlai, retrage01, petrosagg, tavi.purdila,
	xiaoj, mark, edisonmcastro, pscollins, phh, sigmaepsilon92,
	luca.dariz, liuyuan


Hello,

On Tue, 29 Oct 2019 16:57:43 +0900,
Johannes Berg wrote:

> > Offering UML feature-sets, keeping compatibility, while
> > benefiting from LKL (e.g., various underlying environment
> > support) would be very high-level goal since there are many
> > users of UML (various test tool projects, including coming
> > Kunit).
> 
> Aren't you going about this the wrong way around?
> 
> I mean, this reads like you're proposing to start from LKL and
> reimplement UML on top of it, but we currently have UML in the tree and
> LKL isn't. Seems backward to me.

I see your point.

Our basic standpoint is to follow the project idea listed in
the old UML web page.

http://user-mode-linux.sourceforge.net/old/projects.html

Especially LKL should be able to contribute the following
ideas to UML.

- Architecture Ports (e.g., run on arm32)
- OS Ports (e.g., Windows host)
- UML as a normal userspace library

I wish I'm not going to break any existing facility of UML
in this introduction.  If you found any, I'm happy to fix
such problems.

> Also, looking at the patches, I'm not a huge fan of the whole "drop LKL
> into UML". UML is already complex enough as is, with its memory model
> and all, mixing in LKL makes it way more complex.
> 
> I don't think "drop LKL under UML" was what people had in mind when they
> suggested that the two merge ...

As Richard explained, putting new arch/um/lkl folder should
not be our final goal.

-- Hajime

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* [RFC v2 00/37] Unifying LKL into UML
  2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
@ 2019-11-08  5:02   ` Hajime Tazaki
  2019-10-23  4:37 ` [RFC PATCH 02/47] kbuild: allow architectures to automatically define kconfig symbols Hajime Tazaki
                     ` (48 subsequent siblings)
  49 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki

This RFC patchset is to ask opinions from folks, whether LKL codes is good
to integrate into UML code base.  We wish to have any kind of feedback from
your kind reviews.  There are numbers of commits which should be asked for
reviews to other mailing lists; we will do it later once we got discussed
in this mailing list.

# sorry for the long list of patches: we can make it smaller by only
  including basic set of LKL (e.g., removing foreign OS support, etc) if
  you wish.


rfc v2:
- use UMMODE instead of SUBARCH to switch UML or LKL
- tools/lkl directory is still there. I confirmed we can move under arch/um
  (e.g., arch/um/lkl/hosts).  I will move it IF this is preferable.
- drop several patches involved non-uml directory
- drop several patches which are not required
- refine commit logs
- document updated




LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
as extensively as possible with minimal effort and reduced maintenance
overhead.

Examples of how LKL can be used are: creating userspace applications
(running on Linux and other operating systems) that can read or write Linux
filesystems or can use the Linux networking stack, creating kernel drivers
for other operating systems that can read Linux filesystems, bootloaders
support for reading/writing Linux filesystems, etc.

With LKL, the kernel code is compiled into an object file that can be
directly linked by applications. The API offered by LKL is based on the
Linux system call interface.

LKL is originally implemented as an architecture port in arch/lkl, but this
series of commits tries to integrate this into arch/um as one of the mode
of UML.  This was discussed during RFC email of LKL (*1).

The latest LKL version can be found at https://github.com/lkl/linux

Milestone
=========
This patches is just a first step toward upstreaming *library mode* of
Linux kernel, but we think we need to have several steps toward our goal,
describing in the below.

1. Put LKL code under arch/um (arch/um/lkl), and build it in a
separate way from UML.
2. Share common parts of implementation between UML and LKL.
3. Reimplement UML features with LKL API (if we wish)

For the step 1, we put LKL as one of UMMODE in order to make less effort to
integrate (make ARCH=um UMMODE=library).  The modification to existing UML
code is trying to be minimized.

The RFC patches also includes and a bit of step 2 as a proof of possibility
to share the code.  For this, we used the virtio device code of LKL and use
it from UML by enabling virtio-mmio driver with UML code.



Building LKL the host library and LKL applications
==================================================

% cd tools/lkl
% make

will build LKL as a object file, it will install it in tools/lkl/lib together
with the headers files in tools/lkl/include then will build the host library,
tests and a few of application examples:

* tests/boot - a simple applications that uses LKL and exercises the basic
LKL APIs

* tests/net-test - a simple applications that uses network feature of
LKL and exercises the basic network-related APIs

* fs2tar - a tool that converts a filesystem image to a tar archive

* cptofs/cpfromfs - a tool that copies files to/from a filesystem image

% make run-tests

should run the above `tests/boot` and `tests/net-test` and report errors if
there are any.

Supported hosts
===============

Currently LKL supports POSIX and Windows userspace applications. New hosts
can be added relatively easy if the host supports gcc and GNU ld. Previous
versions of LKL supported Windows kernel and Haiku kernel hosts, and we
also have WIP patches (not included in this RFC) with rump-hypercall
interface, used in UEFI, as well as macOS userspace (part of POSIX?).

There is also musl-libc port for LKL, which might be interested in for some
folks.


Further readings about LKL
=========================

- Discussion in github LKL issue
https://github.com/lkl/linux/issues/304

- LKL (an article)
https://www.researchgate.net/profile/Nicolae_Tapus2/publication/224164682_LKL_The_Linux_kernel_library/links/02bfe50fd921ab4f7c000000.pdf

*1 RFC email to LKML (back in 2015)
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1012277.html



Please review the following changes for suitability for inclusion. If you have
any objections or suggestions for improvement, please respond to the patches. If
you agree with the changes, please provide your Acked-by.

The following changes since commit 73625ed66389d4c620520058d828f43a93ab4d0c:

  um: irq: Fix LAST_IRQ usage in init_IRQ() (2019-09-16 08:38:58 +0200)

are available in the Git repository at:

  git://github.com/thehajime/linux 61b15bfb52c7f1f066685c90a1cfe8346b3faec9
  https://github.com/thehajime/linux/tree/upstream-to-uml-5.5-rc1-v2

Andreas Abel (1):
  kallsyms: Add a config option to select section for kallsyms

Hajime Tazaki (6):
  lkl: add system call hijack support
  scripts: revert CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX patches
  lkl: Android ARM (arm/arm64) support
  um lkl: add CI tests
  um: use lkl virtio_net_tap device as UML device
  um: add lkl virtio-blk device

Octavian Purdila (29):
  asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
  arch: add __SYSCALL_DEFINE_ARCH
  lkl: architecture skeleton for Linux kernel library
  lkl: host interface
  lkl: memory handling
  lkl: kernel threads support
  lkl: interrupt support
  lkl: system call interface and application API
  lkl: timers, time and delay support
  lkl: memory mapped I/O support
  lkl: basic kernel console support
  lkl: initialization and cleanup
  lkl: plug in the build system
  lkl tools: skeleton for host side library, tests and tools
  lkl tools: host lib: add utilities functions
  lkl tools: host lib: memory mapped I/O helpers
  lkl tools: host lib: virtio devices
  lkl tools: host lib: virtio block device
  lkl tools: host lib: filesystem helpers
  lkl tools: host lib: posix host operations
  lkl tools: "boot" test
  lkl tools: tool that reads/writes to/from a filesystem image
  lkl tools: tool that converts a filesystem image to tar
  lkl tools: virtio: add network device support
  checkpatch: avoid showing BIT_ULL warnings for tools/ files
  lkl tools: add lklfuse
  lkl: add documentation
  lkl: add support for Windows hosts
  lkl tools: add support for Windows host

Thomas Liebetraut (1):
  tools: Add the lkl host library to the common tools Makefile

 .circleci/config.yml                       | 276 ++++++
 Documentation/virt/uml/lkl.txt             | 453 ++++++++++
 MAINTAINERS                                |   8 +
 Makefile                                   |   4 +-
 arch/Kconfig                               |   6 +
 arch/um/Kconfig                            |  32 +-
 arch/um/Makefile                           | 151 +---
 arch/um/Makefile.um                        | 152 ++++
 arch/um/configs/x86_64_defconfig           |   6 +
 arch/um/include/asm/Kbuild                 |   1 +
 arch/um/include/asm/io.h                   |   4 +
 arch/um/kernel/syscall.c                   |  53 ++
 arch/um/lkl/.gitignore                     |   2 +
 arch/um/lkl/Kconfig                        |  91 ++
 arch/um/lkl/Kconfig.debug                  |   0
 arch/um/lkl/Makefile                       |  69 ++
 arch/um/lkl/auto.conf                      |   1 +
 arch/um/lkl/configs/lkl_defconfig          |  91 ++
 arch/um/lkl/include/asm/Kbuild             |  80 ++
 arch/um/lkl/include/asm/bitsperlong.h      |  11 +
 arch/um/lkl/include/asm/byteorder.h        |   7 +
 arch/um/lkl/include/asm/cpu.h              |  14 +
 arch/um/lkl/include/asm/elf.h              |  15 +
 arch/um/lkl/include/asm/host_ops.h         |  10 +
 arch/um/lkl/include/asm/io.h               | 104 +++
 arch/um/lkl/include/asm/irq.h              |  15 +
 arch/um/lkl/include/asm/mutex.h            |   7 +
 arch/um/lkl/include/asm/page.h             |  14 +
 arch/um/lkl/include/asm/pgtable.h          |  62 ++
 arch/um/lkl/include/asm/processor.h        |  60 ++
 arch/um/lkl/include/asm/ptrace.h           |  25 +
 arch/um/lkl/include/asm/sched.h            |  23 +
 arch/um/lkl/include/asm/setup.h            |   7 +
 arch/um/lkl/include/asm/syscalls.h         |  18 +
 arch/um/lkl/include/asm/syscalls_32.h      |  43 +
 arch/um/lkl/include/asm/thread_info.h      |  70 ++
 arch/um/lkl/include/asm/tlb.h              |  12 +
 arch/um/lkl/include/asm/uaccess.h          |  64 ++
 arch/um/lkl/include/asm/unistd.h           |  29 +
 arch/um/lkl/include/asm/unistd_32.h        |  31 +
 arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
 arch/um/lkl/include/asm/xor.h              |   9 +
 arch/um/lkl/include/system/stdarg.h        |   2 +
 arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
 arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
 arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
 arch/um/lkl/include/uapi/asm/host_ops.h    | 153 ++++
 arch/um/lkl/include/uapi/asm/irq.h         |  36 +
 arch/um/lkl/include/uapi/asm/sigcontext.h  |  16 +
 arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
 arch/um/lkl/include/uapi/asm/swab.h        |  11 +
 arch/um/lkl/include/uapi/asm/syscalls.h    | 348 ++++++++
 arch/um/lkl/include/uapi/asm/unistd.h      |  18 +
 arch/um/lkl/kernel/Makefile                |   4 +
 arch/um/lkl/kernel/asm-offsets.c           |   2 +
 arch/um/lkl/kernel/console.c               |  42 +
 arch/um/lkl/kernel/cpu.c                   | 223 +++++
 arch/um/lkl/kernel/irq.c                   | 193 +++++
 arch/um/lkl/kernel/misc.c                  |  60 ++
 arch/um/lkl/kernel/setup.c                 | 193 +++++
 arch/um/lkl/kernel/syscalls.c              | 246 ++++++
 arch/um/lkl/kernel/syscalls_32.c           | 159 ++++
 arch/um/lkl/kernel/threads.c               | 227 +++++
 arch/um/lkl/kernel/time.c                  | 145 ++++
 arch/um/lkl/kernel/vmlinux.lds.S           |  51 ++
 arch/um/lkl/mm/Makefile                    |   1 +
 arch/um/lkl/mm/bootmem.c                   |  66 ++
 arch/um/lkl/scripts/headers_install.py     | 195 +++++
 arch/um/os-Linux/Makefile                  |   5 +
 arch/um/os-Linux/lkl_dev.c                 | 188 +++++
 certs/system_certificates.S                |  16 +-
 include/asm-generic/atomic64.h             |   2 +
 include/asm-generic/export.h               |  34 +-
 include/asm-generic/vmlinux.lds.h          | 279 ++++---
 include/linux/compiler_attributes.h        |   4 +
 include/linux/export.h                     |  23 +-
 include/linux/linkage.h                    |  12 +-
 include/linux/syscalls.h                   |   6 +
 init/Kconfig                               |  12 +
 lib/.gitignore                             |   2 +
 lib/raid6/.gitignore                       |   1 +
 scripts/.gitignore                         |   2 +
 scripts/Makefile.build                     |   9 +-
 scripts/adjust_autoksyms.sh                |   6 +
 scripts/basic/.gitignore                   |   1 +
 scripts/checkpatch.pl                      |  13 +-
 scripts/depmod.sh                          |  25 +-
 scripts/genksyms/genksyms.c                |  11 +-
 scripts/kallsyms.c                         |  54 +-
 scripts/kconfig/.gitignore                 |   1 +
 scripts/link-vmlinux.sh                    |  10 +
 scripts/mod/.gitignore                     |   1 +
 scripts/mod/modpost.c                      |  30 +-
 tools/Makefile                             |  11 +-
 tools/lkl/.gitignore                       |  15 +
 tools/lkl/Build                            |   6 +
 tools/lkl/Makefile                         | 130 +++
 tools/lkl/Makefile.autoconf                | 114 +++
 tools/lkl/Targets                          |  25 +
 tools/lkl/bin/lkl-hijack.sh                |  23 +
 tools/lkl/cptofs.c                         | 635 ++++++++++++++
 tools/lkl/fs2tar.c                         | 410 +++++++++
 tools/lkl/include/.gitignore               |   1 +
 tools/lkl/include/lkl.h                    | 928 +++++++++++++++++++++
 tools/lkl/include/lkl_config.h             |  61 ++
 tools/lkl/include/lkl_host.h               | 160 ++++
 tools/lkl/include/mingw32/sys/socket.h     |   4 +
 tools/lkl/lib/.gitignore                   |   3 +
 tools/lkl/lib/Build                        |  26 +
 tools/lkl/lib/Makefile                     |  33 +
 tools/lkl/lib/config.c                     | 793 ++++++++++++++++++
 tools/lkl/lib/dbg.c                        | 300 +++++++
 tools/lkl/lib/dbg_handler.c                |  44 +
 tools/lkl/lib/endian.h                     |  31 +
 tools/lkl/lib/fs.c                         | 433 ++++++++++
 tools/lkl/lib/hijack/Build                 |   4 +
 tools/lkl/lib/hijack/hijack.c              | 620 ++++++++++++++
 tools/lkl/lib/hijack/init.c                | 252 ++++++
 tools/lkl/lib/hijack/init.h                |   8 +
 tools/lkl/lib/hijack/xlate.c               | 613 ++++++++++++++
 tools/lkl/lib/hijack/xlate.h               |  13 +
 tools/lkl/lib/iomem.c                      |  88 ++
 tools/lkl/lib/iomem.h                      |  15 +
 tools/lkl/lib/jmp_buf.c                    |  14 +
 tools/lkl/lib/jmp_buf.h                    |   8 +
 tools/lkl/lib/net.c                        | 818 ++++++++++++++++++
 tools/lkl/lib/nt-host.c                    | 375 +++++++++
 tools/lkl/lib/posix-host.c                 | 439 ++++++++++
 tools/lkl/lib/utils.c                      | 266 ++++++
 tools/lkl/lib/virtio.c                     | 644 ++++++++++++++
 tools/lkl/lib/virtio.h                     | 115 +++
 tools/lkl/lib/virtio_blk.c                 | 132 +++
 tools/lkl/lib/virtio_net.c                 | 345 ++++++++
 tools/lkl/lib/virtio_net_dpdk.c            | 480 +++++++++++
 tools/lkl/lib/virtio_net_fd.c              | 195 +++++
 tools/lkl/lib/virtio_net_fd.h              |  50 ++
 tools/lkl/lib/virtio_net_macvtap.c         |  32 +
 tools/lkl/lib/virtio_net_pipe.c            |  76 ++
 tools/lkl/lib/virtio_net_raw.c             |  94 +++
 tools/lkl/lib/virtio_net_tap.c             | 111 +++
 tools/lkl/lib/virtio_net_vde.c             | 168 ++++
 tools/lkl/lklfuse.c                        | 658 +++++++++++++++
 tools/lkl/scripts/checkpatch.sh            |  60 ++
 tools/lkl/scripts/lkl-jenkins.sh           |  21 +
 tools/lkl/tests/Build                      |   3 +
 tools/lkl/tests/boot.c                     | 562 +++++++++++++
 tools/lkl/tests/boot.sh                    |   9 +
 tools/lkl/tests/cla.c                      | 159 ++++
 tools/lkl/tests/cla.h                      |  33 +
 tools/lkl/tests/disk.c                     | 189 +++++
 tools/lkl/tests/disk.sh                    |  70 ++
 tools/lkl/tests/hijack-test.sh             | 760 +++++++++++++++++
 tools/lkl/tests/lklfuse.sh                 | 110 +++
 tools/lkl/tests/net-setup.sh               | 134 +++
 tools/lkl/tests/net-test.c                 | 317 +++++++
 tools/lkl/tests/net.sh                     | 186 +++++
 tools/lkl/tests/run.py                     | 182 ++++
 tools/lkl/tests/run_netperf.sh             |  98 +++
 tools/lkl/tests/tap13.py                   | 209 +++++
 tools/lkl/tests/test.c                     | 126 +++
 tools/lkl/tests/test.h                     |  72 ++
 tools/lkl/tests/test.sh                    | 240 ++++++
 tools/lkl/tests/valgrind.supp              |  85 ++
 tools/lkl/tests/valgrind2xunit.py          |  69 ++
 usr/initramfs_data.S                       |   4 +-
 165 files changed, 19489 insertions(+), 354 deletions(-)
 create mode 100644 .circleci/config.yml
 create mode 100644 Documentation/virt/uml/lkl.txt
 create mode 100644 arch/um/Makefile.um
 create mode 100644 arch/um/lkl/.gitignore
 create mode 100644 arch/um/lkl/Kconfig
 create mode 100644 arch/um/lkl/Kconfig.debug
 create mode 100644 arch/um/lkl/Makefile
 create mode 100644 arch/um/lkl/auto.conf
 create mode 100644 arch/um/lkl/configs/lkl_defconfig
 create mode 100644 arch/um/lkl/include/asm/Kbuild
 create mode 100644 arch/um/lkl/include/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/asm/cpu.h
 create mode 100644 arch/um/lkl/include/asm/elf.h
 create mode 100644 arch/um/lkl/include/asm/host_ops.h
 create mode 100644 arch/um/lkl/include/asm/io.h
 create mode 100644 arch/um/lkl/include/asm/irq.h
 create mode 100644 arch/um/lkl/include/asm/mutex.h
 create mode 100644 arch/um/lkl/include/asm/page.h
 create mode 100644 arch/um/lkl/include/asm/pgtable.h
 create mode 100644 arch/um/lkl/include/asm/processor.h
 create mode 100644 arch/um/lkl/include/asm/ptrace.h
 create mode 100644 arch/um/lkl/include/asm/sched.h
 create mode 100644 arch/um/lkl/include/asm/setup.h
 create mode 100644 arch/um/lkl/include/asm/syscalls.h
 create mode 100644 arch/um/lkl/include/asm/syscalls_32.h
 create mode 100644 arch/um/lkl/include/asm/thread_info.h
 create mode 100644 arch/um/lkl/include/asm/tlb.h
 create mode 100644 arch/um/lkl/include/asm/uaccess.h
 create mode 100644 arch/um/lkl/include/asm/unistd.h
 create mode 100644 arch/um/lkl/include/asm/unistd_32.h
 create mode 100644 arch/um/lkl/include/asm/vmlinux.lds.h
 create mode 100644 arch/um/lkl/include/asm/xor.h
 create mode 100644 arch/um/lkl/include/system/stdarg.h
 create mode 100644 arch/um/lkl/include/uapi/asm/Kbuild
 create mode 100644 arch/um/lkl/include/uapi/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/uapi/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/uapi/asm/host_ops.h
 create mode 100644 arch/um/lkl/include/uapi/asm/irq.h
 create mode 100644 arch/um/lkl/include/uapi/asm/sigcontext.h
 create mode 100644 arch/um/lkl/include/uapi/asm/siginfo.h
 create mode 100644 arch/um/lkl/include/uapi/asm/swab.h
 create mode 100644 arch/um/lkl/include/uapi/asm/syscalls.h
 create mode 100644 arch/um/lkl/include/uapi/asm/unistd.h
 create mode 100644 arch/um/lkl/kernel/Makefile
 create mode 100644 arch/um/lkl/kernel/asm-offsets.c
 create mode 100644 arch/um/lkl/kernel/console.c
 create mode 100644 arch/um/lkl/kernel/cpu.c
 create mode 100644 arch/um/lkl/kernel/irq.c
 create mode 100644 arch/um/lkl/kernel/misc.c
 create mode 100644 arch/um/lkl/kernel/setup.c
 create mode 100644 arch/um/lkl/kernel/syscalls.c
 create mode 100644 arch/um/lkl/kernel/syscalls_32.c
 create mode 100644 arch/um/lkl/kernel/threads.c
 create mode 100644 arch/um/lkl/kernel/time.c
 create mode 100644 arch/um/lkl/kernel/vmlinux.lds.S
 create mode 100644 arch/um/lkl/mm/Makefile
 create mode 100644 arch/um/lkl/mm/bootmem.c
 create mode 100755 arch/um/lkl/scripts/headers_install.py
 create mode 100644 arch/um/os-Linux/lkl_dev.c
 create mode 100644 tools/lkl/.gitignore
 create mode 100644 tools/lkl/Build
 create mode 100644 tools/lkl/Makefile
 create mode 100644 tools/lkl/Makefile.autoconf
 create mode 100644 tools/lkl/Targets
 create mode 100755 tools/lkl/bin/lkl-hijack.sh
 create mode 100644 tools/lkl/cptofs.c
 create mode 100644 tools/lkl/fs2tar.c
 create mode 100644 tools/lkl/include/.gitignore
 create mode 100644 tools/lkl/include/lkl.h
 create mode 100644 tools/lkl/include/lkl_config.h
 create mode 100644 tools/lkl/include/lkl_host.h
 create mode 100644 tools/lkl/include/mingw32/sys/socket.h
 create mode 100644 tools/lkl/lib/.gitignore
 create mode 100644 tools/lkl/lib/Build
 create mode 100644 tools/lkl/lib/Makefile
 create mode 100644 tools/lkl/lib/config.c
 create mode 100644 tools/lkl/lib/dbg.c
 create mode 100644 tools/lkl/lib/dbg_handler.c
 create mode 100644 tools/lkl/lib/endian.h
 create mode 100644 tools/lkl/lib/fs.c
 create mode 100644 tools/lkl/lib/hijack/Build
 create mode 100644 tools/lkl/lib/hijack/hijack.c
 create mode 100644 tools/lkl/lib/hijack/init.c
 create mode 100644 tools/lkl/lib/hijack/init.h
 create mode 100644 tools/lkl/lib/hijack/xlate.c
 create mode 100644 tools/lkl/lib/hijack/xlate.h
 create mode 100644 tools/lkl/lib/iomem.c
 create mode 100644 tools/lkl/lib/iomem.h
 create mode 100644 tools/lkl/lib/jmp_buf.c
 create mode 100644 tools/lkl/lib/jmp_buf.h
 create mode 100644 tools/lkl/lib/net.c
 create mode 100644 tools/lkl/lib/nt-host.c
 create mode 100644 tools/lkl/lib/posix-host.c
 create mode 100644 tools/lkl/lib/utils.c
 create mode 100644 tools/lkl/lib/virtio.c
 create mode 100644 tools/lkl/lib/virtio.h
 create mode 100644 tools/lkl/lib/virtio_blk.c
 create mode 100644 tools/lkl/lib/virtio_net.c
 create mode 100644 tools/lkl/lib/virtio_net_dpdk.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.h
 create mode 100644 tools/lkl/lib/virtio_net_macvtap.c
 create mode 100644 tools/lkl/lib/virtio_net_pipe.c
 create mode 100644 tools/lkl/lib/virtio_net_raw.c
 create mode 100644 tools/lkl/lib/virtio_net_tap.c
 create mode 100644 tools/lkl/lib/virtio_net_vde.c
 create mode 100644 tools/lkl/lklfuse.c
 create mode 100755 tools/lkl/scripts/checkpatch.sh
 create mode 100755 tools/lkl/scripts/lkl-jenkins.sh
 create mode 100644 tools/lkl/tests/Build
 create mode 100644 tools/lkl/tests/boot.c
 create mode 100755 tools/lkl/tests/boot.sh
 create mode 100644 tools/lkl/tests/cla.c
 create mode 100644 tools/lkl/tests/cla.h
 create mode 100644 tools/lkl/tests/disk.c
 create mode 100755 tools/lkl/tests/disk.sh
 create mode 100755 tools/lkl/tests/hijack-test.sh
 create mode 100755 tools/lkl/tests/lklfuse.sh
 create mode 100644 tools/lkl/tests/net-setup.sh
 create mode 100644 tools/lkl/tests/net-test.c
 create mode 100755 tools/lkl/tests/net.sh
 create mode 100755 tools/lkl/tests/run.py
 create mode 100755 tools/lkl/tests/run_netperf.sh
 create mode 100644 tools/lkl/tests/tap13.py
 create mode 100644 tools/lkl/tests/test.c
 create mode 100644 tools/lkl/tests/test.h
 create mode 100644 tools/lkl/tests/test.sh
 create mode 100644 tools/lkl/tests/valgrind.supp
 create mode 100755 tools/lkl/tests/valgrind2xunit.py

-- 
2.20.1 (Apple Git-117)

^ permalink raw reply	[flat|nested] 206+ messages in thread

* [RFC v2 00/37] Unifying LKL into UML
@ 2019-11-08  5:02   ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, linux-kernel-library, linux-arch,
	Hajime Tazaki, Akira Moroo

This RFC patchset is to ask opinions from folks, whether LKL codes is good
to integrate into UML code base.  We wish to have any kind of feedback from
your kind reviews.  There are numbers of commits which should be asked for
reviews to other mailing lists; we will do it later once we got discussed
in this mailing list.

# sorry for the long list of patches: we can make it smaller by only
  including basic set of LKL (e.g., removing foreign OS support, etc) if
  you wish.


rfc v2:
- use UMMODE instead of SUBARCH to switch UML or LKL
- tools/lkl directory is still there. I confirmed we can move under arch/um
  (e.g., arch/um/lkl/hosts).  I will move it IF this is preferable.
- drop several patches involved non-uml directory
- drop several patches which are not required
- refine commit logs
- document updated




LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
as extensively as possible with minimal effort and reduced maintenance
overhead.

Examples of how LKL can be used are: creating userspace applications
(running on Linux and other operating systems) that can read or write Linux
filesystems or can use the Linux networking stack, creating kernel drivers
for other operating systems that can read Linux filesystems, bootloaders
support for reading/writing Linux filesystems, etc.

With LKL, the kernel code is compiled into an object file that can be
directly linked by applications. The API offered by LKL is based on the
Linux system call interface.

LKL is originally implemented as an architecture port in arch/lkl, but this
series of commits tries to integrate this into arch/um as one of the mode
of UML.  This was discussed during RFC email of LKL (*1).

The latest LKL version can be found at https://github.com/lkl/linux

Milestone
=========
This patches is just a first step toward upstreaming *library mode* of
Linux kernel, but we think we need to have several steps toward our goal,
describing in the below.

1. Put LKL code under arch/um (arch/um/lkl), and build it in a
separate way from UML.
2. Share common parts of implementation between UML and LKL.
3. Reimplement UML features with LKL API (if we wish)

For the step 1, we put LKL as one of UMMODE in order to make less effort to
integrate (make ARCH=um UMMODE=library).  The modification to existing UML
code is trying to be minimized.

The RFC patches also includes and a bit of step 2 as a proof of possibility
to share the code.  For this, we used the virtio device code of LKL and use
it from UML by enabling virtio-mmio driver with UML code.



Building LKL the host library and LKL applications
==================================================

% cd tools/lkl
% make

will build LKL as a object file, it will install it in tools/lkl/lib together
with the headers files in tools/lkl/include then will build the host library,
tests and a few of application examples:

* tests/boot - a simple applications that uses LKL and exercises the basic
LKL APIs

* tests/net-test - a simple applications that uses network feature of
LKL and exercises the basic network-related APIs

* fs2tar - a tool that converts a filesystem image to a tar archive

* cptofs/cpfromfs - a tool that copies files to/from a filesystem image

% make run-tests

should run the above `tests/boot` and `tests/net-test` and report errors if
there are any.

Supported hosts
===============

Currently LKL supports POSIX and Windows userspace applications. New hosts
can be added relatively easy if the host supports gcc and GNU ld. Previous
versions of LKL supported Windows kernel and Haiku kernel hosts, and we
also have WIP patches (not included in this RFC) with rump-hypercall
interface, used in UEFI, as well as macOS userspace (part of POSIX?).

There is also musl-libc port for LKL, which might be interested in for some
folks.


Further readings about LKL
=========================

- Discussion in github LKL issue
https://github.com/lkl/linux/issues/304

- LKL (an article)
https://www.researchgate.net/profile/Nicolae_Tapus2/publication/224164682_LKL_The_Linux_kernel_library/links/02bfe50fd921ab4f7c000000.pdf

*1 RFC email to LKML (back in 2015)
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1012277.html



Please review the following changes for suitability for inclusion. If you have
any objections or suggestions for improvement, please respond to the patches. If
you agree with the changes, please provide your Acked-by.

The following changes since commit 73625ed66389d4c620520058d828f43a93ab4d0c:

  um: irq: Fix LAST_IRQ usage in init_IRQ() (2019-09-16 08:38:58 +0200)

are available in the Git repository at:

  git://github.com/thehajime/linux 61b15bfb52c7f1f066685c90a1cfe8346b3faec9
  https://github.com/thehajime/linux/tree/upstream-to-uml-5.5-rc1-v2

Andreas Abel (1):
  kallsyms: Add a config option to select section for kallsyms

Hajime Tazaki (6):
  lkl: add system call hijack support
  scripts: revert CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX patches
  lkl: Android ARM (arm/arm64) support
  um lkl: add CI tests
  um: use lkl virtio_net_tap device as UML device
  um: add lkl virtio-blk device

Octavian Purdila (29):
  asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
  arch: add __SYSCALL_DEFINE_ARCH
  lkl: architecture skeleton for Linux kernel library
  lkl: host interface
  lkl: memory handling
  lkl: kernel threads support
  lkl: interrupt support
  lkl: system call interface and application API
  lkl: timers, time and delay support
  lkl: memory mapped I/O support
  lkl: basic kernel console support
  lkl: initialization and cleanup
  lkl: plug in the build system
  lkl tools: skeleton for host side library, tests and tools
  lkl tools: host lib: add utilities functions
  lkl tools: host lib: memory mapped I/O helpers
  lkl tools: host lib: virtio devices
  lkl tools: host lib: virtio block device
  lkl tools: host lib: filesystem helpers
  lkl tools: host lib: posix host operations
  lkl tools: "boot" test
  lkl tools: tool that reads/writes to/from a filesystem image
  lkl tools: tool that converts a filesystem image to tar
  lkl tools: virtio: add network device support
  checkpatch: avoid showing BIT_ULL warnings for tools/ files
  lkl tools: add lklfuse
  lkl: add documentation
  lkl: add support for Windows hosts
  lkl tools: add support for Windows host

Thomas Liebetraut (1):
  tools: Add the lkl host library to the common tools Makefile

 .circleci/config.yml                       | 276 ++++++
 Documentation/virt/uml/lkl.txt             | 453 ++++++++++
 MAINTAINERS                                |   8 +
 Makefile                                   |   4 +-
 arch/Kconfig                               |   6 +
 arch/um/Kconfig                            |  32 +-
 arch/um/Makefile                           | 151 +---
 arch/um/Makefile.um                        | 152 ++++
 arch/um/configs/x86_64_defconfig           |   6 +
 arch/um/include/asm/Kbuild                 |   1 +
 arch/um/include/asm/io.h                   |   4 +
 arch/um/kernel/syscall.c                   |  53 ++
 arch/um/lkl/.gitignore                     |   2 +
 arch/um/lkl/Kconfig                        |  91 ++
 arch/um/lkl/Kconfig.debug                  |   0
 arch/um/lkl/Makefile                       |  69 ++
 arch/um/lkl/auto.conf                      |   1 +
 arch/um/lkl/configs/lkl_defconfig          |  91 ++
 arch/um/lkl/include/asm/Kbuild             |  80 ++
 arch/um/lkl/include/asm/bitsperlong.h      |  11 +
 arch/um/lkl/include/asm/byteorder.h        |   7 +
 arch/um/lkl/include/asm/cpu.h              |  14 +
 arch/um/lkl/include/asm/elf.h              |  15 +
 arch/um/lkl/include/asm/host_ops.h         |  10 +
 arch/um/lkl/include/asm/io.h               | 104 +++
 arch/um/lkl/include/asm/irq.h              |  15 +
 arch/um/lkl/include/asm/mutex.h            |   7 +
 arch/um/lkl/include/asm/page.h             |  14 +
 arch/um/lkl/include/asm/pgtable.h          |  62 ++
 arch/um/lkl/include/asm/processor.h        |  60 ++
 arch/um/lkl/include/asm/ptrace.h           |  25 +
 arch/um/lkl/include/asm/sched.h            |  23 +
 arch/um/lkl/include/asm/setup.h            |   7 +
 arch/um/lkl/include/asm/syscalls.h         |  18 +
 arch/um/lkl/include/asm/syscalls_32.h      |  43 +
 arch/um/lkl/include/asm/thread_info.h      |  70 ++
 arch/um/lkl/include/asm/tlb.h              |  12 +
 arch/um/lkl/include/asm/uaccess.h          |  64 ++
 arch/um/lkl/include/asm/unistd.h           |  29 +
 arch/um/lkl/include/asm/unistd_32.h        |  31 +
 arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
 arch/um/lkl/include/asm/xor.h              |   9 +
 arch/um/lkl/include/system/stdarg.h        |   2 +
 arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
 arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
 arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
 arch/um/lkl/include/uapi/asm/host_ops.h    | 153 ++++
 arch/um/lkl/include/uapi/asm/irq.h         |  36 +
 arch/um/lkl/include/uapi/asm/sigcontext.h  |  16 +
 arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
 arch/um/lkl/include/uapi/asm/swab.h        |  11 +
 arch/um/lkl/include/uapi/asm/syscalls.h    | 348 ++++++++
 arch/um/lkl/include/uapi/asm/unistd.h      |  18 +
 arch/um/lkl/kernel/Makefile                |   4 +
 arch/um/lkl/kernel/asm-offsets.c           |   2 +
 arch/um/lkl/kernel/console.c               |  42 +
 arch/um/lkl/kernel/cpu.c                   | 223 +++++
 arch/um/lkl/kernel/irq.c                   | 193 +++++
 arch/um/lkl/kernel/misc.c                  |  60 ++
 arch/um/lkl/kernel/setup.c                 | 193 +++++
 arch/um/lkl/kernel/syscalls.c              | 246 ++++++
 arch/um/lkl/kernel/syscalls_32.c           | 159 ++++
 arch/um/lkl/kernel/threads.c               | 227 +++++
 arch/um/lkl/kernel/time.c                  | 145 ++++
 arch/um/lkl/kernel/vmlinux.lds.S           |  51 ++
 arch/um/lkl/mm/Makefile                    |   1 +
 arch/um/lkl/mm/bootmem.c                   |  66 ++
 arch/um/lkl/scripts/headers_install.py     | 195 +++++
 arch/um/os-Linux/Makefile                  |   5 +
 arch/um/os-Linux/lkl_dev.c                 | 188 +++++
 certs/system_certificates.S                |  16 +-
 include/asm-generic/atomic64.h             |   2 +
 include/asm-generic/export.h               |  34 +-
 include/asm-generic/vmlinux.lds.h          | 279 ++++---
 include/linux/compiler_attributes.h        |   4 +
 include/linux/export.h                     |  23 +-
 include/linux/linkage.h                    |  12 +-
 include/linux/syscalls.h                   |   6 +
 init/Kconfig                               |  12 +
 lib/.gitignore                             |   2 +
 lib/raid6/.gitignore                       |   1 +
 scripts/.gitignore                         |   2 +
 scripts/Makefile.build                     |   9 +-
 scripts/adjust_autoksyms.sh                |   6 +
 scripts/basic/.gitignore                   |   1 +
 scripts/checkpatch.pl                      |  13 +-
 scripts/depmod.sh                          |  25 +-
 scripts/genksyms/genksyms.c                |  11 +-
 scripts/kallsyms.c                         |  54 +-
 scripts/kconfig/.gitignore                 |   1 +
 scripts/link-vmlinux.sh                    |  10 +
 scripts/mod/.gitignore                     |   1 +
 scripts/mod/modpost.c                      |  30 +-
 tools/Makefile                             |  11 +-
 tools/lkl/.gitignore                       |  15 +
 tools/lkl/Build                            |   6 +
 tools/lkl/Makefile                         | 130 +++
 tools/lkl/Makefile.autoconf                | 114 +++
 tools/lkl/Targets                          |  25 +
 tools/lkl/bin/lkl-hijack.sh                |  23 +
 tools/lkl/cptofs.c                         | 635 ++++++++++++++
 tools/lkl/fs2tar.c                         | 410 +++++++++
 tools/lkl/include/.gitignore               |   1 +
 tools/lkl/include/lkl.h                    | 928 +++++++++++++++++++++
 tools/lkl/include/lkl_config.h             |  61 ++
 tools/lkl/include/lkl_host.h               | 160 ++++
 tools/lkl/include/mingw32/sys/socket.h     |   4 +
 tools/lkl/lib/.gitignore                   |   3 +
 tools/lkl/lib/Build                        |  26 +
 tools/lkl/lib/Makefile                     |  33 +
 tools/lkl/lib/config.c                     | 793 ++++++++++++++++++
 tools/lkl/lib/dbg.c                        | 300 +++++++
 tools/lkl/lib/dbg_handler.c                |  44 +
 tools/lkl/lib/endian.h                     |  31 +
 tools/lkl/lib/fs.c                         | 433 ++++++++++
 tools/lkl/lib/hijack/Build                 |   4 +
 tools/lkl/lib/hijack/hijack.c              | 620 ++++++++++++++
 tools/lkl/lib/hijack/init.c                | 252 ++++++
 tools/lkl/lib/hijack/init.h                |   8 +
 tools/lkl/lib/hijack/xlate.c               | 613 ++++++++++++++
 tools/lkl/lib/hijack/xlate.h               |  13 +
 tools/lkl/lib/iomem.c                      |  88 ++
 tools/lkl/lib/iomem.h                      |  15 +
 tools/lkl/lib/jmp_buf.c                    |  14 +
 tools/lkl/lib/jmp_buf.h                    |   8 +
 tools/lkl/lib/net.c                        | 818 ++++++++++++++++++
 tools/lkl/lib/nt-host.c                    | 375 +++++++++
 tools/lkl/lib/posix-host.c                 | 439 ++++++++++
 tools/lkl/lib/utils.c                      | 266 ++++++
 tools/lkl/lib/virtio.c                     | 644 ++++++++++++++
 tools/lkl/lib/virtio.h                     | 115 +++
 tools/lkl/lib/virtio_blk.c                 | 132 +++
 tools/lkl/lib/virtio_net.c                 | 345 ++++++++
 tools/lkl/lib/virtio_net_dpdk.c            | 480 +++++++++++
 tools/lkl/lib/virtio_net_fd.c              | 195 +++++
 tools/lkl/lib/virtio_net_fd.h              |  50 ++
 tools/lkl/lib/virtio_net_macvtap.c         |  32 +
 tools/lkl/lib/virtio_net_pipe.c            |  76 ++
 tools/lkl/lib/virtio_net_raw.c             |  94 +++
 tools/lkl/lib/virtio_net_tap.c             | 111 +++
 tools/lkl/lib/virtio_net_vde.c             | 168 ++++
 tools/lkl/lklfuse.c                        | 658 +++++++++++++++
 tools/lkl/scripts/checkpatch.sh            |  60 ++
 tools/lkl/scripts/lkl-jenkins.sh           |  21 +
 tools/lkl/tests/Build                      |   3 +
 tools/lkl/tests/boot.c                     | 562 +++++++++++++
 tools/lkl/tests/boot.sh                    |   9 +
 tools/lkl/tests/cla.c                      | 159 ++++
 tools/lkl/tests/cla.h                      |  33 +
 tools/lkl/tests/disk.c                     | 189 +++++
 tools/lkl/tests/disk.sh                    |  70 ++
 tools/lkl/tests/hijack-test.sh             | 760 +++++++++++++++++
 tools/lkl/tests/lklfuse.sh                 | 110 +++
 tools/lkl/tests/net-setup.sh               | 134 +++
 tools/lkl/tests/net-test.c                 | 317 +++++++
 tools/lkl/tests/net.sh                     | 186 +++++
 tools/lkl/tests/run.py                     | 182 ++++
 tools/lkl/tests/run_netperf.sh             |  98 +++
 tools/lkl/tests/tap13.py                   | 209 +++++
 tools/lkl/tests/test.c                     | 126 +++
 tools/lkl/tests/test.h                     |  72 ++
 tools/lkl/tests/test.sh                    | 240 ++++++
 tools/lkl/tests/valgrind.supp              |  85 ++
 tools/lkl/tests/valgrind2xunit.py          |  69 ++
 usr/initramfs_data.S                       |   4 +-
 165 files changed, 19489 insertions(+), 354 deletions(-)
 create mode 100644 .circleci/config.yml
 create mode 100644 Documentation/virt/uml/lkl.txt
 create mode 100644 arch/um/Makefile.um
 create mode 100644 arch/um/lkl/.gitignore
 create mode 100644 arch/um/lkl/Kconfig
 create mode 100644 arch/um/lkl/Kconfig.debug
 create mode 100644 arch/um/lkl/Makefile
 create mode 100644 arch/um/lkl/auto.conf
 create mode 100644 arch/um/lkl/configs/lkl_defconfig
 create mode 100644 arch/um/lkl/include/asm/Kbuild
 create mode 100644 arch/um/lkl/include/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/asm/cpu.h
 create mode 100644 arch/um/lkl/include/asm/elf.h
 create mode 100644 arch/um/lkl/include/asm/host_ops.h
 create mode 100644 arch/um/lkl/include/asm/io.h
 create mode 100644 arch/um/lkl/include/asm/irq.h
 create mode 100644 arch/um/lkl/include/asm/mutex.h
 create mode 100644 arch/um/lkl/include/asm/page.h
 create mode 100644 arch/um/lkl/include/asm/pgtable.h
 create mode 100644 arch/um/lkl/include/asm/processor.h
 create mode 100644 arch/um/lkl/include/asm/ptrace.h
 create mode 100644 arch/um/lkl/include/asm/sched.h
 create mode 100644 arch/um/lkl/include/asm/setup.h
 create mode 100644 arch/um/lkl/include/asm/syscalls.h
 create mode 100644 arch/um/lkl/include/asm/syscalls_32.h
 create mode 100644 arch/um/lkl/include/asm/thread_info.h
 create mode 100644 arch/um/lkl/include/asm/tlb.h
 create mode 100644 arch/um/lkl/include/asm/uaccess.h
 create mode 100644 arch/um/lkl/include/asm/unistd.h
 create mode 100644 arch/um/lkl/include/asm/unistd_32.h
 create mode 100644 arch/um/lkl/include/asm/vmlinux.lds.h
 create mode 100644 arch/um/lkl/include/asm/xor.h
 create mode 100644 arch/um/lkl/include/system/stdarg.h
 create mode 100644 arch/um/lkl/include/uapi/asm/Kbuild
 create mode 100644 arch/um/lkl/include/uapi/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/uapi/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/uapi/asm/host_ops.h
 create mode 100644 arch/um/lkl/include/uapi/asm/irq.h
 create mode 100644 arch/um/lkl/include/uapi/asm/sigcontext.h
 create mode 100644 arch/um/lkl/include/uapi/asm/siginfo.h
 create mode 100644 arch/um/lkl/include/uapi/asm/swab.h
 create mode 100644 arch/um/lkl/include/uapi/asm/syscalls.h
 create mode 100644 arch/um/lkl/include/uapi/asm/unistd.h
 create mode 100644 arch/um/lkl/kernel/Makefile
 create mode 100644 arch/um/lkl/kernel/asm-offsets.c
 create mode 100644 arch/um/lkl/kernel/console.c
 create mode 100644 arch/um/lkl/kernel/cpu.c
 create mode 100644 arch/um/lkl/kernel/irq.c
 create mode 100644 arch/um/lkl/kernel/misc.c
 create mode 100644 arch/um/lkl/kernel/setup.c
 create mode 100644 arch/um/lkl/kernel/syscalls.c
 create mode 100644 arch/um/lkl/kernel/syscalls_32.c
 create mode 100644 arch/um/lkl/kernel/threads.c
 create mode 100644 arch/um/lkl/kernel/time.c
 create mode 100644 arch/um/lkl/kernel/vmlinux.lds.S
 create mode 100644 arch/um/lkl/mm/Makefile
 create mode 100644 arch/um/lkl/mm/bootmem.c
 create mode 100755 arch/um/lkl/scripts/headers_install.py
 create mode 100644 arch/um/os-Linux/lkl_dev.c
 create mode 100644 tools/lkl/.gitignore
 create mode 100644 tools/lkl/Build
 create mode 100644 tools/lkl/Makefile
 create mode 100644 tools/lkl/Makefile.autoconf
 create mode 100644 tools/lkl/Targets
 create mode 100755 tools/lkl/bin/lkl-hijack.sh
 create mode 100644 tools/lkl/cptofs.c
 create mode 100644 tools/lkl/fs2tar.c
 create mode 100644 tools/lkl/include/.gitignore
 create mode 100644 tools/lkl/include/lkl.h
 create mode 100644 tools/lkl/include/lkl_config.h
 create mode 100644 tools/lkl/include/lkl_host.h
 create mode 100644 tools/lkl/include/mingw32/sys/socket.h
 create mode 100644 tools/lkl/lib/.gitignore
 create mode 100644 tools/lkl/lib/Build
 create mode 100644 tools/lkl/lib/Makefile
 create mode 100644 tools/lkl/lib/config.c
 create mode 100644 tools/lkl/lib/dbg.c
 create mode 100644 tools/lkl/lib/dbg_handler.c
 create mode 100644 tools/lkl/lib/endian.h
 create mode 100644 tools/lkl/lib/fs.c
 create mode 100644 tools/lkl/lib/hijack/Build
 create mode 100644 tools/lkl/lib/hijack/hijack.c
 create mode 100644 tools/lkl/lib/hijack/init.c
 create mode 100644 tools/lkl/lib/hijack/init.h
 create mode 100644 tools/lkl/lib/hijack/xlate.c
 create mode 100644 tools/lkl/lib/hijack/xlate.h
 create mode 100644 tools/lkl/lib/iomem.c
 create mode 100644 tools/lkl/lib/iomem.h
 create mode 100644 tools/lkl/lib/jmp_buf.c
 create mode 100644 tools/lkl/lib/jmp_buf.h
 create mode 100644 tools/lkl/lib/net.c
 create mode 100644 tools/lkl/lib/nt-host.c
 create mode 100644 tools/lkl/lib/posix-host.c
 create mode 100644 tools/lkl/lib/utils.c
 create mode 100644 tools/lkl/lib/virtio.c
 create mode 100644 tools/lkl/lib/virtio.h
 create mode 100644 tools/lkl/lib/virtio_blk.c
 create mode 100644 tools/lkl/lib/virtio_net.c
 create mode 100644 tools/lkl/lib/virtio_net_dpdk.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.h
 create mode 100644 tools/lkl/lib/virtio_net_macvtap.c
 create mode 100644 tools/lkl/lib/virtio_net_pipe.c
 create mode 100644 tools/lkl/lib/virtio_net_raw.c
 create mode 100644 tools/lkl/lib/virtio_net_tap.c
 create mode 100644 tools/lkl/lib/virtio_net_vde.c
 create mode 100644 tools/lkl/lklfuse.c
 create mode 100755 tools/lkl/scripts/checkpatch.sh
 create mode 100755 tools/lkl/scripts/lkl-jenkins.sh
 create mode 100644 tools/lkl/tests/Build
 create mode 100644 tools/lkl/tests/boot.c
 create mode 100755 tools/lkl/tests/boot.sh
 create mode 100644 tools/lkl/tests/cla.c
 create mode 100644 tools/lkl/tests/cla.h
 create mode 100644 tools/lkl/tests/disk.c
 create mode 100755 tools/lkl/tests/disk.sh
 create mode 100755 tools/lkl/tests/hijack-test.sh
 create mode 100755 tools/lkl/tests/lklfuse.sh
 create mode 100644 tools/lkl/tests/net-setup.sh
 create mode 100644 tools/lkl/tests/net-test.c
 create mode 100755 tools/lkl/tests/net.sh
 create mode 100755 tools/lkl/tests/run.py
 create mode 100755 tools/lkl/tests/run_netperf.sh
 create mode 100644 tools/lkl/tests/tap13.py
 create mode 100644 tools/lkl/tests/test.c
 create mode 100644 tools/lkl/tests/test.h
 create mode 100644 tools/lkl/tests/test.sh
 create mode 100644 tools/lkl/tests/valgrind.supp
 create mode 100755 tools/lkl/tests/valgrind2xunit.py

-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* [RFC v2 01/37] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch

From: Octavian Purdila <tavi.purdila@gmail.com>

With CONFIG_64BIT enabled, atomic64 via CONFIG_GENERIC_ATOMIC64 options
are not compiled due to type conflict of atomic64_t defined in
linux/type.h.

This commit fixes the issue and allow using generic atomic64 ops.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 include/asm-generic/atomic64.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/asm-generic/atomic64.h b/include/asm-generic/atomic64.h
index 370f01d4450f..9b15847baae5 100644
--- a/include/asm-generic/atomic64.h
+++ b/include/asm-generic/atomic64.h
@@ -9,9 +9,11 @@
 #define _ASM_GENERIC_ATOMIC64_H
 #include <linux/types.h>
 
+#ifndef CONFIG_64BIT
 typedef struct {
 	s64 counter;
 } atomic64_t;
+#endif
 
 #define ATOMIC64_INIT(i)	{ (i) }
 
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 01/37] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, linux-kernel-library, linux-arch, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

With CONFIG_64BIT enabled, atomic64 via CONFIG_GENERIC_ATOMIC64 options
are not compiled due to type conflict of atomic64_t defined in
linux/type.h.

This commit fixes the issue and allow using generic atomic64 ops.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 include/asm-generic/atomic64.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/asm-generic/atomic64.h b/include/asm-generic/atomic64.h
index 370f01d4450f..9b15847baae5 100644
--- a/include/asm-generic/atomic64.h
+++ b/include/asm-generic/atomic64.h
@@ -9,9 +9,11 @@
 #define _ASM_GENERIC_ATOMIC64_H
 #include <linux/types.h>
 
+#ifndef CONFIG_64BIT
 typedef struct {
 	s64 counter;
 } atomic64_t;
+#endif
 
 #define ATOMIC64_INIT(i)	{ (i) }
 
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 02/37] arch: add __SYSCALL_DEFINE_ARCH
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch

From: Octavian Purdila <tavi.purdila@gmail.com>

This allows the architecture code to process the system call
definitions. It is used by LKL to create strong typed function
definitions for system calls.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 include/linux/syscalls.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 88145da7d140..77e52fe19923 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -203,9 +203,14 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
 }
 #endif
 
+#ifndef __SYSCALL_DEFINE_ARCH
+#define __SYSCALL_DEFINE_ARCH(x, sname, ...)
+#endif
+
 #ifndef SYSCALL_DEFINE0
 #define SYSCALL_DEFINE0(sname)					\
 	SYSCALL_METADATA(_##sname, 0);				\
+	__SYSCALL_DEFINE_ARCH(0, _##sname);			\
 	asmlinkage long sys_##sname(void);			\
 	ALLOW_ERROR_INJECTION(sys_##sname, ERRNO);		\
 	asmlinkage long sys_##sname(void)
@@ -222,6 +227,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
 
 #define SYSCALL_DEFINEx(x, sname, ...)				\
 	SYSCALL_METADATA(sname, x, __VA_ARGS__)			\
+	__SYSCALL_DEFINE_ARCH(x, sname, __VA_ARGS__)		\
 	__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
 
 #define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 02/37] arch: add __SYSCALL_DEFINE_ARCH
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, linux-kernel-library, linux-arch, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

This allows the architecture code to process the system call
definitions. It is used by LKL to create strong typed function
definitions for system calls.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 include/linux/syscalls.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 88145da7d140..77e52fe19923 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -203,9 +203,14 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
 }
 #endif
 
+#ifndef __SYSCALL_DEFINE_ARCH
+#define __SYSCALL_DEFINE_ARCH(x, sname, ...)
+#endif
+
 #ifndef SYSCALL_DEFINE0
 #define SYSCALL_DEFINE0(sname)					\
 	SYSCALL_METADATA(_##sname, 0);				\
+	__SYSCALL_DEFINE_ARCH(0, _##sname);			\
 	asmlinkage long sys_##sname(void);			\
 	ALLOW_ERROR_INJECTION(sys_##sname, ERRNO);		\
 	asmlinkage long sys_##sname(void)
@@ -222,6 +227,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
 
 #define SYSCALL_DEFINEx(x, sname, ...)				\
 	SYSCALL_METADATA(sname, x, __VA_ARGS__)			\
+	__SYSCALL_DEFINE_ARCH(x, sname, __VA_ARGS__)		\
 	__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
 
 #define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Conrad Meyer, Edison M . Castro, Hajime Tazaki, Jens Staal,
	Lai Jiangshan, Levente Kurusa, Luca Dariz, Mark Stillwell,
	Matthieu Coudron, Michael Zimmermann, Motomu Utsumi,
	Patrick Collins, Petros Angelatos, Pierre-Hugues Husson,
	Xiao Jia, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Adds the LKL Kconfig, vmlinux linker script, basic architecture
headers and miscellaneous basic functions or stubs such as
dump_stack(), show_regs() and cpuinfo proc ops.

The headers we introduce in this patch are simple wrappers to the
asm-generic headers or stubs for things we don't support, such as
ptrace, DMA, signals, ELF handling and low level processor operations.

The kernel configuration is automatically updated to reflect the
endianness of the host, 64bit support or the output format for
vmlinux's linker script. We do this by looking at the ld's default
output format.

Signed-off-by: Andreas Abel <aabel@google.com>
Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Jens Staal <staal1978@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Levente Kurusa <levex@linux.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Matthieu Coudron <mattator@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 MAINTAINERS                                |   8 +
 arch/um/lkl/.gitignore                     |   2 +
 arch/um/lkl/Kconfig                        |  95 ++++++
 arch/um/lkl/Kconfig.debug                  |   0
 arch/um/lkl/configs/lkl_defconfig          |  91 ++++++
 arch/um/lkl/include/asm/Kbuild             |  80 +++++
 arch/um/lkl/include/asm/bitsperlong.h      |  11 +
 arch/um/lkl/include/asm/byteorder.h        |   7 +
 arch/um/lkl/include/asm/cpu.h              |  14 +
 arch/um/lkl/include/asm/elf.h              |  15 +
 arch/um/lkl/include/asm/mutex.h            |   7 +
 arch/um/lkl/include/asm/processor.h        |  60 ++++
 arch/um/lkl/include/asm/ptrace.h           |  25 ++
 arch/um/lkl/include/asm/sched.h            |  23 ++
 arch/um/lkl/include/asm/syscalls.h         |  18 ++
 arch/um/lkl/include/asm/syscalls_32.h      |  43 +++
 arch/um/lkl/include/asm/tlb.h              |  12 +
 arch/um/lkl/include/asm/uaccess.h          |  64 ++++
 arch/um/lkl/include/asm/unistd_32.h        |  31 ++
 arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
 arch/um/lkl/include/asm/xor.h              |   9 +
 arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
 arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
 arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
 arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
 arch/um/lkl/include/uapi/asm/swab.h        |  11 +
 arch/um/lkl/include/uapi/asm/syscalls.h    | 348 +++++++++++++++++++++
 arch/um/lkl/kernel/asm-offsets.c           |   2 +
 arch/um/lkl/kernel/misc.c                  |  60 ++++
 arch/um/lkl/kernel/vmlinux.lds.S           |  51 +++
 30 files changed, 1145 insertions(+)
 create mode 100644 arch/um/lkl/.gitignore
 create mode 100644 arch/um/lkl/Kconfig
 create mode 100644 arch/um/lkl/Kconfig.debug
 create mode 100644 arch/um/lkl/configs/lkl_defconfig
 create mode 100644 arch/um/lkl/include/asm/Kbuild
 create mode 100644 arch/um/lkl/include/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/asm/cpu.h
 create mode 100644 arch/um/lkl/include/asm/elf.h
 create mode 100644 arch/um/lkl/include/asm/mutex.h
 create mode 100644 arch/um/lkl/include/asm/processor.h
 create mode 100644 arch/um/lkl/include/asm/ptrace.h
 create mode 100644 arch/um/lkl/include/asm/sched.h
 create mode 100644 arch/um/lkl/include/asm/syscalls.h
 create mode 100644 arch/um/lkl/include/asm/syscalls_32.h
 create mode 100644 arch/um/lkl/include/asm/tlb.h
 create mode 100644 arch/um/lkl/include/asm/uaccess.h
 create mode 100644 arch/um/lkl/include/asm/unistd_32.h
 create mode 100644 arch/um/lkl/include/asm/vmlinux.lds.h
 create mode 100644 arch/um/lkl/include/asm/xor.h
 create mode 100644 arch/um/lkl/include/uapi/asm/Kbuild
 create mode 100644 arch/um/lkl/include/uapi/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/uapi/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/uapi/asm/siginfo.h
 create mode 100644 arch/um/lkl/include/uapi/asm/swab.h
 create mode 100644 arch/um/lkl/include/uapi/asm/syscalls.h
 create mode 100644 arch/um/lkl/kernel/asm-offsets.c
 create mode 100644 arch/um/lkl/kernel/misc.c
 create mode 100644 arch/um/lkl/kernel/vmlinux.lds.S

diff --git a/MAINTAINERS b/MAINTAINERS
index e7a47b5210fd..df8151c6fd9e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9369,6 +9369,14 @@ F:	Documentation/core-api/atomic_ops.rst
 F:	Documentation/core-api/refcount-vs-atomic.rst
 F:	Documentation/memory-barriers.txt
 
+LINUX KERNEL LIBRARY
+M:	Octavian Purdila <tavi.purdila@gmail.com>
+M:	Hajime Tazaki <thehajime@gmail.com>
+L:	linux-kernel-library@freelists.org
+S:	Maintained
+F:	arch/um/lkl/
+F:	tools/lkl/
+
 LIS3LV02D ACCELEROMETER DRIVER
 M:	Eric Piel <eric.piel@tremplin-utc.net>
 S:	Maintained
diff --git a/arch/um/lkl/.gitignore b/arch/um/lkl/.gitignore
new file mode 100644
index 000000000000..ced1c60d8235
--- /dev/null
+++ b/arch/um/lkl/.gitignore
@@ -0,0 +1,2 @@
+kernel/vmlinux.lds
+include/generated
diff --git a/arch/um/lkl/Kconfig b/arch/um/lkl/Kconfig
new file mode 100644
index 000000000000..1dae70f16c43
--- /dev/null
+++ b/arch/um/lkl/Kconfig
@@ -0,0 +1,95 @@
+# SPDX-License-Identifier: GPL-2.0
+
+config UML_LKL
+       def_bool y
+       depends on !SMP && !MMU && !COREDUMP && !SECCOMP && !UPROBES && !COMPAT && !USER_RETURN_NOTIFIER
+       select ARCH_THREAD_STACK_ALLOCATOR
+       select RWSEM_GENERIC_SPINLOCK
+       select GENERIC_ATOMIC64
+       select GENERIC_HWEIGHT
+       select FLATMEM
+       select FLAT_NODE_MEM_MAP
+       select GENERIC_CLOCKEVENTS
+       select GENERIC_CPU_DEVICES
+       select NO_HZ_IDLE
+       select NO_PREEMPT
+       select ARCH_WANT_FRAME_POINTERS
+       select HAS_DMA
+       select DMA_DIRECT_OPS
+       select PHYS_ADDR_T_64BIT if 64BIT
+       select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64"
+       select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64-freebsd"
+       select NET
+       select MULTIUSER
+       select INET
+       select IPV6
+       select IP_PNP
+       select IP_PNP_DHCP
+       select TCP_CONG_ADVANCED
+       select TCP_CONG_BBR
+       select HIGH_RES_TIMERS
+       select NET_SCHED
+       select NET_SCH_FQ
+       select IP_MULTICAST
+       select IPV6_MULTICAST
+       select IP_MULTIPLE_TABLES
+       select IPV6_MULTIPLE_TABLES
+       select IP_ROUTE_MULTIPATH
+       select IPV6_ROUTE_MULTIPATH
+       select IP_ADVANCED_ROUTER
+       select IPV6_ADVANCED_ROUTER
+       select ARCH_NO_COHERENT_DMA_MMAP
+       select HAVE_MEMBLOCK
+       select NO_BOOTMEM
+
+config OUTPUT_FORMAT
+       string "Output format"
+       default "$(OUTPUT_FORMAT)"
+
+config ARCH_DMA_ADDR_T_64BIT
+       def_bool 64BIT
+
+config 64BIT
+       def_bool n
+
+config COREDUMP
+       def_bool n
+
+config BIG_ENDIAN
+       def_bool n
+
+config GENERIC_CSUM
+       def_bool y
+
+config GENERIC_HWEIGHT
+       def_bool y
+
+config NO_IOPORT_MAP
+       def_bool y
+
+config RWSEM_GENERIC_SPINLOCK
+	bool
+	default y
+
+config BTRFS_FS
+	tristate
+	default n
+
+config XFS_FS
+	tristate
+	default n
+
+config HZ
+        int
+        default 100
+
+config CONSOLE_LOGLEVEL_QUIET
+	int "quiet console loglevel (1-15)"
+	range 1 15
+	default "4"
+	help
+	  loglevel to use when "quiet" is passed on the kernel commandline.
+
+	  When "quiet" is passed on the kernel commandline this loglevel
+	  will be used as the loglevel. IOW passing "quiet" will be the
+	  equivalent of passing "loglevel=<CONSOLE_LOGLEVEL_QUIET>"
diff --git a/arch/um/lkl/Kconfig.debug b/arch/um/lkl/Kconfig.debug
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/arch/um/lkl/configs/lkl_defconfig b/arch/um/lkl/configs/lkl_defconfig
new file mode 100644
index 000000000000..1a281480839b
--- /dev/null
+++ b/arch/um/lkl/configs/lkl_defconfig
@@ -0,0 +1,91 @@
+# CONFIG_LOCALVERSION_AUTO is not set
+CONFIG_NO_HZ_IDLE=y
+# CONFIG_SYSFS_SYSCALL is not set
+CONFIG_KALLSYMS_USE_DATA_SECTION=y
+CONFIG_KALLSYMS_ALL=y
+# CONFIG_BASE_FULL is not set
+# CONFIG_FUTEX is not set
+# CONFIG_SIGNALFD is not set
+# CONFIG_TIMERFD is not set
+# CONFIG_AIO is not set
+# CONFIG_ADVISE_SYSCALLS is not set
+CONFIG_EMBEDDED=y
+# CONFIG_VM_EVENT_COUNTERS is not set
+# CONFIG_COMPAT_BRK is not set
+# CONFIG_BLK_DEV_BSG is not set
+CONFIG_NET=y
+CONFIG_INET=y
+# CONFIG_WIRELESS is not set
+# CONFIG_UEVENT_HELPER is not set
+# CONFIG_FW_LOADER is not set
+CONFIG_VIRTIO_BLK=y
+CONFIG_NETDEVICES=y
+CONFIG_VIRTIO_NET=y
+# CONFIG_ETHERNET is not set
+# CONFIG_WLAN is not set
+# CONFIG_VT is not set
+CONFIG_VIRTIO_MMIO=y
+CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_FS_POSIX_ACL=y
+CONFIG_EXT4_FS_SECURITY=y
+# CONFIG_FILE_LOCKING is not set
+# CONFIG_DNOTIFY is not set
+# CONFIG_INOTIFY_USER is not set
+CONFIG_VFAT_FS=y
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_CODEPAGE_737=y
+CONFIG_NLS_CODEPAGE_775=y
+CONFIG_NLS_CODEPAGE_850=y
+CONFIG_NLS_CODEPAGE_852=y
+CONFIG_NLS_CODEPAGE_855=y
+CONFIG_NLS_CODEPAGE_857=y
+CONFIG_NLS_CODEPAGE_860=y
+CONFIG_NLS_CODEPAGE_861=y
+CONFIG_NLS_CODEPAGE_862=y
+CONFIG_NLS_CODEPAGE_863=y
+CONFIG_NLS_CODEPAGE_864=y
+CONFIG_NLS_CODEPAGE_865=y
+CONFIG_NLS_CODEPAGE_866=y
+CONFIG_NLS_CODEPAGE_869=y
+CONFIG_NLS_CODEPAGE_936=y
+CONFIG_NLS_CODEPAGE_950=y
+CONFIG_NLS_CODEPAGE_932=y
+CONFIG_NLS_CODEPAGE_949=y
+CONFIG_NLS_CODEPAGE_874=y
+CONFIG_NLS_ISO8859_8=y
+CONFIG_NLS_CODEPAGE_1250=y
+CONFIG_NLS_CODEPAGE_1251=y
+CONFIG_NLS_ASCII=y
+CONFIG_NLS_ISO8859_1=y
+CONFIG_NLS_ISO8859_2=y
+CONFIG_NLS_ISO8859_3=y
+CONFIG_NLS_ISO8859_4=y
+CONFIG_NLS_ISO8859_5=y
+CONFIG_NLS_ISO8859_6=y
+CONFIG_NLS_ISO8859_7=y
+CONFIG_NLS_ISO8859_9=y
+CONFIG_NLS_ISO8859_13=y
+CONFIG_NLS_ISO8859_14=y
+CONFIG_NLS_ISO8859_15=y
+CONFIG_NLS_KOI8_R=y
+CONFIG_NLS_KOI8_U=y
+CONFIG_NLS_MAC_ROMAN=y
+CONFIG_NLS_MAC_CELTIC=y
+CONFIG_NLS_MAC_CENTEURO=y
+CONFIG_NLS_MAC_CROATIAN=y
+CONFIG_NLS_MAC_CYRILLIC=y
+CONFIG_NLS_MAC_GAELIC=y
+CONFIG_NLS_MAC_GREEK=y
+CONFIG_NLS_MAC_ICELAND=y
+CONFIG_NLS_MAC_INUIT=y
+CONFIG_NLS_MAC_ROMANIAN=y
+CONFIG_NLS_MAC_TURKISH=y
+CONFIG_NLS_UTF8=y
+CONFIG_HZ_100=y
+CONFIG_CRYPTO_ANSI_CPRNG=y
+CONFIG_PRINTK_TIME=y
+CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_REDUCED=y
+# CONFIG_ENABLE_WARN_DEPRECATED is not set
+# CONFIG_ENABLE_MUST_CHECK is not set
diff --git a/arch/um/lkl/include/asm/Kbuild b/arch/um/lkl/include/asm/Kbuild
new file mode 100644
index 000000000000..f6308985c61c
--- /dev/null
+++ b/arch/um/lkl/include/asm/Kbuild
@@ -0,0 +1,80 @@
+generic-y += atomic.h
+generic-y += barrier.h
+generic-y += bitops.h
+generic-y += bug.h
+generic-y += bugs.h
+generic-y += cache.h
+generic-y += cacheflush.h
+generic-y += checksum.h
+generic-y += cmpxchg-local.h
+generic-y += cmpxchg.h
+generic-y += compat.h
+generic-y += cputime.h
+generic-y += current.h
+generic-y += delay.h
+generic-y += device.h
+generic-y += div64.h
+generic-y += dma.h
+generic-y += dma-mapping.h
+generic-y += emergency-restart.h
+generic-y += errno.h
+generic-y += extable.h
+generic-y += exec.h
+generic-y += ftrace.h
+generic-y += futex.h
+generic-y += hardirq.h
+generic-y += hw_irq.h
+generic-y += ioctl.h
+generic-y += ipcbuf.h
+generic-y += irq_regs.h
+generic-y += irqflags.h
+generic-y += irq_work.h
+generic-y += kdebug.h
+generic-y += kmap_types.h
+generic-y += linkage.h
+generic-y += local.h
+generic-y += local64.h
+generic-y += mcs_spinlock.h
+generic-y += mmiowb.h
+generic-y += mmu.h
+generic-y += mmu_context.h
+generic-y += module.h
+generic-y += msgbuf.h
+generic-y += param.h
+generic-y += parport.h
+generic-y += pci.h
+generic-y += percpu.h
+generic-y += pgalloc.h
+generic-y += poll.h
+generic-y += preempt.h
+generic-y += resource.h
+generic-y += rwsem.h
+generic-y += scatterlist.h
+generic-y += seccomp.h
+generic-y += sections.h
+generic-y += segment.h
+generic-y += sembuf.h
+generic-y += serial.h
+generic-y += shmbuf.h
+generic-y += signal.h
+generic-y += simd.h
+generic-y += sizes.h
+generic-y += socket.h
+generic-y += sockios.h
+generic-y += stat.h
+generic-y += statfs.h
+generic-y += string.h
+generic-y += swab.h
+generic-y += switch_to.h
+generic-y += syscall.h
+generic-y += termbits.h
+generic-y += termios.h
+generic-y += time.h
+generic-y += timex.h
+generic-y += tlbflush.h
+generic-y += topology.h
+generic-y += trace_clock.h
+generic-y += unaligned.h
+generic-y += user.h
+generic-y += word-at-a-time.h
+generic-y += kprobes.h
diff --git a/arch/um/lkl/include/asm/bitsperlong.h b/arch/um/lkl/include/asm/bitsperlong.h
new file mode 100644
index 000000000000..5745d5e51274
--- /dev/null
+++ b/arch/um/lkl/include/asm/bitsperlong.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LKL_BITSPERLONG_H
+#define __LKL_BITSPERLONG_H
+
+#include <uapi/asm/bitsperlong.h>
+
+#define BITS_PER_LONG __BITS_PER_LONG
+
+#define BITS_PER_LONG_LONG 64
+
+#endif
diff --git a/arch/um/lkl/include/asm/byteorder.h b/arch/um/lkl/include/asm/byteorder.h
new file mode 100644
index 000000000000..5d0c4efaa44b
--- /dev/null
+++ b/arch/um/lkl/include/asm/byteorder.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_BYTEORDER_H
+#define _ASM_LKL_BYTEORDER_H
+
+#include <uapi/asm/byteorder.h>
+
+#endif /* _ASM_LKL_BYTEORDER_H */
diff --git a/arch/um/lkl/include/asm/cpu.h b/arch/um/lkl/include/asm/cpu.h
new file mode 100644
index 000000000000..d2b8c501c7b1
--- /dev/null
+++ b/arch/um/lkl/include/asm/cpu.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_CPU_H
+#define _ASM_LKL_CPU_H
+
+int lkl_cpu_get(void);
+void lkl_cpu_put(void);
+int lkl_cpu_try_run_irq(int irq);
+int lkl_cpu_init(void);
+void lkl_cpu_shutdown(void);
+void lkl_cpu_wait_shutdown(void);
+void lkl_cpu_change_owner(lkl_thread_t owner);
+void lkl_cpu_set_irqs_pending(void);
+
+#endif /* _ASM_LKL_CPU_H */
diff --git a/arch/um/lkl/include/asm/elf.h b/arch/um/lkl/include/asm/elf.h
new file mode 100644
index 000000000000..bb2456d638f4
--- /dev/null
+++ b/arch/um/lkl/include/asm/elf.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_ELF_H
+#define _ASM_LKL_ELF_H
+
+#define elf_check_arch(x) 0
+
+#ifdef CONFIG_64BIT
+#define ELF_CLASS ELFCLASS64
+#else
+#define ELF_CLASS ELFCLASS32
+#endif
+
+#define elf_gregset_t long
+#define elf_fpregset_t double
+#endif
diff --git a/arch/um/lkl/include/asm/mutex.h b/arch/um/lkl/include/asm/mutex.h
new file mode 100644
index 000000000000..492d04183f9c
--- /dev/null
+++ b/arch/um/lkl/include/asm/mutex.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_MUTEX_H
+#define _ASM_LKL_MUTEX_H
+
+#include <asm-generic/mutex-dec.h>
+
+#endif
diff --git a/arch/um/lkl/include/asm/processor.h b/arch/um/lkl/include/asm/processor.h
new file mode 100644
index 000000000000..c1aa8b3a266e
--- /dev/null
+++ b/arch/um/lkl/include/asm/processor.h
@@ -0,0 +1,60 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_PROCESSOR_H
+#define _ASM_LKL_PROCESSOR_H
+
+struct task_struct;
+
+static inline void cpu_relax(void)
+{
+	unsigned long flags;
+
+	/* since this is usually called in a tight loop waiting for some
+	 * external condition (e.g. jiffies) lets run interrupts now to allow
+	 * the external condition to propagate
+	 */
+	local_irq_save(flags);
+	local_irq_restore(flags);
+}
+
+#define current_text_addr() ({ __label__ _l; _l: &&_l; })
+
+static inline unsigned long thread_saved_pc(struct task_struct *tsk)
+{
+	return 0;
+}
+
+static inline void release_thread(struct task_struct *dead_task)
+{
+}
+
+static inline void prepare_to_copy(struct task_struct *tsk)
+{
+}
+
+static inline unsigned long get_wchan(struct task_struct *p)
+{
+	return 0;
+}
+
+static inline void flush_thread(void)
+{
+}
+
+static inline void trap_init(void)
+{
+}
+
+struct thread_struct { };
+
+#define INIT_THREAD { }
+
+#define task_pt_regs(tsk) (struct pt_regs *)(NULL)
+
+/* We don't have strict user/kernel spaces */
+#define TASK_SIZE ((unsigned long)-1)
+#define TASK_UNMAPPED_BASE 0
+
+#define KSTK_EIP(tsk) (0)
+#define KSTK_ESP(tsk) (0)
+
+#endif
diff --git a/arch/um/lkl/include/asm/ptrace.h b/arch/um/lkl/include/asm/ptrace.h
new file mode 100644
index 000000000000..28199be26dc0
--- /dev/null
+++ b/arch/um/lkl/include/asm/ptrace.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_PTRACE_H
+#define _ASM_LKL_PTRACE_H
+
+#include <linux/errno.h>
+
+struct task_struct;
+
+#define user_mode(regs) 0
+#define kernel_mode(regs) 1
+#define profile_pc(regs) 0
+#define instruction_pointer(regs) 0
+#define user_stack_pointer(regs) 0
+
+static inline long arch_ptrace(struct task_struct *child, long request,
+			       unsigned long addr, unsigned long data)
+{
+	return -EINVAL;
+}
+
+static inline void ptrace_disable(struct task_struct *child)
+{
+}
+
+#endif
diff --git a/arch/um/lkl/include/asm/sched.h b/arch/um/lkl/include/asm/sched.h
new file mode 100644
index 000000000000..4c2635921ec8
--- /dev/null
+++ b/arch/um/lkl/include/asm/sched.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_SCHED_H
+#define _ASM_LKL_SCHED_H
+
+#include <linux/sched.h>
+#include <uapi/asm/host_ops.h>
+
+static inline void thread_sched_jb(void)
+{
+	if (test_ti_thread_flag(current_thread_info(), TIF_HOST_THREAD)) {
+		set_ti_thread_flag(current_thread_info(), TIF_SCHED_JB);
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		lkl_ops->jmp_buf_set(&current_thread_info()->sched_jb,
+				     schedule);
+	} else {
+		lkl_bug("%s() can be used only for host task\n", __func__);
+	}
+}
+
+void switch_to_host_task(struct task_struct *);
+int host_task_stub(void *unused);
+
+#endif /*  _ASM_LKL_SCHED_H */
diff --git a/arch/um/lkl/include/asm/syscalls.h b/arch/um/lkl/include/asm/syscalls.h
new file mode 100644
index 000000000000..2eaa870a9020
--- /dev/null
+++ b/arch/um/lkl/include/asm/syscalls.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_SYSCALLS_H
+#define _ASM_LKL_SYSCALLS_H
+
+int syscalls_init(void);
+void syscalls_cleanup(void);
+long lkl_syscall(long no, long *params);
+void wakeup_idle_host_task(void);
+
+#define sys_mmap sys_mmap_pgoff
+#define sys_mmap2 sys_mmap_pgoff
+#define sys_clone sys_ni_syscall
+#define sys_vfork sys_ni_syscall
+#define sys_rt_sigreturn sys_ni_syscall
+
+#include <asm-generic/syscalls.h>
+
+#endif /* _ASM_LKL_SYSCALLS_H */
diff --git a/arch/um/lkl/include/asm/syscalls_32.h b/arch/um/lkl/include/asm/syscalls_32.h
new file mode 100644
index 000000000000..0e1a7649c81b
--- /dev/null
+++ b/arch/um/lkl/include/asm/syscalls_32.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_SYSCALLS_32_H
+#define _ASM_SYSCALLS_32_H
+
+#include <linux/compiler.h>
+#include <linux/linkage.h>
+#include <linux/types.h>
+#include <linux/signal.h>
+
+#if __BITS_PER_LONG == 32
+
+/* kernel/syscalls_32.c */
+asmlinkage long sys32_truncate64(const char __user *, unsigned long,
+				 unsigned long);
+asmlinkage long sys32_ftruncate64(unsigned int, unsigned long, unsigned long);
+
+#ifdef CONFIG_MMU
+struct mmap_arg_struct32;
+asmlinkage long sys32_mmap(struct mmap_arg_struct32 __user *);
+#endif
+
+asmlinkage long sys32_wait4(pid_t, unsigned int __user *, int,
+			    struct rusage __user *);
+
+asmlinkage long sys32_pread64(unsigned int, char __user *, u32, u32, u32);
+asmlinkage long sys32_pwrite64(unsigned int, const char __user *, u32, u32,
+			       u32);
+
+long sys32_fadvise64_64(int a, __u32 b, __u32 c, __u32 d, __u32 e, int f);
+
+asmlinkage ssize_t sys32_readahead(int, unsigned int, unsigned int, size_t);
+asmlinkage long sys32_sync_file_range(int, unsigned int, unsigned int,
+				      unsigned int, unsigned int, unsigned int);
+asmlinkage long sys32_sync_file_range2(int, unsigned int, unsigned int,
+				       unsigned int, unsigned int,
+				       unsigned int);
+asmlinkage long sys32_fadvise64(int, unsigned int, unsigned int, size_t, int);
+asmlinkage long sys32_fallocate(int, int, unsigned int, unsigned int,
+				unsigned int, unsigned int);
+
+#endif /* __BITS_PER_LONG */
+
+#endif /* _ASM_SYSCALLS_32_H */
diff --git a/arch/um/lkl/include/asm/tlb.h b/arch/um/lkl/include/asm/tlb.h
new file mode 100644
index 000000000000..d474890d317d
--- /dev/null
+++ b/arch/um/lkl/include/asm/tlb.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_TLB_H
+#define _ASM_LKL_TLB_H
+
+#define tlb_start_vma(tlb, vma)				do { } while (0)
+#define tlb_end_vma(tlb, vma)				do { } while (0)
+#define __tlb_remove_tlb_entry(tlb, pte, address)	do { } while (0)
+#define tlb_flush(tlb)					do { } while (0)
+
+#include <asm-generic/tlb.h>
+
+#endif /* _ASM_LKL_TLB_H */
diff --git a/arch/um/lkl/include/asm/uaccess.h b/arch/um/lkl/include/asm/uaccess.h
new file mode 100644
index 000000000000..f267ac3be8b3
--- /dev/null
+++ b/arch/um/lkl/include/asm/uaccess.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_UACCESS_H
+#define _ASM_LKL_UACCESS_H
+
+/* copied from old include/asm-generic/uaccess.h */
+static inline __must_check long
+raw_copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	if (__builtin_constant_p(n)) {
+		switch (n) {
+		case 1:
+			*(u8 *)to = *(u8 __force *)from;
+			return 0;
+		case 2:
+			*(u16 *)to = *(u16 __force *)from;
+			return 0;
+		case 4:
+			*(u32 *)to = *(u32 __force *)from;
+			return 0;
+#ifdef CONFIG_64BIT
+		case 8:
+			*(u64 *)to = *(u64 __force *)from;
+			return 0;
+#endif
+		default:
+			break;
+		}
+	}
+
+	memcpy(to, (const void __force *)from, n);
+	return 0;
+}
+
+static inline __must_check long
+raw_copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+	if (__builtin_constant_p(n)) {
+		switch (n) {
+		case 1:
+			*(u8 __force *)to = *(u8 *)from;
+			return 0;
+		case 2:
+			*(u16 __force *)to = *(u16 *)from;
+			return 0;
+		case 4:
+			*(u32 __force *)to = *(u32 *)from;
+			return 0;
+#ifdef CONFIG_64BIT
+		case 8:
+			*(u64 __force *)to = *(u64 *)from;
+			return 0;
+#endif
+		default:
+			break;
+		}
+	}
+
+	memcpy((void __force *)to, from, n);
+	return 0;
+}
+
+#include <asm-generic/uaccess.h>
+
+#endif
diff --git a/arch/um/lkl/include/asm/unistd_32.h b/arch/um/lkl/include/asm/unistd_32.h
new file mode 100644
index 000000000000..8582a55e61e2
--- /dev/null
+++ b/arch/um/lkl/include/asm/unistd_32.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <asm/bitsperlong.h>
+
+#ifndef __SYSCALL
+#define __SYSCALL(x, y)
+#endif
+
+#if __BITS_PER_LONG == 32
+__SYSCALL(__NR3264_truncate, sys32_truncate64)
+__SYSCALL(__NR3264_ftruncate, sys32_ftruncate64)
+
+#ifdef CONFIG_MMU
+__SYSCALL(__NR3264_mmap, sys32_mmap)
+#endif
+
+__SYSCALL(__NR_wait4, sys32_wait4)
+
+__SYSCALL(__NR_pread64, sys32_pread64)
+__SYSCALL(__NR_pwrite64, sys32_pwrite64)
+
+__SYSCALL(__NR_readahead, sys32_readahead)
+#ifdef __ARCH_WANT_SYNC_FILE_RANGE2
+__SYSCALL(__NR_sync_file_range2, sys32_sync_file_range2)
+#else
+__SYSCALL(__NR_sync_file_range, sys32_sync_file_range)
+#endif
+/* mm/fadvise.c */
+__SYSCALL(__NR3264_fadvise64, sys32_fadvise64_64)
+__SYSCALL(__NR_fallocate, sys32_fallocate)
+
+#endif
diff --git a/arch/um/lkl/include/asm/vmlinux.lds.h b/arch/um/lkl/include/asm/vmlinux.lds.h
new file mode 100644
index 000000000000..a3c285882dc4
--- /dev/null
+++ b/arch/um/lkl/include/asm/vmlinux.lds.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_VMLINUX_LDS_H
+#define _LKL_VMLINUX_LDS_H
+
+/* we encode our own __ro_after_init section */
+#define RO_AFTER_INIT_DATA
+
+#ifdef __MINGW32__
+#define RODATA_SECTION .rdata
+#endif
+
+#include <asm-generic/vmlinux.lds.h>
+
+#endif
diff --git a/arch/um/lkl/include/asm/xor.h b/arch/um/lkl/include/asm/xor.h
new file mode 100644
index 000000000000..286ce75b5d9d
--- /dev/null
+++ b/arch/um/lkl/include/asm/xor.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_XOR_H
+#define _ASM_LKL_XOR_H
+
+#include <asm-generic/xor.h>
+
+#define XOR_SELECT_TEMPLATE(x) (&xor_block_8regs)
+
+#endif /* _ASM_LKL_XOR_H */
diff --git a/arch/um/lkl/include/uapi/asm/Kbuild b/arch/um/lkl/include/uapi/asm/Kbuild
new file mode 100644
index 000000000000..39d9a1f2e8f5
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/Kbuild
@@ -0,0 +1,9 @@
+# UAPI Header export list
+
+generic-y += elf.h
+generic-y += kvm_para.h
+generic-y += shmparam.h
+generic-y += timex.h
+
+# no header-y since we need special user headers handling
+# see arch/lkl/script/headers.py
diff --git a/arch/um/lkl/include/uapi/asm/bitsperlong.h b/arch/um/lkl/include/uapi/asm/bitsperlong.h
new file mode 100644
index 000000000000..8b4ebf2b0264
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/bitsperlong.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_BITSPERLONG_H
+#define _ASM_UAPI_LKL_BITSPERLONG_H
+
+#ifdef CONFIG_64BIT
+#define __BITS_PER_LONG 64
+#else
+#define __BITS_PER_LONG 32
+#endif
+
+#define __ARCH_WANT_STAT64
+
+#endif /* _ASM_UAPI_LKL_BITSPERLONG_H */
diff --git a/arch/um/lkl/include/uapi/asm/byteorder.h b/arch/um/lkl/include/uapi/asm/byteorder.h
new file mode 100644
index 000000000000..3c4a58d2062f
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/byteorder.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_BYTEORDER_H
+#define _ASM_UAPI_LKL_BYTEORDER_H
+
+#if defined(CONFIG_BIG_ENDIAN)
+#include <linux/byteorder/big_endian.h>
+#else
+#include <linux/byteorder/little_endian.h>
+#endif
+
+#endif /* _ASM_UAPI_LKL_BYTEORDER_H */
diff --git a/arch/um/lkl/include/uapi/asm/siginfo.h b/arch/um/lkl/include/uapi/asm/siginfo.h
new file mode 100644
index 000000000000..811916cf42c8
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/siginfo.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_LKL_SIGINFO_H
+#define _ASM_LKL_SIGINFO_H
+
+#ifdef CONFIG_64BIT
+#define __ARCH_SI_PREAMBLE_SIZE	(4 * sizeof(int))
+#endif
+
+#include <asm-generic/siginfo.h>
+
+#endif /* _ASM_LKL_SIGINFO_H */
diff --git a/arch/um/lkl/include/uapi/asm/swab.h b/arch/um/lkl/include/uapi/asm/swab.h
new file mode 100644
index 000000000000..1a1773e1bd35
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/swab.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_LKL_SWAB_H
+#define _ASM_LKL_SWAB_H
+
+#ifndef __arch_swab32
+#define __arch_swab32(x) ___constant_swab32(x)
+#endif
+
+#include <asm-generic/swab.h>
+
+#endif /* _ASM_LKL_SWAB_H */
diff --git a/arch/um/lkl/include/uapi/asm/syscalls.h b/arch/um/lkl/include/uapi/asm/syscalls.h
new file mode 100644
index 000000000000..a81534ffccb7
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/syscalls.h
@@ -0,0 +1,348 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_SYSCALLS_H
+#define _ASM_UAPI_LKL_SYSCALLS_H
+
+#include <autoconf.h>
+#include <linux/types.h>
+
+typedef __kernel_uid32_t	qid_t;
+typedef __kernel_fd_set		fd_set;
+typedef __kernel_mode_t		mode_t;
+typedef unsigned short		umode_t;
+typedef __u32			nlink_t;
+typedef __kernel_off_t		off_t;
+typedef __kernel_pid_t		pid_t;
+typedef __kernel_key_t		key_t;
+typedef __kernel_suseconds_t	suseconds_t;
+typedef __kernel_timer_t	timer_t;
+typedef __kernel_clockid_t	clockid_t;
+typedef __kernel_mqd_t		mqd_t;
+typedef __kernel_uid32_t	uid_t;
+typedef __kernel_gid32_t	gid_t;
+typedef __kernel_uid16_t        uid16_t;
+typedef __kernel_gid16_t        gid16_t;
+typedef unsigned long		uintptr_t;
+#ifdef CONFIG_UID16
+typedef __kernel_old_uid_t	old_uid_t;
+typedef __kernel_old_gid_t	old_gid_t;
+#endif
+typedef __kernel_loff_t		loff_t;
+typedef __kernel_size_t		size_t;
+typedef __kernel_ssize_t	ssize_t;
+typedef __kernel_time_t		time_t;
+typedef __kernel_clock_t	clock_t;
+typedef __u32			u32;
+typedef __s32			s32;
+typedef __u64			u64;
+typedef __s64			s64;
+
+#define __user
+
+#include <asm/unistd.h>
+/* Temporary undefine system calls that don't have data types defined in UAPI
+ * headers
+ */
+#undef __NR_kexec_load
+#undef __NR_getcpu
+#undef __NR_sched_getattr
+#undef __NR_sched_setattr
+#undef __NR_sched_setparam
+#undef __NR_sched_getparam
+#undef __NR_sched_setscheduler
+#undef __NR_name_to_handle_at
+#undef __NR_open_by_handle_at
+
+/* deprecated system calls */
+#undef __NR_epoll_create
+#undef __NR_epoll_wait
+#undef __NR_access
+#undef __NR_chmod
+#undef __NR_chown
+#undef __NR_lchown
+#undef __NR_open
+#undef __NR_creat
+#undef __NR_readlink
+#undef __NR_pipe
+#undef __NR_mknod
+#undef __NR_mkdir
+#undef __NR_rmdir
+#undef __NR_unlink
+#undef __NR_symlink
+#undef __NR_link
+#undef __NR_rename
+#undef __NR_getdents
+#undef __NR_select
+#undef __NR_poll
+#undef __NR_dup2
+#undef __NR_futimesat
+#undef __NR_utimes
+#undef __NR_ustat
+#undef __NR_eventfd
+#undef __NR_bdflush
+#undef __NR_send
+#undef __NR_recv
+
+#undef __NR_umount
+#define __NR_umount __NR_umount2
+
+#ifdef CONFIG_64BIT
+#define __NR_newfstat __NR3264_fstat
+#define __NR_newfstatat __NR3264_fstatat
+#endif
+
+#define __NR_mmap_pgoff __NR3264_mmap
+
+#include <linux/time.h>
+#include <linux/times.h>
+#include <linux/timex.h>
+#include <linux/capability.h>
+#define __KERNEL__ /* to pull in S_ definitions */
+#include <linux/stat.h>
+#undef __KERNEL__
+#include <linux/errno.h>
+#include <linux/fcntl.h>
+#include <linux/fs.h>
+#include <asm/statfs.h>
+#include <asm/stat.h>
+#include <linux/bpf.h>
+#include <linux/msg.h>
+#include <linux/resource.h>
+#include <linux/sysinfo.h>
+#include <linux/shm.h>
+#include <linux/aio_abi.h>
+#include <linux/socket.h>
+#include <linux/perf_event.h>
+#include <linux/sem.h>
+#include <linux/futex.h>
+#include <linux/poll.h>
+#include <linux/mqueue.h>
+#include <linux/eventpoll.h>
+#include <linux/uio.h>
+#include <asm/signal.h>
+#include <asm/siginfo.h>
+#include <linux/utime.h>
+#include <asm/socket.h>
+#include <linux/icmp.h>
+#include <linux/ip.h>
+
+/* Define data structures used in system calls that are not defined in UAPI
+ * headers
+ */
+struct sockaddr {
+	unsigned short int sa_family;
+	char sa_data[14];
+};
+
+#define __UAPI_DEF_IF_NET_DEVICE_FLAGS_LOWER_UP_DORMANT_ECHO 1
+#define __UAPI_DEF_IF_IFNAMSIZ	1
+#define __UAPI_DEF_IF_NET_DEVICE_FLAGS 1
+#define __UAPI_DEF_IF_IFREQ	1
+#define __UAPI_DEF_IF_IFMAP	1
+#include <linux/if.h>
+#define __UAPI_DEF_IN_IPPROTO	1
+#define __UAPI_DEF_IN_ADDR	1
+#define __UAPI_DEF_IN6_ADDR	1
+#define __UAPI_DEF_IP_MREQ	1
+#define __UAPI_DEF_IN_PKTINFO	1
+#define __UAPI_DEF_SOCKADDR_IN	1
+#define __UAPI_DEF_IN_CLASS	1
+#include <linux/in.h>
+#include <linux/in6.h>
+#include <linux/sockios.h>
+#include <linux/route.h>
+#include <linux/ipv6_route.h>
+#include <linux/ipv6.h>
+#include <linux/netlink.h>
+#include <linux/neighbour.h>
+#include <linux/rtnetlink.h>
+#include <linux/fib_rules.h>
+
+#include <linux/kdev_t.h>
+#include <asm/irq.h>
+#include <linux/virtio_blk.h>
+#include <linux/virtio_net.h>
+#include <linux/virtio_ring.h>
+#include <linux/pkt_sched.h>
+#include <linux/io_uring.h>
+
+struct user_msghdr {
+	void		__user *msg_name;
+	int		msg_namelen;
+	struct iovec	__user *msg_iov;
+	__kernel_size_t	msg_iovlen;
+	void		__user *msg_control;
+	__kernel_size_t	msg_controllen;
+	unsigned int	msg_flags;
+};
+
+typedef __u32 key_serial_t;
+
+struct mmsghdr {
+	struct user_msghdr  msg_hdr;
+	unsigned int        msg_len;
+};
+
+struct linux_dirent64 {
+	u64		d_ino;
+	s64		d_off;
+	unsigned short	d_reclen;
+	unsigned char	d_type;
+	char		d_name[0];
+};
+
+struct linux_dirent {
+	unsigned long	d_ino;
+	unsigned long	d_off;
+	unsigned short	d_reclen;
+	char		d_name[1];
+};
+
+struct ustat {
+	__kernel_daddr_t	f_tfree;
+	__kernel_ino_t		f_tinode;
+	char			f_fname[6];
+	char			f_fpack[6];
+};
+
+typedef __kernel_rwf_t		rwf_t;
+
+#define AF_UNSPEC       0
+#define AF_UNIX         1
+#define AF_LOCAL        1
+#define AF_INET         2
+#define AF_AX25         3
+#define AF_IPX          4
+#define AF_APPLETALK    5
+#define AF_NETROM       6
+#define AF_BRIDGE       7
+#define AF_ATMPVC       8
+#define AF_X25          9
+#define AF_INET6        10
+#define AF_ROSE         11
+#define AF_DECnet       12
+#define AF_NETBEUI      13
+#define AF_SECURITY     14
+#define AF_KEY          15
+#define AF_NETLINK      16
+#define AF_ROUTE        AF_NETLINK
+#define AF_PACKET       17
+#define AF_ASH          18
+#define AF_ECONET       19
+#define AF_ATMSVC       20
+#define AF_RDS          21
+#define AF_SNA          22
+#define AF_IRDA         23
+#define AF_PPPOX        24
+#define AF_WANPIPE      25
+#define AF_LLC          26
+#define AF_IB           27
+#define AF_MPLS         28
+#define AF_CAN          29
+#define AF_TIPC         30
+#define AF_BLUETOOTH    31
+#define AF_IUCV         32
+#define AF_RXRPC        33
+#define AF_ISDN         34
+#define AF_PHONET       35
+#define AF_IEEE802154   36
+#define AF_CAIF         37
+#define AF_ALG          38
+#define AF_NFC          39
+#define AF_VSOCK        40
+
+#define SOCK_STREAM		1
+#define SOCK_DGRAM		2
+#define SOCK_RAW		3
+#define SOCK_RDM		4
+#define SOCK_SEQPACKET		5
+#define SOCK_DCCP		6
+#define SOCK_PACKET		10
+
+#define MSG_TRUNC 0x20
+#define MSG_DONTWAIT 0x40
+
+/* avoid colision with system headers defines */
+#define sa_handler sa_handler
+#define st_atime st_atime
+#define st_mtime st_mtime
+#define st_ctime st_ctime
+#define s_addr s_addr
+
+long lkl_syscall(long no, long *params);
+long lkl_sys_halt(void);
+
+#define __MAP0(m, ...)
+#define __MAP1(m, t, a) m(t, a)
+#define __MAP2(m, t, a, ...) m(t, a), __MAP1(m, __VA_ARGS__)
+#define __MAP3(m, t, a, ...) m(t, a), __MAP2(m, __VA_ARGS__)
+#define __MAP4(m, t, a, ...) m(t, a), __MAP3(m, __VA_ARGS__)
+#define __MAP5(m, t, a, ...) m(t, a), __MAP4(m, __VA_ARGS__)
+#define __MAP6(m, t, a, ...) m(t, a), __MAP5(m, __VA_ARGS__)
+#define __MAP(n, ...) __MAP##n(__VA_ARGS__)
+
+#define __SC_LONG(t, a) (long)a
+#define __SC_TABLE(t, a) {sizeof(t), (long long)(a)}
+#define __SC_DECL(t, a) t a
+
+#define LKL_SYSCALL0(name)					       \
+	static inline long lkl_sys##name(void)			       \
+	{							       \
+		long params[6];					       \
+		return lkl_syscall(__lkl__NR##name, params);	       \
+	}
+
+#if __BITS_PER_LONG == 32
+#define LKL_SYSCALLx(x, name, ...)					\
+	static inline							\
+	long lkl_sys##name(__MAP(x, __SC_DECL, __VA_ARGS__))		\
+	{								\
+		struct {						\
+			unsigned int size;				\
+			long long value;				\
+		} lkl_params[x] = { __MAP(x, __SC_TABLE, __VA_ARGS__) }; \
+		long sys_params[6], i, k;				\
+		for (i = k = 0; i < x && k < 6; i++, k++) {		\
+			if (lkl_params[i].size > sizeof(long) &&	\
+			    k + 1 < 6) {				\
+				sys_params[k] =				\
+					(long)(lkl_params[i].value & (-1UL)); \
+				k++;					\
+				sys_params[k] =				\
+					(long)(lkl_params[i].value >>	\
+					       __BITS_PER_LONG);	\
+			} else {					\
+				sys_params[k] = (long)(lkl_params[i].value); \
+			}						\
+		}							\
+		return lkl_syscall(__lkl__NR##name, sys_params);	\
+	}
+#else
+#define LKL_SYSCALLx(x, name, ...)					\
+	static inline							\
+	long lkl_sys##name(__MAP(x, __SC_DECL, __VA_ARGS__))		\
+	{								\
+		long lkl_params[6] = { __MAP(x, __SC_LONG, __VA_ARGS__) }; \
+		return lkl_syscall(__lkl__NR##name, lkl_params);	\
+	}
+#endif
+
+#define SYSCALL_DEFINE0(name, ...) LKL_SYSCALL0(name)
+#define SYSCALL_DEFINE1(name, ...) LKL_SYSCALLx(1, name, __VA_ARGS__)
+#define SYSCALL_DEFINE2(name, ...) LKL_SYSCALLx(2, name, __VA_ARGS__)
+#define SYSCALL_DEFINE3(name, ...) LKL_SYSCALLx(3, name, __VA_ARGS__)
+#define SYSCALL_DEFINE4(name, ...) LKL_SYSCALLx(4, name, __VA_ARGS__)
+#define SYSCALL_DEFINE5(name, ...) LKL_SYSCALLx(5, name, __VA_ARGS__)
+#define SYSCALL_DEFINE6(name, ...) LKL_SYSCALLx(6, name, __VA_ARGS__)
+
+#if __BITS_PER_LONG == 32
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wpointer-to-int-cast"
+#endif
+
+#include <asm/syscall_defs.h>
+
+#if __BITS_PER_LONG == 32
+#pragma GCC diagnostic pop
+#endif
+
+#endif
diff --git a/arch/um/lkl/kernel/asm-offsets.c b/arch/um/lkl/kernel/asm-offsets.c
new file mode 100644
index 000000000000..6be0763698dc
--- /dev/null
+++ b/arch/um/lkl/kernel/asm-offsets.c
@@ -0,0 +1,2 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Dummy asm-offsets.c file. Required by kbuild and ready to be used - hint! */
diff --git a/arch/um/lkl/kernel/misc.c b/arch/um/lkl/kernel/misc.c
new file mode 100644
index 000000000000..60f048f02ae6
--- /dev/null
+++ b/arch/um/lkl/kernel/misc.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kallsyms.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <asm/ptrace.h>
+#include <asm/host_ops.h>
+
+#ifdef CONFIG_PRINTK
+void dump_stack(void)
+{
+	unsigned long dummy;
+	unsigned long *stack = &dummy;
+	unsigned long addr;
+
+	pr_info("Call Trace:\n");
+	while (((long)stack & (THREAD_SIZE - 1)) != 0) {
+		addr = *stack;
+		if (__kernel_text_address(addr)) {
+			pr_info("%p:  [<%08lx>] %pS", stack, addr,
+				(void *)addr);
+			pr_cont("\n");
+		}
+		stack++;
+	}
+	pr_info("\n");
+}
+#endif
+
+void show_regs(struct pt_regs *regs)
+{
+}
+
+#ifdef CONFIG_PROC_FS
+static void *cpuinfo_start(struct seq_file *m, loff_t *pos)
+{
+	return NULL;
+}
+
+static void *cpuinfo_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	return NULL;
+}
+
+static void cpuinfo_stop(struct seq_file *m, void *v)
+{
+}
+
+static int show_cpuinfo(struct seq_file *m, void *v)
+{
+	return 0;
+}
+
+const struct seq_operations cpuinfo_op = {
+	.start	= cpuinfo_start,
+	.next	= cpuinfo_next,
+	.stop	= cpuinfo_stop,
+	.show	= show_cpuinfo,
+};
+#endif
diff --git a/arch/um/lkl/kernel/vmlinux.lds.S b/arch/um/lkl/kernel/vmlinux.lds.S
new file mode 100644
index 000000000000..efe420f38110
--- /dev/null
+++ b/arch/um/lkl/kernel/vmlinux.lds.S
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <asm/vmlinux.lds.h>
+#include <asm/thread_info.h>
+#include <asm/page.h>
+#include <asm/cache.h>
+#include <linux/export.h>
+
+OUTPUT_FORMAT(CONFIG_OUTPUT_FORMAT)
+
+VMLINUX_SYMBOL(jiffies) = VMLINUX_SYMBOL(jiffies_64);
+
+SECTIONS
+{
+	VMLINUX_SYMBOL(__init_begin) = .;
+	HEAD_TEXT_SECTION
+	INIT_TEXT_SECTION(PAGE_SIZE)
+	INIT_DATA_SECTION(16)
+	PERCPU_SECTION(L1_CACHE_BYTES)
+	VMLINUX_SYMBOL(__init_end) = .;
+
+	VMLINUX_SYMBOL(_stext) = .;
+	VMLINUX_SYMBOL(_text) = . ;
+	VMLINUX_SYMBOL(text) = . ;
+	.text      :
+	{
+		TEXT_TEXT
+		SCHED_TEXT
+		LOCK_TEXT
+		CPUIDLE_TEXT
+	}
+	VMLINUX_SYMBOL(_etext) = .;
+
+	VMLINUX_SYMBOL(_sdata) = .;
+	RO_DATA_SECTION(PAGE_SIZE)
+	RW_DATA_SECTION(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
+	VMLINUX_SYMBOL(_edata) = .;
+
+	VMLINUX_SYMBOL(__start_ro_after_init) = .;
+	.data..ro_after_init : { *(.data..ro_after_init)}
+	EXCEPTION_TABLE(16)
+	VMLINUX_SYMBOL(__end_ro_after_init) = .;
+	NOTES
+
+	BSS_SECTION(0, 0, 0)
+	VMLINUX_SYMBOL(_end) = .;
+
+	STABS_DEBUG
+	DWARF_DEBUG
+
+	DISCARDS
+}
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Patrick Collins, Levente Kurusa, Matthieu Coudron,
	Conrad Meyer, Octavian Purdila, Jens Staal, Motomu Utsumi,
	Lai Jiangshan, Akira Moroo, Petros Angelatos, Yuan Liu, Xiao Jia,
	Mark Stillwell, Hajime Tazaki, linux-kernel-library,
	Pierre-Hugues Husson, Michael Zimmermann, Luca Dariz,
	Edison M . Castro

From: Octavian Purdila <tavi.purdila@gmail.com>

Adds the LKL Kconfig, vmlinux linker script, basic architecture
headers and miscellaneous basic functions or stubs such as
dump_stack(), show_regs() and cpuinfo proc ops.

The headers we introduce in this patch are simple wrappers to the
asm-generic headers or stubs for things we don't support, such as
ptrace, DMA, signals, ELF handling and low level processor operations.

The kernel configuration is automatically updated to reflect the
endianness of the host, 64bit support or the output format for
vmlinux's linker script. We do this by looking at the ld's default
output format.

Signed-off-by: Andreas Abel <aabel@google.com>
Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Jens Staal <staal1978@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Levente Kurusa <levex@linux.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Matthieu Coudron <mattator@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 MAINTAINERS                                |   8 +
 arch/um/lkl/.gitignore                     |   2 +
 arch/um/lkl/Kconfig                        |  95 ++++++
 arch/um/lkl/Kconfig.debug                  |   0
 arch/um/lkl/configs/lkl_defconfig          |  91 ++++++
 arch/um/lkl/include/asm/Kbuild             |  80 +++++
 arch/um/lkl/include/asm/bitsperlong.h      |  11 +
 arch/um/lkl/include/asm/byteorder.h        |   7 +
 arch/um/lkl/include/asm/cpu.h              |  14 +
 arch/um/lkl/include/asm/elf.h              |  15 +
 arch/um/lkl/include/asm/mutex.h            |   7 +
 arch/um/lkl/include/asm/processor.h        |  60 ++++
 arch/um/lkl/include/asm/ptrace.h           |  25 ++
 arch/um/lkl/include/asm/sched.h            |  23 ++
 arch/um/lkl/include/asm/syscalls.h         |  18 ++
 arch/um/lkl/include/asm/syscalls_32.h      |  43 +++
 arch/um/lkl/include/asm/tlb.h              |  12 +
 arch/um/lkl/include/asm/uaccess.h          |  64 ++++
 arch/um/lkl/include/asm/unistd_32.h        |  31 ++
 arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
 arch/um/lkl/include/asm/xor.h              |   9 +
 arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
 arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
 arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
 arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
 arch/um/lkl/include/uapi/asm/swab.h        |  11 +
 arch/um/lkl/include/uapi/asm/syscalls.h    | 348 +++++++++++++++++++++
 arch/um/lkl/kernel/asm-offsets.c           |   2 +
 arch/um/lkl/kernel/misc.c                  |  60 ++++
 arch/um/lkl/kernel/vmlinux.lds.S           |  51 +++
 30 files changed, 1145 insertions(+)
 create mode 100644 arch/um/lkl/.gitignore
 create mode 100644 arch/um/lkl/Kconfig
 create mode 100644 arch/um/lkl/Kconfig.debug
 create mode 100644 arch/um/lkl/configs/lkl_defconfig
 create mode 100644 arch/um/lkl/include/asm/Kbuild
 create mode 100644 arch/um/lkl/include/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/asm/cpu.h
 create mode 100644 arch/um/lkl/include/asm/elf.h
 create mode 100644 arch/um/lkl/include/asm/mutex.h
 create mode 100644 arch/um/lkl/include/asm/processor.h
 create mode 100644 arch/um/lkl/include/asm/ptrace.h
 create mode 100644 arch/um/lkl/include/asm/sched.h
 create mode 100644 arch/um/lkl/include/asm/syscalls.h
 create mode 100644 arch/um/lkl/include/asm/syscalls_32.h
 create mode 100644 arch/um/lkl/include/asm/tlb.h
 create mode 100644 arch/um/lkl/include/asm/uaccess.h
 create mode 100644 arch/um/lkl/include/asm/unistd_32.h
 create mode 100644 arch/um/lkl/include/asm/vmlinux.lds.h
 create mode 100644 arch/um/lkl/include/asm/xor.h
 create mode 100644 arch/um/lkl/include/uapi/asm/Kbuild
 create mode 100644 arch/um/lkl/include/uapi/asm/bitsperlong.h
 create mode 100644 arch/um/lkl/include/uapi/asm/byteorder.h
 create mode 100644 arch/um/lkl/include/uapi/asm/siginfo.h
 create mode 100644 arch/um/lkl/include/uapi/asm/swab.h
 create mode 100644 arch/um/lkl/include/uapi/asm/syscalls.h
 create mode 100644 arch/um/lkl/kernel/asm-offsets.c
 create mode 100644 arch/um/lkl/kernel/misc.c
 create mode 100644 arch/um/lkl/kernel/vmlinux.lds.S

diff --git a/MAINTAINERS b/MAINTAINERS
index e7a47b5210fd..df8151c6fd9e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9369,6 +9369,14 @@ F:	Documentation/core-api/atomic_ops.rst
 F:	Documentation/core-api/refcount-vs-atomic.rst
 F:	Documentation/memory-barriers.txt
 
+LINUX KERNEL LIBRARY
+M:	Octavian Purdila <tavi.purdila@gmail.com>
+M:	Hajime Tazaki <thehajime@gmail.com>
+L:	linux-kernel-library@freelists.org
+S:	Maintained
+F:	arch/um/lkl/
+F:	tools/lkl/
+
 LIS3LV02D ACCELEROMETER DRIVER
 M:	Eric Piel <eric.piel@tremplin-utc.net>
 S:	Maintained
diff --git a/arch/um/lkl/.gitignore b/arch/um/lkl/.gitignore
new file mode 100644
index 000000000000..ced1c60d8235
--- /dev/null
+++ b/arch/um/lkl/.gitignore
@@ -0,0 +1,2 @@
+kernel/vmlinux.lds
+include/generated
diff --git a/arch/um/lkl/Kconfig b/arch/um/lkl/Kconfig
new file mode 100644
index 000000000000..1dae70f16c43
--- /dev/null
+++ b/arch/um/lkl/Kconfig
@@ -0,0 +1,95 @@
+# SPDX-License-Identifier: GPL-2.0
+
+config UML_LKL
+       def_bool y
+       depends on !SMP && !MMU && !COREDUMP && !SECCOMP && !UPROBES && !COMPAT && !USER_RETURN_NOTIFIER
+       select ARCH_THREAD_STACK_ALLOCATOR
+       select RWSEM_GENERIC_SPINLOCK
+       select GENERIC_ATOMIC64
+       select GENERIC_HWEIGHT
+       select FLATMEM
+       select FLAT_NODE_MEM_MAP
+       select GENERIC_CLOCKEVENTS
+       select GENERIC_CPU_DEVICES
+       select NO_HZ_IDLE
+       select NO_PREEMPT
+       select ARCH_WANT_FRAME_POINTERS
+       select HAS_DMA
+       select DMA_DIRECT_OPS
+       select PHYS_ADDR_T_64BIT if 64BIT
+       select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64"
+       select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64-freebsd"
+       select NET
+       select MULTIUSER
+       select INET
+       select IPV6
+       select IP_PNP
+       select IP_PNP_DHCP
+       select TCP_CONG_ADVANCED
+       select TCP_CONG_BBR
+       select HIGH_RES_TIMERS
+       select NET_SCHED
+       select NET_SCH_FQ
+       select IP_MULTICAST
+       select IPV6_MULTICAST
+       select IP_MULTIPLE_TABLES
+       select IPV6_MULTIPLE_TABLES
+       select IP_ROUTE_MULTIPATH
+       select IPV6_ROUTE_MULTIPATH
+       select IP_ADVANCED_ROUTER
+       select IPV6_ADVANCED_ROUTER
+       select ARCH_NO_COHERENT_DMA_MMAP
+       select HAVE_MEMBLOCK
+       select NO_BOOTMEM
+
+config OUTPUT_FORMAT
+       string "Output format"
+       default "$(OUTPUT_FORMAT)"
+
+config ARCH_DMA_ADDR_T_64BIT
+       def_bool 64BIT
+
+config 64BIT
+       def_bool n
+
+config COREDUMP
+       def_bool n
+
+config BIG_ENDIAN
+       def_bool n
+
+config GENERIC_CSUM
+       def_bool y
+
+config GENERIC_HWEIGHT
+       def_bool y
+
+config NO_IOPORT_MAP
+       def_bool y
+
+config RWSEM_GENERIC_SPINLOCK
+	bool
+	default y
+
+config BTRFS_FS
+	tristate
+	default n
+
+config XFS_FS
+	tristate
+	default n
+
+config HZ
+        int
+        default 100
+
+config CONSOLE_LOGLEVEL_QUIET
+	int "quiet console loglevel (1-15)"
+	range 1 15
+	default "4"
+	help
+	  loglevel to use when "quiet" is passed on the kernel commandline.
+
+	  When "quiet" is passed on the kernel commandline this loglevel
+	  will be used as the loglevel. IOW passing "quiet" will be the
+	  equivalent of passing "loglevel=<CONSOLE_LOGLEVEL_QUIET>"
diff --git a/arch/um/lkl/Kconfig.debug b/arch/um/lkl/Kconfig.debug
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/arch/um/lkl/configs/lkl_defconfig b/arch/um/lkl/configs/lkl_defconfig
new file mode 100644
index 000000000000..1a281480839b
--- /dev/null
+++ b/arch/um/lkl/configs/lkl_defconfig
@@ -0,0 +1,91 @@
+# CONFIG_LOCALVERSION_AUTO is not set
+CONFIG_NO_HZ_IDLE=y
+# CONFIG_SYSFS_SYSCALL is not set
+CONFIG_KALLSYMS_USE_DATA_SECTION=y
+CONFIG_KALLSYMS_ALL=y
+# CONFIG_BASE_FULL is not set
+# CONFIG_FUTEX is not set
+# CONFIG_SIGNALFD is not set
+# CONFIG_TIMERFD is not set
+# CONFIG_AIO is not set
+# CONFIG_ADVISE_SYSCALLS is not set
+CONFIG_EMBEDDED=y
+# CONFIG_VM_EVENT_COUNTERS is not set
+# CONFIG_COMPAT_BRK is not set
+# CONFIG_BLK_DEV_BSG is not set
+CONFIG_NET=y
+CONFIG_INET=y
+# CONFIG_WIRELESS is not set
+# CONFIG_UEVENT_HELPER is not set
+# CONFIG_FW_LOADER is not set
+CONFIG_VIRTIO_BLK=y
+CONFIG_NETDEVICES=y
+CONFIG_VIRTIO_NET=y
+# CONFIG_ETHERNET is not set
+# CONFIG_WLAN is not set
+# CONFIG_VT is not set
+CONFIG_VIRTIO_MMIO=y
+CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_FS_POSIX_ACL=y
+CONFIG_EXT4_FS_SECURITY=y
+# CONFIG_FILE_LOCKING is not set
+# CONFIG_DNOTIFY is not set
+# CONFIG_INOTIFY_USER is not set
+CONFIG_VFAT_FS=y
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_CODEPAGE_737=y
+CONFIG_NLS_CODEPAGE_775=y
+CONFIG_NLS_CODEPAGE_850=y
+CONFIG_NLS_CODEPAGE_852=y
+CONFIG_NLS_CODEPAGE_855=y
+CONFIG_NLS_CODEPAGE_857=y
+CONFIG_NLS_CODEPAGE_860=y
+CONFIG_NLS_CODEPAGE_861=y
+CONFIG_NLS_CODEPAGE_862=y
+CONFIG_NLS_CODEPAGE_863=y
+CONFIG_NLS_CODEPAGE_864=y
+CONFIG_NLS_CODEPAGE_865=y
+CONFIG_NLS_CODEPAGE_866=y
+CONFIG_NLS_CODEPAGE_869=y
+CONFIG_NLS_CODEPAGE_936=y
+CONFIG_NLS_CODEPAGE_950=y
+CONFIG_NLS_CODEPAGE_932=y
+CONFIG_NLS_CODEPAGE_949=y
+CONFIG_NLS_CODEPAGE_874=y
+CONFIG_NLS_ISO8859_8=y
+CONFIG_NLS_CODEPAGE_1250=y
+CONFIG_NLS_CODEPAGE_1251=y
+CONFIG_NLS_ASCII=y
+CONFIG_NLS_ISO8859_1=y
+CONFIG_NLS_ISO8859_2=y
+CONFIG_NLS_ISO8859_3=y
+CONFIG_NLS_ISO8859_4=y
+CONFIG_NLS_ISO8859_5=y
+CONFIG_NLS_ISO8859_6=y
+CONFIG_NLS_ISO8859_7=y
+CONFIG_NLS_ISO8859_9=y
+CONFIG_NLS_ISO8859_13=y
+CONFIG_NLS_ISO8859_14=y
+CONFIG_NLS_ISO8859_15=y
+CONFIG_NLS_KOI8_R=y
+CONFIG_NLS_KOI8_U=y
+CONFIG_NLS_MAC_ROMAN=y
+CONFIG_NLS_MAC_CELTIC=y
+CONFIG_NLS_MAC_CENTEURO=y
+CONFIG_NLS_MAC_CROATIAN=y
+CONFIG_NLS_MAC_CYRILLIC=y
+CONFIG_NLS_MAC_GAELIC=y
+CONFIG_NLS_MAC_GREEK=y
+CONFIG_NLS_MAC_ICELAND=y
+CONFIG_NLS_MAC_INUIT=y
+CONFIG_NLS_MAC_ROMANIAN=y
+CONFIG_NLS_MAC_TURKISH=y
+CONFIG_NLS_UTF8=y
+CONFIG_HZ_100=y
+CONFIG_CRYPTO_ANSI_CPRNG=y
+CONFIG_PRINTK_TIME=y
+CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_REDUCED=y
+# CONFIG_ENABLE_WARN_DEPRECATED is not set
+# CONFIG_ENABLE_MUST_CHECK is not set
diff --git a/arch/um/lkl/include/asm/Kbuild b/arch/um/lkl/include/asm/Kbuild
new file mode 100644
index 000000000000..f6308985c61c
--- /dev/null
+++ b/arch/um/lkl/include/asm/Kbuild
@@ -0,0 +1,80 @@
+generic-y += atomic.h
+generic-y += barrier.h
+generic-y += bitops.h
+generic-y += bug.h
+generic-y += bugs.h
+generic-y += cache.h
+generic-y += cacheflush.h
+generic-y += checksum.h
+generic-y += cmpxchg-local.h
+generic-y += cmpxchg.h
+generic-y += compat.h
+generic-y += cputime.h
+generic-y += current.h
+generic-y += delay.h
+generic-y += device.h
+generic-y += div64.h
+generic-y += dma.h
+generic-y += dma-mapping.h
+generic-y += emergency-restart.h
+generic-y += errno.h
+generic-y += extable.h
+generic-y += exec.h
+generic-y += ftrace.h
+generic-y += futex.h
+generic-y += hardirq.h
+generic-y += hw_irq.h
+generic-y += ioctl.h
+generic-y += ipcbuf.h
+generic-y += irq_regs.h
+generic-y += irqflags.h
+generic-y += irq_work.h
+generic-y += kdebug.h
+generic-y += kmap_types.h
+generic-y += linkage.h
+generic-y += local.h
+generic-y += local64.h
+generic-y += mcs_spinlock.h
+generic-y += mmiowb.h
+generic-y += mmu.h
+generic-y += mmu_context.h
+generic-y += module.h
+generic-y += msgbuf.h
+generic-y += param.h
+generic-y += parport.h
+generic-y += pci.h
+generic-y += percpu.h
+generic-y += pgalloc.h
+generic-y += poll.h
+generic-y += preempt.h
+generic-y += resource.h
+generic-y += rwsem.h
+generic-y += scatterlist.h
+generic-y += seccomp.h
+generic-y += sections.h
+generic-y += segment.h
+generic-y += sembuf.h
+generic-y += serial.h
+generic-y += shmbuf.h
+generic-y += signal.h
+generic-y += simd.h
+generic-y += sizes.h
+generic-y += socket.h
+generic-y += sockios.h
+generic-y += stat.h
+generic-y += statfs.h
+generic-y += string.h
+generic-y += swab.h
+generic-y += switch_to.h
+generic-y += syscall.h
+generic-y += termbits.h
+generic-y += termios.h
+generic-y += time.h
+generic-y += timex.h
+generic-y += tlbflush.h
+generic-y += topology.h
+generic-y += trace_clock.h
+generic-y += unaligned.h
+generic-y += user.h
+generic-y += word-at-a-time.h
+generic-y += kprobes.h
diff --git a/arch/um/lkl/include/asm/bitsperlong.h b/arch/um/lkl/include/asm/bitsperlong.h
new file mode 100644
index 000000000000..5745d5e51274
--- /dev/null
+++ b/arch/um/lkl/include/asm/bitsperlong.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LKL_BITSPERLONG_H
+#define __LKL_BITSPERLONG_H
+
+#include <uapi/asm/bitsperlong.h>
+
+#define BITS_PER_LONG __BITS_PER_LONG
+
+#define BITS_PER_LONG_LONG 64
+
+#endif
diff --git a/arch/um/lkl/include/asm/byteorder.h b/arch/um/lkl/include/asm/byteorder.h
new file mode 100644
index 000000000000..5d0c4efaa44b
--- /dev/null
+++ b/arch/um/lkl/include/asm/byteorder.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_BYTEORDER_H
+#define _ASM_LKL_BYTEORDER_H
+
+#include <uapi/asm/byteorder.h>
+
+#endif /* _ASM_LKL_BYTEORDER_H */
diff --git a/arch/um/lkl/include/asm/cpu.h b/arch/um/lkl/include/asm/cpu.h
new file mode 100644
index 000000000000..d2b8c501c7b1
--- /dev/null
+++ b/arch/um/lkl/include/asm/cpu.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_CPU_H
+#define _ASM_LKL_CPU_H
+
+int lkl_cpu_get(void);
+void lkl_cpu_put(void);
+int lkl_cpu_try_run_irq(int irq);
+int lkl_cpu_init(void);
+void lkl_cpu_shutdown(void);
+void lkl_cpu_wait_shutdown(void);
+void lkl_cpu_change_owner(lkl_thread_t owner);
+void lkl_cpu_set_irqs_pending(void);
+
+#endif /* _ASM_LKL_CPU_H */
diff --git a/arch/um/lkl/include/asm/elf.h b/arch/um/lkl/include/asm/elf.h
new file mode 100644
index 000000000000..bb2456d638f4
--- /dev/null
+++ b/arch/um/lkl/include/asm/elf.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_ELF_H
+#define _ASM_LKL_ELF_H
+
+#define elf_check_arch(x) 0
+
+#ifdef CONFIG_64BIT
+#define ELF_CLASS ELFCLASS64
+#else
+#define ELF_CLASS ELFCLASS32
+#endif
+
+#define elf_gregset_t long
+#define elf_fpregset_t double
+#endif
diff --git a/arch/um/lkl/include/asm/mutex.h b/arch/um/lkl/include/asm/mutex.h
new file mode 100644
index 000000000000..492d04183f9c
--- /dev/null
+++ b/arch/um/lkl/include/asm/mutex.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_MUTEX_H
+#define _ASM_LKL_MUTEX_H
+
+#include <asm-generic/mutex-dec.h>
+
+#endif
diff --git a/arch/um/lkl/include/asm/processor.h b/arch/um/lkl/include/asm/processor.h
new file mode 100644
index 000000000000..c1aa8b3a266e
--- /dev/null
+++ b/arch/um/lkl/include/asm/processor.h
@@ -0,0 +1,60 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_PROCESSOR_H
+#define _ASM_LKL_PROCESSOR_H
+
+struct task_struct;
+
+static inline void cpu_relax(void)
+{
+	unsigned long flags;
+
+	/* since this is usually called in a tight loop waiting for some
+	 * external condition (e.g. jiffies) lets run interrupts now to allow
+	 * the external condition to propagate
+	 */
+	local_irq_save(flags);
+	local_irq_restore(flags);
+}
+
+#define current_text_addr() ({ __label__ _l; _l: &&_l; })
+
+static inline unsigned long thread_saved_pc(struct task_struct *tsk)
+{
+	return 0;
+}
+
+static inline void release_thread(struct task_struct *dead_task)
+{
+}
+
+static inline void prepare_to_copy(struct task_struct *tsk)
+{
+}
+
+static inline unsigned long get_wchan(struct task_struct *p)
+{
+	return 0;
+}
+
+static inline void flush_thread(void)
+{
+}
+
+static inline void trap_init(void)
+{
+}
+
+struct thread_struct { };
+
+#define INIT_THREAD { }
+
+#define task_pt_regs(tsk) (struct pt_regs *)(NULL)
+
+/* We don't have strict user/kernel spaces */
+#define TASK_SIZE ((unsigned long)-1)
+#define TASK_UNMAPPED_BASE 0
+
+#define KSTK_EIP(tsk) (0)
+#define KSTK_ESP(tsk) (0)
+
+#endif
diff --git a/arch/um/lkl/include/asm/ptrace.h b/arch/um/lkl/include/asm/ptrace.h
new file mode 100644
index 000000000000..28199be26dc0
--- /dev/null
+++ b/arch/um/lkl/include/asm/ptrace.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_PTRACE_H
+#define _ASM_LKL_PTRACE_H
+
+#include <linux/errno.h>
+
+struct task_struct;
+
+#define user_mode(regs) 0
+#define kernel_mode(regs) 1
+#define profile_pc(regs) 0
+#define instruction_pointer(regs) 0
+#define user_stack_pointer(regs) 0
+
+static inline long arch_ptrace(struct task_struct *child, long request,
+			       unsigned long addr, unsigned long data)
+{
+	return -EINVAL;
+}
+
+static inline void ptrace_disable(struct task_struct *child)
+{
+}
+
+#endif
diff --git a/arch/um/lkl/include/asm/sched.h b/arch/um/lkl/include/asm/sched.h
new file mode 100644
index 000000000000..4c2635921ec8
--- /dev/null
+++ b/arch/um/lkl/include/asm/sched.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_SCHED_H
+#define _ASM_LKL_SCHED_H
+
+#include <linux/sched.h>
+#include <uapi/asm/host_ops.h>
+
+static inline void thread_sched_jb(void)
+{
+	if (test_ti_thread_flag(current_thread_info(), TIF_HOST_THREAD)) {
+		set_ti_thread_flag(current_thread_info(), TIF_SCHED_JB);
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		lkl_ops->jmp_buf_set(&current_thread_info()->sched_jb,
+				     schedule);
+	} else {
+		lkl_bug("%s() can be used only for host task\n", __func__);
+	}
+}
+
+void switch_to_host_task(struct task_struct *);
+int host_task_stub(void *unused);
+
+#endif /*  _ASM_LKL_SCHED_H */
diff --git a/arch/um/lkl/include/asm/syscalls.h b/arch/um/lkl/include/asm/syscalls.h
new file mode 100644
index 000000000000..2eaa870a9020
--- /dev/null
+++ b/arch/um/lkl/include/asm/syscalls.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_SYSCALLS_H
+#define _ASM_LKL_SYSCALLS_H
+
+int syscalls_init(void);
+void syscalls_cleanup(void);
+long lkl_syscall(long no, long *params);
+void wakeup_idle_host_task(void);
+
+#define sys_mmap sys_mmap_pgoff
+#define sys_mmap2 sys_mmap_pgoff
+#define sys_clone sys_ni_syscall
+#define sys_vfork sys_ni_syscall
+#define sys_rt_sigreturn sys_ni_syscall
+
+#include <asm-generic/syscalls.h>
+
+#endif /* _ASM_LKL_SYSCALLS_H */
diff --git a/arch/um/lkl/include/asm/syscalls_32.h b/arch/um/lkl/include/asm/syscalls_32.h
new file mode 100644
index 000000000000..0e1a7649c81b
--- /dev/null
+++ b/arch/um/lkl/include/asm/syscalls_32.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_SYSCALLS_32_H
+#define _ASM_SYSCALLS_32_H
+
+#include <linux/compiler.h>
+#include <linux/linkage.h>
+#include <linux/types.h>
+#include <linux/signal.h>
+
+#if __BITS_PER_LONG == 32
+
+/* kernel/syscalls_32.c */
+asmlinkage long sys32_truncate64(const char __user *, unsigned long,
+				 unsigned long);
+asmlinkage long sys32_ftruncate64(unsigned int, unsigned long, unsigned long);
+
+#ifdef CONFIG_MMU
+struct mmap_arg_struct32;
+asmlinkage long sys32_mmap(struct mmap_arg_struct32 __user *);
+#endif
+
+asmlinkage long sys32_wait4(pid_t, unsigned int __user *, int,
+			    struct rusage __user *);
+
+asmlinkage long sys32_pread64(unsigned int, char __user *, u32, u32, u32);
+asmlinkage long sys32_pwrite64(unsigned int, const char __user *, u32, u32,
+			       u32);
+
+long sys32_fadvise64_64(int a, __u32 b, __u32 c, __u32 d, __u32 e, int f);
+
+asmlinkage ssize_t sys32_readahead(int, unsigned int, unsigned int, size_t);
+asmlinkage long sys32_sync_file_range(int, unsigned int, unsigned int,
+				      unsigned int, unsigned int, unsigned int);
+asmlinkage long sys32_sync_file_range2(int, unsigned int, unsigned int,
+				       unsigned int, unsigned int,
+				       unsigned int);
+asmlinkage long sys32_fadvise64(int, unsigned int, unsigned int, size_t, int);
+asmlinkage long sys32_fallocate(int, int, unsigned int, unsigned int,
+				unsigned int, unsigned int);
+
+#endif /* __BITS_PER_LONG */
+
+#endif /* _ASM_SYSCALLS_32_H */
diff --git a/arch/um/lkl/include/asm/tlb.h b/arch/um/lkl/include/asm/tlb.h
new file mode 100644
index 000000000000..d474890d317d
--- /dev/null
+++ b/arch/um/lkl/include/asm/tlb.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_TLB_H
+#define _ASM_LKL_TLB_H
+
+#define tlb_start_vma(tlb, vma)				do { } while (0)
+#define tlb_end_vma(tlb, vma)				do { } while (0)
+#define __tlb_remove_tlb_entry(tlb, pte, address)	do { } while (0)
+#define tlb_flush(tlb)					do { } while (0)
+
+#include <asm-generic/tlb.h>
+
+#endif /* _ASM_LKL_TLB_H */
diff --git a/arch/um/lkl/include/asm/uaccess.h b/arch/um/lkl/include/asm/uaccess.h
new file mode 100644
index 000000000000..f267ac3be8b3
--- /dev/null
+++ b/arch/um/lkl/include/asm/uaccess.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_UACCESS_H
+#define _ASM_LKL_UACCESS_H
+
+/* copied from old include/asm-generic/uaccess.h */
+static inline __must_check long
+raw_copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	if (__builtin_constant_p(n)) {
+		switch (n) {
+		case 1:
+			*(u8 *)to = *(u8 __force *)from;
+			return 0;
+		case 2:
+			*(u16 *)to = *(u16 __force *)from;
+			return 0;
+		case 4:
+			*(u32 *)to = *(u32 __force *)from;
+			return 0;
+#ifdef CONFIG_64BIT
+		case 8:
+			*(u64 *)to = *(u64 __force *)from;
+			return 0;
+#endif
+		default:
+			break;
+		}
+	}
+
+	memcpy(to, (const void __force *)from, n);
+	return 0;
+}
+
+static inline __must_check long
+raw_copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+	if (__builtin_constant_p(n)) {
+		switch (n) {
+		case 1:
+			*(u8 __force *)to = *(u8 *)from;
+			return 0;
+		case 2:
+			*(u16 __force *)to = *(u16 *)from;
+			return 0;
+		case 4:
+			*(u32 __force *)to = *(u32 *)from;
+			return 0;
+#ifdef CONFIG_64BIT
+		case 8:
+			*(u64 __force *)to = *(u64 *)from;
+			return 0;
+#endif
+		default:
+			break;
+		}
+	}
+
+	memcpy((void __force *)to, from, n);
+	return 0;
+}
+
+#include <asm-generic/uaccess.h>
+
+#endif
diff --git a/arch/um/lkl/include/asm/unistd_32.h b/arch/um/lkl/include/asm/unistd_32.h
new file mode 100644
index 000000000000..8582a55e61e2
--- /dev/null
+++ b/arch/um/lkl/include/asm/unistd_32.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <asm/bitsperlong.h>
+
+#ifndef __SYSCALL
+#define __SYSCALL(x, y)
+#endif
+
+#if __BITS_PER_LONG == 32
+__SYSCALL(__NR3264_truncate, sys32_truncate64)
+__SYSCALL(__NR3264_ftruncate, sys32_ftruncate64)
+
+#ifdef CONFIG_MMU
+__SYSCALL(__NR3264_mmap, sys32_mmap)
+#endif
+
+__SYSCALL(__NR_wait4, sys32_wait4)
+
+__SYSCALL(__NR_pread64, sys32_pread64)
+__SYSCALL(__NR_pwrite64, sys32_pwrite64)
+
+__SYSCALL(__NR_readahead, sys32_readahead)
+#ifdef __ARCH_WANT_SYNC_FILE_RANGE2
+__SYSCALL(__NR_sync_file_range2, sys32_sync_file_range2)
+#else
+__SYSCALL(__NR_sync_file_range, sys32_sync_file_range)
+#endif
+/* mm/fadvise.c */
+__SYSCALL(__NR3264_fadvise64, sys32_fadvise64_64)
+__SYSCALL(__NR_fallocate, sys32_fallocate)
+
+#endif
diff --git a/arch/um/lkl/include/asm/vmlinux.lds.h b/arch/um/lkl/include/asm/vmlinux.lds.h
new file mode 100644
index 000000000000..a3c285882dc4
--- /dev/null
+++ b/arch/um/lkl/include/asm/vmlinux.lds.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_VMLINUX_LDS_H
+#define _LKL_VMLINUX_LDS_H
+
+/* we encode our own __ro_after_init section */
+#define RO_AFTER_INIT_DATA
+
+#ifdef __MINGW32__
+#define RODATA_SECTION .rdata
+#endif
+
+#include <asm-generic/vmlinux.lds.h>
+
+#endif
diff --git a/arch/um/lkl/include/asm/xor.h b/arch/um/lkl/include/asm/xor.h
new file mode 100644
index 000000000000..286ce75b5d9d
--- /dev/null
+++ b/arch/um/lkl/include/asm/xor.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_XOR_H
+#define _ASM_LKL_XOR_H
+
+#include <asm-generic/xor.h>
+
+#define XOR_SELECT_TEMPLATE(x) (&xor_block_8regs)
+
+#endif /* _ASM_LKL_XOR_H */
diff --git a/arch/um/lkl/include/uapi/asm/Kbuild b/arch/um/lkl/include/uapi/asm/Kbuild
new file mode 100644
index 000000000000..39d9a1f2e8f5
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/Kbuild
@@ -0,0 +1,9 @@
+# UAPI Header export list
+
+generic-y += elf.h
+generic-y += kvm_para.h
+generic-y += shmparam.h
+generic-y += timex.h
+
+# no header-y since we need special user headers handling
+# see arch/lkl/script/headers.py
diff --git a/arch/um/lkl/include/uapi/asm/bitsperlong.h b/arch/um/lkl/include/uapi/asm/bitsperlong.h
new file mode 100644
index 000000000000..8b4ebf2b0264
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/bitsperlong.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_BITSPERLONG_H
+#define _ASM_UAPI_LKL_BITSPERLONG_H
+
+#ifdef CONFIG_64BIT
+#define __BITS_PER_LONG 64
+#else
+#define __BITS_PER_LONG 32
+#endif
+
+#define __ARCH_WANT_STAT64
+
+#endif /* _ASM_UAPI_LKL_BITSPERLONG_H */
diff --git a/arch/um/lkl/include/uapi/asm/byteorder.h b/arch/um/lkl/include/uapi/asm/byteorder.h
new file mode 100644
index 000000000000..3c4a58d2062f
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/byteorder.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_BYTEORDER_H
+#define _ASM_UAPI_LKL_BYTEORDER_H
+
+#if defined(CONFIG_BIG_ENDIAN)
+#include <linux/byteorder/big_endian.h>
+#else
+#include <linux/byteorder/little_endian.h>
+#endif
+
+#endif /* _ASM_UAPI_LKL_BYTEORDER_H */
diff --git a/arch/um/lkl/include/uapi/asm/siginfo.h b/arch/um/lkl/include/uapi/asm/siginfo.h
new file mode 100644
index 000000000000..811916cf42c8
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/siginfo.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_LKL_SIGINFO_H
+#define _ASM_LKL_SIGINFO_H
+
+#ifdef CONFIG_64BIT
+#define __ARCH_SI_PREAMBLE_SIZE	(4 * sizeof(int))
+#endif
+
+#include <asm-generic/siginfo.h>
+
+#endif /* _ASM_LKL_SIGINFO_H */
diff --git a/arch/um/lkl/include/uapi/asm/swab.h b/arch/um/lkl/include/uapi/asm/swab.h
new file mode 100644
index 000000000000..1a1773e1bd35
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/swab.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_LKL_SWAB_H
+#define _ASM_LKL_SWAB_H
+
+#ifndef __arch_swab32
+#define __arch_swab32(x) ___constant_swab32(x)
+#endif
+
+#include <asm-generic/swab.h>
+
+#endif /* _ASM_LKL_SWAB_H */
diff --git a/arch/um/lkl/include/uapi/asm/syscalls.h b/arch/um/lkl/include/uapi/asm/syscalls.h
new file mode 100644
index 000000000000..a81534ffccb7
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/syscalls.h
@@ -0,0 +1,348 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_SYSCALLS_H
+#define _ASM_UAPI_LKL_SYSCALLS_H
+
+#include <autoconf.h>
+#include <linux/types.h>
+
+typedef __kernel_uid32_t	qid_t;
+typedef __kernel_fd_set		fd_set;
+typedef __kernel_mode_t		mode_t;
+typedef unsigned short		umode_t;
+typedef __u32			nlink_t;
+typedef __kernel_off_t		off_t;
+typedef __kernel_pid_t		pid_t;
+typedef __kernel_key_t		key_t;
+typedef __kernel_suseconds_t	suseconds_t;
+typedef __kernel_timer_t	timer_t;
+typedef __kernel_clockid_t	clockid_t;
+typedef __kernel_mqd_t		mqd_t;
+typedef __kernel_uid32_t	uid_t;
+typedef __kernel_gid32_t	gid_t;
+typedef __kernel_uid16_t        uid16_t;
+typedef __kernel_gid16_t        gid16_t;
+typedef unsigned long		uintptr_t;
+#ifdef CONFIG_UID16
+typedef __kernel_old_uid_t	old_uid_t;
+typedef __kernel_old_gid_t	old_gid_t;
+#endif
+typedef __kernel_loff_t		loff_t;
+typedef __kernel_size_t		size_t;
+typedef __kernel_ssize_t	ssize_t;
+typedef __kernel_time_t		time_t;
+typedef __kernel_clock_t	clock_t;
+typedef __u32			u32;
+typedef __s32			s32;
+typedef __u64			u64;
+typedef __s64			s64;
+
+#define __user
+
+#include <asm/unistd.h>
+/* Temporary undefine system calls that don't have data types defined in UAPI
+ * headers
+ */
+#undef __NR_kexec_load
+#undef __NR_getcpu
+#undef __NR_sched_getattr
+#undef __NR_sched_setattr
+#undef __NR_sched_setparam
+#undef __NR_sched_getparam
+#undef __NR_sched_setscheduler
+#undef __NR_name_to_handle_at
+#undef __NR_open_by_handle_at
+
+/* deprecated system calls */
+#undef __NR_epoll_create
+#undef __NR_epoll_wait
+#undef __NR_access
+#undef __NR_chmod
+#undef __NR_chown
+#undef __NR_lchown
+#undef __NR_open
+#undef __NR_creat
+#undef __NR_readlink
+#undef __NR_pipe
+#undef __NR_mknod
+#undef __NR_mkdir
+#undef __NR_rmdir
+#undef __NR_unlink
+#undef __NR_symlink
+#undef __NR_link
+#undef __NR_rename
+#undef __NR_getdents
+#undef __NR_select
+#undef __NR_poll
+#undef __NR_dup2
+#undef __NR_futimesat
+#undef __NR_utimes
+#undef __NR_ustat
+#undef __NR_eventfd
+#undef __NR_bdflush
+#undef __NR_send
+#undef __NR_recv
+
+#undef __NR_umount
+#define __NR_umount __NR_umount2
+
+#ifdef CONFIG_64BIT
+#define __NR_newfstat __NR3264_fstat
+#define __NR_newfstatat __NR3264_fstatat
+#endif
+
+#define __NR_mmap_pgoff __NR3264_mmap
+
+#include <linux/time.h>
+#include <linux/times.h>
+#include <linux/timex.h>
+#include <linux/capability.h>
+#define __KERNEL__ /* to pull in S_ definitions */
+#include <linux/stat.h>
+#undef __KERNEL__
+#include <linux/errno.h>
+#include <linux/fcntl.h>
+#include <linux/fs.h>
+#include <asm/statfs.h>
+#include <asm/stat.h>
+#include <linux/bpf.h>
+#include <linux/msg.h>
+#include <linux/resource.h>
+#include <linux/sysinfo.h>
+#include <linux/shm.h>
+#include <linux/aio_abi.h>
+#include <linux/socket.h>
+#include <linux/perf_event.h>
+#include <linux/sem.h>
+#include <linux/futex.h>
+#include <linux/poll.h>
+#include <linux/mqueue.h>
+#include <linux/eventpoll.h>
+#include <linux/uio.h>
+#include <asm/signal.h>
+#include <asm/siginfo.h>
+#include <linux/utime.h>
+#include <asm/socket.h>
+#include <linux/icmp.h>
+#include <linux/ip.h>
+
+/* Define data structures used in system calls that are not defined in UAPI
+ * headers
+ */
+struct sockaddr {
+	unsigned short int sa_family;
+	char sa_data[14];
+};
+
+#define __UAPI_DEF_IF_NET_DEVICE_FLAGS_LOWER_UP_DORMANT_ECHO 1
+#define __UAPI_DEF_IF_IFNAMSIZ	1
+#define __UAPI_DEF_IF_NET_DEVICE_FLAGS 1
+#define __UAPI_DEF_IF_IFREQ	1
+#define __UAPI_DEF_IF_IFMAP	1
+#include <linux/if.h>
+#define __UAPI_DEF_IN_IPPROTO	1
+#define __UAPI_DEF_IN_ADDR	1
+#define __UAPI_DEF_IN6_ADDR	1
+#define __UAPI_DEF_IP_MREQ	1
+#define __UAPI_DEF_IN_PKTINFO	1
+#define __UAPI_DEF_SOCKADDR_IN	1
+#define __UAPI_DEF_IN_CLASS	1
+#include <linux/in.h>
+#include <linux/in6.h>
+#include <linux/sockios.h>
+#include <linux/route.h>
+#include <linux/ipv6_route.h>
+#include <linux/ipv6.h>
+#include <linux/netlink.h>
+#include <linux/neighbour.h>
+#include <linux/rtnetlink.h>
+#include <linux/fib_rules.h>
+
+#include <linux/kdev_t.h>
+#include <asm/irq.h>
+#include <linux/virtio_blk.h>
+#include <linux/virtio_net.h>
+#include <linux/virtio_ring.h>
+#include <linux/pkt_sched.h>
+#include <linux/io_uring.h>
+
+struct user_msghdr {
+	void		__user *msg_name;
+	int		msg_namelen;
+	struct iovec	__user *msg_iov;
+	__kernel_size_t	msg_iovlen;
+	void		__user *msg_control;
+	__kernel_size_t	msg_controllen;
+	unsigned int	msg_flags;
+};
+
+typedef __u32 key_serial_t;
+
+struct mmsghdr {
+	struct user_msghdr  msg_hdr;
+	unsigned int        msg_len;
+};
+
+struct linux_dirent64 {
+	u64		d_ino;
+	s64		d_off;
+	unsigned short	d_reclen;
+	unsigned char	d_type;
+	char		d_name[0];
+};
+
+struct linux_dirent {
+	unsigned long	d_ino;
+	unsigned long	d_off;
+	unsigned short	d_reclen;
+	char		d_name[1];
+};
+
+struct ustat {
+	__kernel_daddr_t	f_tfree;
+	__kernel_ino_t		f_tinode;
+	char			f_fname[6];
+	char			f_fpack[6];
+};
+
+typedef __kernel_rwf_t		rwf_t;
+
+#define AF_UNSPEC       0
+#define AF_UNIX         1
+#define AF_LOCAL        1
+#define AF_INET         2
+#define AF_AX25         3
+#define AF_IPX          4
+#define AF_APPLETALK    5
+#define AF_NETROM       6
+#define AF_BRIDGE       7
+#define AF_ATMPVC       8
+#define AF_X25          9
+#define AF_INET6        10
+#define AF_ROSE         11
+#define AF_DECnet       12
+#define AF_NETBEUI      13
+#define AF_SECURITY     14
+#define AF_KEY          15
+#define AF_NETLINK      16
+#define AF_ROUTE        AF_NETLINK
+#define AF_PACKET       17
+#define AF_ASH          18
+#define AF_ECONET       19
+#define AF_ATMSVC       20
+#define AF_RDS          21
+#define AF_SNA          22
+#define AF_IRDA         23
+#define AF_PPPOX        24
+#define AF_WANPIPE      25
+#define AF_LLC          26
+#define AF_IB           27
+#define AF_MPLS         28
+#define AF_CAN          29
+#define AF_TIPC         30
+#define AF_BLUETOOTH    31
+#define AF_IUCV         32
+#define AF_RXRPC        33
+#define AF_ISDN         34
+#define AF_PHONET       35
+#define AF_IEEE802154   36
+#define AF_CAIF         37
+#define AF_ALG          38
+#define AF_NFC          39
+#define AF_VSOCK        40
+
+#define SOCK_STREAM		1
+#define SOCK_DGRAM		2
+#define SOCK_RAW		3
+#define SOCK_RDM		4
+#define SOCK_SEQPACKET		5
+#define SOCK_DCCP		6
+#define SOCK_PACKET		10
+
+#define MSG_TRUNC 0x20
+#define MSG_DONTWAIT 0x40
+
+/* avoid colision with system headers defines */
+#define sa_handler sa_handler
+#define st_atime st_atime
+#define st_mtime st_mtime
+#define st_ctime st_ctime
+#define s_addr s_addr
+
+long lkl_syscall(long no, long *params);
+long lkl_sys_halt(void);
+
+#define __MAP0(m, ...)
+#define __MAP1(m, t, a) m(t, a)
+#define __MAP2(m, t, a, ...) m(t, a), __MAP1(m, __VA_ARGS__)
+#define __MAP3(m, t, a, ...) m(t, a), __MAP2(m, __VA_ARGS__)
+#define __MAP4(m, t, a, ...) m(t, a), __MAP3(m, __VA_ARGS__)
+#define __MAP5(m, t, a, ...) m(t, a), __MAP4(m, __VA_ARGS__)
+#define __MAP6(m, t, a, ...) m(t, a), __MAP5(m, __VA_ARGS__)
+#define __MAP(n, ...) __MAP##n(__VA_ARGS__)
+
+#define __SC_LONG(t, a) (long)a
+#define __SC_TABLE(t, a) {sizeof(t), (long long)(a)}
+#define __SC_DECL(t, a) t a
+
+#define LKL_SYSCALL0(name)					       \
+	static inline long lkl_sys##name(void)			       \
+	{							       \
+		long params[6];					       \
+		return lkl_syscall(__lkl__NR##name, params);	       \
+	}
+
+#if __BITS_PER_LONG == 32
+#define LKL_SYSCALLx(x, name, ...)					\
+	static inline							\
+	long lkl_sys##name(__MAP(x, __SC_DECL, __VA_ARGS__))		\
+	{								\
+		struct {						\
+			unsigned int size;				\
+			long long value;				\
+		} lkl_params[x] = { __MAP(x, __SC_TABLE, __VA_ARGS__) }; \
+		long sys_params[6], i, k;				\
+		for (i = k = 0; i < x && k < 6; i++, k++) {		\
+			if (lkl_params[i].size > sizeof(long) &&	\
+			    k + 1 < 6) {				\
+				sys_params[k] =				\
+					(long)(lkl_params[i].value & (-1UL)); \
+				k++;					\
+				sys_params[k] =				\
+					(long)(lkl_params[i].value >>	\
+					       __BITS_PER_LONG);	\
+			} else {					\
+				sys_params[k] = (long)(lkl_params[i].value); \
+			}						\
+		}							\
+		return lkl_syscall(__lkl__NR##name, sys_params);	\
+	}
+#else
+#define LKL_SYSCALLx(x, name, ...)					\
+	static inline							\
+	long lkl_sys##name(__MAP(x, __SC_DECL, __VA_ARGS__))		\
+	{								\
+		long lkl_params[6] = { __MAP(x, __SC_LONG, __VA_ARGS__) }; \
+		return lkl_syscall(__lkl__NR##name, lkl_params);	\
+	}
+#endif
+
+#define SYSCALL_DEFINE0(name, ...) LKL_SYSCALL0(name)
+#define SYSCALL_DEFINE1(name, ...) LKL_SYSCALLx(1, name, __VA_ARGS__)
+#define SYSCALL_DEFINE2(name, ...) LKL_SYSCALLx(2, name, __VA_ARGS__)
+#define SYSCALL_DEFINE3(name, ...) LKL_SYSCALLx(3, name, __VA_ARGS__)
+#define SYSCALL_DEFINE4(name, ...) LKL_SYSCALLx(4, name, __VA_ARGS__)
+#define SYSCALL_DEFINE5(name, ...) LKL_SYSCALLx(5, name, __VA_ARGS__)
+#define SYSCALL_DEFINE6(name, ...) LKL_SYSCALLx(6, name, __VA_ARGS__)
+
+#if __BITS_PER_LONG == 32
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wpointer-to-int-cast"
+#endif
+
+#include <asm/syscall_defs.h>
+
+#if __BITS_PER_LONG == 32
+#pragma GCC diagnostic pop
+#endif
+
+#endif
diff --git a/arch/um/lkl/kernel/asm-offsets.c b/arch/um/lkl/kernel/asm-offsets.c
new file mode 100644
index 000000000000..6be0763698dc
--- /dev/null
+++ b/arch/um/lkl/kernel/asm-offsets.c
@@ -0,0 +1,2 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Dummy asm-offsets.c file. Required by kbuild and ready to be used - hint! */
diff --git a/arch/um/lkl/kernel/misc.c b/arch/um/lkl/kernel/misc.c
new file mode 100644
index 000000000000..60f048f02ae6
--- /dev/null
+++ b/arch/um/lkl/kernel/misc.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kallsyms.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <asm/ptrace.h>
+#include <asm/host_ops.h>
+
+#ifdef CONFIG_PRINTK
+void dump_stack(void)
+{
+	unsigned long dummy;
+	unsigned long *stack = &dummy;
+	unsigned long addr;
+
+	pr_info("Call Trace:\n");
+	while (((long)stack & (THREAD_SIZE - 1)) != 0) {
+		addr = *stack;
+		if (__kernel_text_address(addr)) {
+			pr_info("%p:  [<%08lx>] %pS", stack, addr,
+				(void *)addr);
+			pr_cont("\n");
+		}
+		stack++;
+	}
+	pr_info("\n");
+}
+#endif
+
+void show_regs(struct pt_regs *regs)
+{
+}
+
+#ifdef CONFIG_PROC_FS
+static void *cpuinfo_start(struct seq_file *m, loff_t *pos)
+{
+	return NULL;
+}
+
+static void *cpuinfo_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	return NULL;
+}
+
+static void cpuinfo_stop(struct seq_file *m, void *v)
+{
+}
+
+static int show_cpuinfo(struct seq_file *m, void *v)
+{
+	return 0;
+}
+
+const struct seq_operations cpuinfo_op = {
+	.start	= cpuinfo_start,
+	.next	= cpuinfo_next,
+	.stop	= cpuinfo_stop,
+	.show	= show_cpuinfo,
+};
+#endif
diff --git a/arch/um/lkl/kernel/vmlinux.lds.S b/arch/um/lkl/kernel/vmlinux.lds.S
new file mode 100644
index 000000000000..efe420f38110
--- /dev/null
+++ b/arch/um/lkl/kernel/vmlinux.lds.S
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <asm/vmlinux.lds.h>
+#include <asm/thread_info.h>
+#include <asm/page.h>
+#include <asm/cache.h>
+#include <linux/export.h>
+
+OUTPUT_FORMAT(CONFIG_OUTPUT_FORMAT)
+
+VMLINUX_SYMBOL(jiffies) = VMLINUX_SYMBOL(jiffies_64);
+
+SECTIONS
+{
+	VMLINUX_SYMBOL(__init_begin) = .;
+	HEAD_TEXT_SECTION
+	INIT_TEXT_SECTION(PAGE_SIZE)
+	INIT_DATA_SECTION(16)
+	PERCPU_SECTION(L1_CACHE_BYTES)
+	VMLINUX_SYMBOL(__init_end) = .;
+
+	VMLINUX_SYMBOL(_stext) = .;
+	VMLINUX_SYMBOL(_text) = . ;
+	VMLINUX_SYMBOL(text) = . ;
+	.text      :
+	{
+		TEXT_TEXT
+		SCHED_TEXT
+		LOCK_TEXT
+		CPUIDLE_TEXT
+	}
+	VMLINUX_SYMBOL(_etext) = .;
+
+	VMLINUX_SYMBOL(_sdata) = .;
+	RO_DATA_SECTION(PAGE_SIZE)
+	RW_DATA_SECTION(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
+	VMLINUX_SYMBOL(_edata) = .;
+
+	VMLINUX_SYMBOL(__start_ro_after_init) = .;
+	.data..ro_after_init : { *(.data..ro_after_init)}
+	EXCEPTION_TABLE(16)
+	VMLINUX_SYMBOL(__end_ro_after_init) = .;
+	NOTES
+
+	BSS_SECTION(0, 0, 0)
+	VMLINUX_SYMBOL(_end) = .;
+
+	STABS_DEBUG
+	DWARF_DEBUG
+
+	DISCARDS
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 04/37] lkl: host interface
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki, Michael Zimmermann, Patrick Collins,
	Pierre-Hugues Husson, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch introduces the host operations that define the interface
between the LKL and the host. These operations must be provided either
by a host library or by the application itself.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/host_ops.h      | 10 ++++++++++
 arch/um/lkl/include/uapi/asm/host_ops.h | 26 +++++++++++++++++++++++++
 2 files changed, 36 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/host_ops.h
 create mode 100644 arch/um/lkl/include/uapi/asm/host_ops.h

diff --git a/arch/um/lkl/include/asm/host_ops.h b/arch/um/lkl/include/asm/host_ops.h
new file mode 100644
index 000000000000..65850f394b79
--- /dev/null
+++ b/arch/um/lkl/include/asm/host_ops.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_HOST_OPS_H
+#define _ASM_LKL_HOST_OPS_H
+
+#include "irq.h"
+#include <uapi/asm/host_ops.h>
+
+extern struct lkl_host_operations *lkl_ops;
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
new file mode 100644
index 000000000000..7cfb0a93e6a6
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_HOST_OPS_H
+#define _ASM_UAPI_LKL_HOST_OPS_H
+
+/* Defined in {posix,nt}-host.c */
+struct lkl_mutex;
+struct lkl_sem;
+struct lkl_tls_key;
+typedef unsigned long lkl_thread_t;
+struct lkl_jmp_buf {
+	unsigned long buf[32];
+};
+
+/**
+ * lkl_host_operations - host operations used by the Linux kernel
+ *
+ * These operations must be provided by a host library or by the application
+ * itself.
+ *
+ */
+struct lkl_host_operations {
+};
+
+void lkl_bug(const char *fmt, ...);
+
+#endif
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 04/37] lkl: host interface
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Octavian Purdila, Akira Moroo, Yuan Liu,
	Patrick Collins, linux-kernel-library, Pierre-Hugues Husson,
	Michael Zimmermann, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch introduces the host operations that define the interface
between the LKL and the host. These operations must be provided either
by a host library or by the application itself.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/host_ops.h      | 10 ++++++++++
 arch/um/lkl/include/uapi/asm/host_ops.h | 26 +++++++++++++++++++++++++
 2 files changed, 36 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/host_ops.h
 create mode 100644 arch/um/lkl/include/uapi/asm/host_ops.h

diff --git a/arch/um/lkl/include/asm/host_ops.h b/arch/um/lkl/include/asm/host_ops.h
new file mode 100644
index 000000000000..65850f394b79
--- /dev/null
+++ b/arch/um/lkl/include/asm/host_ops.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_HOST_OPS_H
+#define _ASM_LKL_HOST_OPS_H
+
+#include "irq.h"
+#include <uapi/asm/host_ops.h>
+
+extern struct lkl_host_operations *lkl_ops;
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
new file mode 100644
index 000000000000..7cfb0a93e6a6
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_HOST_OPS_H
+#define _ASM_UAPI_LKL_HOST_OPS_H
+
+/* Defined in {posix,nt}-host.c */
+struct lkl_mutex;
+struct lkl_sem;
+struct lkl_tls_key;
+typedef unsigned long lkl_thread_t;
+struct lkl_jmp_buf {
+	unsigned long buf[32];
+};
+
+/**
+ * lkl_host_operations - host operations used by the Linux kernel
+ *
+ * These operations must be provided by a host library or by the application
+ * itself.
+ *
+ */
+struct lkl_host_operations {
+};
+
+void lkl_bug(const char *fmt, ...);
+
+#endif
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 05/37] lkl: memory handling
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki, Levente Kurusa, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

LKL is a non MMU architecture and hence there is not much work left to
do other than initializing the boot allocator and providing the page
and page table definitions.

The backstore memory is allocated via a host operation and the memory
size to be used is specified when the kernel is started, in the
lkl_start_kernel call.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Levente Kurusa <levex@linux.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/page.h          | 14 ++++++
 arch/um/lkl/include/asm/pgtable.h       | 62 +++++++++++++++++++++++
 arch/um/lkl/include/uapi/asm/host_ops.h |  5 ++
 arch/um/lkl/mm/bootmem.c                | 66 +++++++++++++++++++++++++
 4 files changed, 147 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/page.h
 create mode 100644 arch/um/lkl/include/asm/pgtable.h
 create mode 100644 arch/um/lkl/mm/bootmem.c

diff --git a/arch/um/lkl/include/asm/page.h b/arch/um/lkl/include/asm/page.h
new file mode 100644
index 000000000000..e77f3da22031
--- /dev/null
+++ b/arch/um/lkl/include/asm/page.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_PAGE_H
+#define _ASM_LKL_PAGE_H
+
+#define CONFIG_KERNEL_RAM_BASE_ADDRESS memory_start
+
+#ifndef __ASSEMBLY__
+void free_mem(void);
+void bootmem_init(unsigned long mem_size);
+#endif
+
+#include <asm-generic/page.h>
+
+#endif /* _ASM_LKL_PAGE_H */
diff --git a/arch/um/lkl/include/asm/pgtable.h b/arch/um/lkl/include/asm/pgtable.h
new file mode 100644
index 000000000000..b790296abfac
--- /dev/null
+++ b/arch/um/lkl/include/asm/pgtable.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_PGTABLE_H
+#define _LKL_PGTABLE_H
+
+#include <asm-generic/4level-fixup.h>
+
+/*
+ * (C) Copyright 2000-2002, Greg Ungerer <gerg@snapgear.com>
+ */
+
+#include <linux/slab.h>
+#include <asm/processor.h>
+#include <asm/io.h>
+
+#define pgd_present(pgd)	(1)
+#define pgd_none(pgd)		(0)
+#define pgd_bad(pgd)		(0)
+#define pgd_clear(pgdp)
+#define kern_addr_valid(addr)	(1)
+#define	pmd_offset(a, b)	((void *)0)
+
+#define PAGE_NONE		__pgprot(0)
+#define PAGE_SHARED		__pgprot(0)
+#define PAGE_COPY		__pgprot(0)
+#define PAGE_READONLY		__pgprot(0)
+#define PAGE_KERNEL		__pgprot(0)
+
+void paging_init(void);
+#define swapper_pg_dir		((pgd_t *)0)
+
+#define __swp_type(x)		(0)
+#define __swp_offset(x)		(0)
+#define __swp_entry(typ, off)	((swp_entry_t) { ((typ) | ((off) << 7)) })
+#define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) })
+#define __swp_entry_to_pte(x)	((pte_t) { (x).val })
+
+/*
+ * ZERO_PAGE is a global shared page that is always zero: used
+ * for zero-mapped memory areas etc..
+ */
+extern void *empty_zero_page;
+#define ZERO_PAGE(vaddr)	(virt_to_page(empty_zero_page))
+
+/*
+ * No page table caches to initialise.
+ */
+#define pgtable_cache_init()	do { } while (0)
+
+/*
+ * All 32bit addresses are effectively valid for vmalloc...
+ * Sort of meaningless for non-VM targets.
+ */
+#define	VMALLOC_START		0
+#define	VMALLOC_END		0xffffffff
+#define	KMAP_START		0
+#define	KMAP_END		0xffffffff
+
+#include <asm-generic/pgtable.h>
+
+#define check_pgt_cache()	do { } while (0)
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 7cfb0a93e6a6..6bbc94c120be 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -17,8 +17,13 @@ struct lkl_jmp_buf {
  * These operations must be provided by a host library or by the application
  * itself.
  *
+ * @mem_alloc - allocate memory
+ * @mem_free - free memory
+ *
  */
 struct lkl_host_operations {
+	void *(*mem_alloc)(unsigned long mem);
+	void (*mem_free)(void *mem);
 };
 
 void lkl_bug(const char *fmt, ...);
diff --git a/arch/um/lkl/mm/bootmem.c b/arch/um/lkl/mm/bootmem.c
new file mode 100644
index 000000000000..39dd0d22b44e
--- /dev/null
+++ b/arch/um/lkl/mm/bootmem.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/memblock.h>
+#include <linux/mm.h>
+#include <linux/swap.h>
+
+unsigned long memory_start, memory_end;
+static unsigned long _memory_start, mem_size;
+
+void *empty_zero_page;
+
+void __init bootmem_init(unsigned long mem_sz)
+{
+	mem_size = mem_sz;
+
+	_memory_start = (unsigned long)lkl_ops->mem_alloc(mem_size);
+	memory_start = _memory_start;
+	WARN_ON(!memory_start);
+	memory_end = memory_start + mem_size;
+
+	if (PAGE_ALIGN(memory_start) != memory_start) {
+		mem_size -= PAGE_ALIGN(memory_start) - memory_start;
+		memory_start = PAGE_ALIGN(memory_start);
+		mem_size = (mem_size / PAGE_SIZE) * PAGE_SIZE;
+	}
+	pr_info("memblock address range: 0x%lx - 0x%lx\n", memory_start,
+		memory_start + mem_size);
+	/*
+	 * Give all the memory to the bootmap allocator, tell it to put the
+	 * boot mem_map at the start of memory.
+	 */
+	max_low_pfn = virt_to_pfn(memory_end);
+	min_low_pfn = virt_to_pfn(memory_start);
+	memblock_add(memory_start, mem_size);
+
+	empty_zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+	memset((void *)empty_zero_page, 0, PAGE_SIZE);
+
+	{
+		unsigned long zones_size[MAX_NR_ZONES] = {0, };
+
+		zones_size[ZONE_NORMAL] = (mem_size) >> PAGE_SHIFT;
+		free_area_init(zones_size);
+	}
+}
+
+void __init mem_init(void)
+{
+	max_mapnr = (((unsigned long)high_memory) - PAGE_OFFSET) >> PAGE_SHIFT;
+	/* this will put all memory onto the freelists */
+	totalram_pages_add(memblock_free_all());
+	pr_info("Memory available: %luk/%luk RAM\n",
+		(nr_free_pages() << PAGE_SHIFT) >> 10, mem_size >> 10);
+}
+
+/*
+ * In our case __init memory is not part of the page allocator so there is
+ * nothing to free.
+ */
+void free_initmem(void)
+{
+}
+
+void free_mem(void)
+{
+	lkl_ops->mem_free((void *)_memory_start);
+}
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 05/37] lkl: memory handling
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Levente Kurusa, Octavian Purdila, Akira Moroo,
	Yuan Liu, linux-kernel-library, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

LKL is a non MMU architecture and hence there is not much work left to
do other than initializing the boot allocator and providing the page
and page table definitions.

The backstore memory is allocated via a host operation and the memory
size to be used is specified when the kernel is started, in the
lkl_start_kernel call.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Levente Kurusa <levex@linux.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/page.h          | 14 ++++++
 arch/um/lkl/include/asm/pgtable.h       | 62 +++++++++++++++++++++++
 arch/um/lkl/include/uapi/asm/host_ops.h |  5 ++
 arch/um/lkl/mm/bootmem.c                | 66 +++++++++++++++++++++++++
 4 files changed, 147 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/page.h
 create mode 100644 arch/um/lkl/include/asm/pgtable.h
 create mode 100644 arch/um/lkl/mm/bootmem.c

diff --git a/arch/um/lkl/include/asm/page.h b/arch/um/lkl/include/asm/page.h
new file mode 100644
index 000000000000..e77f3da22031
--- /dev/null
+++ b/arch/um/lkl/include/asm/page.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_PAGE_H
+#define _ASM_LKL_PAGE_H
+
+#define CONFIG_KERNEL_RAM_BASE_ADDRESS memory_start
+
+#ifndef __ASSEMBLY__
+void free_mem(void);
+void bootmem_init(unsigned long mem_size);
+#endif
+
+#include <asm-generic/page.h>
+
+#endif /* _ASM_LKL_PAGE_H */
diff --git a/arch/um/lkl/include/asm/pgtable.h b/arch/um/lkl/include/asm/pgtable.h
new file mode 100644
index 000000000000..b790296abfac
--- /dev/null
+++ b/arch/um/lkl/include/asm/pgtable.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_PGTABLE_H
+#define _LKL_PGTABLE_H
+
+#include <asm-generic/4level-fixup.h>
+
+/*
+ * (C) Copyright 2000-2002, Greg Ungerer <gerg@snapgear.com>
+ */
+
+#include <linux/slab.h>
+#include <asm/processor.h>
+#include <asm/io.h>
+
+#define pgd_present(pgd)	(1)
+#define pgd_none(pgd)		(0)
+#define pgd_bad(pgd)		(0)
+#define pgd_clear(pgdp)
+#define kern_addr_valid(addr)	(1)
+#define	pmd_offset(a, b)	((void *)0)
+
+#define PAGE_NONE		__pgprot(0)
+#define PAGE_SHARED		__pgprot(0)
+#define PAGE_COPY		__pgprot(0)
+#define PAGE_READONLY		__pgprot(0)
+#define PAGE_KERNEL		__pgprot(0)
+
+void paging_init(void);
+#define swapper_pg_dir		((pgd_t *)0)
+
+#define __swp_type(x)		(0)
+#define __swp_offset(x)		(0)
+#define __swp_entry(typ, off)	((swp_entry_t) { ((typ) | ((off) << 7)) })
+#define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) })
+#define __swp_entry_to_pte(x)	((pte_t) { (x).val })
+
+/*
+ * ZERO_PAGE is a global shared page that is always zero: used
+ * for zero-mapped memory areas etc..
+ */
+extern void *empty_zero_page;
+#define ZERO_PAGE(vaddr)	(virt_to_page(empty_zero_page))
+
+/*
+ * No page table caches to initialise.
+ */
+#define pgtable_cache_init()	do { } while (0)
+
+/*
+ * All 32bit addresses are effectively valid for vmalloc...
+ * Sort of meaningless for non-VM targets.
+ */
+#define	VMALLOC_START		0
+#define	VMALLOC_END		0xffffffff
+#define	KMAP_START		0
+#define	KMAP_END		0xffffffff
+
+#include <asm-generic/pgtable.h>
+
+#define check_pgt_cache()	do { } while (0)
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 7cfb0a93e6a6..6bbc94c120be 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -17,8 +17,13 @@ struct lkl_jmp_buf {
  * These operations must be provided by a host library or by the application
  * itself.
  *
+ * @mem_alloc - allocate memory
+ * @mem_free - free memory
+ *
  */
 struct lkl_host_operations {
+	void *(*mem_alloc)(unsigned long mem);
+	void (*mem_free)(void *mem);
 };
 
 void lkl_bug(const char *fmt, ...);
diff --git a/arch/um/lkl/mm/bootmem.c b/arch/um/lkl/mm/bootmem.c
new file mode 100644
index 000000000000..39dd0d22b44e
--- /dev/null
+++ b/arch/um/lkl/mm/bootmem.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/memblock.h>
+#include <linux/mm.h>
+#include <linux/swap.h>
+
+unsigned long memory_start, memory_end;
+static unsigned long _memory_start, mem_size;
+
+void *empty_zero_page;
+
+void __init bootmem_init(unsigned long mem_sz)
+{
+	mem_size = mem_sz;
+
+	_memory_start = (unsigned long)lkl_ops->mem_alloc(mem_size);
+	memory_start = _memory_start;
+	WARN_ON(!memory_start);
+	memory_end = memory_start + mem_size;
+
+	if (PAGE_ALIGN(memory_start) != memory_start) {
+		mem_size -= PAGE_ALIGN(memory_start) - memory_start;
+		memory_start = PAGE_ALIGN(memory_start);
+		mem_size = (mem_size / PAGE_SIZE) * PAGE_SIZE;
+	}
+	pr_info("memblock address range: 0x%lx - 0x%lx\n", memory_start,
+		memory_start + mem_size);
+	/*
+	 * Give all the memory to the bootmap allocator, tell it to put the
+	 * boot mem_map at the start of memory.
+	 */
+	max_low_pfn = virt_to_pfn(memory_end);
+	min_low_pfn = virt_to_pfn(memory_start);
+	memblock_add(memory_start, mem_size);
+
+	empty_zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+	memset((void *)empty_zero_page, 0, PAGE_SIZE);
+
+	{
+		unsigned long zones_size[MAX_NR_ZONES] = {0, };
+
+		zones_size[ZONE_NORMAL] = (mem_size) >> PAGE_SHIFT;
+		free_area_init(zones_size);
+	}
+}
+
+void __init mem_init(void)
+{
+	max_mapnr = (((unsigned long)high_memory) - PAGE_OFFSET) >> PAGE_SHIFT;
+	/* this will put all memory onto the freelists */
+	totalram_pages_add(memblock_free_all());
+	pr_info("Memory available: %luk/%luk RAM\n",
+		(nr_free_pages() << PAGE_SHIFT) >> 10, mem_size >> 10);
+}
+
+/*
+ * In our case __init memory is not part of the page allocator so there is
+ * nothing to free.
+ */
+void free_initmem(void)
+{
+}
+
+void free_mem(void)
+{
+	lkl_ops->mem_free((void *)_memory_start);
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 06/37] lkl: kernel threads support
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki, Lai Jiangshan, Patrick Collins, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

LKL does not support user processes but it must support kernel threads
as part as the normal kernel work-flow. It uses host operations to
create and terminate host threads that are going to run the kernel
threads. It also uses semaphores to synchronize those threads and to
allow the Linux kernel scheduler to control how the kernel threads
run.

Each kernel thread runs in a host threads and has a host semaphore
associated with it - the thread's scheduling semaphore. The semaphore
counter is initialized to 0. The first thing a kernel thread does
after getting spawned, before running any kernel code, is to perform a
down operation to block the thread.

The kernel controls host threads scheduling by performing up and down
operations on the scheduling semaphore. In __switch_context an up
operation on the next thread is performed to wake up a blocked thread,
and a down operation is performed on the prev thread to block it.

A thread is terminated by marking it in free_thread_info and
performing an up operation on the scheduling semaphore at which point
the marked thread will terminate itself.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/thread_info.h   |  70 ++++++++
 arch/um/lkl/include/uapi/asm/host_ops.h |  55 ++++++
 arch/um/lkl/kernel/cpu.c                | 223 +++++++++++++++++++++++
 arch/um/lkl/kernel/threads.c            | 227 ++++++++++++++++++++++++
 4 files changed, 575 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/thread_info.h
 create mode 100644 arch/um/lkl/kernel/cpu.c
 create mode 100644 arch/um/lkl/kernel/threads.c

diff --git a/arch/um/lkl/include/asm/thread_info.h b/arch/um/lkl/include/asm/thread_info.h
new file mode 100644
index 000000000000..da4e75fc7b10
--- /dev/null
+++ b/arch/um/lkl/include/asm/thread_info.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_THREAD_INFO_H
+#define _ASM_LKL_THREAD_INFO_H
+
+#define THREAD_SIZE	       (4096)
+
+#ifndef __ASSEMBLY__
+#include <asm/types.h>
+#include <asm/processor.h>
+#include <asm/host_ops.h>
+
+typedef struct {
+	unsigned long seg;
+} mm_segment_t;
+
+struct thread_info {
+	struct task_struct *task;
+	unsigned long flags;
+	int preempt_count;
+	mm_segment_t addr_limit;
+	struct lkl_sem *sched_sem;
+	struct lkl_jmp_buf sched_jb;
+	bool dead;
+	lkl_thread_t tid;
+	struct task_struct *prev_sched;
+	unsigned long stackend;
+};
+
+#define INIT_THREAD_INFO(tsk)				\
+{							\
+	.task		= &tsk,				\
+	.preempt_count	= INIT_PREEMPT_COUNT,		\
+	.flags		= 0,				\
+	.addr_limit	= KERNEL_DS,			\
+}
+
+/* how to get the thread information struct from C */
+extern struct thread_info *_current_thread_info;
+static inline struct thread_info *current_thread_info(void)
+{
+	return _current_thread_info;
+}
+
+/* thread information allocation */
+unsigned long *alloc_thread_stack_node(struct task_struct *, int node);
+void free_thread_stack(struct task_struct *tsk);
+
+void threads_init(void);
+void threads_cleanup(void);
+
+#define TIF_SYSCALL_TRACE		0
+#define TIF_NOTIFY_RESUME		1
+#define TIF_SIGPENDING			2
+#define TIF_NEED_RESCHED		3
+#define TIF_RESTORE_SIGMASK		4
+#define TIF_MEMDIE			5
+#define TIF_NOHZ			6
+#define TIF_SCHED_JB			7
+#define TIF_HOST_THREAD			8
+
+#define __HAVE_THREAD_FUNCTIONS
+
+#define task_thread_info(task)	((struct thread_info *)(task)->stack)
+#define task_stack_page(task)	((task)->stack)
+void setup_thread_stack(struct task_struct *p, struct task_struct *org);
+#define end_of_stack(p) (&task_thread_info(p)->stackend)
+
+#endif /* __ASSEMBLY__ */
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 6bbc94c120be..19924fc7c718 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -17,15 +17,70 @@ struct lkl_jmp_buf {
  * These operations must be provided by a host library or by the application
  * itself.
  *
+ * @sem_alloc - allocate a host semaphore an initialize it to count
+ * @sem_free - free a host semaphore
+ * @sem_up - perform an up operation on the semaphore
+ * @sem_down - perform a down operation on the semaphore
+ *
+ * @mutex_alloc - allocate and initialize a host mutex; the recursive parameter
+ * determines if the mutex is recursive or not
+ * @mutex_free - free a host mutex
+ * @mutex_lock - acquire the mutex
+ * @mutex_unlock - release the mutex
+ *
+ * @thread_create - create a new thread and run f(arg) in its context; returns a
+ * thread handle or 0 if the thread could not be created
+ * @thread_detach - on POSIX systems, free up resources held by
+ * pthreads. Noop on Win32.
+ * @thread_exit - terminates the current thread
+ * @thread_join - wait for the given thread to terminate. Returns 0
+ * for success, -1 otherwise
+ *
  * @mem_alloc - allocate memory
  * @mem_free - free memory
  *
+ * @gettid - returns the host thread id of the caller, which need not
+ * be the same as the handle returned by thread_create
+ *
+ * @jmp_buf_set - runs the give function and setups a jump back point by saving
+ * the context in the jump buffer; jmp_buf_longjmp can be called from the give
+ * function or any callee in that function to return back to the jump back
+ * point
+ *
+ * NOTE: we can't return from jmp_buf_set before calling jmp_buf_longjmp or
+ * otherwise the saved context (stack) is not going to be valid, so we must pass
+ * the function that will eventually call longjmp here
+ *
+ * @jmp_buf_longjmp - perform a jump back to the saved jump buffer
  */
 struct lkl_host_operations {
+	struct lkl_sem *(*sem_alloc)(int count);
+	void (*sem_free)(struct lkl_sem *sem);
+	void (*sem_up)(struct lkl_sem *sem);
+	void (*sem_down)(struct lkl_sem *sem);
+
+	struct lkl_mutex *(*mutex_alloc)(int recursive);
+	void (*mutex_free)(struct lkl_mutex *mutex);
+	void (*mutex_lock)(struct lkl_mutex *mutex);
+	void (*mutex_unlock)(struct lkl_mutex *mutex);
+
+	lkl_thread_t (*thread_create)(void (*f)(void *), void *arg);
+	void (*thread_detach)(void);
+	void (*thread_exit)(void);
+	int (*thread_join)(lkl_thread_t tid);
+	lkl_thread_t (*thread_self)(void);
+	int (*thread_equal)(lkl_thread_t a, lkl_thread_t b);
+
 	void *(*mem_alloc)(unsigned long mem);
 	void (*mem_free)(void *mem);
+
+	long (*gettid)(void);
+
+	void (*jmp_buf_set)(struct lkl_jmp_buf *jmpb, void (*f)(void));
+	void (*jmp_buf_longjmp)(struct lkl_jmp_buf *jmpb, int val);
 };
 
+int lkl_printf(const char *fmt, ...);
 void lkl_bug(const char *fmt, ...);
 
 #endif
diff --git a/arch/um/lkl/kernel/cpu.c b/arch/um/lkl/kernel/cpu.c
new file mode 100644
index 000000000000..125af3b2d5dd
--- /dev/null
+++ b/arch/um/lkl/kernel/cpu.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kernel.h>
+#include <linux/sched/stat.h>
+#include <asm/host_ops.h>
+#include <asm/cpu.h>
+#include <asm/thread_info.h>
+#include <asm/unistd.h>
+#include <asm/sched.h>
+#include <asm/syscalls.h>
+
+/*
+ * This structure is used to get access to the "LKL CPU" that allows us to run
+ * Linux code. Because we have to deal with various synchronization requirements
+ * between idle thread, system calls, interrupts, "reentrancy", CPU shutdown,
+ * imbalance wake up (i.e. acquire the CPU from one thread and release it from
+ * another), we can't use a simple synchronization mechanism such as (recursive)
+ * mutex or semaphore. Instead, we use a mutex and a bunch of status data plus a
+ * semaphore.
+ */
+static struct lkl_cpu {
+	/* lock that protects the CPU status data */
+	struct lkl_mutex *lock;
+	/*
+	 * Since we must free the cpu lock during shutdown we need a
+	 * synchronization algorithm between lkl_cpu_shutdown() and the CPU
+	 * access functions since lkl_cpu_get() gets called from thread
+	 * destructor callback functions which may be scheduled after
+	 * lkl_cpu_shutdown() has freed the cpu lock.
+	 *
+	 * An atomic counter is used to keep track of the number of running
+	 * CPU access functions and allow the shutdown function to wait for
+	 * them.
+	 *
+	 * The shutdown functions adds MAX_THREADS to this counter which allows
+	 * the CPU access functions to check if the shutdown process has
+	 * started.
+	 *
+	 * This algorithm assumes that we never have more the MAX_THREADS
+	 * requesting CPU access.
+	 */
+	#define MAX_THREADS 1000000
+	unsigned int shutdown_gate;
+	bool irqs_pending;
+	/* no of threads waiting the CPU */
+	unsigned int sleepers;
+	/* no of times the current thread got the CPU */
+	unsigned int count;
+	/* current thread that owns the CPU */
+	lkl_thread_t owner;
+	/* semaphore for threads waiting the CPU */
+	struct lkl_sem *sem;
+	/* semaphore used for shutdown */
+	struct lkl_sem *shutdown_sem;
+} cpu;
+
+static int __cpu_try_get_lock(int n)
+{
+	lkl_thread_t self;
+
+	if (__sync_fetch_and_add(&cpu.shutdown_gate, n) >= MAX_THREADS)
+		return -2;
+
+	lkl_ops->mutex_lock(cpu.lock);
+
+	if (cpu.shutdown_gate >= MAX_THREADS)
+		return -1;
+
+	self = lkl_ops->thread_self();
+
+	if (cpu.owner && !lkl_ops->thread_equal(cpu.owner, self))
+		return 0;
+
+	cpu.owner = self;
+	cpu.count++;
+
+	return 1;
+}
+
+static void __cpu_try_get_unlock(int lock_ret, int n)
+{
+	if (lock_ret >= -1)
+		lkl_ops->mutex_unlock(cpu.lock);
+	__sync_fetch_and_sub(&cpu.shutdown_gate, n);
+}
+
+void lkl_cpu_change_owner(lkl_thread_t owner)
+{
+	lkl_ops->mutex_lock(cpu.lock);
+	if (cpu.count > 1)
+		lkl_bug("bad count while changing owner\n");
+	cpu.owner = owner;
+	lkl_ops->mutex_unlock(cpu.lock);
+}
+
+int lkl_cpu_get(void)
+{
+	int ret;
+
+	ret = __cpu_try_get_lock(1);
+
+	while (ret == 0) {
+		cpu.sleepers++;
+		__cpu_try_get_unlock(ret, 0);
+		lkl_ops->sem_down(cpu.sem);
+		ret = __cpu_try_get_lock(0);
+	}
+
+	__cpu_try_get_unlock(ret, 1);
+
+	return ret;
+}
+
+void lkl_cpu_put(void)
+{
+	lkl_ops->mutex_lock(cpu.lock);
+
+	if (!cpu.count || !cpu.owner ||
+	    !lkl_ops->thread_equal(cpu.owner, lkl_ops->thread_self()))
+		lkl_bug("%s: unbalanced put\n", __func__);
+
+	while (cpu.irqs_pending && !irqs_disabled()) {
+		cpu.irqs_pending = false;
+		lkl_ops->mutex_unlock(cpu.lock);
+		run_irqs();
+		lkl_ops->mutex_lock(cpu.lock);
+	}
+
+	if (test_ti_thread_flag(current_thread_info(), TIF_HOST_THREAD) &&
+	    !single_task_running() && cpu.count == 1) {
+		if (in_interrupt())
+			lkl_bug("%s: in interrupt\n", __func__);
+		lkl_ops->mutex_unlock(cpu.lock);
+		thread_sched_jb();
+		return;
+	}
+
+	if (--cpu.count > 0) {
+		lkl_ops->mutex_unlock(cpu.lock);
+		return;
+	}
+
+	if (cpu.sleepers) {
+		cpu.sleepers--;
+		lkl_ops->sem_up(cpu.sem);
+	}
+
+	cpu.owner = 0;
+
+	lkl_ops->mutex_unlock(cpu.lock);
+}
+
+int lkl_cpu_try_run_irq(int irq)
+{
+	int ret;
+
+	ret = __cpu_try_get_lock(1);
+	if (!ret) {
+		set_irq_pending(irq);
+		cpu.irqs_pending = true;
+	}
+	__cpu_try_get_unlock(ret, 1);
+
+	return ret;
+}
+
+void lkl_cpu_shutdown(void)
+{
+	__sync_fetch_and_add(&cpu.shutdown_gate, MAX_THREADS);
+}
+
+void lkl_cpu_wait_shutdown(void)
+{
+	lkl_ops->sem_down(cpu.shutdown_sem);
+	lkl_ops->sem_free(cpu.shutdown_sem);
+}
+
+static void lkl_cpu_cleanup(bool shutdown)
+{
+	while (__sync_fetch_and_add(&cpu.shutdown_gate, 0) > MAX_THREADS)
+		;
+
+	if (shutdown)
+		lkl_ops->sem_up(cpu.shutdown_sem);
+	else if (cpu.shutdown_sem)
+		lkl_ops->sem_free(cpu.shutdown_sem);
+	if (cpu.sem)
+		lkl_ops->sem_free(cpu.sem);
+	if (cpu.lock)
+		lkl_ops->mutex_free(cpu.lock);
+}
+
+void arch_cpu_idle(void)
+{
+	if (cpu.shutdown_gate >= MAX_THREADS) {
+		lkl_ops->mutex_lock(cpu.lock);
+		while (cpu.sleepers--)
+			lkl_ops->sem_up(cpu.sem);
+		lkl_ops->mutex_unlock(cpu.lock);
+
+		lkl_cpu_cleanup(true);
+
+		lkl_ops->thread_exit();
+	}
+	/* enable irqs now to allow direct irqs to run */
+	local_irq_enable();
+
+	/* switch to idle_host_task */
+	wakeup_idle_host_task();
+}
+
+int lkl_cpu_init(void)
+{
+	cpu.lock = lkl_ops->mutex_alloc(0);
+	cpu.sem = lkl_ops->sem_alloc(0);
+	cpu.shutdown_sem = lkl_ops->sem_alloc(0);
+
+	if (!cpu.lock || !cpu.sem || !cpu.shutdown_sem) {
+		lkl_cpu_cleanup(false);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
diff --git a/arch/um/lkl/kernel/threads.c b/arch/um/lkl/kernel/threads.c
new file mode 100644
index 000000000000..4fe8c56ae5e0
--- /dev/null
+++ b/arch/um/lkl/kernel/threads.c
@@ -0,0 +1,227 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/sched/task.h>
+#include <linux/sched/signal.h>
+#include <asm/host_ops.h>
+#include <asm/cpu.h>
+#include <asm/sched.h>
+
+static int init_ti(struct thread_info *ti)
+{
+	ti->sched_sem = lkl_ops->sem_alloc(0);
+	if (!ti->sched_sem)
+		return -ENOMEM;
+
+	ti->dead = false;
+	ti->prev_sched = NULL;
+	ti->tid = 0;
+
+	return 0;
+}
+
+unsigned long *alloc_thread_stack_node(struct task_struct *task, int node)
+{
+	struct thread_info *ti;
+
+	ti = kmalloc(sizeof(*ti), GFP_KERNEL);
+	if (!ti)
+		return NULL;
+
+	if (init_ti(ti)) {
+		kfree(ti);
+		return NULL;
+	}
+	ti->task = task;
+
+	return (unsigned long *)ti;
+}
+
+/*
+ * The only new tasks created are kernel threads that have a predefined starting
+ * point thus no stack copy is required.
+ */
+void setup_thread_stack(struct task_struct *p, struct task_struct *org)
+{
+	struct thread_info *ti = task_thread_info(p);
+	struct thread_info *org_ti = task_thread_info(org);
+
+	ti->flags = org_ti->flags;
+	ti->preempt_count = org_ti->preempt_count;
+	ti->addr_limit = org_ti->addr_limit;
+}
+
+static void kill_thread(struct thread_info *ti)
+{
+	if (!test_ti_thread_flag(ti, TIF_HOST_THREAD)) {
+		ti->dead = true;
+		lkl_ops->sem_up(ti->sched_sem);
+		lkl_ops->thread_join(ti->tid);
+	}
+	lkl_ops->sem_free(ti->sched_sem);
+}
+
+void free_thread_stack(struct task_struct *tsk)
+{
+	struct thread_info *ti = task_thread_info(tsk);
+
+	kill_thread(ti);
+	kfree(ti);
+}
+
+struct thread_info *_current_thread_info = &init_thread_union.thread_info;
+
+/*
+ * schedule() expects the return of this function to be the task that we
+ * switched away from. Returning prev is not going to work because we are
+ * actually going to return the previous taks that was scheduled before the
+ * task we are going to wake up, and not the current task, e.g.:
+ *
+ * swapper -> init: saved prev on swapper stack is swapper
+ * init -> ksoftirqd0: saved prev on init stack is init
+ * ksoftirqd0 -> swapper: returned prev is swapper
+ */
+static struct task_struct *abs_prev = &init_task;
+
+struct task_struct *__switch_to(struct task_struct *prev,
+				struct task_struct *next)
+{
+	struct thread_info *_prev = task_thread_info(prev);
+	struct thread_info *_next = task_thread_info(next);
+	unsigned long _prev_flags = _prev->flags;
+	struct lkl_jmp_buf _prev_jb;
+
+	_current_thread_info = task_thread_info(next);
+	_next->prev_sched = prev;
+	abs_prev = prev;
+
+	BUG_ON(!_next->tid);
+	lkl_cpu_change_owner(_next->tid);
+
+	if (test_bit(TIF_SCHED_JB, &_prev_flags)) {
+		/* Atomic. Must be done before wakeup next */
+		clear_ti_thread_flag(_prev, TIF_SCHED_JB);
+		_prev_jb = _prev->sched_jb;
+	}
+
+	lkl_ops->sem_up(_next->sched_sem);
+	if (test_bit(TIF_SCHED_JB, &_prev_flags))
+		lkl_ops->jmp_buf_longjmp(&_prev_jb, 1);
+	else
+		lkl_ops->sem_down(_prev->sched_sem);
+
+	if (_prev->dead)
+		lkl_ops->thread_exit();
+
+	return abs_prev;
+}
+
+int host_task_stub(void *unused)
+{
+	return 0;
+}
+
+void switch_to_host_task(struct task_struct *task)
+{
+	if (WARN_ON(!test_tsk_thread_flag(task, TIF_HOST_THREAD)))
+		return;
+
+	task_thread_info(task)->tid = lkl_ops->thread_self();
+
+	if (current == task)
+		return;
+
+	wake_up_process(task);
+	thread_sched_jb();
+	lkl_ops->sem_down(task_thread_info(task)->sched_sem);
+	schedule_tail(abs_prev);
+}
+
+struct thread_bootstrap_arg {
+	struct thread_info *ti;
+	int (*f)(void *arg);
+	void *arg;
+};
+
+static void thread_bootstrap(void *_tba)
+{
+	struct thread_bootstrap_arg *tba = (struct thread_bootstrap_arg *)_tba;
+	struct thread_info *ti = tba->ti;
+	int (*f)(void *) = tba->f;
+	void *arg = tba->arg;
+
+	lkl_ops->sem_down(ti->sched_sem);
+	kfree(tba);
+	if (ti->prev_sched)
+		schedule_tail(ti->prev_sched);
+
+	f(arg);
+	do_exit(0);
+}
+
+int copy_thread(unsigned long clone_flags, unsigned long esp,
+		unsigned long unused, struct task_struct *p)
+{
+	struct thread_info *ti = task_thread_info(p);
+	struct thread_bootstrap_arg *tba;
+
+	if ((int (*)(void *))esp == host_task_stub) {
+		set_ti_thread_flag(ti, TIF_HOST_THREAD);
+		return 0;
+	}
+
+	tba = kmalloc(sizeof(*tba), GFP_KERNEL);
+	if (!tba)
+		return -ENOMEM;
+
+	tba->f = (int (*)(void *))esp;
+	tba->arg = (void *)unused;
+	tba->ti = ti;
+
+	ti->tid = lkl_ops->thread_create(thread_bootstrap, tba);
+	if (!ti->tid) {
+		kfree(tba);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void show_stack(struct task_struct *task, unsigned long *esp)
+{
+}
+
+/**
+ * This is called before the kernel initializes, so no kernel calls (including
+ * printk) can't be made yet.
+ */
+void threads_init(void)
+{
+	int ret;
+	struct thread_info *ti = &init_thread_union.thread_info;
+
+	ret = init_ti(ti);
+	if (ret < 0)
+		lkl_printf("lkl: failed to allocate init schedule semaphore\n");
+
+	ti->tid = lkl_ops->thread_self();
+}
+
+void threads_cleanup(void)
+{
+	struct task_struct *p, *t;
+
+	for_each_process_thread(p, t) {
+		struct thread_info *ti = task_thread_info(t);
+
+		if (t->pid != 1 && !test_ti_thread_flag(ti, TIF_HOST_THREAD))
+			WARN(!(t->flags & PF_KTHREAD),
+			     "non kernel thread task %s\n", t->comm);
+		WARN(t->state == TASK_RUNNING,
+		     "thread %s still running while halting\n", t->comm);
+
+		kill_thread(ti);
+	}
+
+	lkl_ops->sem_free(init_thread_union.thread_info.sched_sem);
+}
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 06/37] lkl: kernel threads support
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Octavian Purdila, Lai Jiangshan, Akira Moroo,
	Yuan Liu, Patrick Collins, linux-kernel-library, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

LKL does not support user processes but it must support kernel threads
as part as the normal kernel work-flow. It uses host operations to
create and terminate host threads that are going to run the kernel
threads. It also uses semaphores to synchronize those threads and to
allow the Linux kernel scheduler to control how the kernel threads
run.

Each kernel thread runs in a host threads and has a host semaphore
associated with it - the thread's scheduling semaphore. The semaphore
counter is initialized to 0. The first thing a kernel thread does
after getting spawned, before running any kernel code, is to perform a
down operation to block the thread.

The kernel controls host threads scheduling by performing up and down
operations on the scheduling semaphore. In __switch_context an up
operation on the next thread is performed to wake up a blocked thread,
and a down operation is performed on the prev thread to block it.

A thread is terminated by marking it in free_thread_info and
performing an up operation on the scheduling semaphore at which point
the marked thread will terminate itself.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/thread_info.h   |  70 ++++++++
 arch/um/lkl/include/uapi/asm/host_ops.h |  55 ++++++
 arch/um/lkl/kernel/cpu.c                | 223 +++++++++++++++++++++++
 arch/um/lkl/kernel/threads.c            | 227 ++++++++++++++++++++++++
 4 files changed, 575 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/thread_info.h
 create mode 100644 arch/um/lkl/kernel/cpu.c
 create mode 100644 arch/um/lkl/kernel/threads.c

diff --git a/arch/um/lkl/include/asm/thread_info.h b/arch/um/lkl/include/asm/thread_info.h
new file mode 100644
index 000000000000..da4e75fc7b10
--- /dev/null
+++ b/arch/um/lkl/include/asm/thread_info.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_THREAD_INFO_H
+#define _ASM_LKL_THREAD_INFO_H
+
+#define THREAD_SIZE	       (4096)
+
+#ifndef __ASSEMBLY__
+#include <asm/types.h>
+#include <asm/processor.h>
+#include <asm/host_ops.h>
+
+typedef struct {
+	unsigned long seg;
+} mm_segment_t;
+
+struct thread_info {
+	struct task_struct *task;
+	unsigned long flags;
+	int preempt_count;
+	mm_segment_t addr_limit;
+	struct lkl_sem *sched_sem;
+	struct lkl_jmp_buf sched_jb;
+	bool dead;
+	lkl_thread_t tid;
+	struct task_struct *prev_sched;
+	unsigned long stackend;
+};
+
+#define INIT_THREAD_INFO(tsk)				\
+{							\
+	.task		= &tsk,				\
+	.preempt_count	= INIT_PREEMPT_COUNT,		\
+	.flags		= 0,				\
+	.addr_limit	= KERNEL_DS,			\
+}
+
+/* how to get the thread information struct from C */
+extern struct thread_info *_current_thread_info;
+static inline struct thread_info *current_thread_info(void)
+{
+	return _current_thread_info;
+}
+
+/* thread information allocation */
+unsigned long *alloc_thread_stack_node(struct task_struct *, int node);
+void free_thread_stack(struct task_struct *tsk);
+
+void threads_init(void);
+void threads_cleanup(void);
+
+#define TIF_SYSCALL_TRACE		0
+#define TIF_NOTIFY_RESUME		1
+#define TIF_SIGPENDING			2
+#define TIF_NEED_RESCHED		3
+#define TIF_RESTORE_SIGMASK		4
+#define TIF_MEMDIE			5
+#define TIF_NOHZ			6
+#define TIF_SCHED_JB			7
+#define TIF_HOST_THREAD			8
+
+#define __HAVE_THREAD_FUNCTIONS
+
+#define task_thread_info(task)	((struct thread_info *)(task)->stack)
+#define task_stack_page(task)	((task)->stack)
+void setup_thread_stack(struct task_struct *p, struct task_struct *org);
+#define end_of_stack(p) (&task_thread_info(p)->stackend)
+
+#endif /* __ASSEMBLY__ */
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 6bbc94c120be..19924fc7c718 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -17,15 +17,70 @@ struct lkl_jmp_buf {
  * These operations must be provided by a host library or by the application
  * itself.
  *
+ * @sem_alloc - allocate a host semaphore an initialize it to count
+ * @sem_free - free a host semaphore
+ * @sem_up - perform an up operation on the semaphore
+ * @sem_down - perform a down operation on the semaphore
+ *
+ * @mutex_alloc - allocate and initialize a host mutex; the recursive parameter
+ * determines if the mutex is recursive or not
+ * @mutex_free - free a host mutex
+ * @mutex_lock - acquire the mutex
+ * @mutex_unlock - release the mutex
+ *
+ * @thread_create - create a new thread and run f(arg) in its context; returns a
+ * thread handle or 0 if the thread could not be created
+ * @thread_detach - on POSIX systems, free up resources held by
+ * pthreads. Noop on Win32.
+ * @thread_exit - terminates the current thread
+ * @thread_join - wait for the given thread to terminate. Returns 0
+ * for success, -1 otherwise
+ *
  * @mem_alloc - allocate memory
  * @mem_free - free memory
  *
+ * @gettid - returns the host thread id of the caller, which need not
+ * be the same as the handle returned by thread_create
+ *
+ * @jmp_buf_set - runs the give function and setups a jump back point by saving
+ * the context in the jump buffer; jmp_buf_longjmp can be called from the give
+ * function or any callee in that function to return back to the jump back
+ * point
+ *
+ * NOTE: we can't return from jmp_buf_set before calling jmp_buf_longjmp or
+ * otherwise the saved context (stack) is not going to be valid, so we must pass
+ * the function that will eventually call longjmp here
+ *
+ * @jmp_buf_longjmp - perform a jump back to the saved jump buffer
  */
 struct lkl_host_operations {
+	struct lkl_sem *(*sem_alloc)(int count);
+	void (*sem_free)(struct lkl_sem *sem);
+	void (*sem_up)(struct lkl_sem *sem);
+	void (*sem_down)(struct lkl_sem *sem);
+
+	struct lkl_mutex *(*mutex_alloc)(int recursive);
+	void (*mutex_free)(struct lkl_mutex *mutex);
+	void (*mutex_lock)(struct lkl_mutex *mutex);
+	void (*mutex_unlock)(struct lkl_mutex *mutex);
+
+	lkl_thread_t (*thread_create)(void (*f)(void *), void *arg);
+	void (*thread_detach)(void);
+	void (*thread_exit)(void);
+	int (*thread_join)(lkl_thread_t tid);
+	lkl_thread_t (*thread_self)(void);
+	int (*thread_equal)(lkl_thread_t a, lkl_thread_t b);
+
 	void *(*mem_alloc)(unsigned long mem);
 	void (*mem_free)(void *mem);
+
+	long (*gettid)(void);
+
+	void (*jmp_buf_set)(struct lkl_jmp_buf *jmpb, void (*f)(void));
+	void (*jmp_buf_longjmp)(struct lkl_jmp_buf *jmpb, int val);
 };
 
+int lkl_printf(const char *fmt, ...);
 void lkl_bug(const char *fmt, ...);
 
 #endif
diff --git a/arch/um/lkl/kernel/cpu.c b/arch/um/lkl/kernel/cpu.c
new file mode 100644
index 000000000000..125af3b2d5dd
--- /dev/null
+++ b/arch/um/lkl/kernel/cpu.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kernel.h>
+#include <linux/sched/stat.h>
+#include <asm/host_ops.h>
+#include <asm/cpu.h>
+#include <asm/thread_info.h>
+#include <asm/unistd.h>
+#include <asm/sched.h>
+#include <asm/syscalls.h>
+
+/*
+ * This structure is used to get access to the "LKL CPU" that allows us to run
+ * Linux code. Because we have to deal with various synchronization requirements
+ * between idle thread, system calls, interrupts, "reentrancy", CPU shutdown,
+ * imbalance wake up (i.e. acquire the CPU from one thread and release it from
+ * another), we can't use a simple synchronization mechanism such as (recursive)
+ * mutex or semaphore. Instead, we use a mutex and a bunch of status data plus a
+ * semaphore.
+ */
+static struct lkl_cpu {
+	/* lock that protects the CPU status data */
+	struct lkl_mutex *lock;
+	/*
+	 * Since we must free the cpu lock during shutdown we need a
+	 * synchronization algorithm between lkl_cpu_shutdown() and the CPU
+	 * access functions since lkl_cpu_get() gets called from thread
+	 * destructor callback functions which may be scheduled after
+	 * lkl_cpu_shutdown() has freed the cpu lock.
+	 *
+	 * An atomic counter is used to keep track of the number of running
+	 * CPU access functions and allow the shutdown function to wait for
+	 * them.
+	 *
+	 * The shutdown functions adds MAX_THREADS to this counter which allows
+	 * the CPU access functions to check if the shutdown process has
+	 * started.
+	 *
+	 * This algorithm assumes that we never have more the MAX_THREADS
+	 * requesting CPU access.
+	 */
+	#define MAX_THREADS 1000000
+	unsigned int shutdown_gate;
+	bool irqs_pending;
+	/* no of threads waiting the CPU */
+	unsigned int sleepers;
+	/* no of times the current thread got the CPU */
+	unsigned int count;
+	/* current thread that owns the CPU */
+	lkl_thread_t owner;
+	/* semaphore for threads waiting the CPU */
+	struct lkl_sem *sem;
+	/* semaphore used for shutdown */
+	struct lkl_sem *shutdown_sem;
+} cpu;
+
+static int __cpu_try_get_lock(int n)
+{
+	lkl_thread_t self;
+
+	if (__sync_fetch_and_add(&cpu.shutdown_gate, n) >= MAX_THREADS)
+		return -2;
+
+	lkl_ops->mutex_lock(cpu.lock);
+
+	if (cpu.shutdown_gate >= MAX_THREADS)
+		return -1;
+
+	self = lkl_ops->thread_self();
+
+	if (cpu.owner && !lkl_ops->thread_equal(cpu.owner, self))
+		return 0;
+
+	cpu.owner = self;
+	cpu.count++;
+
+	return 1;
+}
+
+static void __cpu_try_get_unlock(int lock_ret, int n)
+{
+	if (lock_ret >= -1)
+		lkl_ops->mutex_unlock(cpu.lock);
+	__sync_fetch_and_sub(&cpu.shutdown_gate, n);
+}
+
+void lkl_cpu_change_owner(lkl_thread_t owner)
+{
+	lkl_ops->mutex_lock(cpu.lock);
+	if (cpu.count > 1)
+		lkl_bug("bad count while changing owner\n");
+	cpu.owner = owner;
+	lkl_ops->mutex_unlock(cpu.lock);
+}
+
+int lkl_cpu_get(void)
+{
+	int ret;
+
+	ret = __cpu_try_get_lock(1);
+
+	while (ret == 0) {
+		cpu.sleepers++;
+		__cpu_try_get_unlock(ret, 0);
+		lkl_ops->sem_down(cpu.sem);
+		ret = __cpu_try_get_lock(0);
+	}
+
+	__cpu_try_get_unlock(ret, 1);
+
+	return ret;
+}
+
+void lkl_cpu_put(void)
+{
+	lkl_ops->mutex_lock(cpu.lock);
+
+	if (!cpu.count || !cpu.owner ||
+	    !lkl_ops->thread_equal(cpu.owner, lkl_ops->thread_self()))
+		lkl_bug("%s: unbalanced put\n", __func__);
+
+	while (cpu.irqs_pending && !irqs_disabled()) {
+		cpu.irqs_pending = false;
+		lkl_ops->mutex_unlock(cpu.lock);
+		run_irqs();
+		lkl_ops->mutex_lock(cpu.lock);
+	}
+
+	if (test_ti_thread_flag(current_thread_info(), TIF_HOST_THREAD) &&
+	    !single_task_running() && cpu.count == 1) {
+		if (in_interrupt())
+			lkl_bug("%s: in interrupt\n", __func__);
+		lkl_ops->mutex_unlock(cpu.lock);
+		thread_sched_jb();
+		return;
+	}
+
+	if (--cpu.count > 0) {
+		lkl_ops->mutex_unlock(cpu.lock);
+		return;
+	}
+
+	if (cpu.sleepers) {
+		cpu.sleepers--;
+		lkl_ops->sem_up(cpu.sem);
+	}
+
+	cpu.owner = 0;
+
+	lkl_ops->mutex_unlock(cpu.lock);
+}
+
+int lkl_cpu_try_run_irq(int irq)
+{
+	int ret;
+
+	ret = __cpu_try_get_lock(1);
+	if (!ret) {
+		set_irq_pending(irq);
+		cpu.irqs_pending = true;
+	}
+	__cpu_try_get_unlock(ret, 1);
+
+	return ret;
+}
+
+void lkl_cpu_shutdown(void)
+{
+	__sync_fetch_and_add(&cpu.shutdown_gate, MAX_THREADS);
+}
+
+void lkl_cpu_wait_shutdown(void)
+{
+	lkl_ops->sem_down(cpu.shutdown_sem);
+	lkl_ops->sem_free(cpu.shutdown_sem);
+}
+
+static void lkl_cpu_cleanup(bool shutdown)
+{
+	while (__sync_fetch_and_add(&cpu.shutdown_gate, 0) > MAX_THREADS)
+		;
+
+	if (shutdown)
+		lkl_ops->sem_up(cpu.shutdown_sem);
+	else if (cpu.shutdown_sem)
+		lkl_ops->sem_free(cpu.shutdown_sem);
+	if (cpu.sem)
+		lkl_ops->sem_free(cpu.sem);
+	if (cpu.lock)
+		lkl_ops->mutex_free(cpu.lock);
+}
+
+void arch_cpu_idle(void)
+{
+	if (cpu.shutdown_gate >= MAX_THREADS) {
+		lkl_ops->mutex_lock(cpu.lock);
+		while (cpu.sleepers--)
+			lkl_ops->sem_up(cpu.sem);
+		lkl_ops->mutex_unlock(cpu.lock);
+
+		lkl_cpu_cleanup(true);
+
+		lkl_ops->thread_exit();
+	}
+	/* enable irqs now to allow direct irqs to run */
+	local_irq_enable();
+
+	/* switch to idle_host_task */
+	wakeup_idle_host_task();
+}
+
+int lkl_cpu_init(void)
+{
+	cpu.lock = lkl_ops->mutex_alloc(0);
+	cpu.sem = lkl_ops->sem_alloc(0);
+	cpu.shutdown_sem = lkl_ops->sem_alloc(0);
+
+	if (!cpu.lock || !cpu.sem || !cpu.shutdown_sem) {
+		lkl_cpu_cleanup(false);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
diff --git a/arch/um/lkl/kernel/threads.c b/arch/um/lkl/kernel/threads.c
new file mode 100644
index 000000000000..4fe8c56ae5e0
--- /dev/null
+++ b/arch/um/lkl/kernel/threads.c
@@ -0,0 +1,227 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/sched/task.h>
+#include <linux/sched/signal.h>
+#include <asm/host_ops.h>
+#include <asm/cpu.h>
+#include <asm/sched.h>
+
+static int init_ti(struct thread_info *ti)
+{
+	ti->sched_sem = lkl_ops->sem_alloc(0);
+	if (!ti->sched_sem)
+		return -ENOMEM;
+
+	ti->dead = false;
+	ti->prev_sched = NULL;
+	ti->tid = 0;
+
+	return 0;
+}
+
+unsigned long *alloc_thread_stack_node(struct task_struct *task, int node)
+{
+	struct thread_info *ti;
+
+	ti = kmalloc(sizeof(*ti), GFP_KERNEL);
+	if (!ti)
+		return NULL;
+
+	if (init_ti(ti)) {
+		kfree(ti);
+		return NULL;
+	}
+	ti->task = task;
+
+	return (unsigned long *)ti;
+}
+
+/*
+ * The only new tasks created are kernel threads that have a predefined starting
+ * point thus no stack copy is required.
+ */
+void setup_thread_stack(struct task_struct *p, struct task_struct *org)
+{
+	struct thread_info *ti = task_thread_info(p);
+	struct thread_info *org_ti = task_thread_info(org);
+
+	ti->flags = org_ti->flags;
+	ti->preempt_count = org_ti->preempt_count;
+	ti->addr_limit = org_ti->addr_limit;
+}
+
+static void kill_thread(struct thread_info *ti)
+{
+	if (!test_ti_thread_flag(ti, TIF_HOST_THREAD)) {
+		ti->dead = true;
+		lkl_ops->sem_up(ti->sched_sem);
+		lkl_ops->thread_join(ti->tid);
+	}
+	lkl_ops->sem_free(ti->sched_sem);
+}
+
+void free_thread_stack(struct task_struct *tsk)
+{
+	struct thread_info *ti = task_thread_info(tsk);
+
+	kill_thread(ti);
+	kfree(ti);
+}
+
+struct thread_info *_current_thread_info = &init_thread_union.thread_info;
+
+/*
+ * schedule() expects the return of this function to be the task that we
+ * switched away from. Returning prev is not going to work because we are
+ * actually going to return the previous taks that was scheduled before the
+ * task we are going to wake up, and not the current task, e.g.:
+ *
+ * swapper -> init: saved prev on swapper stack is swapper
+ * init -> ksoftirqd0: saved prev on init stack is init
+ * ksoftirqd0 -> swapper: returned prev is swapper
+ */
+static struct task_struct *abs_prev = &init_task;
+
+struct task_struct *__switch_to(struct task_struct *prev,
+				struct task_struct *next)
+{
+	struct thread_info *_prev = task_thread_info(prev);
+	struct thread_info *_next = task_thread_info(next);
+	unsigned long _prev_flags = _prev->flags;
+	struct lkl_jmp_buf _prev_jb;
+
+	_current_thread_info = task_thread_info(next);
+	_next->prev_sched = prev;
+	abs_prev = prev;
+
+	BUG_ON(!_next->tid);
+	lkl_cpu_change_owner(_next->tid);
+
+	if (test_bit(TIF_SCHED_JB, &_prev_flags)) {
+		/* Atomic. Must be done before wakeup next */
+		clear_ti_thread_flag(_prev, TIF_SCHED_JB);
+		_prev_jb = _prev->sched_jb;
+	}
+
+	lkl_ops->sem_up(_next->sched_sem);
+	if (test_bit(TIF_SCHED_JB, &_prev_flags))
+		lkl_ops->jmp_buf_longjmp(&_prev_jb, 1);
+	else
+		lkl_ops->sem_down(_prev->sched_sem);
+
+	if (_prev->dead)
+		lkl_ops->thread_exit();
+
+	return abs_prev;
+}
+
+int host_task_stub(void *unused)
+{
+	return 0;
+}
+
+void switch_to_host_task(struct task_struct *task)
+{
+	if (WARN_ON(!test_tsk_thread_flag(task, TIF_HOST_THREAD)))
+		return;
+
+	task_thread_info(task)->tid = lkl_ops->thread_self();
+
+	if (current == task)
+		return;
+
+	wake_up_process(task);
+	thread_sched_jb();
+	lkl_ops->sem_down(task_thread_info(task)->sched_sem);
+	schedule_tail(abs_prev);
+}
+
+struct thread_bootstrap_arg {
+	struct thread_info *ti;
+	int (*f)(void *arg);
+	void *arg;
+};
+
+static void thread_bootstrap(void *_tba)
+{
+	struct thread_bootstrap_arg *tba = (struct thread_bootstrap_arg *)_tba;
+	struct thread_info *ti = tba->ti;
+	int (*f)(void *) = tba->f;
+	void *arg = tba->arg;
+
+	lkl_ops->sem_down(ti->sched_sem);
+	kfree(tba);
+	if (ti->prev_sched)
+		schedule_tail(ti->prev_sched);
+
+	f(arg);
+	do_exit(0);
+}
+
+int copy_thread(unsigned long clone_flags, unsigned long esp,
+		unsigned long unused, struct task_struct *p)
+{
+	struct thread_info *ti = task_thread_info(p);
+	struct thread_bootstrap_arg *tba;
+
+	if ((int (*)(void *))esp == host_task_stub) {
+		set_ti_thread_flag(ti, TIF_HOST_THREAD);
+		return 0;
+	}
+
+	tba = kmalloc(sizeof(*tba), GFP_KERNEL);
+	if (!tba)
+		return -ENOMEM;
+
+	tba->f = (int (*)(void *))esp;
+	tba->arg = (void *)unused;
+	tba->ti = ti;
+
+	ti->tid = lkl_ops->thread_create(thread_bootstrap, tba);
+	if (!ti->tid) {
+		kfree(tba);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void show_stack(struct task_struct *task, unsigned long *esp)
+{
+}
+
+/**
+ * This is called before the kernel initializes, so no kernel calls (including
+ * printk) can't be made yet.
+ */
+void threads_init(void)
+{
+	int ret;
+	struct thread_info *ti = &init_thread_union.thread_info;
+
+	ret = init_ti(ti);
+	if (ret < 0)
+		lkl_printf("lkl: failed to allocate init schedule semaphore\n");
+
+	ti->tid = lkl_ops->thread_self();
+}
+
+void threads_cleanup(void)
+{
+	struct task_struct *p, *t;
+
+	for_each_process_thread(p, t) {
+		struct thread_info *ti = task_thread_info(t);
+
+		if (t->pid != 1 && !test_ti_thread_flag(ti, TIF_HOST_THREAD))
+			WARN(!(t->flags & PF_KTHREAD),
+			     "non kernel thread task %s\n", t->comm);
+		WARN(t->state == TASK_RUNNING,
+		     "thread %s still running while halting\n", t->comm);
+
+		kill_thread(ti);
+	}
+
+	lkl_ops->sem_free(init_thread_union.thread_info.sched_sem);
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 07/37] lkl: interrupt support
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki, Michael Zimmermann

From: Octavian Purdila <tavi.purdila@gmail.com>

Add APIs that allows the host to reserve and free and interrupt number
and also to trigger an interrupt.

The trigger operation will simply store the interrupt data in
queue. The interrupt handler is run later, at the first opportunity it
has to avoid races with any kernel threads.

Currently, interrupts are run on the first interrupt enable operation
if interrupts are disabled and if we are not already in interrupt
context.

When triggering an interrupt, it uses GCC's built-in functions for
atomic memory access to synchronize and simple boolean flags.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/irq.h             |  13 ++
 arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
 arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
 arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++
 4 files changed, 258 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/irq.h
 create mode 100644 arch/um/lkl/include/uapi/asm/irq.h
 create mode 100644 arch/um/lkl/include/uapi/asm/sigcontext.h
 create mode 100644 arch/um/lkl/kernel/irq.c

diff --git a/arch/um/lkl/include/asm/irq.h b/arch/um/lkl/include/asm/irq.h
new file mode 100644
index 000000000000..948fc54cb76c
--- /dev/null
+++ b/arch/um/lkl/include/asm/irq.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_IRQ_H
+#define _ASM_LKL_IRQ_H
+
+#define IRQ_STATUS_BITS		(sizeof(long) * 8)
+#define NR_IRQS			((int)(IRQ_STATUS_BITS * IRQ_STATUS_BITS))
+
+void run_irqs(void);
+void set_irq_pending(int irq);
+
+#include <uapi/asm/irq.h>
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/irq.h b/arch/um/lkl/include/uapi/asm/irq.h
new file mode 100644
index 000000000000..666628b233eb
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/irq.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_IRQ_H
+#define _ASM_UAPI_LKL_IRQ_H
+
+/**
+ * lkl_trigger_irq - generate an interrupt
+ *
+ * This function is used by the device host side to signal its Linux counterpart
+ * that some event happened.
+ *
+ * @irq - the irq number to signal
+ */
+int lkl_trigger_irq(int irq);
+
+/**
+ * lkl_get_free_irq - find and reserve a free IRQ number
+ *
+ * This function is called by the host device code to find an unused IRQ number
+ * and reserved it for its own use.
+ *
+ * @user - a string to identify the user
+ * @returns - and irq number that can be used by request_irq or an negative
+ * value in case of an error
+ */
+int lkl_get_free_irq(const char *user);
+
+/**
+ * lkl_put_irq - release an IRQ number previously obtained with lkl_get_free_irq
+ *
+ * @irq - irq number to release
+ * @user - string identifying the user; should be the same as the one passed to
+ * lkl_get_free_irq when the irq number was obtained
+ */
+void lkl_put_irq(int irq, const char *name);
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/sigcontext.h b/arch/um/lkl/include/uapi/asm/sigcontext.h
new file mode 100644
index 000000000000..2f4848843d1d
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/sigcontext.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_SIGCONTEXT_H
+#define _ASM_UAPI_LKL_SIGCONTEXT_H
+
+#include <asm/ptrace.h>
+
+struct pt_regs {
+	void *irq_data;
+};
+
+struct sigcontext {
+	struct pt_regs regs;
+	unsigned long oldmask;
+};
+
+#endif
diff --git a/arch/um/lkl/kernel/irq.c b/arch/um/lkl/kernel/irq.c
new file mode 100644
index 000000000000..e3b59e46ca50
--- /dev/null
+++ b/arch/um/lkl/kernel/irq.c
@@ -0,0 +1,193 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/irq.h>
+#include <linux/hardirq.h>
+#include <asm/irq_regs.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <linux/tick.h>
+#include <asm/irqflags.h>
+#include <asm/host_ops.h>
+#include <asm/cpu.h>
+
+/*
+ * To avoid much overhead we use an indirect approach: the irqs are marked using
+ * a bitmap (array of longs) and a summary of the modified bits is kept in a
+ * separate "index" long - one bit for each sizeof(long). Thus we can support
+ * 4096 irqs on 64bit platforms and 1024 irqs on 32bit platforms.
+ *
+ * Whenever an irq is trigger both the array and the index is updated. To find
+ * which irqs were triggered we first search the index and then the
+ * corresponding part of the arrary.
+ */
+static unsigned long irq_status[NR_IRQS / IRQ_STATUS_BITS];
+static unsigned long irq_index_status;
+
+static inline unsigned long test_and_clear_irq_index_status(void)
+{
+	if (!irq_index_status)
+		return 0;
+	return __sync_fetch_and_and(&irq_index_status, 0);
+}
+
+static inline unsigned long test_and_clear_irq_status(int index)
+{
+	if (!&irq_status[index])
+		return 0;
+	return __sync_fetch_and_and(&irq_status[index], 0);
+}
+
+void set_irq_pending(int irq)
+{
+	int index = irq / IRQ_STATUS_BITS;
+	int bit = irq % IRQ_STATUS_BITS;
+
+	__sync_fetch_and_or(&irq_status[index], BIT(bit));
+	__sync_fetch_and_or(&irq_index_status, BIT(index));
+}
+
+static struct irq_info {
+	const char *user;
+} irqs[NR_IRQS];
+
+static bool irqs_enabled;
+
+static struct pt_regs dummy;
+
+static void run_irq(int irq)
+{
+	unsigned long flags;
+	struct pt_regs *old_regs = set_irq_regs((struct pt_regs *)&dummy);
+
+	/* interrupt handlers need to run with interrupts disabled */
+	local_irq_save(flags);
+	irq_enter();
+	generic_handle_irq(irq);
+	irq_exit();
+	set_irq_regs(old_regs);
+	local_irq_restore(flags);
+}
+
+/**
+ * This function can be called from arbitrary host threads, so do not
+ * issue any Linux calls (e.g. prink) if lkl_cpu_get() was not issued
+ * before.
+ */
+int lkl_trigger_irq(int irq)
+{
+	int ret;
+
+	if (!irq || irq > NR_IRQS)
+		return -EINVAL;
+
+	ret = lkl_cpu_try_run_irq(irq);
+	if (ret <= 0)
+		return ret;
+
+	/*
+	 * Since this can be called from Linux context (e.g. lkl_trigger_irq ->
+	 * IRQ -> softirq -> lkl_trigger_irq) make sure we are actually allowed
+	 * to run irqs at this point
+	 */
+	if (!irqs_enabled) {
+		set_irq_pending(irq);
+		lkl_cpu_put();
+		return 0;
+	}
+
+	run_irq(irq);
+
+	lkl_cpu_put();
+
+	return 0;
+}
+
+static inline void for_each_bit(unsigned long word, void (*f)(int, int), int j)
+{
+	int i = 0;
+
+	while (word) {
+		if (word & 1)
+			f(i, j);
+		word >>= 1;
+		i++;
+	}
+}
+
+static inline void deliver_irq(int bit, int index)
+{
+	run_irq(index * IRQ_STATUS_BITS + bit);
+}
+
+static inline void check_irq_status(int i, int unused)
+{
+	for_each_bit(test_and_clear_irq_status(i), deliver_irq, i);
+}
+
+void run_irqs(void)
+{
+	for_each_bit(test_and_clear_irq_index_status(), check_irq_status, 0);
+}
+
+int show_interrupts(struct seq_file *p, void *v)
+{
+	return 0;
+}
+
+int lkl_get_free_irq(const char *user)
+{
+	int i;
+	int ret = -EBUSY;
+
+	/* 0 is not a valid IRQ */
+	for (i = 1; i < NR_IRQS; i++) {
+		if (!irqs[i].user) {
+			irqs[i].user = user;
+			irq_set_chip_and_handler(i, &dummy_irq_chip,
+						 handle_simple_irq);
+			ret = i;
+			break;
+		}
+	}
+
+	return ret;
+}
+
+void lkl_put_irq(int i, const char *user)
+{
+	if (!irqs[i].user || strcmp(irqs[i].user, user) != 0) {
+		WARN("%s tried to release %s's irq %d", user, irqs[i].user, i);
+		return;
+	}
+
+	irqs[i].user = NULL;
+}
+
+unsigned long arch_local_save_flags(void)
+{
+	return irqs_enabled;
+}
+
+void arch_local_irq_restore(unsigned long flags)
+{
+	if (flags == ARCH_IRQ_ENABLED && irqs_enabled == ARCH_IRQ_DISABLED &&
+	    !in_interrupt())
+		run_irqs();
+	irqs_enabled = flags;
+}
+
+void init_IRQ(void)
+{
+	int i;
+
+	for (i = 0; i < NR_IRQS; i++)
+		irq_set_chip_and_handler(i, &dummy_irq_chip, handle_simple_irq);
+
+	pr_info("lkl: irqs initialized\n");
+}
+
+void cpu_yield_to_irqs(void)
+{
+	cpu_relax();
+}
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 07/37] lkl: interrupt support
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Octavian Purdila, Akira Moroo, linux-kernel-library,
	Michael Zimmermann, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add APIs that allows the host to reserve and free and interrupt number
and also to trigger an interrupt.

The trigger operation will simply store the interrupt data in
queue. The interrupt handler is run later, at the first opportunity it
has to avoid races with any kernel threads.

Currently, interrupts are run on the first interrupt enable operation
if interrupts are disabled and if we are not already in interrupt
context.

When triggering an interrupt, it uses GCC's built-in functions for
atomic memory access to synchronize and simple boolean flags.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/irq.h             |  13 ++
 arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
 arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
 arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++
 4 files changed, 258 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/irq.h
 create mode 100644 arch/um/lkl/include/uapi/asm/irq.h
 create mode 100644 arch/um/lkl/include/uapi/asm/sigcontext.h
 create mode 100644 arch/um/lkl/kernel/irq.c

diff --git a/arch/um/lkl/include/asm/irq.h b/arch/um/lkl/include/asm/irq.h
new file mode 100644
index 000000000000..948fc54cb76c
--- /dev/null
+++ b/arch/um/lkl/include/asm/irq.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_IRQ_H
+#define _ASM_LKL_IRQ_H
+
+#define IRQ_STATUS_BITS		(sizeof(long) * 8)
+#define NR_IRQS			((int)(IRQ_STATUS_BITS * IRQ_STATUS_BITS))
+
+void run_irqs(void);
+void set_irq_pending(int irq);
+
+#include <uapi/asm/irq.h>
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/irq.h b/arch/um/lkl/include/uapi/asm/irq.h
new file mode 100644
index 000000000000..666628b233eb
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/irq.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_IRQ_H
+#define _ASM_UAPI_LKL_IRQ_H
+
+/**
+ * lkl_trigger_irq - generate an interrupt
+ *
+ * This function is used by the device host side to signal its Linux counterpart
+ * that some event happened.
+ *
+ * @irq - the irq number to signal
+ */
+int lkl_trigger_irq(int irq);
+
+/**
+ * lkl_get_free_irq - find and reserve a free IRQ number
+ *
+ * This function is called by the host device code to find an unused IRQ number
+ * and reserved it for its own use.
+ *
+ * @user - a string to identify the user
+ * @returns - and irq number that can be used by request_irq or an negative
+ * value in case of an error
+ */
+int lkl_get_free_irq(const char *user);
+
+/**
+ * lkl_put_irq - release an IRQ number previously obtained with lkl_get_free_irq
+ *
+ * @irq - irq number to release
+ * @user - string identifying the user; should be the same as the one passed to
+ * lkl_get_free_irq when the irq number was obtained
+ */
+void lkl_put_irq(int irq, const char *name);
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/sigcontext.h b/arch/um/lkl/include/uapi/asm/sigcontext.h
new file mode 100644
index 000000000000..2f4848843d1d
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/sigcontext.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_UAPI_LKL_SIGCONTEXT_H
+#define _ASM_UAPI_LKL_SIGCONTEXT_H
+
+#include <asm/ptrace.h>
+
+struct pt_regs {
+	void *irq_data;
+};
+
+struct sigcontext {
+	struct pt_regs regs;
+	unsigned long oldmask;
+};
+
+#endif
diff --git a/arch/um/lkl/kernel/irq.c b/arch/um/lkl/kernel/irq.c
new file mode 100644
index 000000000000..e3b59e46ca50
--- /dev/null
+++ b/arch/um/lkl/kernel/irq.c
@@ -0,0 +1,193 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/irq.h>
+#include <linux/hardirq.h>
+#include <asm/irq_regs.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <linux/tick.h>
+#include <asm/irqflags.h>
+#include <asm/host_ops.h>
+#include <asm/cpu.h>
+
+/*
+ * To avoid much overhead we use an indirect approach: the irqs are marked using
+ * a bitmap (array of longs) and a summary of the modified bits is kept in a
+ * separate "index" long - one bit for each sizeof(long). Thus we can support
+ * 4096 irqs on 64bit platforms and 1024 irqs on 32bit platforms.
+ *
+ * Whenever an irq is trigger both the array and the index is updated. To find
+ * which irqs were triggered we first search the index and then the
+ * corresponding part of the arrary.
+ */
+static unsigned long irq_status[NR_IRQS / IRQ_STATUS_BITS];
+static unsigned long irq_index_status;
+
+static inline unsigned long test_and_clear_irq_index_status(void)
+{
+	if (!irq_index_status)
+		return 0;
+	return __sync_fetch_and_and(&irq_index_status, 0);
+}
+
+static inline unsigned long test_and_clear_irq_status(int index)
+{
+	if (!&irq_status[index])
+		return 0;
+	return __sync_fetch_and_and(&irq_status[index], 0);
+}
+
+void set_irq_pending(int irq)
+{
+	int index = irq / IRQ_STATUS_BITS;
+	int bit = irq % IRQ_STATUS_BITS;
+
+	__sync_fetch_and_or(&irq_status[index], BIT(bit));
+	__sync_fetch_and_or(&irq_index_status, BIT(index));
+}
+
+static struct irq_info {
+	const char *user;
+} irqs[NR_IRQS];
+
+static bool irqs_enabled;
+
+static struct pt_regs dummy;
+
+static void run_irq(int irq)
+{
+	unsigned long flags;
+	struct pt_regs *old_regs = set_irq_regs((struct pt_regs *)&dummy);
+
+	/* interrupt handlers need to run with interrupts disabled */
+	local_irq_save(flags);
+	irq_enter();
+	generic_handle_irq(irq);
+	irq_exit();
+	set_irq_regs(old_regs);
+	local_irq_restore(flags);
+}
+
+/**
+ * This function can be called from arbitrary host threads, so do not
+ * issue any Linux calls (e.g. prink) if lkl_cpu_get() was not issued
+ * before.
+ */
+int lkl_trigger_irq(int irq)
+{
+	int ret;
+
+	if (!irq || irq > NR_IRQS)
+		return -EINVAL;
+
+	ret = lkl_cpu_try_run_irq(irq);
+	if (ret <= 0)
+		return ret;
+
+	/*
+	 * Since this can be called from Linux context (e.g. lkl_trigger_irq ->
+	 * IRQ -> softirq -> lkl_trigger_irq) make sure we are actually allowed
+	 * to run irqs at this point
+	 */
+	if (!irqs_enabled) {
+		set_irq_pending(irq);
+		lkl_cpu_put();
+		return 0;
+	}
+
+	run_irq(irq);
+
+	lkl_cpu_put();
+
+	return 0;
+}
+
+static inline void for_each_bit(unsigned long word, void (*f)(int, int), int j)
+{
+	int i = 0;
+
+	while (word) {
+		if (word & 1)
+			f(i, j);
+		word >>= 1;
+		i++;
+	}
+}
+
+static inline void deliver_irq(int bit, int index)
+{
+	run_irq(index * IRQ_STATUS_BITS + bit);
+}
+
+static inline void check_irq_status(int i, int unused)
+{
+	for_each_bit(test_and_clear_irq_status(i), deliver_irq, i);
+}
+
+void run_irqs(void)
+{
+	for_each_bit(test_and_clear_irq_index_status(), check_irq_status, 0);
+}
+
+int show_interrupts(struct seq_file *p, void *v)
+{
+	return 0;
+}
+
+int lkl_get_free_irq(const char *user)
+{
+	int i;
+	int ret = -EBUSY;
+
+	/* 0 is not a valid IRQ */
+	for (i = 1; i < NR_IRQS; i++) {
+		if (!irqs[i].user) {
+			irqs[i].user = user;
+			irq_set_chip_and_handler(i, &dummy_irq_chip,
+						 handle_simple_irq);
+			ret = i;
+			break;
+		}
+	}
+
+	return ret;
+}
+
+void lkl_put_irq(int i, const char *user)
+{
+	if (!irqs[i].user || strcmp(irqs[i].user, user) != 0) {
+		WARN("%s tried to release %s's irq %d", user, irqs[i].user, i);
+		return;
+	}
+
+	irqs[i].user = NULL;
+}
+
+unsigned long arch_local_save_flags(void)
+{
+	return irqs_enabled;
+}
+
+void arch_local_irq_restore(unsigned long flags)
+{
+	if (flags == ARCH_IRQ_ENABLED && irqs_enabled == ARCH_IRQ_DISABLED &&
+	    !in_interrupt())
+		run_irqs();
+	irqs_enabled = flags;
+}
+
+void init_IRQ(void)
+{
+	int i;
+
+	for (i = 0; i < NR_IRQS; i++)
+		irq_set_chip_and_handler(i, &dummy_irq_chip, handle_simple_irq);
+
+	pr_info("lkl: irqs initialized\n");
+}
+
+void cpu_yield_to_irqs(void)
+{
+	cpu_relax();
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 08/37] lkl: system call interface and application API
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Conrad Meyer, Hajime Tazaki, Jens Staal, Lai Jiangshan,
	Luca Dariz, Michael Zimmermann, Patrick Collins,
	Pierre-Hugues Husson, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

The LKL application API is based on the kernel system call interface
in order to offer a stable API to applications. Note that we can't
offer the full Linux system call interface due to LKL limitations such
as lack of virtual memory, signal, user processes, etc.

The host is using the LKL interrupt mechanism (lkl_trigger_irq) to
initiate a system call. The system call is executed in the context of
the init process.

To avoid collisions between the Linux API and the LKL API (e.g.  struct
stat, MKNOD, etc.) we use a python script to modify the user headers
and to prefix all of the global symbols (structures, typedefs,
defines) with LKL, lkl, _LKL, _lkl, __LKL or __lkl.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Jens Staal <staal1978@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/unistd.h        |  29 +++
 arch/um/lkl/include/uapi/asm/host_ops.h |  14 ++
 arch/um/lkl/include/uapi/asm/unistd.h   |  18 ++
 arch/um/lkl/kernel/syscalls.c           | 246 ++++++++++++++++++++++++
 arch/um/lkl/kernel/syscalls_32.c        | 159 +++++++++++++++
 arch/um/lkl/scripts/headers_install.py  | 195 +++++++++++++++++++
 6 files changed, 661 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/unistd.h
 create mode 100644 arch/um/lkl/include/uapi/asm/unistd.h
 create mode 100644 arch/um/lkl/kernel/syscalls.c
 create mode 100644 arch/um/lkl/kernel/syscalls_32.c
 create mode 100755 arch/um/lkl/scripts/headers_install.py

diff --git a/arch/um/lkl/include/asm/unistd.h b/arch/um/lkl/include/asm/unistd.h
new file mode 100644
index 000000000000..c0efc68bf41f
--- /dev/null
+++ b/arch/um/lkl/include/asm/unistd.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <uapi/asm/unistd.h>
+
+__SYSCALL(__NR_virtio_mmio_device_add, sys_virtio_mmio_device_add)
+
+#define __SC_ASCII(t, a) #t "," #a
+
+#define __ASCII_MAP0(m, ...)
+#define __ASCII_MAP1(m, t, a) m(t, a)
+#define __ASCII_MAP2(m, t, a, ...) m(t, a) "," __ASCII_MAP1(m, __VA_ARGS__)
+#define __ASCII_MAP3(m, t, a, ...) m(t, a) "," __ASCII_MAP2(m, __VA_ARGS__)
+#define __ASCII_MAP4(m, t, a, ...) m(t, a) "," __ASCII_MAP3(m, __VA_ARGS__)
+#define __ASCII_MAP5(m, t, a, ...) m(t, a) "," __ASCII_MAP4(m, __VA_ARGS__)
+#define __ASCII_MAP6(m, t, a, ...) m(t, a) "," __ASCII_MAP5(m, __VA_ARGS__)
+#define __ASCII_MAP(n, ...) __ASCII_MAP##n(__VA_ARGS__)
+
+#ifdef __MINGW32__
+#define SECTION_ATTRS "n0"
+#else
+#define SECTION_ATTRS "a"
+#endif
+
+#define __SYSCALL_DEFINE_ARCH(x, name, ...)				\
+	asm(".section .syscall_defs,\"" SECTION_ATTRS "\"\n"		\
+	    ".ascii \"#ifdef __NR" #name "\\n\"\n"			\
+	    ".ascii \"SYSCALL_DEFINE" #x "(" #name ","			\
+	    __ASCII_MAP(x, __SC_ASCII, __VA_ARGS__) ")\\n\"\n"		\
+	    ".ascii \"#endif\\n\"\n"					\
+	    ".section .text\n");
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 19924fc7c718..1c839d7139f8 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -36,6 +36,15 @@ struct lkl_jmp_buf {
  * @thread_join - wait for the given thread to terminate. Returns 0
  * for success, -1 otherwise
  *
+ * @tls_alloc - allocate a thread local storage key; returns 0 if successful; if
+ * destructor is not NULL it will be called when a thread terminates with its
+ * argument set to the current thread local storage value
+ * @tls_free - frees a thread local storage key; returns 0 if successful
+ * @tls_set - associate data to the thread local storage key; returns 0 if
+ * successful
+ * @tls_get - return data associated with the thread local storage key or NULL
+ * on error
+ *
  * @mem_alloc - allocate memory
  * @mem_free - free memory
  *
@@ -71,6 +80,11 @@ struct lkl_host_operations {
 	lkl_thread_t (*thread_self)(void);
 	int (*thread_equal)(lkl_thread_t a, lkl_thread_t b);
 
+	struct lkl_tls_key *(*tls_alloc)(void (*destructor)(void *));
+	void (*tls_free)(struct lkl_tls_key *key);
+	int (*tls_set)(struct lkl_tls_key *key, void *data);
+	void *(*tls_get)(struct lkl_tls_key *key);
+
 	void *(*mem_alloc)(unsigned long mem);
 	void (*mem_free)(void *mem);
 
diff --git a/arch/um/lkl/include/uapi/asm/unistd.h b/arch/um/lkl/include/uapi/asm/unistd.h
new file mode 100644
index 000000000000..561a7036821e
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/unistd.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#define __ARCH_WANT_SYSCALL_NO_AT
+#define __ARCH_WANT_SYSCALL_DEPRECATED
+#define __ARCH_WANT_SYSCALL_NO_FLAGS
+#define __ARCH_WANT_RENAMEAT
+#define __ARCH_WANT_NEW_STAT
+#define __ARCH_WANT_SET_GET_RLIMIT
+#define __ARCH_WANT_TIME32_SYSCALLS
+
+#include <asm/bitsperlong.h>
+
+#if __BITS_PER_LONG == 64
+#define __ARCH_WANT_SYS_NEWFSTATAT
+#endif
+
+#include <asm-generic/unistd.h>
+
+#define __NR_virtio_mmio_device_add		(__NR_arch_specific_syscall + 0)
diff --git a/arch/um/lkl/kernel/syscalls.c b/arch/um/lkl/kernel/syscalls.c
new file mode 100644
index 000000000000..ce3923baa655
--- /dev/null
+++ b/arch/um/lkl/kernel/syscalls.c
@@ -0,0 +1,246 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/stat.h>
+#include <linux/irq.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/jhash.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/net.h>
+#include <linux/task_work.h>
+#include <linux/syscalls.h>
+#include <linux/kthread.h>
+#include <linux/platform_device.h>
+#include <asm/host_ops.h>
+#include <asm/syscalls.h>
+#include <asm/syscalls_32.h>
+#include <asm/cpu.h>
+#include <asm/sched.h>
+
+static asmlinkage long sys_virtio_mmio_device_add(long base, long size,
+						  unsigned int irq);
+
+typedef long (*syscall_handler_t)(long arg1, ...);
+
+#undef __SYSCALL
+#define __SYSCALL(nr, sym)[nr] = (syscall_handler_t)sym,
+
+static syscall_handler_t syscall_table[__NR_syscalls] = {
+	[0 ... __NR_syscalls - 1] = (syscall_handler_t)sys_ni_syscall,
+#include <asm/unistd.h>
+
+#if __BITS_PER_LONG == 32
+#include <asm/unistd_32.h>
+#endif
+};
+
+static long run_syscall(long no, long *params)
+{
+	long ret;
+
+	if (no < 0 || no >= __NR_syscalls)
+		return -ENOSYS;
+
+	ret = syscall_table[no](params[0], params[1], params[2], params[3],
+				params[4], params[5]);
+
+	task_work_run();
+
+	return ret;
+}
+
+
+#define CLONE_FLAGS (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_THREAD |	\
+		     CLONE_SIGHAND | SIGCHLD)
+
+static int host_task_id;
+static struct task_struct *host0;
+
+static int new_host_task(struct task_struct **task)
+{
+	pid_t pid;
+
+	switch_to_host_task(host0);
+
+	pid = kernel_thread(host_task_stub, NULL, CLONE_FLAGS);
+	if (pid < 0)
+		return pid;
+
+	rcu_read_lock();
+	*task = find_task_by_pid_ns(pid, &init_pid_ns);
+	rcu_read_unlock();
+
+	host_task_id++;
+
+	snprintf((*task)->comm, sizeof((*task)->comm), "host%d", host_task_id);
+
+	return 0;
+}
+static void exit_task(void)
+{
+	do_exit(0);
+}
+
+static void del_host_task(void *arg)
+{
+	struct task_struct *task = (struct task_struct *)arg;
+	struct thread_info *ti = task_thread_info(task);
+
+	if (lkl_cpu_get() < 0)
+		return;
+
+	switch_to_host_task(task);
+	host_task_id--;
+	set_ti_thread_flag(ti, TIF_SCHED_JB);
+	lkl_ops->jmp_buf_set(&ti->sched_jb, exit_task);
+}
+
+static struct lkl_tls_key *task_key;
+
+long lkl_syscall(long no, long *params)
+{
+	struct task_struct *task = host0;
+	long ret;
+
+	ret = lkl_cpu_get();
+	if (ret < 0)
+		return ret;
+
+	if (lkl_ops->tls_get) {
+		task = lkl_ops->tls_get(task_key);
+		if (!task) {
+			ret = new_host_task(&task);
+			if (ret)
+				goto out;
+			lkl_ops->tls_set(task_key, task);
+		}
+	}
+
+	switch_to_host_task(task);
+
+	ret = run_syscall(no, params);
+
+	if (no == __NR_reboot) {
+		thread_sched_jb();
+		return ret;
+	}
+
+out:
+	lkl_cpu_put();
+
+	return ret;
+}
+
+static struct task_struct *idle_host_task;
+
+/* called from idle, don't failed, don't block */
+void wakeup_idle_host_task(void)
+{
+	if (!need_resched() && idle_host_task)
+		wake_up_process(idle_host_task);
+}
+
+static int idle_host_task_loop(void *unused)
+{
+	struct thread_info *ti = task_thread_info(current);
+
+	snprintf(current->comm, sizeof(current->comm), "idle_host_task");
+	set_thread_flag(TIF_HOST_THREAD);
+	idle_host_task = current;
+
+	for (;;) {
+		lkl_cpu_put();
+		lkl_ops->sem_down(ti->sched_sem);
+		if (idle_host_task == NULL) {
+			lkl_ops->thread_exit();
+			return 0;
+		}
+		schedule_tail(ti->prev_sched);
+	}
+}
+
+int syscalls_init(void)
+{
+	snprintf(current->comm, sizeof(current->comm), "host0");
+	set_thread_flag(TIF_HOST_THREAD);
+	host0 = current;
+
+	if (lkl_ops->tls_alloc) {
+		task_key = lkl_ops->tls_alloc(del_host_task);
+		if (!task_key)
+			return -1;
+	}
+
+	if (kernel_thread(idle_host_task_loop, NULL, CLONE_FLAGS) < 0) {
+		if (lkl_ops->tls_free)
+			lkl_ops->tls_free(task_key);
+		return -1;
+	}
+
+	return 0;
+}
+
+void syscalls_cleanup(void)
+{
+	if (idle_host_task) {
+		struct thread_info *ti = task_thread_info(idle_host_task);
+
+		idle_host_task = NULL;
+		lkl_ops->sem_up(ti->sched_sem);
+		lkl_ops->thread_join(ti->tid);
+	}
+
+	if (lkl_ops->tls_free)
+		lkl_ops->tls_free(task_key);
+}
+
+SYSCALL_DEFINE3(virtio_mmio_device_add, long, base, long, size, unsigned int,
+		irq)
+{
+	struct platform_device *pdev;
+	int ret;
+
+	struct resource res[] = {
+		[0] = {
+				.start = base,
+				.end = base + size - 1,
+				.flags = IORESOURCE_MEM,
+			},
+		[1] = {
+				.start = irq,
+				.end = irq,
+				.flags = IORESOURCE_IRQ,
+			},
+	};
+
+	pdev = platform_device_alloc("virtio-mmio", PLATFORM_DEVID_AUTO);
+	if (!pdev) {
+		dev_err(&pdev->dev,
+			"%s: Unable to device alloc for virtio-mmio\n",
+			__func__);
+		return -ENOMEM;
+	}
+
+	ret = platform_device_add_resources(pdev, res, ARRAY_SIZE(res));
+	if (ret) {
+		dev_err(&pdev->dev, "%s: Unable to add resources for %s%d\n",
+			__func__, pdev->name, pdev->id);
+		goto exit_device_put;
+	}
+
+	ret = platform_device_add(pdev);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "%s: Unable to add %s%d\n", __func__,
+			pdev->name, pdev->id);
+		goto exit_release_pdev;
+	}
+
+	return pdev->id;
+
+exit_release_pdev:
+	platform_device_del(pdev);
+exit_device_put:
+	platform_device_put(pdev);
+
+	return ret;
+}
diff --git a/arch/um/lkl/kernel/syscalls_32.c b/arch/um/lkl/kernel/syscalls_32.c
new file mode 100644
index 000000000000..a4271593c338
--- /dev/null
+++ b/arch/um/lkl/kernel/syscalls_32.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * sys_ia32.c: Conversion between 32bit and 64bit native syscalls. Based on
+ *             sys_sparc32
+ *
+ * Copyright (C) 2000		VA Linux Co
+ * Copyright (C) 2000		Don Dugger <n0ano@valinux.com>
+ * Copyright (C) 1999		Arun Sharma <arun.sharma@intel.com>
+ * Copyright (C) 1997,1998	Jakub Jelinek (jj@sunsite.mff.cuni.cz)
+ * Copyright (C) 1997		David S. Miller (davem@caip.rutgers.edu)
+ * Copyright (C) 2000		Hewlett-Packard Co.
+ * Copyright (C) 2000		David Mosberger-Tang <davidm@hpl.hp.com>
+ * Copyright (C) 2000,2001,2002	Andi Kleen, SuSE Labs (x86-64 port)
+ *
+ * These routines maintain argument size conversion between 32bit and 64bit
+ * environment. In 2.5 most of this should be moved to a generic directory.
+ *
+ * This file assumes that there is a hole at the end of user address space.
+ *
+ * Some of the functions are LE specific currently. These are
+ * hopefully all marked.  This should be fixed.
+ */
+
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/fs.h>
+#include <linux/file.h>
+#include <linux/signal.h>
+#include <linux/syscalls.h>
+#include <linux/times.h>
+#include <linux/utsname.h>
+#include <linux/mm.h>
+#include <linux/uio.h>
+#include <linux/poll.h>
+#include <linux/personality.h>
+#include <linux/stat.h>
+#include <linux/rwsem.h>
+#include <linux/compat.h>
+#include <linux/vfs.h>
+#include <linux/ptrace.h>
+#include <linux/highuid.h>
+#include <linux/sysctl.h>
+#include <linux/slab.h>
+#include <asm/types.h>
+#include <linux/atomic.h>
+#include <asm/syscalls_32.h>
+
+#define AA(__x)		((unsigned long)(__x))
+
+#if __BITS_PER_LONG == 32
+
+asmlinkage long sys32_truncate64(const char __user *filename,
+				 unsigned long offset_low,
+				 unsigned long offset_high)
+{
+	return sys_truncate64(filename,
+			      ((loff_t)offset_high << 32) | offset_low);
+}
+
+asmlinkage long sys32_ftruncate64(unsigned int fd, unsigned long offset_low,
+				  unsigned long offset_high)
+{
+	return sys_ftruncate64(fd, ((loff_t)offset_high << 32) | offset_low);
+}
+
+#ifdef CONFIG_MMU
+/*
+ * Linux/i386 didn't use to be able to handle more than
+ * 4 system call parameters, so these system calls used a memory
+ * block for parameter passing..
+ */
+
+struct mmap_arg_struct32 {
+	unsigned int addr;
+	unsigned int len;
+	unsigned int prot;
+	unsigned int flags;
+	unsigned int fd;
+	unsigned int offset;
+};
+
+asmlinkage long sys32_mmap(struct mmap_arg_struct32 __user *arg)
+{
+	struct mmap_arg_struct32 a;
+
+	if (copy_from_user(&a, arg, sizeof(a)))
+		return -EFAULT;
+
+	if (a.offset & ~PAGE_MASK)
+		return -EINVAL;
+
+	return sys_mmap_pgoff(a.addr, a.len, a.prot, a.flags, a.fd,
+			      a.offset >> PAGE_SHIFT);
+}
+#endif
+
+asmlinkage long sys32_wait4(pid_t pid, unsigned int __user *stat_addr,
+			    int options, struct rusage __user *ru)
+{
+	return sys_wait4(pid, stat_addr, options, ru);
+}
+
+asmlinkage long sys32_pread64(unsigned int fd, char __user *ubuf, u32 count,
+			      u32 poslo, u32 poshi)
+{
+	return sys_pread64(fd, ubuf, count,
+			   ((loff_t)AA(poshi) << 32) | AA(poslo));
+}
+
+asmlinkage long sys32_pwrite64(unsigned int fd, const char __user *ubuf,
+			       u32 count, u32 poslo, u32 poshi)
+{
+	return sys_pwrite64(fd, ubuf, count,
+			    ((loff_t)AA(poshi) << 32) | AA(poslo));
+}
+
+/*
+ * Some system calls that need sign extended arguments. This could be
+ * done by a generic wrapper.
+ */
+long sys32_fadvise64_64(int fd, __u32 offset_low, __u32 offset_high,
+			__u32 len_low, __u32 len_high, int advice)
+{
+	return sys_fadvise64_64(fd, (((u64)offset_high) << 32) | offset_low,
+				(((u64)len_high) << 32) | len_low, advice);
+}
+
+asmlinkage ssize_t sys32_readahead(int fd, unsigned int off_lo,
+				   unsigned int off_hi, size_t count)
+{
+	return sys_readahead(fd, ((u64)off_hi << 32) | off_lo, count);
+}
+
+asmlinkage long sys32_sync_file_range(int fd, unsigned int off_low,
+				      unsigned int off_hi, unsigned int n_low,
+				      unsigned int n_hi, unsigned int flags)
+{
+	return sys_sync_file_range(fd, ((u64)off_hi << 32) | off_low,
+				   ((u64)n_hi << 32) | n_low, flags);
+}
+
+asmlinkage long sys32_sync_file_range2(int fd, unsigned int flags,
+				       unsigned int off_low,
+				       unsigned int off_hi, unsigned int n_low,
+				       unsigned int n_hi)
+{
+	return sys_sync_file_range(fd, ((u64)off_hi << 32) | off_low,
+				   ((u64)n_hi << 32) | n_low, flags);
+}
+
+asmlinkage long sys32_fallocate(int fd, int mode, unsigned int offset_lo,
+				unsigned int offset_hi, unsigned int len_lo,
+				unsigned int len_hi)
+{
+	return sys_fallocate(fd, mode, ((u64)offset_hi << 32) | offset_lo,
+			     ((u64)len_hi << 32) | len_lo);
+}
+
+#endif
diff --git a/arch/um/lkl/scripts/headers_install.py b/arch/um/lkl/scripts/headers_install.py
new file mode 100755
index 000000000000..17a4d2b00681
--- /dev/null
+++ b/arch/um/lkl/scripts/headers_install.py
@@ -0,0 +1,195 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+import re, os, sys, argparse, multiprocessing, fnmatch
+
+srctree = os.environ["srctree"]
+objtree = os.environ["objtree"]
+header_paths = [ "include/uapi/", "arch/um/lkl/include/uapi/",
+                 "arch/um/lkl/include/generated/uapi/", "include/generated/" ]
+
+headers = set()
+includes = set()
+
+def relpath2abspath(relpath):
+    if "generated" in relpath:
+        return objtree + "/" + relpath
+    else:
+        return srctree + "/" + relpath
+
+def find_headers(path):
+    headers.add(path)
+    f = open(relpath2abspath(path))
+    for l in f.readlines():
+        m = re.search("#include <(.*)>", l)
+        try:
+            i = m.group(1)
+            for p in header_paths:
+                if os.access(relpath2abspath(p + i), os.R_OK):
+                    if p + i not in headers:
+                        includes.add(i)
+                        headers.add(p + i)
+                        find_headers(p + i)
+        except:
+            pass
+    f.close()
+
+def has_lkl_prefix(w):
+  return w.startswith("lkl") or w.startswith("_lkl") or w.startswith("__lkl") \
+         or w.startswith("LKL") or w.startswith("_LKL") or w.startswith("__LKL")
+
+def find_symbols(regexp, store):
+    for h in headers:
+        f = open(h)
+        for l in f.readlines():
+            m = regexp.search(l)
+            if not m:
+                continue
+            for e in reversed(m.groups()):
+                if e:
+                    if not has_lkl_prefix(e):
+                        store.add(e)
+                    break
+        f.close()
+
+def find_ml_symbols(regexp, store):
+    for h in headers:
+        for i in regexp.finditer(open(h).read()):
+            for j in reversed(i.groups()):
+                if j:
+                    if not has_lkl_prefix(j):
+                        store.add(j)
+                    break
+
+def find_enums(block_regexp, symbol_regexp, store):
+    for h in headers:
+        # remove comments
+        content = re.sub(re.compile("(\/\*(\*(?!\/)|[^*])*\*\/)", re.S|re.M), " ", open(h).read())
+        # remove preprocesor lines
+        clean_content = ""
+        for l in content.split("\n"):
+            if re.match("\s*#", l):
+                continue
+            clean_content += l + "\n"
+        for i in block_regexp.finditer(clean_content):
+            for j in reversed(i.groups()):
+                if j:
+                    for k in symbol_regexp.finditer(j):
+                        for l in k.groups():
+                            if l:
+                                if not has_lkl_prefix(l):
+                                    store.add(l)
+                                break
+
+def lkl_prefix(w):
+    r = ""
+
+    if w.startswith("__"):
+        r = "__"
+    elif w.startswith("_"):
+        r = "_"
+
+    if w.isupper():
+        r += "LKL"
+    else:
+        r += "lkl"
+
+    if not w.startswith("_"):
+        r += "_"
+
+    r += w
+
+    return r
+
+def replace(h):
+    content = open(h).read()
+    for i in includes:
+        search_str = "(#[ \t]*include[ \t]*[<\"][ \t]*)" + i + "([ \t]*[>\"])"
+        replace_str = "\\1" + "lkl/" + i + "\\2"
+        content = re.sub(search_str, replace_str, content)
+    tmp = ""
+    for w in re.split("(\W+)", content):
+        if w in defines:
+            w = lkl_prefix(w)
+        tmp += w
+    content = tmp
+    for s in structs:
+        search_str = "(\W?struct\s+)" + s + "(\W)"
+        replace_str = "\\1" + lkl_prefix(s) + "\\2"
+        content = re.sub(search_str, replace_str, content, flags = re.MULTILINE)
+    for s in unions:
+        search_str = "(\W?union\s+)" + s + "(\W)"
+        replace_str = "\\1" + lkl_prefix(s) + "\\2"
+        content = re.sub(search_str, replace_str, content, flags = re.MULTILINE)
+    open(h, 'w').write(content)
+
+parser = argparse.ArgumentParser(description='install lkl headers')
+parser.add_argument('path', help='path to install to', )
+parser.add_argument('-j', '--jobs', help='number of parallel jobs', default=1, type=int)
+args = parser.parse_args()
+
+find_headers("arch/um/lkl/include/uapi/asm/syscalls.h")
+headers.add("arch/um/lkl/include/uapi/asm/host_ops.h")
+
+if 'LKL_INSTALL_ADDITIONAL_HEADERS' in os.environ:
+    with open(os.environ['LKL_INSTALL_ADDITIONAL_HEADERS'], 'rU') as f:
+        for line in f.readlines():
+            line = line.split('#', 1)[0].strip()
+            if line != '':
+                headers.add(line)
+
+new_headers = set()
+
+for h in headers:
+    dir = os.path.dirname(h)
+    out_dir = args.path + "/" + re.sub("(arch/um/lkl/include/uapi/|arch/um/lkl/include/generated/uapi/|include/uapi/|include/generated/uapi/|include/generated)(.*)", "lkl/\\2", dir)
+    try:
+        os.makedirs(out_dir)
+    except:
+        pass
+    print("  INSTALL\t%s" % (out_dir + "/" + os.path.basename(h)))
+    os.system(srctree+"/scripts/headers_install.sh %s %s" % (os.path.abspath(h),
+                                                       out_dir + "/" + os.path.basename(h)))
+    new_headers.add(out_dir + "/" + os.path.basename(h))
+
+headers = new_headers
+
+defines = set()
+structs = set()
+unions = set()
+
+p = re.compile("#[ \t]*define[ \t]*(\w+)")
+find_symbols(p, defines)
+p = re.compile("typedef.*(\(\*(\w+)\)\(.*\)\s*|\W+(\w+)\s*|\s+(\w+)\(.*\)\s*);")
+find_symbols(p, defines)
+p = re.compile("typedef\s+(struct|union)\s+\w*\s*{[^\\{\}]*}\W*(\w+)\s*;", re.M|re.S)
+find_ml_symbols(p, defines)
+defines.add("siginfo_t")
+defines.add("sigevent_t")
+p = re.compile("struct\s+(\w+)\s*\{")
+find_symbols(p, structs)
+structs.add("iovec")
+p = re.compile("union\s+(\w+)\s*\{")
+find_symbols(p, unions)
+p = re.compile("static\s+__inline__(\s+\w+)+\s+(\w+)\([^)]*\)\s")
+find_symbols(p, defines)
+p = re.compile("static\s+__always_inline(\s+\w+)+\s+(\w+)\([^)]*\)\s")
+find_symbols(p, defines)
+p = re.compile("enum\s+(\w*)\s*{([^}]*)}", re.M|re.S)
+q = re.compile("(\w+)\s*(,|=[^,]*|$)", re.M|re.S)
+find_enums(p, q, defines)
+
+# needed for i386
+defines.add("__NR_stime")
+
+def process_header(h):
+    print("  REPLACE\t%s" % (out_dir + "/" + os.path.basename(h)))
+    replace(h)
+
+p = multiprocessing.Pool(args.jobs)
+try:
+    p.map_async(process_header, headers).wait(999999)
+    p.close()
+except:
+    p.terminate()
+finally:
+    p.join()
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 08/37] lkl: system call interface and application API
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Conrad Meyer, Octavian Purdila, Jens Staal,
	Lai Jiangshan, Akira Moroo, Yuan Liu, Patrick Collins,
	linux-kernel-library, Pierre-Hugues Husson, Michael Zimmermann,
	Luca Dariz, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

The LKL application API is based on the kernel system call interface
in order to offer a stable API to applications. Note that we can't
offer the full Linux system call interface due to LKL limitations such
as lack of virtual memory, signal, user processes, etc.

The host is using the LKL interrupt mechanism (lkl_trigger_irq) to
initiate a system call. The system call is executed in the context of
the init process.

To avoid collisions between the Linux API and the LKL API (e.g.  struct
stat, MKNOD, etc.) we use a python script to modify the user headers
and to prefix all of the global symbols (structures, typedefs,
defines) with LKL, lkl, _LKL, _lkl, __LKL or __lkl.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Jens Staal <staal1978@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/unistd.h        |  29 +++
 arch/um/lkl/include/uapi/asm/host_ops.h |  14 ++
 arch/um/lkl/include/uapi/asm/unistd.h   |  18 ++
 arch/um/lkl/kernel/syscalls.c           | 246 ++++++++++++++++++++++++
 arch/um/lkl/kernel/syscalls_32.c        | 159 +++++++++++++++
 arch/um/lkl/scripts/headers_install.py  | 195 +++++++++++++++++++
 6 files changed, 661 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/unistd.h
 create mode 100644 arch/um/lkl/include/uapi/asm/unistd.h
 create mode 100644 arch/um/lkl/kernel/syscalls.c
 create mode 100644 arch/um/lkl/kernel/syscalls_32.c
 create mode 100755 arch/um/lkl/scripts/headers_install.py

diff --git a/arch/um/lkl/include/asm/unistd.h b/arch/um/lkl/include/asm/unistd.h
new file mode 100644
index 000000000000..c0efc68bf41f
--- /dev/null
+++ b/arch/um/lkl/include/asm/unistd.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <uapi/asm/unistd.h>
+
+__SYSCALL(__NR_virtio_mmio_device_add, sys_virtio_mmio_device_add)
+
+#define __SC_ASCII(t, a) #t "," #a
+
+#define __ASCII_MAP0(m, ...)
+#define __ASCII_MAP1(m, t, a) m(t, a)
+#define __ASCII_MAP2(m, t, a, ...) m(t, a) "," __ASCII_MAP1(m, __VA_ARGS__)
+#define __ASCII_MAP3(m, t, a, ...) m(t, a) "," __ASCII_MAP2(m, __VA_ARGS__)
+#define __ASCII_MAP4(m, t, a, ...) m(t, a) "," __ASCII_MAP3(m, __VA_ARGS__)
+#define __ASCII_MAP5(m, t, a, ...) m(t, a) "," __ASCII_MAP4(m, __VA_ARGS__)
+#define __ASCII_MAP6(m, t, a, ...) m(t, a) "," __ASCII_MAP5(m, __VA_ARGS__)
+#define __ASCII_MAP(n, ...) __ASCII_MAP##n(__VA_ARGS__)
+
+#ifdef __MINGW32__
+#define SECTION_ATTRS "n0"
+#else
+#define SECTION_ATTRS "a"
+#endif
+
+#define __SYSCALL_DEFINE_ARCH(x, name, ...)				\
+	asm(".section .syscall_defs,\"" SECTION_ATTRS "\"\n"		\
+	    ".ascii \"#ifdef __NR" #name "\\n\"\n"			\
+	    ".ascii \"SYSCALL_DEFINE" #x "(" #name ","			\
+	    __ASCII_MAP(x, __SC_ASCII, __VA_ARGS__) ")\\n\"\n"		\
+	    ".ascii \"#endif\\n\"\n"					\
+	    ".section .text\n");
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 19924fc7c718..1c839d7139f8 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -36,6 +36,15 @@ struct lkl_jmp_buf {
  * @thread_join - wait for the given thread to terminate. Returns 0
  * for success, -1 otherwise
  *
+ * @tls_alloc - allocate a thread local storage key; returns 0 if successful; if
+ * destructor is not NULL it will be called when a thread terminates with its
+ * argument set to the current thread local storage value
+ * @tls_free - frees a thread local storage key; returns 0 if successful
+ * @tls_set - associate data to the thread local storage key; returns 0 if
+ * successful
+ * @tls_get - return data associated with the thread local storage key or NULL
+ * on error
+ *
  * @mem_alloc - allocate memory
  * @mem_free - free memory
  *
@@ -71,6 +80,11 @@ struct lkl_host_operations {
 	lkl_thread_t (*thread_self)(void);
 	int (*thread_equal)(lkl_thread_t a, lkl_thread_t b);
 
+	struct lkl_tls_key *(*tls_alloc)(void (*destructor)(void *));
+	void (*tls_free)(struct lkl_tls_key *key);
+	int (*tls_set)(struct lkl_tls_key *key, void *data);
+	void *(*tls_get)(struct lkl_tls_key *key);
+
 	void *(*mem_alloc)(unsigned long mem);
 	void (*mem_free)(void *mem);
 
diff --git a/arch/um/lkl/include/uapi/asm/unistd.h b/arch/um/lkl/include/uapi/asm/unistd.h
new file mode 100644
index 000000000000..561a7036821e
--- /dev/null
+++ b/arch/um/lkl/include/uapi/asm/unistd.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#define __ARCH_WANT_SYSCALL_NO_AT
+#define __ARCH_WANT_SYSCALL_DEPRECATED
+#define __ARCH_WANT_SYSCALL_NO_FLAGS
+#define __ARCH_WANT_RENAMEAT
+#define __ARCH_WANT_NEW_STAT
+#define __ARCH_WANT_SET_GET_RLIMIT
+#define __ARCH_WANT_TIME32_SYSCALLS
+
+#include <asm/bitsperlong.h>
+
+#if __BITS_PER_LONG == 64
+#define __ARCH_WANT_SYS_NEWFSTATAT
+#endif
+
+#include <asm-generic/unistd.h>
+
+#define __NR_virtio_mmio_device_add		(__NR_arch_specific_syscall + 0)
diff --git a/arch/um/lkl/kernel/syscalls.c b/arch/um/lkl/kernel/syscalls.c
new file mode 100644
index 000000000000..ce3923baa655
--- /dev/null
+++ b/arch/um/lkl/kernel/syscalls.c
@@ -0,0 +1,246 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/stat.h>
+#include <linux/irq.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/jhash.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/net.h>
+#include <linux/task_work.h>
+#include <linux/syscalls.h>
+#include <linux/kthread.h>
+#include <linux/platform_device.h>
+#include <asm/host_ops.h>
+#include <asm/syscalls.h>
+#include <asm/syscalls_32.h>
+#include <asm/cpu.h>
+#include <asm/sched.h>
+
+static asmlinkage long sys_virtio_mmio_device_add(long base, long size,
+						  unsigned int irq);
+
+typedef long (*syscall_handler_t)(long arg1, ...);
+
+#undef __SYSCALL
+#define __SYSCALL(nr, sym)[nr] = (syscall_handler_t)sym,
+
+static syscall_handler_t syscall_table[__NR_syscalls] = {
+	[0 ... __NR_syscalls - 1] = (syscall_handler_t)sys_ni_syscall,
+#include <asm/unistd.h>
+
+#if __BITS_PER_LONG == 32
+#include <asm/unistd_32.h>
+#endif
+};
+
+static long run_syscall(long no, long *params)
+{
+	long ret;
+
+	if (no < 0 || no >= __NR_syscalls)
+		return -ENOSYS;
+
+	ret = syscall_table[no](params[0], params[1], params[2], params[3],
+				params[4], params[5]);
+
+	task_work_run();
+
+	return ret;
+}
+
+
+#define CLONE_FLAGS (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_THREAD |	\
+		     CLONE_SIGHAND | SIGCHLD)
+
+static int host_task_id;
+static struct task_struct *host0;
+
+static int new_host_task(struct task_struct **task)
+{
+	pid_t pid;
+
+	switch_to_host_task(host0);
+
+	pid = kernel_thread(host_task_stub, NULL, CLONE_FLAGS);
+	if (pid < 0)
+		return pid;
+
+	rcu_read_lock();
+	*task = find_task_by_pid_ns(pid, &init_pid_ns);
+	rcu_read_unlock();
+
+	host_task_id++;
+
+	snprintf((*task)->comm, sizeof((*task)->comm), "host%d", host_task_id);
+
+	return 0;
+}
+static void exit_task(void)
+{
+	do_exit(0);
+}
+
+static void del_host_task(void *arg)
+{
+	struct task_struct *task = (struct task_struct *)arg;
+	struct thread_info *ti = task_thread_info(task);
+
+	if (lkl_cpu_get() < 0)
+		return;
+
+	switch_to_host_task(task);
+	host_task_id--;
+	set_ti_thread_flag(ti, TIF_SCHED_JB);
+	lkl_ops->jmp_buf_set(&ti->sched_jb, exit_task);
+}
+
+static struct lkl_tls_key *task_key;
+
+long lkl_syscall(long no, long *params)
+{
+	struct task_struct *task = host0;
+	long ret;
+
+	ret = lkl_cpu_get();
+	if (ret < 0)
+		return ret;
+
+	if (lkl_ops->tls_get) {
+		task = lkl_ops->tls_get(task_key);
+		if (!task) {
+			ret = new_host_task(&task);
+			if (ret)
+				goto out;
+			lkl_ops->tls_set(task_key, task);
+		}
+	}
+
+	switch_to_host_task(task);
+
+	ret = run_syscall(no, params);
+
+	if (no == __NR_reboot) {
+		thread_sched_jb();
+		return ret;
+	}
+
+out:
+	lkl_cpu_put();
+
+	return ret;
+}
+
+static struct task_struct *idle_host_task;
+
+/* called from idle, don't failed, don't block */
+void wakeup_idle_host_task(void)
+{
+	if (!need_resched() && idle_host_task)
+		wake_up_process(idle_host_task);
+}
+
+static int idle_host_task_loop(void *unused)
+{
+	struct thread_info *ti = task_thread_info(current);
+
+	snprintf(current->comm, sizeof(current->comm), "idle_host_task");
+	set_thread_flag(TIF_HOST_THREAD);
+	idle_host_task = current;
+
+	for (;;) {
+		lkl_cpu_put();
+		lkl_ops->sem_down(ti->sched_sem);
+		if (idle_host_task == NULL) {
+			lkl_ops->thread_exit();
+			return 0;
+		}
+		schedule_tail(ti->prev_sched);
+	}
+}
+
+int syscalls_init(void)
+{
+	snprintf(current->comm, sizeof(current->comm), "host0");
+	set_thread_flag(TIF_HOST_THREAD);
+	host0 = current;
+
+	if (lkl_ops->tls_alloc) {
+		task_key = lkl_ops->tls_alloc(del_host_task);
+		if (!task_key)
+			return -1;
+	}
+
+	if (kernel_thread(idle_host_task_loop, NULL, CLONE_FLAGS) < 0) {
+		if (lkl_ops->tls_free)
+			lkl_ops->tls_free(task_key);
+		return -1;
+	}
+
+	return 0;
+}
+
+void syscalls_cleanup(void)
+{
+	if (idle_host_task) {
+		struct thread_info *ti = task_thread_info(idle_host_task);
+
+		idle_host_task = NULL;
+		lkl_ops->sem_up(ti->sched_sem);
+		lkl_ops->thread_join(ti->tid);
+	}
+
+	if (lkl_ops->tls_free)
+		lkl_ops->tls_free(task_key);
+}
+
+SYSCALL_DEFINE3(virtio_mmio_device_add, long, base, long, size, unsigned int,
+		irq)
+{
+	struct platform_device *pdev;
+	int ret;
+
+	struct resource res[] = {
+		[0] = {
+				.start = base,
+				.end = base + size - 1,
+				.flags = IORESOURCE_MEM,
+			},
+		[1] = {
+				.start = irq,
+				.end = irq,
+				.flags = IORESOURCE_IRQ,
+			},
+	};
+
+	pdev = platform_device_alloc("virtio-mmio", PLATFORM_DEVID_AUTO);
+	if (!pdev) {
+		dev_err(&pdev->dev,
+			"%s: Unable to device alloc for virtio-mmio\n",
+			__func__);
+		return -ENOMEM;
+	}
+
+	ret = platform_device_add_resources(pdev, res, ARRAY_SIZE(res));
+	if (ret) {
+		dev_err(&pdev->dev, "%s: Unable to add resources for %s%d\n",
+			__func__, pdev->name, pdev->id);
+		goto exit_device_put;
+	}
+
+	ret = platform_device_add(pdev);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "%s: Unable to add %s%d\n", __func__,
+			pdev->name, pdev->id);
+		goto exit_release_pdev;
+	}
+
+	return pdev->id;
+
+exit_release_pdev:
+	platform_device_del(pdev);
+exit_device_put:
+	platform_device_put(pdev);
+
+	return ret;
+}
diff --git a/arch/um/lkl/kernel/syscalls_32.c b/arch/um/lkl/kernel/syscalls_32.c
new file mode 100644
index 000000000000..a4271593c338
--- /dev/null
+++ b/arch/um/lkl/kernel/syscalls_32.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * sys_ia32.c: Conversion between 32bit and 64bit native syscalls. Based on
+ *             sys_sparc32
+ *
+ * Copyright (C) 2000		VA Linux Co
+ * Copyright (C) 2000		Don Dugger <n0ano@valinux.com>
+ * Copyright (C) 1999		Arun Sharma <arun.sharma@intel.com>
+ * Copyright (C) 1997,1998	Jakub Jelinek (jj@sunsite.mff.cuni.cz)
+ * Copyright (C) 1997		David S. Miller (davem@caip.rutgers.edu)
+ * Copyright (C) 2000		Hewlett-Packard Co.
+ * Copyright (C) 2000		David Mosberger-Tang <davidm@hpl.hp.com>
+ * Copyright (C) 2000,2001,2002	Andi Kleen, SuSE Labs (x86-64 port)
+ *
+ * These routines maintain argument size conversion between 32bit and 64bit
+ * environment. In 2.5 most of this should be moved to a generic directory.
+ *
+ * This file assumes that there is a hole at the end of user address space.
+ *
+ * Some of the functions are LE specific currently. These are
+ * hopefully all marked.  This should be fixed.
+ */
+
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/fs.h>
+#include <linux/file.h>
+#include <linux/signal.h>
+#include <linux/syscalls.h>
+#include <linux/times.h>
+#include <linux/utsname.h>
+#include <linux/mm.h>
+#include <linux/uio.h>
+#include <linux/poll.h>
+#include <linux/personality.h>
+#include <linux/stat.h>
+#include <linux/rwsem.h>
+#include <linux/compat.h>
+#include <linux/vfs.h>
+#include <linux/ptrace.h>
+#include <linux/highuid.h>
+#include <linux/sysctl.h>
+#include <linux/slab.h>
+#include <asm/types.h>
+#include <linux/atomic.h>
+#include <asm/syscalls_32.h>
+
+#define AA(__x)		((unsigned long)(__x))
+
+#if __BITS_PER_LONG == 32
+
+asmlinkage long sys32_truncate64(const char __user *filename,
+				 unsigned long offset_low,
+				 unsigned long offset_high)
+{
+	return sys_truncate64(filename,
+			      ((loff_t)offset_high << 32) | offset_low);
+}
+
+asmlinkage long sys32_ftruncate64(unsigned int fd, unsigned long offset_low,
+				  unsigned long offset_high)
+{
+	return sys_ftruncate64(fd, ((loff_t)offset_high << 32) | offset_low);
+}
+
+#ifdef CONFIG_MMU
+/*
+ * Linux/i386 didn't use to be able to handle more than
+ * 4 system call parameters, so these system calls used a memory
+ * block for parameter passing..
+ */
+
+struct mmap_arg_struct32 {
+	unsigned int addr;
+	unsigned int len;
+	unsigned int prot;
+	unsigned int flags;
+	unsigned int fd;
+	unsigned int offset;
+};
+
+asmlinkage long sys32_mmap(struct mmap_arg_struct32 __user *arg)
+{
+	struct mmap_arg_struct32 a;
+
+	if (copy_from_user(&a, arg, sizeof(a)))
+		return -EFAULT;
+
+	if (a.offset & ~PAGE_MASK)
+		return -EINVAL;
+
+	return sys_mmap_pgoff(a.addr, a.len, a.prot, a.flags, a.fd,
+			      a.offset >> PAGE_SHIFT);
+}
+#endif
+
+asmlinkage long sys32_wait4(pid_t pid, unsigned int __user *stat_addr,
+			    int options, struct rusage __user *ru)
+{
+	return sys_wait4(pid, stat_addr, options, ru);
+}
+
+asmlinkage long sys32_pread64(unsigned int fd, char __user *ubuf, u32 count,
+			      u32 poslo, u32 poshi)
+{
+	return sys_pread64(fd, ubuf, count,
+			   ((loff_t)AA(poshi) << 32) | AA(poslo));
+}
+
+asmlinkage long sys32_pwrite64(unsigned int fd, const char __user *ubuf,
+			       u32 count, u32 poslo, u32 poshi)
+{
+	return sys_pwrite64(fd, ubuf, count,
+			    ((loff_t)AA(poshi) << 32) | AA(poslo));
+}
+
+/*
+ * Some system calls that need sign extended arguments. This could be
+ * done by a generic wrapper.
+ */
+long sys32_fadvise64_64(int fd, __u32 offset_low, __u32 offset_high,
+			__u32 len_low, __u32 len_high, int advice)
+{
+	return sys_fadvise64_64(fd, (((u64)offset_high) << 32) | offset_low,
+				(((u64)len_high) << 32) | len_low, advice);
+}
+
+asmlinkage ssize_t sys32_readahead(int fd, unsigned int off_lo,
+				   unsigned int off_hi, size_t count)
+{
+	return sys_readahead(fd, ((u64)off_hi << 32) | off_lo, count);
+}
+
+asmlinkage long sys32_sync_file_range(int fd, unsigned int off_low,
+				      unsigned int off_hi, unsigned int n_low,
+				      unsigned int n_hi, unsigned int flags)
+{
+	return sys_sync_file_range(fd, ((u64)off_hi << 32) | off_low,
+				   ((u64)n_hi << 32) | n_low, flags);
+}
+
+asmlinkage long sys32_sync_file_range2(int fd, unsigned int flags,
+				       unsigned int off_low,
+				       unsigned int off_hi, unsigned int n_low,
+				       unsigned int n_hi)
+{
+	return sys_sync_file_range(fd, ((u64)off_hi << 32) | off_low,
+				   ((u64)n_hi << 32) | n_low, flags);
+}
+
+asmlinkage long sys32_fallocate(int fd, int mode, unsigned int offset_lo,
+				unsigned int offset_hi, unsigned int len_lo,
+				unsigned int len_hi)
+{
+	return sys_fallocate(fd, mode, ((u64)offset_hi << 32) | offset_lo,
+			     ((u64)len_hi << 32) | len_lo);
+}
+
+#endif
diff --git a/arch/um/lkl/scripts/headers_install.py b/arch/um/lkl/scripts/headers_install.py
new file mode 100755
index 000000000000..17a4d2b00681
--- /dev/null
+++ b/arch/um/lkl/scripts/headers_install.py
@@ -0,0 +1,195 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+import re, os, sys, argparse, multiprocessing, fnmatch
+
+srctree = os.environ["srctree"]
+objtree = os.environ["objtree"]
+header_paths = [ "include/uapi/", "arch/um/lkl/include/uapi/",
+                 "arch/um/lkl/include/generated/uapi/", "include/generated/" ]
+
+headers = set()
+includes = set()
+
+def relpath2abspath(relpath):
+    if "generated" in relpath:
+        return objtree + "/" + relpath
+    else:
+        return srctree + "/" + relpath
+
+def find_headers(path):
+    headers.add(path)
+    f = open(relpath2abspath(path))
+    for l in f.readlines():
+        m = re.search("#include <(.*)>", l)
+        try:
+            i = m.group(1)
+            for p in header_paths:
+                if os.access(relpath2abspath(p + i), os.R_OK):
+                    if p + i not in headers:
+                        includes.add(i)
+                        headers.add(p + i)
+                        find_headers(p + i)
+        except:
+            pass
+    f.close()
+
+def has_lkl_prefix(w):
+  return w.startswith("lkl") or w.startswith("_lkl") or w.startswith("__lkl") \
+         or w.startswith("LKL") or w.startswith("_LKL") or w.startswith("__LKL")
+
+def find_symbols(regexp, store):
+    for h in headers:
+        f = open(h)
+        for l in f.readlines():
+            m = regexp.search(l)
+            if not m:
+                continue
+            for e in reversed(m.groups()):
+                if e:
+                    if not has_lkl_prefix(e):
+                        store.add(e)
+                    break
+        f.close()
+
+def find_ml_symbols(regexp, store):
+    for h in headers:
+        for i in regexp.finditer(open(h).read()):
+            for j in reversed(i.groups()):
+                if j:
+                    if not has_lkl_prefix(j):
+                        store.add(j)
+                    break
+
+def find_enums(block_regexp, symbol_regexp, store):
+    for h in headers:
+        # remove comments
+        content = re.sub(re.compile("(\/\*(\*(?!\/)|[^*])*\*\/)", re.S|re.M), " ", open(h).read())
+        # remove preprocesor lines
+        clean_content = ""
+        for l in content.split("\n"):
+            if re.match("\s*#", l):
+                continue
+            clean_content += l + "\n"
+        for i in block_regexp.finditer(clean_content):
+            for j in reversed(i.groups()):
+                if j:
+                    for k in symbol_regexp.finditer(j):
+                        for l in k.groups():
+                            if l:
+                                if not has_lkl_prefix(l):
+                                    store.add(l)
+                                break
+
+def lkl_prefix(w):
+    r = ""
+
+    if w.startswith("__"):
+        r = "__"
+    elif w.startswith("_"):
+        r = "_"
+
+    if w.isupper():
+        r += "LKL"
+    else:
+        r += "lkl"
+
+    if not w.startswith("_"):
+        r += "_"
+
+    r += w
+
+    return r
+
+def replace(h):
+    content = open(h).read()
+    for i in includes:
+        search_str = "(#[ \t]*include[ \t]*[<\"][ \t]*)" + i + "([ \t]*[>\"])"
+        replace_str = "\\1" + "lkl/" + i + "\\2"
+        content = re.sub(search_str, replace_str, content)
+    tmp = ""
+    for w in re.split("(\W+)", content):
+        if w in defines:
+            w = lkl_prefix(w)
+        tmp += w
+    content = tmp
+    for s in structs:
+        search_str = "(\W?struct\s+)" + s + "(\W)"
+        replace_str = "\\1" + lkl_prefix(s) + "\\2"
+        content = re.sub(search_str, replace_str, content, flags = re.MULTILINE)
+    for s in unions:
+        search_str = "(\W?union\s+)" + s + "(\W)"
+        replace_str = "\\1" + lkl_prefix(s) + "\\2"
+        content = re.sub(search_str, replace_str, content, flags = re.MULTILINE)
+    open(h, 'w').write(content)
+
+parser = argparse.ArgumentParser(description='install lkl headers')
+parser.add_argument('path', help='path to install to', )
+parser.add_argument('-j', '--jobs', help='number of parallel jobs', default=1, type=int)
+args = parser.parse_args()
+
+find_headers("arch/um/lkl/include/uapi/asm/syscalls.h")
+headers.add("arch/um/lkl/include/uapi/asm/host_ops.h")
+
+if 'LKL_INSTALL_ADDITIONAL_HEADERS' in os.environ:
+    with open(os.environ['LKL_INSTALL_ADDITIONAL_HEADERS'], 'rU') as f:
+        for line in f.readlines():
+            line = line.split('#', 1)[0].strip()
+            if line != '':
+                headers.add(line)
+
+new_headers = set()
+
+for h in headers:
+    dir = os.path.dirname(h)
+    out_dir = args.path + "/" + re.sub("(arch/um/lkl/include/uapi/|arch/um/lkl/include/generated/uapi/|include/uapi/|include/generated/uapi/|include/generated)(.*)", "lkl/\\2", dir)
+    try:
+        os.makedirs(out_dir)
+    except:
+        pass
+    print("  INSTALL\t%s" % (out_dir + "/" + os.path.basename(h)))
+    os.system(srctree+"/scripts/headers_install.sh %s %s" % (os.path.abspath(h),
+                                                       out_dir + "/" + os.path.basename(h)))
+    new_headers.add(out_dir + "/" + os.path.basename(h))
+
+headers = new_headers
+
+defines = set()
+structs = set()
+unions = set()
+
+p = re.compile("#[ \t]*define[ \t]*(\w+)")
+find_symbols(p, defines)
+p = re.compile("typedef.*(\(\*(\w+)\)\(.*\)\s*|\W+(\w+)\s*|\s+(\w+)\(.*\)\s*);")
+find_symbols(p, defines)
+p = re.compile("typedef\s+(struct|union)\s+\w*\s*{[^\\{\}]*}\W*(\w+)\s*;", re.M|re.S)
+find_ml_symbols(p, defines)
+defines.add("siginfo_t")
+defines.add("sigevent_t")
+p = re.compile("struct\s+(\w+)\s*\{")
+find_symbols(p, structs)
+structs.add("iovec")
+p = re.compile("union\s+(\w+)\s*\{")
+find_symbols(p, unions)
+p = re.compile("static\s+__inline__(\s+\w+)+\s+(\w+)\([^)]*\)\s")
+find_symbols(p, defines)
+p = re.compile("static\s+__always_inline(\s+\w+)+\s+(\w+)\([^)]*\)\s")
+find_symbols(p, defines)
+p = re.compile("enum\s+(\w*)\s*{([^}]*)}", re.M|re.S)
+q = re.compile("(\w+)\s*(,|=[^,]*|$)", re.M|re.S)
+find_enums(p, q, defines)
+
+# needed for i386
+defines.add("__NR_stime")
+
+def process_header(h):
+    print("  REPLACE\t%s" % (out_dir + "/" + os.path.basename(h)))
+    replace(h)
+
+p = multiprocessing.Pool(args.jobs)
+try:
+    p.map_async(process_header, headers).wait(999999)
+    p.close()
+except:
+    p.terminate()
+finally:
+    p.join()
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 09/37] lkl: timers, time and delay support
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki, Michael Zimmermann

From: Octavian Purdila <tavi.purdila@gmail.com>

Clockevent driver based on host timer operations and clocksource
driver and udelay support based on host time operations.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/uapi/asm/host_ops.h |  13 +++
 arch/um/lkl/kernel/time.c               | 145 ++++++++++++++++++++++++
 2 files changed, 158 insertions(+)
 create mode 100644 arch/um/lkl/kernel/time.c

diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 1c839d7139f8..c9f77dd7fbe7 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -48,6 +48,13 @@ struct lkl_jmp_buf {
  * @mem_alloc - allocate memory
  * @mem_free - free memory
  *
+ * @timer_create - allocate a host timer that runs fn(arg) when the timer
+ * fires.
+ * @timer_free - disarms and free the timer
+ * @timer_set_oneshot - arm the timer to fire once, after delta ns.
+ * @timer_set_periodic - arm the timer to fire periodically, with a period of
+ * delta ns.
+ *
  * @gettid - returns the host thread id of the caller, which need not
  * be the same as the handle returned by thread_create
  *
@@ -88,6 +95,12 @@ struct lkl_host_operations {
 	void *(*mem_alloc)(unsigned long mem);
 	void (*mem_free)(void *mem);
 
+	unsigned long long (*time)(void);
+
+	void *(*timer_alloc)(void (*fn)(void *), void *arg);
+	int (*timer_set_oneshot)(void *timer, unsigned long delta);
+	void (*timer_free)(void *timer);
+
 	long (*gettid)(void);
 
 	void (*jmp_buf_set)(struct lkl_jmp_buf *jmpb, void (*f)(void));
diff --git a/arch/um/lkl/kernel/time.c b/arch/um/lkl/kernel/time.c
new file mode 100644
index 000000000000..b8320e1bfa53
--- /dev/null
+++ b/arch/um/lkl/kernel/time.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/clocksource.h>
+#include <linux/clockchips.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <asm/host_ops.h>
+
+static unsigned long long boot_time;
+
+void __ndelay(unsigned long nsecs)
+{
+	unsigned long long start = lkl_ops->time();
+
+	while (lkl_ops->time() < start + nsecs)
+		;
+}
+
+void __udelay(unsigned long usecs)
+{
+	__ndelay(usecs * NSEC_PER_USEC);
+}
+
+void __const_udelay(unsigned long xloops)
+{
+	__udelay(xloops / 0x10c7ul);
+}
+
+void calibrate_delay(void)
+{
+}
+
+void read_persistent_clock(struct timespec *ts)
+{
+	*ts = ns_to_timespec(lkl_ops->time());
+}
+
+/*
+ * Scheduler clock - returns current time in nanosec units.
+ *
+ */
+unsigned long long sched_clock(void)
+{
+	if (!boot_time)
+		return 0;
+
+	return lkl_ops->time() - boot_time;
+}
+
+static u64 clock_read(struct clocksource *cs)
+{
+	return lkl_ops->time();
+}
+
+static struct clocksource clocksource = {
+	.name	= "lkl",
+	.rating = 499,
+	.read	= clock_read,
+	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
+	.mask	= CLOCKSOURCE_MASK(64),
+};
+
+static void *timer;
+
+static int timer_irq;
+
+static void timer_fn(void *arg)
+{
+	lkl_trigger_irq(timer_irq);
+}
+
+static int clockevent_set_state_shutdown(struct clock_event_device *evt)
+{
+	if (timer) {
+		lkl_ops->timer_free(timer);
+		timer = NULL;
+	}
+
+	return 0;
+}
+
+static int clockevent_set_state_oneshot(struct clock_event_device *evt)
+{
+	timer = lkl_ops->timer_alloc(timer_fn, NULL);
+	if (!timer)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static irqreturn_t timer_irq_handler(int irq, void *dev_id)
+{
+	struct clock_event_device *dev = (struct clock_event_device *)dev_id;
+
+	dev->event_handler(dev);
+
+	return IRQ_HANDLED;
+}
+
+static int clockevent_next_event(unsigned long ns,
+				 struct clock_event_device *evt)
+{
+	return lkl_ops->timer_set_oneshot(timer, ns);
+}
+
+static struct clock_event_device clockevent = {
+	.name			= "lkl",
+	.features		= CLOCK_EVT_FEAT_ONESHOT,
+	.set_state_oneshot	= clockevent_set_state_oneshot,
+	.set_next_event		= clockevent_next_event,
+	.set_state_shutdown	= clockevent_set_state_shutdown,
+};
+
+static struct irqaction irq0  = {
+	.handler	= timer_irq_handler,
+	.flags		= IRQF_NOBALANCING | IRQF_TIMER,
+	.dev_id		= &clockevent,
+	.name		= "timer"
+};
+
+void __init time_init(void)
+{
+	int ret;
+
+	if (!lkl_ops->timer_alloc || !lkl_ops->timer_free ||
+	    !lkl_ops->timer_set_oneshot || !lkl_ops->time) {
+		pr_err("lkl: no time or timer support provided by host\n");
+		return;
+	}
+
+	timer_irq = lkl_get_free_irq("timer");
+	setup_irq(timer_irq, &irq0);
+
+	ret = clocksource_register_khz(&clocksource, 1000000);
+	if (ret)
+		pr_err("lkl: unable to register clocksource\n");
+
+	clockevents_config_and_register(&clockevent, NSEC_PER_SEC, 1,
+					ULONG_MAX);
+
+	boot_time = lkl_ops->time();
+	pr_info("lkl: time and timers initialized (irq%d)\n", timer_irq);
+}
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 09/37] lkl: timers, time and delay support
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Octavian Purdila, Akira Moroo, linux-kernel-library,
	Michael Zimmermann, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Clockevent driver based on host timer operations and clocksource
driver and udelay support based on host time operations.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/uapi/asm/host_ops.h |  13 +++
 arch/um/lkl/kernel/time.c               | 145 ++++++++++++++++++++++++
 2 files changed, 158 insertions(+)
 create mode 100644 arch/um/lkl/kernel/time.c

diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 1c839d7139f8..c9f77dd7fbe7 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -48,6 +48,13 @@ struct lkl_jmp_buf {
  * @mem_alloc - allocate memory
  * @mem_free - free memory
  *
+ * @timer_create - allocate a host timer that runs fn(arg) when the timer
+ * fires.
+ * @timer_free - disarms and free the timer
+ * @timer_set_oneshot - arm the timer to fire once, after delta ns.
+ * @timer_set_periodic - arm the timer to fire periodically, with a period of
+ * delta ns.
+ *
  * @gettid - returns the host thread id of the caller, which need not
  * be the same as the handle returned by thread_create
  *
@@ -88,6 +95,12 @@ struct lkl_host_operations {
 	void *(*mem_alloc)(unsigned long mem);
 	void (*mem_free)(void *mem);
 
+	unsigned long long (*time)(void);
+
+	void *(*timer_alloc)(void (*fn)(void *), void *arg);
+	int (*timer_set_oneshot)(void *timer, unsigned long delta);
+	void (*timer_free)(void *timer);
+
 	long (*gettid)(void);
 
 	void (*jmp_buf_set)(struct lkl_jmp_buf *jmpb, void (*f)(void));
diff --git a/arch/um/lkl/kernel/time.c b/arch/um/lkl/kernel/time.c
new file mode 100644
index 000000000000..b8320e1bfa53
--- /dev/null
+++ b/arch/um/lkl/kernel/time.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/clocksource.h>
+#include <linux/clockchips.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <asm/host_ops.h>
+
+static unsigned long long boot_time;
+
+void __ndelay(unsigned long nsecs)
+{
+	unsigned long long start = lkl_ops->time();
+
+	while (lkl_ops->time() < start + nsecs)
+		;
+}
+
+void __udelay(unsigned long usecs)
+{
+	__ndelay(usecs * NSEC_PER_USEC);
+}
+
+void __const_udelay(unsigned long xloops)
+{
+	__udelay(xloops / 0x10c7ul);
+}
+
+void calibrate_delay(void)
+{
+}
+
+void read_persistent_clock(struct timespec *ts)
+{
+	*ts = ns_to_timespec(lkl_ops->time());
+}
+
+/*
+ * Scheduler clock - returns current time in nanosec units.
+ *
+ */
+unsigned long long sched_clock(void)
+{
+	if (!boot_time)
+		return 0;
+
+	return lkl_ops->time() - boot_time;
+}
+
+static u64 clock_read(struct clocksource *cs)
+{
+	return lkl_ops->time();
+}
+
+static struct clocksource clocksource = {
+	.name	= "lkl",
+	.rating = 499,
+	.read	= clock_read,
+	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
+	.mask	= CLOCKSOURCE_MASK(64),
+};
+
+static void *timer;
+
+static int timer_irq;
+
+static void timer_fn(void *arg)
+{
+	lkl_trigger_irq(timer_irq);
+}
+
+static int clockevent_set_state_shutdown(struct clock_event_device *evt)
+{
+	if (timer) {
+		lkl_ops->timer_free(timer);
+		timer = NULL;
+	}
+
+	return 0;
+}
+
+static int clockevent_set_state_oneshot(struct clock_event_device *evt)
+{
+	timer = lkl_ops->timer_alloc(timer_fn, NULL);
+	if (!timer)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static irqreturn_t timer_irq_handler(int irq, void *dev_id)
+{
+	struct clock_event_device *dev = (struct clock_event_device *)dev_id;
+
+	dev->event_handler(dev);
+
+	return IRQ_HANDLED;
+}
+
+static int clockevent_next_event(unsigned long ns,
+				 struct clock_event_device *evt)
+{
+	return lkl_ops->timer_set_oneshot(timer, ns);
+}
+
+static struct clock_event_device clockevent = {
+	.name			= "lkl",
+	.features		= CLOCK_EVT_FEAT_ONESHOT,
+	.set_state_oneshot	= clockevent_set_state_oneshot,
+	.set_next_event		= clockevent_next_event,
+	.set_state_shutdown	= clockevent_set_state_shutdown,
+};
+
+static struct irqaction irq0  = {
+	.handler	= timer_irq_handler,
+	.flags		= IRQF_NOBALANCING | IRQF_TIMER,
+	.dev_id		= &clockevent,
+	.name		= "timer"
+};
+
+void __init time_init(void)
+{
+	int ret;
+
+	if (!lkl_ops->timer_alloc || !lkl_ops->timer_free ||
+	    !lkl_ops->timer_set_oneshot || !lkl_ops->time) {
+		pr_err("lkl: no time or timer support provided by host\n");
+		return;
+	}
+
+	timer_irq = lkl_get_free_irq("timer");
+	setup_irq(timer_irq, &irq0);
+
+	ret = clocksource_register_khz(&clocksource, 1000000);
+	if (ret)
+		pr_err("lkl: unable to register clocksource\n");
+
+	clockevents_config_and_register(&clockevent, NSEC_PER_SEC, 1,
+					ULONG_MAX);
+
+	boot_time = lkl_ops->time();
+	pr_info("lkl: time and timers initialized (irq%d)\n", timer_irq);
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 10/37] lkl: memory mapped I/O support
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch

From: Octavian Purdila <tavi.purdila@gmail.com>

All memory mapped I/O access is redirected to the host via the
iomem_access host operation. The host can setup the memory mapped I/O
region via the ioremap operation.

This allows the host to implement support for various devices, such as
block or network devices.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/io.h            | 104 ++++++++++++++++++++++++
 arch/um/lkl/include/uapi/asm/host_ops.h |  10 +++
 2 files changed, 114 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/io.h

diff --git a/arch/um/lkl/include/asm/io.h b/arch/um/lkl/include/asm/io.h
new file mode 100644
index 000000000000..33d4e1a7feb2
--- /dev/null
+++ b/arch/um/lkl/include/asm/io.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_IO_H
+#define _ASM_LKL_IO_H
+
+#include <asm/bug.h>
+#include <asm/host_ops.h>
+
+#define __raw_readb __raw_readb
+static inline u8 __raw_readb(const volatile void __iomem *addr)
+{
+	int ret;
+	u8 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+
+#define __raw_readw __raw_readw
+static inline u16 __raw_readw(const volatile void __iomem *addr)
+{
+	int ret;
+	u16 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+
+#define __raw_readl __raw_readl
+static inline u32 __raw_readl(const volatile void __iomem *addr)
+{
+	int ret;
+	u32 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+
+#ifdef CONFIG_64BIT
+#define __raw_readq __raw_readq
+static inline u64 __raw_readq(const volatile void __iomem *addr)
+{
+	int ret;
+	u64 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+#endif /* CONFIG_64BIT */
+
+#define __raw_writeb __raw_writeb
+static inline void __raw_writeb(u8 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+
+#define __raw_writew __raw_writew
+static inline void __raw_writew(u16 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+
+#define __raw_writel __raw_writel
+static inline void __raw_writel(u32 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+
+#ifdef CONFIG_64BIT
+#define __raw_writeq __raw_writeq
+static inline void __raw_writeq(u64 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+#endif /* CONFIG_64BIT */
+
+#define ioremap ioremap
+static inline void __iomem *ioremap(phys_addr_t offset, size_t size)
+{
+	return (void __iomem *)lkl_ops->ioremap(offset, size);
+}
+
+#include <asm-generic/io.h>
+
+#endif /* _ASM_LKL_IO_H */
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index c9f77dd7fbe7..7c0c0967c44a 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -55,6 +55,12 @@ struct lkl_jmp_buf {
  * @timer_set_periodic - arm the timer to fire periodically, with a period of
  * delta ns.
  *
+ * @ioremap - searches for an I/O memory region identified by addr and size and
+ * returns a pointer to the start of the address range that can be used by
+ * iomem_access
+ * @iomem_access - reads or writes to and I/O memory region; addr must be in the
+ * range returned by ioremap
+ *
  * @gettid - returns the host thread id of the caller, which need not
  * be the same as the handle returned by thread_create
  *
@@ -101,6 +107,10 @@ struct lkl_host_operations {
 	int (*timer_set_oneshot)(void *timer, unsigned long delta);
 	void (*timer_free)(void *timer);
 
+	void *(*ioremap)(long addr, int size);
+	int (*iomem_access)(const volatile void *addr, void *val, int size,
+			    int write);
+
 	long (*gettid)(void);
 
 	void (*jmp_buf_set)(struct lkl_jmp_buf *jmpb, void (*f)(void));
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 10/37] lkl: memory mapped I/O support
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, linux-kernel-library, linux-arch, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

All memory mapped I/O access is redirected to the host via the
iomem_access host operation. The host can setup the memory mapped I/O
region via the ioremap operation.

This allows the host to implement support for various devices, such as
block or network devices.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/io.h            | 104 ++++++++++++++++++++++++
 arch/um/lkl/include/uapi/asm/host_ops.h |  10 +++
 2 files changed, 114 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/io.h

diff --git a/arch/um/lkl/include/asm/io.h b/arch/um/lkl/include/asm/io.h
new file mode 100644
index 000000000000..33d4e1a7feb2
--- /dev/null
+++ b/arch/um/lkl/include/asm/io.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_IO_H
+#define _ASM_LKL_IO_H
+
+#include <asm/bug.h>
+#include <asm/host_ops.h>
+
+#define __raw_readb __raw_readb
+static inline u8 __raw_readb(const volatile void __iomem *addr)
+{
+	int ret;
+	u8 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+
+#define __raw_readw __raw_readw
+static inline u16 __raw_readw(const volatile void __iomem *addr)
+{
+	int ret;
+	u16 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+
+#define __raw_readl __raw_readl
+static inline u32 __raw_readl(const volatile void __iomem *addr)
+{
+	int ret;
+	u32 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+
+#ifdef CONFIG_64BIT
+#define __raw_readq __raw_readq
+static inline u64 __raw_readq(const volatile void __iomem *addr)
+{
+	int ret;
+	u64 value;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 0);
+	WARN(ret, "error reading iomem %p", addr);
+
+	return value;
+}
+#endif /* CONFIG_64BIT */
+
+#define __raw_writeb __raw_writeb
+static inline void __raw_writeb(u8 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+
+#define __raw_writew __raw_writew
+static inline void __raw_writew(u16 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+
+#define __raw_writel __raw_writel
+static inline void __raw_writel(u32 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+
+#ifdef CONFIG_64BIT
+#define __raw_writeq __raw_writeq
+static inline void __raw_writeq(u64 value, volatile void __iomem *addr)
+{
+	int ret;
+
+	ret = lkl_ops->iomem_access(addr, &value, sizeof(value), 1);
+	WARN(ret, "error writing iomem %p", addr);
+}
+#endif /* CONFIG_64BIT */
+
+#define ioremap ioremap
+static inline void __iomem *ioremap(phys_addr_t offset, size_t size)
+{
+	return (void __iomem *)lkl_ops->ioremap(offset, size);
+}
+
+#include <asm-generic/io.h>
+
+#endif /* _ASM_LKL_IO_H */
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index c9f77dd7fbe7..7c0c0967c44a 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -55,6 +55,12 @@ struct lkl_jmp_buf {
  * @timer_set_periodic - arm the timer to fire periodically, with a period of
  * delta ns.
  *
+ * @ioremap - searches for an I/O memory region identified by addr and size and
+ * returns a pointer to the start of the address range that can be used by
+ * iomem_access
+ * @iomem_access - reads or writes to and I/O memory region; addr must be in the
+ * range returned by ioremap
+ *
  * @gettid - returns the host thread id of the caller, which need not
  * be the same as the handle returned by thread_create
  *
@@ -101,6 +107,10 @@ struct lkl_host_operations {
 	int (*timer_set_oneshot)(void *timer, unsigned long delta);
 	void (*timer_free)(void *timer);
 
+	void *(*ioremap)(long addr, int size);
+	int (*iomem_access)(const volatile void *addr, void *val, int size,
+			    int write);
+
 	long (*gettid)(void);
 
 	void (*jmp_buf_set)(struct lkl_jmp_buf *jmpb, void (*f)(void));
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 11/37] lkl: basic kernel console support
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch

From: Octavian Purdila <tavi.purdila@gmail.com>

Write operations are deferred to the host print operation.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/uapi/asm/host_ops.h |  4 +++
 arch/um/lkl/kernel/console.c            | 42 +++++++++++++++++++++++++
 2 files changed, 46 insertions(+)
 create mode 100644 arch/um/lkl/kernel/console.c

diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 7c0c0967c44a..6ae781419ce6 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -17,6 +17,8 @@ struct lkl_jmp_buf {
  * These operations must be provided by a host library or by the application
  * itself.
  *
+ * @print - optional operation that receives console messages
+ *
  * @sem_alloc - allocate a host semaphore an initialize it to count
  * @sem_free - free a host semaphore
  * @sem_up - perform an up operation on the semaphore
@@ -76,6 +78,8 @@ struct lkl_jmp_buf {
  * @jmp_buf_longjmp - perform a jump back to the saved jump buffer
  */
 struct lkl_host_operations {
+	void (*print)(const char *str, int len);
+
 	struct lkl_sem *(*sem_alloc)(int count);
 	void (*sem_free)(struct lkl_sem *sem);
 	void (*sem_up)(struct lkl_sem *sem);
diff --git a/arch/um/lkl/kernel/console.c b/arch/um/lkl/kernel/console.c
new file mode 100644
index 000000000000..54d7f756c6da
--- /dev/null
+++ b/arch/um/lkl/kernel/console.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/console.h>
+#include <asm/host_ops.h>
+
+static void console_write(struct console *con, const char *str,
+			  unsigned int len)
+{
+	if (lkl_ops->print)
+		lkl_ops->print(str, len);
+}
+
+#ifdef CONFIG_LKL_EARLY_CONSOLE
+static struct console lkl_boot_console = {
+	.name	= "lkl_boot_console",
+	.write	= console_write,
+	.flags	= CON_PRINTBUFFER | CON_BOOT,
+	.index	= -1,
+};
+
+int __init lkl_boot_console_init(void)
+{
+	register_console(&lkl_boot_console);
+	return 0;
+}
+early_initcall(lkl_boot_console_init);
+#endif
+
+static struct console lkl_console = {
+	.name	= "lkl_console",
+	.write	= console_write,
+	.flags	= CON_PRINTBUFFER,
+	.index	= -1,
+};
+
+static int __init lkl_console_init(void)
+{
+	register_console(&lkl_console);
+	return 0;
+}
+core_initcall(lkl_console_init);
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 11/37] lkl: basic kernel console support
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, linux-kernel-library, linux-arch, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Write operations are deferred to the host print operation.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/uapi/asm/host_ops.h |  4 +++
 arch/um/lkl/kernel/console.c            | 42 +++++++++++++++++++++++++
 2 files changed, 46 insertions(+)
 create mode 100644 arch/um/lkl/kernel/console.c

diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 7c0c0967c44a..6ae781419ce6 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -17,6 +17,8 @@ struct lkl_jmp_buf {
  * These operations must be provided by a host library or by the application
  * itself.
  *
+ * @print - optional operation that receives console messages
+ *
  * @sem_alloc - allocate a host semaphore an initialize it to count
  * @sem_free - free a host semaphore
  * @sem_up - perform an up operation on the semaphore
@@ -76,6 +78,8 @@ struct lkl_jmp_buf {
  * @jmp_buf_longjmp - perform a jump back to the saved jump buffer
  */
 struct lkl_host_operations {
+	void (*print)(const char *str, int len);
+
 	struct lkl_sem *(*sem_alloc)(int count);
 	void (*sem_free)(struct lkl_sem *sem);
 	void (*sem_up)(struct lkl_sem *sem);
diff --git a/arch/um/lkl/kernel/console.c b/arch/um/lkl/kernel/console.c
new file mode 100644
index 000000000000..54d7f756c6da
--- /dev/null
+++ b/arch/um/lkl/kernel/console.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/console.h>
+#include <asm/host_ops.h>
+
+static void console_write(struct console *con, const char *str,
+			  unsigned int len)
+{
+	if (lkl_ops->print)
+		lkl_ops->print(str, len);
+}
+
+#ifdef CONFIG_LKL_EARLY_CONSOLE
+static struct console lkl_boot_console = {
+	.name	= "lkl_boot_console",
+	.write	= console_write,
+	.flags	= CON_PRINTBUFFER | CON_BOOT,
+	.index	= -1,
+};
+
+int __init lkl_boot_console_init(void)
+{
+	register_console(&lkl_boot_console);
+	return 0;
+}
+early_initcall(lkl_boot_console_init);
+#endif
+
+static struct console lkl_console = {
+	.name	= "lkl_console",
+	.write	= console_write,
+	.flags	= CON_PRINTBUFFER,
+	.index	= -1,
+};
+
+static int __init lkl_console_init(void)
+{
+	register_console(&lkl_console);
+	return 0;
+}
+core_initcall(lkl_console_init);
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 12/37] lkl: initialization and cleanup
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki, Michael Zimmermann, Patrick Collins, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Add the lkl_start_kernel and lkl_sys_halt APIs that start and
respectively stops the Linux kernel.

lkl_start_kernel creates a separate threads that will run the initial
and idle kernel thread. It waits for the kernel to complete
initialization before returning, to avoid races with system calls
issues by the host application.

During the setup phase, we create "/init" in initial ramfs root
filesystem to avoid mounting the "real" rootfs since ramfs is good
enough for now.

lkl_stop_kernel will shutdown the kernel, terminate all threads and
free all host resources used by the kernel before returning.

This patch also introduces idle CPU handling since it is closely
related to the shutdown process. A host semaphore is used to wait for
new interrupts when the kernel switches the CPU to idle to avoid
wasting host CPU cycles. When the kernel is shutdown we terminate the
idle thread at the first CPU idle event.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/setup.h         |   7 +
 arch/um/lkl/include/uapi/asm/host_ops.h |  26 ++++
 arch/um/lkl/kernel/setup.c              | 193 ++++++++++++++++++++++++
 3 files changed, 226 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/setup.h
 create mode 100644 arch/um/lkl/kernel/setup.c

diff --git a/arch/um/lkl/include/asm/setup.h b/arch/um/lkl/include/asm/setup.h
new file mode 100644
index 000000000000..b40955208cc6
--- /dev/null
+++ b/arch/um/lkl/include/asm/setup.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_SETUP_H
+#define _ASM_LKL_SETUP_H
+
+#define COMMAND_LINE_SIZE 4096
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 6ae781419ce6..5f26e61f4b18 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -17,8 +17,14 @@ struct lkl_jmp_buf {
  * These operations must be provided by a host library or by the application
  * itself.
  *
+ * @virtio_devices - string containg the list of virtio devices in virtio mmio
+ * command line format. This string is appended to the kernel command line and
+ * is provided here for convenience to be implemented by the host library.
+ *
  * @print - optional operation that receives console messages
  *
+ * @panic - called during a kernel panic
+ *
  * @sem_alloc - allocate a host semaphore an initialize it to count
  * @sem_free - free a host semaphore
  * @sem_up - perform an up operation on the semaphore
@@ -78,7 +84,10 @@ struct lkl_jmp_buf {
  * @jmp_buf_longjmp - perform a jump back to the saved jump buffer
  */
 struct lkl_host_operations {
+	const char *virtio_devices;
+
 	void (*print)(const char *str, int len);
+	void (*panic)(void);
 
 	struct lkl_sem *(*sem_alloc)(int count);
 	void (*sem_free)(struct lkl_sem *sem);
@@ -121,6 +130,23 @@ struct lkl_host_operations {
 	void (*jmp_buf_longjmp)(struct lkl_jmp_buf *jmpb, int val);
 };
 
+/**
+ * lkl_start_kernel - registers the host operations and starts the kernel
+ *
+ * The function returns only after the kernel is shutdown with lkl_sys_halt.
+ *
+ * @lkl_ops - pointer to host operations
+ * @cmd_line - format for command line string that is going to be used to
+ * generate the Linux kernel command line
+ */
+int lkl_start_kernel(struct lkl_host_operations *lkl_ops, const char *cmd_line,
+		     ...);
+
+/**
+ * lkl_is_running - returns 1 if the kernel is currently running
+ */
+int lkl_is_running(void);
+
 int lkl_printf(const char *fmt, ...);
 void lkl_bug(const char *fmt, ...);
 
diff --git a/arch/um/lkl/kernel/setup.c b/arch/um/lkl/kernel/setup.c
new file mode 100644
index 000000000000..1bf973d36307
--- /dev/null
+++ b/arch/um/lkl/kernel/setup.c
@@ -0,0 +1,193 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/binfmts.h>
+#include <linux/init.h>
+#include <linux/init_task.h>
+#include <linux/personality.h>
+#include <linux/reboot.h>
+#include <linux/fs.h>
+#include <linux/start_kernel.h>
+#include <linux/syscalls.h>
+#include <linux/tick.h>
+#include <asm/host_ops.h>
+#include <asm/irq.h>
+#include <asm/unistd.h>
+#include <asm/syscalls.h>
+#include <asm/cpu.h>
+
+struct lkl_host_operations *lkl_ops;
+static char cmd_line[COMMAND_LINE_SIZE];
+static void *init_sem;
+static int is_running;
+void (*pm_power_off)(void) = NULL;
+static unsigned long mem_size = 64 * 1024 * 1024;
+
+static long lkl_panic_blink(int state)
+{
+	lkl_ops->panic();
+	return 0;
+}
+
+static int __init setup_mem_size(char *str)
+{
+	mem_size = memparse(str, NULL);
+	return 0;
+}
+early_param("mem", setup_mem_size);
+
+void __init setup_arch(char **cl)
+{
+	*cl = cmd_line;
+	panic_blink = lkl_panic_blink;
+	parse_early_param();
+	bootmem_init(mem_size);
+}
+
+static void __init lkl_run_kernel(void *arg)
+{
+	threads_init();
+	lkl_cpu_get();
+	start_kernel();
+}
+
+int __init lkl_start_kernel(struct lkl_host_operations *ops, const char *fmt,
+			    ...)
+{
+	va_list ap;
+	int ret;
+
+	lkl_ops = ops;
+
+	va_start(ap, fmt);
+	ret = vsnprintf(boot_command_line, COMMAND_LINE_SIZE, fmt, ap);
+	va_end(ap);
+
+	if (ops->virtio_devices)
+		strscpy(boot_command_line + ret, ops->virtio_devices,
+			COMMAND_LINE_SIZE - ret);
+
+	memcpy(cmd_line, boot_command_line, COMMAND_LINE_SIZE);
+
+	init_sem = lkl_ops->sem_alloc(0);
+	if (!init_sem)
+		return -ENOMEM;
+
+	ret = lkl_cpu_init();
+	if (ret)
+		goto out_free_init_sem;
+
+	ret = lkl_ops->thread_create(lkl_run_kernel, NULL);
+	if (!ret) {
+		ret = -ENOMEM;
+		goto out_free_init_sem;
+	}
+
+	lkl_ops->sem_down(init_sem);
+	lkl_ops->sem_free(init_sem);
+	current_thread_info()->tid = lkl_ops->thread_self();
+	lkl_cpu_change_owner(current_thread_info()->tid);
+
+	lkl_cpu_put();
+	is_running = 1;
+
+	return 0;
+
+out_free_init_sem:
+	lkl_ops->sem_free(init_sem);
+
+	return ret;
+}
+
+int lkl_is_running(void)
+{
+	return is_running;
+}
+
+void machine_halt(void)
+{
+	lkl_cpu_shutdown();
+}
+
+void machine_power_off(void)
+{
+	machine_halt();
+}
+
+void machine_restart(char *unused)
+{
+	machine_halt();
+}
+
+long lkl_sys_halt(void)
+{
+	long err;
+	long params[6] = {
+		LINUX_REBOOT_MAGIC1,
+		LINUX_REBOOT_MAGIC2,
+		LINUX_REBOOT_CMD_RESTART,
+	};
+
+	err = lkl_syscall(__NR_reboot, params);
+	if (err < 0)
+		return err;
+
+	is_running = false;
+
+	lkl_cpu_wait_shutdown();
+
+	syscalls_cleanup();
+	threads_cleanup();
+	/* Shutdown the clockevents source. */
+	tick_suspend_local();
+	free_mem();
+	lkl_ops->thread_join(current_thread_info()->tid);
+
+	return 0;
+}
+
+static int lkl_run_init(struct linux_binprm *bprm);
+
+static struct linux_binfmt lkl_run_init_binfmt = {
+	.module		= THIS_MODULE,
+	.load_binary	= lkl_run_init,
+};
+
+static int lkl_run_init(struct linux_binprm *bprm)
+{
+	int ret;
+
+	if (strcmp("/init", bprm->filename) != 0)
+		return -EINVAL;
+
+	ret = flush_old_exec(bprm);
+	if (ret)
+		return ret;
+	set_personality(PER_LINUX);
+	setup_new_exec(bprm);
+	install_exec_creds(bprm);
+
+	set_binfmt(&lkl_run_init_binfmt);
+
+	init_pid_ns.child_reaper = NULL;
+
+	syscalls_init();
+
+	lkl_ops->sem_up(init_sem);
+	lkl_ops->thread_exit();
+
+	return 0;
+}
+
+/* skip mounting the "real" rootfs. ramfs is good enough. */
+static int __init fs_setup(void)
+{
+	int fd;
+
+	fd = sys_open("/init", O_CREAT, 0700);
+	WARN_ON(fd < 0);
+	sys_close(fd);
+
+	register_binfmt(&lkl_run_init_binfmt);
+
+	return 0;
+}
+late_initcall(fs_setup);
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 12/37] lkl: initialization and cleanup
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Octavian Purdila, Akira Moroo, Yuan Liu,
	Patrick Collins, linux-kernel-library, Michael Zimmermann,
	Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add the lkl_start_kernel and lkl_sys_halt APIs that start and
respectively stops the Linux kernel.

lkl_start_kernel creates a separate threads that will run the initial
and idle kernel thread. It waits for the kernel to complete
initialization before returning, to avoid races with system calls
issues by the host application.

During the setup phase, we create "/init" in initial ramfs root
filesystem to avoid mounting the "real" rootfs since ramfs is good
enough for now.

lkl_stop_kernel will shutdown the kernel, terminate all threads and
free all host resources used by the kernel before returning.

This patch also introduces idle CPU handling since it is closely
related to the shutdown process. A host semaphore is used to wait for
new interrupts when the kernel switches the CPU to idle to avoid
wasting host CPU cycles. When the kernel is shutdown we terminate the
idle thread at the first CPU idle event.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/include/asm/setup.h         |   7 +
 arch/um/lkl/include/uapi/asm/host_ops.h |  26 ++++
 arch/um/lkl/kernel/setup.c              | 193 ++++++++++++++++++++++++
 3 files changed, 226 insertions(+)
 create mode 100644 arch/um/lkl/include/asm/setup.h
 create mode 100644 arch/um/lkl/kernel/setup.c

diff --git a/arch/um/lkl/include/asm/setup.h b/arch/um/lkl/include/asm/setup.h
new file mode 100644
index 000000000000..b40955208cc6
--- /dev/null
+++ b/arch/um/lkl/include/asm/setup.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LKL_SETUP_H
+#define _ASM_LKL_SETUP_H
+
+#define COMMAND_LINE_SIZE 4096
+
+#endif
diff --git a/arch/um/lkl/include/uapi/asm/host_ops.h b/arch/um/lkl/include/uapi/asm/host_ops.h
index 6ae781419ce6..5f26e61f4b18 100644
--- a/arch/um/lkl/include/uapi/asm/host_ops.h
+++ b/arch/um/lkl/include/uapi/asm/host_ops.h
@@ -17,8 +17,14 @@ struct lkl_jmp_buf {
  * These operations must be provided by a host library or by the application
  * itself.
  *
+ * @virtio_devices - string containg the list of virtio devices in virtio mmio
+ * command line format. This string is appended to the kernel command line and
+ * is provided here for convenience to be implemented by the host library.
+ *
  * @print - optional operation that receives console messages
  *
+ * @panic - called during a kernel panic
+ *
  * @sem_alloc - allocate a host semaphore an initialize it to count
  * @sem_free - free a host semaphore
  * @sem_up - perform an up operation on the semaphore
@@ -78,7 +84,10 @@ struct lkl_jmp_buf {
  * @jmp_buf_longjmp - perform a jump back to the saved jump buffer
  */
 struct lkl_host_operations {
+	const char *virtio_devices;
+
 	void (*print)(const char *str, int len);
+	void (*panic)(void);
 
 	struct lkl_sem *(*sem_alloc)(int count);
 	void (*sem_free)(struct lkl_sem *sem);
@@ -121,6 +130,23 @@ struct lkl_host_operations {
 	void (*jmp_buf_longjmp)(struct lkl_jmp_buf *jmpb, int val);
 };
 
+/**
+ * lkl_start_kernel - registers the host operations and starts the kernel
+ *
+ * The function returns only after the kernel is shutdown with lkl_sys_halt.
+ *
+ * @lkl_ops - pointer to host operations
+ * @cmd_line - format for command line string that is going to be used to
+ * generate the Linux kernel command line
+ */
+int lkl_start_kernel(struct lkl_host_operations *lkl_ops, const char *cmd_line,
+		     ...);
+
+/**
+ * lkl_is_running - returns 1 if the kernel is currently running
+ */
+int lkl_is_running(void);
+
 int lkl_printf(const char *fmt, ...);
 void lkl_bug(const char *fmt, ...);
 
diff --git a/arch/um/lkl/kernel/setup.c b/arch/um/lkl/kernel/setup.c
new file mode 100644
index 000000000000..1bf973d36307
--- /dev/null
+++ b/arch/um/lkl/kernel/setup.c
@@ -0,0 +1,193 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/binfmts.h>
+#include <linux/init.h>
+#include <linux/init_task.h>
+#include <linux/personality.h>
+#include <linux/reboot.h>
+#include <linux/fs.h>
+#include <linux/start_kernel.h>
+#include <linux/syscalls.h>
+#include <linux/tick.h>
+#include <asm/host_ops.h>
+#include <asm/irq.h>
+#include <asm/unistd.h>
+#include <asm/syscalls.h>
+#include <asm/cpu.h>
+
+struct lkl_host_operations *lkl_ops;
+static char cmd_line[COMMAND_LINE_SIZE];
+static void *init_sem;
+static int is_running;
+void (*pm_power_off)(void) = NULL;
+static unsigned long mem_size = 64 * 1024 * 1024;
+
+static long lkl_panic_blink(int state)
+{
+	lkl_ops->panic();
+	return 0;
+}
+
+static int __init setup_mem_size(char *str)
+{
+	mem_size = memparse(str, NULL);
+	return 0;
+}
+early_param("mem", setup_mem_size);
+
+void __init setup_arch(char **cl)
+{
+	*cl = cmd_line;
+	panic_blink = lkl_panic_blink;
+	parse_early_param();
+	bootmem_init(mem_size);
+}
+
+static void __init lkl_run_kernel(void *arg)
+{
+	threads_init();
+	lkl_cpu_get();
+	start_kernel();
+}
+
+int __init lkl_start_kernel(struct lkl_host_operations *ops, const char *fmt,
+			    ...)
+{
+	va_list ap;
+	int ret;
+
+	lkl_ops = ops;
+
+	va_start(ap, fmt);
+	ret = vsnprintf(boot_command_line, COMMAND_LINE_SIZE, fmt, ap);
+	va_end(ap);
+
+	if (ops->virtio_devices)
+		strscpy(boot_command_line + ret, ops->virtio_devices,
+			COMMAND_LINE_SIZE - ret);
+
+	memcpy(cmd_line, boot_command_line, COMMAND_LINE_SIZE);
+
+	init_sem = lkl_ops->sem_alloc(0);
+	if (!init_sem)
+		return -ENOMEM;
+
+	ret = lkl_cpu_init();
+	if (ret)
+		goto out_free_init_sem;
+
+	ret = lkl_ops->thread_create(lkl_run_kernel, NULL);
+	if (!ret) {
+		ret = -ENOMEM;
+		goto out_free_init_sem;
+	}
+
+	lkl_ops->sem_down(init_sem);
+	lkl_ops->sem_free(init_sem);
+	current_thread_info()->tid = lkl_ops->thread_self();
+	lkl_cpu_change_owner(current_thread_info()->tid);
+
+	lkl_cpu_put();
+	is_running = 1;
+
+	return 0;
+
+out_free_init_sem:
+	lkl_ops->sem_free(init_sem);
+
+	return ret;
+}
+
+int lkl_is_running(void)
+{
+	return is_running;
+}
+
+void machine_halt(void)
+{
+	lkl_cpu_shutdown();
+}
+
+void machine_power_off(void)
+{
+	machine_halt();
+}
+
+void machine_restart(char *unused)
+{
+	machine_halt();
+}
+
+long lkl_sys_halt(void)
+{
+	long err;
+	long params[6] = {
+		LINUX_REBOOT_MAGIC1,
+		LINUX_REBOOT_MAGIC2,
+		LINUX_REBOOT_CMD_RESTART,
+	};
+
+	err = lkl_syscall(__NR_reboot, params);
+	if (err < 0)
+		return err;
+
+	is_running = false;
+
+	lkl_cpu_wait_shutdown();
+
+	syscalls_cleanup();
+	threads_cleanup();
+	/* Shutdown the clockevents source. */
+	tick_suspend_local();
+	free_mem();
+	lkl_ops->thread_join(current_thread_info()->tid);
+
+	return 0;
+}
+
+static int lkl_run_init(struct linux_binprm *bprm);
+
+static struct linux_binfmt lkl_run_init_binfmt = {
+	.module		= THIS_MODULE,
+	.load_binary	= lkl_run_init,
+};
+
+static int lkl_run_init(struct linux_binprm *bprm)
+{
+	int ret;
+
+	if (strcmp("/init", bprm->filename) != 0)
+		return -EINVAL;
+
+	ret = flush_old_exec(bprm);
+	if (ret)
+		return ret;
+	set_personality(PER_LINUX);
+	setup_new_exec(bprm);
+	install_exec_creds(bprm);
+
+	set_binfmt(&lkl_run_init_binfmt);
+
+	init_pid_ns.child_reaper = NULL;
+
+	syscalls_init();
+
+	lkl_ops->sem_up(init_sem);
+	lkl_ops->thread_exit();
+
+	return 0;
+}
+
+/* skip mounting the "real" rootfs. ramfs is good enough. */
+static int __init fs_setup(void)
+{
+	int fd;
+
+	fd = sys_open("/init", O_CREAT, 0700);
+	WARN_ON(fd < 0);
+	sys_close(fd);
+
+	register_binfmt(&lkl_run_init_binfmt);
+
+	return 0;
+}
+late_initcall(fs_setup);
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 13/37] lkl: plug in the build system
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Basic Makefiles for building LKL. Add a new architecture specific
target for installing the resulting library files and headers.

To make LKL binaries build, UML introduced an additional option, UMMODE
variable, to switch the output file of build: kernel (default), or
library (LKL).  Those modes are not able to be ON at the same time.

To build on library mode, users do the following:

  make defconfig ARCH=um UMMODE=library
  make ARCH=um UMMODE=library

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
Signed-off-by: Akira Moroo <retrage01@gmail.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 arch/um/Kconfig             |  26 +++++++
 arch/um/Makefile            | 151 +++---------------------------------
 arch/um/Makefile.um         | 149 +++++++++++++++++++++++++++++++++++
 arch/um/lkl/Kconfig         |  17 ++--
 arch/um/lkl/Makefile        |  69 ++++++++++++++++
 arch/um/lkl/auto.conf       |   1 +
 arch/um/lkl/kernel/Makefile |   4 +
 arch/um/lkl/mm/Makefile     |   1 +
 8 files changed, 264 insertions(+), 154 deletions(-)
 create mode 100644 arch/um/Makefile.um
 create mode 100644 arch/um/lkl/Makefile
 create mode 100644 arch/um/lkl/auto.conf
 create mode 100644 arch/um/lkl/kernel/Makefile
 create mode 100644 arch/um/lkl/mm/Makefile

diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index 3c3adfc486f2..c46bdb2987ce 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -5,6 +5,10 @@ menu "UML-specific options"
 config UML
 	bool
 	default y
+
+config UMMODE_KERN
+	bool "UML mode: kernel mode"
+	default y if "$(UMMODE)" = "kernel"
 	select ARCH_HAS_KCOV
 	select ARCH_NO_PREEMPT
 	select HAVE_ARCH_AUDITSYSCALL
@@ -18,7 +22,25 @@ config UML
 	select GENERIC_CLOCKEVENTS
 	select HAVE_GCC_PLUGINS
 	select TTY # Needed for line.c
+        help
+	  This mode switches a mode to build a regular kernel executable
+          of UML.
+
+config UMMODE_LIB
+	bool "UML mode: library mode"
+	depends on !UMMODE_KERN
+	select LKL
+	default y if "$(UMMODE)" = "library"
+	help
+	  This mode switches a mode to build a library of UML (Linux
+	  Kernel Library/LKL).  This switch is exclusive to "kernel mode"
+	  of UML, which is traditional mode of UML.
+
+	  For more detail about LKL, see
+	  <file:Documentation/virt/uml/lkl.txt>.
 
+
+if UMMODE_KERN
 config MMU
 	bool
 	default y
@@ -196,6 +218,10 @@ config UML_TIME_TRAVEL_SUPPORT
 
 	  It is safe to say Y, but you probably don't need this.
 
+endif #UMMODE_KERN
+
 endmenu
 
 source "arch/um/drivers/Kconfig"
+
+source "arch/um/lkl/Kconfig"
diff --git a/arch/um/Makefile b/arch/um/Makefile
index d2daa206872d..d8cb874c8a53 100644
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -1,148 +1,15 @@
-#
-# This file is included by the global makefile so that you can add your own
-# architecture-specific flags and dependencies.
-#
-# Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
-# Licensed under the GPL
-#
-
-# select defconfig based on actual architecture
-ifeq ($(SUBARCH),x86)
-  ifeq ($(shell uname -m),x86_64)
-        KBUILD_DEFCONFIG := x86_64_defconfig
-  else
-        KBUILD_DEFCONFIG := i386_defconfig
-  endif
-else
-        KBUILD_DEFCONFIG := $(SUBARCH)_defconfig
-endif
+# SPDX-License-Identifier: GPL-2.0
 
 ARCH_DIR := arch/um
-OS := $(shell uname -s)
-# We require bash because the vmlinux link and loader script cpp use bash
-# features.
-SHELL := /bin/bash
-
-core-y			+= $(ARCH_DIR)/kernel/		\
-			   $(ARCH_DIR)/drivers/		\
-			   $(ARCH_DIR)/os-$(OS)/
-
-MODE_INCLUDE	+= -I$(srctree)/$(ARCH_DIR)/include/shared/skas
 
-HEADER_ARCH 	:= $(SUBARCH)
+# select mode of UML build
+UMMODE ?= kernel
+include $(ARCH_DIR)/lkl/auto.conf
 
-ifneq ($(filter $(SUBARCH),x86 x86_64 i386),)
-	HEADER_ARCH := x86
+ifeq ($(UMMODE),kernel)
+	include $(ARCH_DIR)/Makefile.um
+else ifeq ($(UMMODE),library)
+	include $(ARCH_DIR)/lkl/Makefile
 endif
 
-ifdef CONFIG_64BIT
-	KBUILD_CFLAGS += -mcmodel=large
-endif
-
-HOST_DIR := arch/$(HEADER_ARCH)
-
-include $(ARCH_DIR)/Makefile-skas
-include $(HOST_DIR)/Makefile.um
-
-core-y += $(HOST_DIR)/um/
-
-SHARED_HEADERS	:= $(ARCH_DIR)/include/shared
-ARCH_INCLUDE	:= -I$(srctree)/$(SHARED_HEADERS)
-ARCH_INCLUDE	+= -I$(srctree)/$(HOST_DIR)/um/shared
-KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/um
-
-# -Dvmap=kernel_vmap prevents anything from referencing the libpcap.o symbol so
-# named - it's a common symbol in libpcap, so we get a binary which crashes.
-#
-# Same things for in6addr_loopback and mktime - found in libc. For these two we
-# only get link-time error, luckily.
-#
-# -Dlongjmp=kernel_longjmp prevents anything from referencing the libpthread.a
-# embedded copy of longjmp, same thing for setjmp.
-#
-# These apply to USER_CFLAGS to.
-
-KBUILD_CFLAGS += $(CFLAGS) $(CFLAGS-y) -D__arch_um__ \
-	$(ARCH_INCLUDE) $(MODE_INCLUDE) -Dvmap=kernel_vmap	\
-	-Dlongjmp=kernel_longjmp -Dsetjmp=kernel_setjmp \
-	-Din6addr_loopback=kernel_in6addr_loopback \
-	-Din6addr_any=kernel_in6addr_any -Dstrrchr=kernel_strrchr
-
-KBUILD_AFLAGS += $(ARCH_INCLUDE)
-
-USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \
-		$(ARCH_INCLUDE) $(MODE_INCLUDE) $(filter -I%,$(CFLAGS)) \
-		-D_FILE_OFFSET_BITS=64 -idirafter $(srctree)/include \
-		-idirafter $(objtree)/include -D__KERNEL__ -D__UM_HOST__
-
-#This will adjust *FLAGS accordingly to the platform.
-include $(ARCH_DIR)/Makefile-os-$(OS)
-
-KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/include \
-		   -I$(srctree)/$(HOST_DIR)/include/uapi \
-		   -I$(objtree)/$(HOST_DIR)/include/generated \
-		   -I$(objtree)/$(HOST_DIR)/include/generated/uapi
-
-# -Derrno=kernel_errno - This turns all kernel references to errno into
-# kernel_errno to separate them from the libc errno.  This allows -fno-common
-# in KBUILD_CFLAGS.  Otherwise, it would cause ld to complain about the two different
-# errnos.
-# These apply to kernelspace only.
-#
-# strip leading and trailing whitespace to make the USER_CFLAGS removal of these
-# defines more robust
-
-KERNEL_DEFINES = $(strip -Derrno=kernel_errno -Dsigprocmask=kernel_sigprocmask \
-			 -Dmktime=kernel_mktime $(ARCH_KERNEL_DEFINES))
-KBUILD_CFLAGS += $(KERNEL_DEFINES)
-
-PHONY += linux
-
-all: linux
-
-linux: vmlinux
-	@echo '  LINK $@'
-	$(Q)ln -f $< $@
-
-define archhelp
-  echo '* linux		- Binary kernel image (./linux) - for backward'
-  echo '		   compatibility only, this creates a hard link to the'
-  echo '		   real kernel binary, the "vmlinux" binary you'
-  echo '		   find in the kernel root.'
-endef
-
-archheaders:
-	$(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(HEADER_ARCH) asm-generic archheaders
-
-archprepare:
-	$(Q)$(MAKE) $(build)=$(HOST_DIR)/um include/generated/user_constants.h
-
-LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static
-LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)
-
-CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \
-	$(call cc-option, -fno-stack-protector,) \
-	$(call cc-option, -fno-stack-protector-all,)
-
-# Options used by linker script
-export LDS_START      := $(START)
-export LDS_ELF_ARCH   := $(ELF_ARCH)
-export LDS_ELF_FORMAT := $(ELF_FORMAT)
-
-# The wrappers will select whether using "malloc" or the kernel allocator.
-LINK_WRAPS = -Wl,--wrap,malloc -Wl,--wrap,free -Wl,--wrap,calloc
-
-LD_FLAGS_CMDLINE = $(foreach opt,$(KBUILD_LDFLAGS),-Wl,$(opt))
-
-# Used by link-vmlinux.sh which has special support for um link
-export CFLAGS_vmlinux := $(LINK-y) $(LINK_WRAPS) $(LD_FLAGS_CMDLINE)
-
-# When cleaning we don't include .config, so we don't include
-# TT or skas makefiles and don't clean skas_ptregs.h.
-CLEAN_FILES += linux x.i gmon.out
-
-archclean:
-	@find . \( -name '*.bb' -o -name '*.bbg' -o -name '*.da' \
-		-o -name '*.gcov' \) -type f -print | xargs rm -f
-
-export HEADER_ARCH SUBARCH USER_CFLAGS CFLAGS_NO_HARDENING OS DEV_NULL_PATH
+export UMMODE HEADER_ARCH HOST_DIR SRCARCH
diff --git a/arch/um/Makefile.um b/arch/um/Makefile.um
new file mode 100644
index 000000000000..d54fd387a16f
--- /dev/null
+++ b/arch/um/Makefile.um
@@ -0,0 +1,149 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# This file is included by the global makefile so that you can add your own
+# architecture-specific flags and dependencies.
+#
+# Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
+# Licensed under the GPL
+#
+
+# select defconfig based on actual architecture
+ifeq ($(SUBARCH),x86)
+  ifeq ($(shell uname -m),x86_64)
+        KBUILD_DEFCONFIG := x86_64_defconfig
+  else
+        KBUILD_DEFCONFIG := i386_defconfig
+  endif
+else
+        KBUILD_DEFCONFIG := $(SUBARCH)_defconfig
+endif
+
+ARCH_DIR := arch/um
+OS := $(shell uname -s)
+# We require bash because the vmlinux link and loader script cpp use bash
+# features.
+SHELL := /bin/bash
+
+core-y			+= $(ARCH_DIR)/kernel/		\
+			   $(ARCH_DIR)/drivers/		\
+			   $(ARCH_DIR)/os-$(OS)/
+
+MODE_INCLUDE	+= -I$(srctree)/$(ARCH_DIR)/include/shared/skas
+
+HEADER_ARCH 	:= $(SUBARCH)
+
+ifneq ($(filter $(SUBARCH),x86 x86_64 i386),)
+	HEADER_ARCH := x86
+endif
+
+ifdef CONFIG_64BIT
+	KBUILD_CFLAGS += -mcmodel=large
+endif
+
+HOST_DIR := arch/$(HEADER_ARCH)
+
+include $(ARCH_DIR)/Makefile-skas
+include $(HOST_DIR)/Makefile.um
+
+core-y += $(HOST_DIR)/um/
+
+SHARED_HEADERS	:= $(ARCH_DIR)/include/shared
+ARCH_INCLUDE	:= -I$(srctree)/$(SHARED_HEADERS)
+ARCH_INCLUDE	+= -I$(srctree)/$(HOST_DIR)/um/shared
+KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/um
+
+# -Dvmap=kernel_vmap prevents anything from referencing the libpcap.o symbol so
+# named - it's a common symbol in libpcap, so we get a binary which crashes.
+#
+# Same things for in6addr_loopback and mktime - found in libc. For these two we
+# only get link-time error, luckily.
+#
+# -Dlongjmp=kernel_longjmp prevents anything from referencing the libpthread.a
+# embedded copy of longjmp, same thing for setjmp.
+#
+# These apply to USER_CFLAGS to.
+
+KBUILD_CFLAGS += $(CFLAGS) $(CFLAGS-y) -D__arch_um__ \
+	$(ARCH_INCLUDE) $(MODE_INCLUDE) -Dvmap=kernel_vmap	\
+	-Dlongjmp=kernel_longjmp -Dsetjmp=kernel_setjmp \
+	-Din6addr_loopback=kernel_in6addr_loopback \
+	-Din6addr_any=kernel_in6addr_any -Dstrrchr=kernel_strrchr
+
+KBUILD_AFLAGS += $(ARCH_INCLUDE)
+
+USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \
+		$(ARCH_INCLUDE) $(MODE_INCLUDE) $(filter -I%,$(CFLAGS)) \
+		-D_FILE_OFFSET_BITS=64 -idirafter $(srctree)/include \
+		-idirafter $(objtree)/include -D__KERNEL__ -D__UM_HOST__
+
+#This will adjust *FLAGS accordingly to the platform.
+include $(ARCH_DIR)/Makefile-os-$(OS)
+
+KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/include \
+		   -I$(srctree)/$(HOST_DIR)/include/uapi \
+		   -I$(objtree)/$(HOST_DIR)/include/generated \
+		   -I$(objtree)/$(HOST_DIR)/include/generated/uapi
+
+# -Derrno=kernel_errno - This turns all kernel references to errno into
+# kernel_errno to separate them from the libc errno.  This allows -fno-common
+# in KBUILD_CFLAGS.  Otherwise, it would cause ld to complain about the two different
+# errnos.
+# These apply to kernelspace only.
+#
+# strip leading and trailing whitespace to make the USER_CFLAGS removal of these
+# defines more robust
+
+KERNEL_DEFINES = $(strip -Derrno=kernel_errno -Dsigprocmask=kernel_sigprocmask \
+			 -Dmktime=kernel_mktime $(ARCH_KERNEL_DEFINES))
+KBUILD_CFLAGS += $(KERNEL_DEFINES)
+
+PHONY += linux
+
+all: linux
+
+linux: vmlinux
+	@echo '  LINK $@'
+	$(Q)ln -f $< $@
+
+define archhelp
+  echo '* linux		- Binary kernel image (./linux) - for backward'
+  echo '		   compatibility only, this creates a hard link to the'
+  echo '		   real kernel binary, the "vmlinux" binary you'
+  echo '		   find in the kernel root.'
+endef
+
+archheaders:
+	$(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(HEADER_ARCH) asm-generic archheaders
+
+archprepare:
+	$(Q)$(MAKE) $(build)=$(HOST_DIR)/um include/generated/user_constants.h
+
+LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static
+LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)
+
+CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \
+	$(call cc-option, -fno-stack-protector,) \
+	$(call cc-option, -fno-stack-protector-all,)
+
+# Options used by linker script
+export LDS_START      := $(START)
+export LDS_ELF_ARCH   := $(ELF_ARCH)
+export LDS_ELF_FORMAT := $(ELF_FORMAT)
+
+# The wrappers will select whether using "malloc" or the kernel allocator.
+LINK_WRAPS = -Wl,--wrap,malloc -Wl,--wrap,free -Wl,--wrap,calloc
+
+LD_FLAGS_CMDLINE = $(foreach opt,$(KBUILD_LDFLAGS),-Wl,$(opt))
+
+# Used by link-vmlinux.sh which has special support for um link
+export CFLAGS_vmlinux := $(LINK-y) $(LINK_WRAPS) $(LD_FLAGS_CMDLINE)
+
+# When cleaning we don't include .config, so we don't include
+# TT or skas makefiles and don't clean skas_ptregs.h.
+CLEAN_FILES += linux x.i gmon.out
+
+archclean:
+	@find . \( -name '*.bb' -o -name '*.bbg' -o -name '*.da' \
+		-o -name '*.gcov' \) -type f -print | xargs rm -f
+
+export HEADER_ARCH SUBARCH USER_CFLAGS CFLAGS_NO_HARDENING OS DEV_NULL_PATH
diff --git a/arch/um/lkl/Kconfig b/arch/um/lkl/Kconfig
index 1dae70f16c43..07b3699095ae 100644
--- a/arch/um/lkl/Kconfig
+++ b/arch/um/lkl/Kconfig
@@ -1,6 +1,8 @@
 # SPDX-License-Identifier: GPL-2.0
 
-config UML_LKL
+menu "LKL-specific options"
+
+config LKL
        def_bool y
        depends on !SMP && !MMU && !COREDUMP && !SECCOMP && !UPROBES && !COMPAT && !USER_RETURN_NOTIFIER
        select ARCH_THREAD_STACK_ALLOCATOR
@@ -59,7 +61,7 @@ config BIG_ENDIAN
        def_bool n
 
 config GENERIC_CSUM
-       def_bool y
+       def_bool LKL
 
 config GENERIC_HWEIGHT
        def_bool y
@@ -83,13 +85,4 @@ config HZ
         int
         default 100
 
-config CONSOLE_LOGLEVEL_QUIET
-	int "quiet console loglevel (1-15)"
-	range 1 15
-	default "4"
-	help
-	  loglevel to use when "quiet" is passed on the kernel commandline.
-
-	  When "quiet" is passed on the kernel commandline this loglevel
-	  will be used as the loglevel. IOW passing "quiet" will be the
-	  equivalent of passing "loglevel=<CONSOLE_LOGLEVEL_QUIET>"
+endmenu
diff --git a/arch/um/lkl/Makefile b/arch/um/lkl/Makefile
new file mode 100644
index 000000000000..45af83c3825a
--- /dev/null
+++ b/arch/um/lkl/Makefile
@@ -0,0 +1,69 @@
+# SPDX-License-Identifier: GPL-2.0
+
+HOST_DIR := $(ARCH_DIR)/lkl
+include $(HOST_DIR)/auto.conf
+
+SRCARCH := um/lkl
+ARCH_INCLUDE += -I$(srctree)/$(HOST_DIR)/um/include
+LINUXINCLUDE := $(subst $(ARCH_DIR),$(HOST_DIR),$(LINUXINCLUDE)) $(ARCH_INCLUDE)
+KBUILD_CFLAGS += -fno-builtin -D__arch_um__
+KBUILD_DEFCONFIG := lkl_defconfig
+
+ifneq (,$(filter $(OUTPUT_FORMAT),elf64-x86-64 elf32-i386 elf64-x86-64-freebsd elf32-littlearm elf64-littleaarch64))
+KBUILD_CFLAGS += -fPIC
+else ifneq (,$(filter $(OUTPUT_FORMAT),pe-i386 pe-x86-64 ))
+ifneq ($(OUTPUT_FORMAT),pe-x86-64)
+prefix=_
+endif
+# workaround for #include_next<stdarg.h> errors
+LINUXINCLUDE := -isystem $(HOST_DIR)/include/system $(LINUXINCLUDE)
+# workaround for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52991
+KBUILD_CFLAGS += -mno-ms-bitfields
+else
+$(error Unrecognized platform: $(OUTPUT_FORMAT))
+endif
+
+ifeq ($(shell uname -s), Linux)
+NPROC=$(shell nproc)
+else # e.g., FreeBSD
+NPROC=$(shell sysctl -n hw.ncpu)
+endif
+
+LDFLAGS_vmlinux += -r
+LKL_ENTRY_POINTS := lkl_start_kernel lkl_sys_halt lkl_syscall lkl_trigger_irq \
+	lkl_get_free_irq lkl_put_irq lkl_is_running lkl_bug lkl_printf
+
+ifeq ($(OUTPUT_FORMAT),elf32-i386)
+LKL_ENTRY_POINTS += \
+	__x86.get_pc_thunk.bx __x86.get_pc_thunk.dx __x86.get_pc_thunk.ax \
+	__x86.get_pc_thunk.cx __x86.get_pc_thunk.si __x86.get_pc_thunk.di
+endif
+
+core-y += $(HOST_DIR)/kernel/
+core-y += $(HOST_DIR)/mm/
+
+all: lkl.o
+
+lkl.o: vmlinux
+	$(OBJCOPY) -R .eh_frame -R .syscall_defs $(foreach sym,$(LKL_ENTRY_POINTS),-G$(prefix)$(sym)) vmlinux lkl.o
+
+$(HOST_DIR)/include/generated/uapi/asm/syscall_defs.h: vmlinux
+	$(OBJCOPY) -j .syscall_defs -O binary --set-section-flags .syscall_defs=alloc $< $@
+	$(Q) export tmpfile=$(shell mktemp); \
+	sed 's/\x0//g' $@ > $$tmpfile; mv $$tmpfile $@ ; rm -f $$tmpfile
+
+install: lkl.o headers $(HOST_DIR)/include/generated/uapi/asm/syscall_defs.h
+	@echo "  INSTALL	$(INSTALL_PATH)/lib/lkl.o"
+	@mkdir -p $(INSTALL_PATH)/lib/
+	@cp lkl.o $(INSTALL_PATH)/lib/
+	@$(srctree)/$(HOST_DIR)/scripts/headers_install.py \
+		$(subst -j,-j$(NPROC),$(findstring -j,$(MAKEFLAGS))) \
+		$(INSTALL_PATH)/include
+
+archclean:
+	$(Q)rm -rf $(srctree)/$(HOST_DIR)/include/generated
+	$(Q)$(MAKE) $(clean)=$(boot)
+
+define archhelp
+  echo '  install	- Install library and headers to INSTALL_PATH/{lib,include}'
+endef
diff --git a/arch/um/lkl/auto.conf b/arch/um/lkl/auto.conf
new file mode 100644
index 000000000000..4bfd65a02d73
--- /dev/null
+++ b/arch/um/lkl/auto.conf
@@ -0,0 +1 @@
+export OUTPUT_FORMAT=$(shell $(LD) -r -print-output-format)
diff --git a/arch/um/lkl/kernel/Makefile b/arch/um/lkl/kernel/Makefile
new file mode 100644
index 000000000000..ef489f2f7176
--- /dev/null
+++ b/arch/um/lkl/kernel/Makefile
@@ -0,0 +1,4 @@
+extra-y := vmlinux.lds
+
+obj-y = setup.o threads.o irq.o time.o syscalls.o misc.o console.o \
+	syscalls_32.o cpu.o
diff --git a/arch/um/lkl/mm/Makefile b/arch/um/lkl/mm/Makefile
new file mode 100644
index 000000000000..2af6e3051897
--- /dev/null
+++ b/arch/um/lkl/mm/Makefile
@@ -0,0 +1 @@
+obj-y = bootmem.o
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 13/37] lkl: plug in the build system
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, linux-kernel-library, linux-arch,
	Hajime Tazaki, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Basic Makefiles for building LKL. Add a new architecture specific
target for installing the resulting library files and headers.

To make LKL binaries build, UML introduced an additional option, UMMODE
variable, to switch the output file of build: kernel (default), or
library (LKL).  Those modes are not able to be ON at the same time.

To build on library mode, users do the following:

  make defconfig ARCH=um UMMODE=library
  make ARCH=um UMMODE=library

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
Signed-off-by: Akira Moroo <retrage01@gmail.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 arch/um/Kconfig             |  26 +++++++
 arch/um/Makefile            | 151 +++---------------------------------
 arch/um/Makefile.um         | 149 +++++++++++++++++++++++++++++++++++
 arch/um/lkl/Kconfig         |  17 ++--
 arch/um/lkl/Makefile        |  69 ++++++++++++++++
 arch/um/lkl/auto.conf       |   1 +
 arch/um/lkl/kernel/Makefile |   4 +
 arch/um/lkl/mm/Makefile     |   1 +
 8 files changed, 264 insertions(+), 154 deletions(-)
 create mode 100644 arch/um/Makefile.um
 create mode 100644 arch/um/lkl/Makefile
 create mode 100644 arch/um/lkl/auto.conf
 create mode 100644 arch/um/lkl/kernel/Makefile
 create mode 100644 arch/um/lkl/mm/Makefile

diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index 3c3adfc486f2..c46bdb2987ce 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -5,6 +5,10 @@ menu "UML-specific options"
 config UML
 	bool
 	default y
+
+config UMMODE_KERN
+	bool "UML mode: kernel mode"
+	default y if "$(UMMODE)" = "kernel"
 	select ARCH_HAS_KCOV
 	select ARCH_NO_PREEMPT
 	select HAVE_ARCH_AUDITSYSCALL
@@ -18,7 +22,25 @@ config UML
 	select GENERIC_CLOCKEVENTS
 	select HAVE_GCC_PLUGINS
 	select TTY # Needed for line.c
+        help
+	  This mode switches a mode to build a regular kernel executable
+          of UML.
+
+config UMMODE_LIB
+	bool "UML mode: library mode"
+	depends on !UMMODE_KERN
+	select LKL
+	default y if "$(UMMODE)" = "library"
+	help
+	  This mode switches a mode to build a library of UML (Linux
+	  Kernel Library/LKL).  This switch is exclusive to "kernel mode"
+	  of UML, which is traditional mode of UML.
+
+	  For more detail about LKL, see
+	  <file:Documentation/virt/uml/lkl.txt>.
 
+
+if UMMODE_KERN
 config MMU
 	bool
 	default y
@@ -196,6 +218,10 @@ config UML_TIME_TRAVEL_SUPPORT
 
 	  It is safe to say Y, but you probably don't need this.
 
+endif #UMMODE_KERN
+
 endmenu
 
 source "arch/um/drivers/Kconfig"
+
+source "arch/um/lkl/Kconfig"
diff --git a/arch/um/Makefile b/arch/um/Makefile
index d2daa206872d..d8cb874c8a53 100644
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -1,148 +1,15 @@
-#
-# This file is included by the global makefile so that you can add your own
-# architecture-specific flags and dependencies.
-#
-# Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
-# Licensed under the GPL
-#
-
-# select defconfig based on actual architecture
-ifeq ($(SUBARCH),x86)
-  ifeq ($(shell uname -m),x86_64)
-        KBUILD_DEFCONFIG := x86_64_defconfig
-  else
-        KBUILD_DEFCONFIG := i386_defconfig
-  endif
-else
-        KBUILD_DEFCONFIG := $(SUBARCH)_defconfig
-endif
+# SPDX-License-Identifier: GPL-2.0
 
 ARCH_DIR := arch/um
-OS := $(shell uname -s)
-# We require bash because the vmlinux link and loader script cpp use bash
-# features.
-SHELL := /bin/bash
-
-core-y			+= $(ARCH_DIR)/kernel/		\
-			   $(ARCH_DIR)/drivers/		\
-			   $(ARCH_DIR)/os-$(OS)/
-
-MODE_INCLUDE	+= -I$(srctree)/$(ARCH_DIR)/include/shared/skas
 
-HEADER_ARCH 	:= $(SUBARCH)
+# select mode of UML build
+UMMODE ?= kernel
+include $(ARCH_DIR)/lkl/auto.conf
 
-ifneq ($(filter $(SUBARCH),x86 x86_64 i386),)
-	HEADER_ARCH := x86
+ifeq ($(UMMODE),kernel)
+	include $(ARCH_DIR)/Makefile.um
+else ifeq ($(UMMODE),library)
+	include $(ARCH_DIR)/lkl/Makefile
 endif
 
-ifdef CONFIG_64BIT
-	KBUILD_CFLAGS += -mcmodel=large
-endif
-
-HOST_DIR := arch/$(HEADER_ARCH)
-
-include $(ARCH_DIR)/Makefile-skas
-include $(HOST_DIR)/Makefile.um
-
-core-y += $(HOST_DIR)/um/
-
-SHARED_HEADERS	:= $(ARCH_DIR)/include/shared
-ARCH_INCLUDE	:= -I$(srctree)/$(SHARED_HEADERS)
-ARCH_INCLUDE	+= -I$(srctree)/$(HOST_DIR)/um/shared
-KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/um
-
-# -Dvmap=kernel_vmap prevents anything from referencing the libpcap.o symbol so
-# named - it's a common symbol in libpcap, so we get a binary which crashes.
-#
-# Same things for in6addr_loopback and mktime - found in libc. For these two we
-# only get link-time error, luckily.
-#
-# -Dlongjmp=kernel_longjmp prevents anything from referencing the libpthread.a
-# embedded copy of longjmp, same thing for setjmp.
-#
-# These apply to USER_CFLAGS to.
-
-KBUILD_CFLAGS += $(CFLAGS) $(CFLAGS-y) -D__arch_um__ \
-	$(ARCH_INCLUDE) $(MODE_INCLUDE) -Dvmap=kernel_vmap	\
-	-Dlongjmp=kernel_longjmp -Dsetjmp=kernel_setjmp \
-	-Din6addr_loopback=kernel_in6addr_loopback \
-	-Din6addr_any=kernel_in6addr_any -Dstrrchr=kernel_strrchr
-
-KBUILD_AFLAGS += $(ARCH_INCLUDE)
-
-USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \
-		$(ARCH_INCLUDE) $(MODE_INCLUDE) $(filter -I%,$(CFLAGS)) \
-		-D_FILE_OFFSET_BITS=64 -idirafter $(srctree)/include \
-		-idirafter $(objtree)/include -D__KERNEL__ -D__UM_HOST__
-
-#This will adjust *FLAGS accordingly to the platform.
-include $(ARCH_DIR)/Makefile-os-$(OS)
-
-KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/include \
-		   -I$(srctree)/$(HOST_DIR)/include/uapi \
-		   -I$(objtree)/$(HOST_DIR)/include/generated \
-		   -I$(objtree)/$(HOST_DIR)/include/generated/uapi
-
-# -Derrno=kernel_errno - This turns all kernel references to errno into
-# kernel_errno to separate them from the libc errno.  This allows -fno-common
-# in KBUILD_CFLAGS.  Otherwise, it would cause ld to complain about the two different
-# errnos.
-# These apply to kernelspace only.
-#
-# strip leading and trailing whitespace to make the USER_CFLAGS removal of these
-# defines more robust
-
-KERNEL_DEFINES = $(strip -Derrno=kernel_errno -Dsigprocmask=kernel_sigprocmask \
-			 -Dmktime=kernel_mktime $(ARCH_KERNEL_DEFINES))
-KBUILD_CFLAGS += $(KERNEL_DEFINES)
-
-PHONY += linux
-
-all: linux
-
-linux: vmlinux
-	@echo '  LINK $@'
-	$(Q)ln -f $< $@
-
-define archhelp
-  echo '* linux		- Binary kernel image (./linux) - for backward'
-  echo '		   compatibility only, this creates a hard link to the'
-  echo '		   real kernel binary, the "vmlinux" binary you'
-  echo '		   find in the kernel root.'
-endef
-
-archheaders:
-	$(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(HEADER_ARCH) asm-generic archheaders
-
-archprepare:
-	$(Q)$(MAKE) $(build)=$(HOST_DIR)/um include/generated/user_constants.h
-
-LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static
-LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)
-
-CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \
-	$(call cc-option, -fno-stack-protector,) \
-	$(call cc-option, -fno-stack-protector-all,)
-
-# Options used by linker script
-export LDS_START      := $(START)
-export LDS_ELF_ARCH   := $(ELF_ARCH)
-export LDS_ELF_FORMAT := $(ELF_FORMAT)
-
-# The wrappers will select whether using "malloc" or the kernel allocator.
-LINK_WRAPS = -Wl,--wrap,malloc -Wl,--wrap,free -Wl,--wrap,calloc
-
-LD_FLAGS_CMDLINE = $(foreach opt,$(KBUILD_LDFLAGS),-Wl,$(opt))
-
-# Used by link-vmlinux.sh which has special support for um link
-export CFLAGS_vmlinux := $(LINK-y) $(LINK_WRAPS) $(LD_FLAGS_CMDLINE)
-
-# When cleaning we don't include .config, so we don't include
-# TT or skas makefiles and don't clean skas_ptregs.h.
-CLEAN_FILES += linux x.i gmon.out
-
-archclean:
-	@find . \( -name '*.bb' -o -name '*.bbg' -o -name '*.da' \
-		-o -name '*.gcov' \) -type f -print | xargs rm -f
-
-export HEADER_ARCH SUBARCH USER_CFLAGS CFLAGS_NO_HARDENING OS DEV_NULL_PATH
+export UMMODE HEADER_ARCH HOST_DIR SRCARCH
diff --git a/arch/um/Makefile.um b/arch/um/Makefile.um
new file mode 100644
index 000000000000..d54fd387a16f
--- /dev/null
+++ b/arch/um/Makefile.um
@@ -0,0 +1,149 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# This file is included by the global makefile so that you can add your own
+# architecture-specific flags and dependencies.
+#
+# Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
+# Licensed under the GPL
+#
+
+# select defconfig based on actual architecture
+ifeq ($(SUBARCH),x86)
+  ifeq ($(shell uname -m),x86_64)
+        KBUILD_DEFCONFIG := x86_64_defconfig
+  else
+        KBUILD_DEFCONFIG := i386_defconfig
+  endif
+else
+        KBUILD_DEFCONFIG := $(SUBARCH)_defconfig
+endif
+
+ARCH_DIR := arch/um
+OS := $(shell uname -s)
+# We require bash because the vmlinux link and loader script cpp use bash
+# features.
+SHELL := /bin/bash
+
+core-y			+= $(ARCH_DIR)/kernel/		\
+			   $(ARCH_DIR)/drivers/		\
+			   $(ARCH_DIR)/os-$(OS)/
+
+MODE_INCLUDE	+= -I$(srctree)/$(ARCH_DIR)/include/shared/skas
+
+HEADER_ARCH 	:= $(SUBARCH)
+
+ifneq ($(filter $(SUBARCH),x86 x86_64 i386),)
+	HEADER_ARCH := x86
+endif
+
+ifdef CONFIG_64BIT
+	KBUILD_CFLAGS += -mcmodel=large
+endif
+
+HOST_DIR := arch/$(HEADER_ARCH)
+
+include $(ARCH_DIR)/Makefile-skas
+include $(HOST_DIR)/Makefile.um
+
+core-y += $(HOST_DIR)/um/
+
+SHARED_HEADERS	:= $(ARCH_DIR)/include/shared
+ARCH_INCLUDE	:= -I$(srctree)/$(SHARED_HEADERS)
+ARCH_INCLUDE	+= -I$(srctree)/$(HOST_DIR)/um/shared
+KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/um
+
+# -Dvmap=kernel_vmap prevents anything from referencing the libpcap.o symbol so
+# named - it's a common symbol in libpcap, so we get a binary which crashes.
+#
+# Same things for in6addr_loopback and mktime - found in libc. For these two we
+# only get link-time error, luckily.
+#
+# -Dlongjmp=kernel_longjmp prevents anything from referencing the libpthread.a
+# embedded copy of longjmp, same thing for setjmp.
+#
+# These apply to USER_CFLAGS to.
+
+KBUILD_CFLAGS += $(CFLAGS) $(CFLAGS-y) -D__arch_um__ \
+	$(ARCH_INCLUDE) $(MODE_INCLUDE) -Dvmap=kernel_vmap	\
+	-Dlongjmp=kernel_longjmp -Dsetjmp=kernel_setjmp \
+	-Din6addr_loopback=kernel_in6addr_loopback \
+	-Din6addr_any=kernel_in6addr_any -Dstrrchr=kernel_strrchr
+
+KBUILD_AFLAGS += $(ARCH_INCLUDE)
+
+USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \
+		$(ARCH_INCLUDE) $(MODE_INCLUDE) $(filter -I%,$(CFLAGS)) \
+		-D_FILE_OFFSET_BITS=64 -idirafter $(srctree)/include \
+		-idirafter $(objtree)/include -D__KERNEL__ -D__UM_HOST__
+
+#This will adjust *FLAGS accordingly to the platform.
+include $(ARCH_DIR)/Makefile-os-$(OS)
+
+KBUILD_CPPFLAGS += -I$(srctree)/$(HOST_DIR)/include \
+		   -I$(srctree)/$(HOST_DIR)/include/uapi \
+		   -I$(objtree)/$(HOST_DIR)/include/generated \
+		   -I$(objtree)/$(HOST_DIR)/include/generated/uapi
+
+# -Derrno=kernel_errno - This turns all kernel references to errno into
+# kernel_errno to separate them from the libc errno.  This allows -fno-common
+# in KBUILD_CFLAGS.  Otherwise, it would cause ld to complain about the two different
+# errnos.
+# These apply to kernelspace only.
+#
+# strip leading and trailing whitespace to make the USER_CFLAGS removal of these
+# defines more robust
+
+KERNEL_DEFINES = $(strip -Derrno=kernel_errno -Dsigprocmask=kernel_sigprocmask \
+			 -Dmktime=kernel_mktime $(ARCH_KERNEL_DEFINES))
+KBUILD_CFLAGS += $(KERNEL_DEFINES)
+
+PHONY += linux
+
+all: linux
+
+linux: vmlinux
+	@echo '  LINK $@'
+	$(Q)ln -f $< $@
+
+define archhelp
+  echo '* linux		- Binary kernel image (./linux) - for backward'
+  echo '		   compatibility only, this creates a hard link to the'
+  echo '		   real kernel binary, the "vmlinux" binary you'
+  echo '		   find in the kernel root.'
+endef
+
+archheaders:
+	$(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(HEADER_ARCH) asm-generic archheaders
+
+archprepare:
+	$(Q)$(MAKE) $(build)=$(HOST_DIR)/um include/generated/user_constants.h
+
+LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static
+LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)
+
+CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \
+	$(call cc-option, -fno-stack-protector,) \
+	$(call cc-option, -fno-stack-protector-all,)
+
+# Options used by linker script
+export LDS_START      := $(START)
+export LDS_ELF_ARCH   := $(ELF_ARCH)
+export LDS_ELF_FORMAT := $(ELF_FORMAT)
+
+# The wrappers will select whether using "malloc" or the kernel allocator.
+LINK_WRAPS = -Wl,--wrap,malloc -Wl,--wrap,free -Wl,--wrap,calloc
+
+LD_FLAGS_CMDLINE = $(foreach opt,$(KBUILD_LDFLAGS),-Wl,$(opt))
+
+# Used by link-vmlinux.sh which has special support for um link
+export CFLAGS_vmlinux := $(LINK-y) $(LINK_WRAPS) $(LD_FLAGS_CMDLINE)
+
+# When cleaning we don't include .config, so we don't include
+# TT or skas makefiles and don't clean skas_ptregs.h.
+CLEAN_FILES += linux x.i gmon.out
+
+archclean:
+	@find . \( -name '*.bb' -o -name '*.bbg' -o -name '*.da' \
+		-o -name '*.gcov' \) -type f -print | xargs rm -f
+
+export HEADER_ARCH SUBARCH USER_CFLAGS CFLAGS_NO_HARDENING OS DEV_NULL_PATH
diff --git a/arch/um/lkl/Kconfig b/arch/um/lkl/Kconfig
index 1dae70f16c43..07b3699095ae 100644
--- a/arch/um/lkl/Kconfig
+++ b/arch/um/lkl/Kconfig
@@ -1,6 +1,8 @@
 # SPDX-License-Identifier: GPL-2.0
 
-config UML_LKL
+menu "LKL-specific options"
+
+config LKL
        def_bool y
        depends on !SMP && !MMU && !COREDUMP && !SECCOMP && !UPROBES && !COMPAT && !USER_RETURN_NOTIFIER
        select ARCH_THREAD_STACK_ALLOCATOR
@@ -59,7 +61,7 @@ config BIG_ENDIAN
        def_bool n
 
 config GENERIC_CSUM
-       def_bool y
+       def_bool LKL
 
 config GENERIC_HWEIGHT
        def_bool y
@@ -83,13 +85,4 @@ config HZ
         int
         default 100
 
-config CONSOLE_LOGLEVEL_QUIET
-	int "quiet console loglevel (1-15)"
-	range 1 15
-	default "4"
-	help
-	  loglevel to use when "quiet" is passed on the kernel commandline.
-
-	  When "quiet" is passed on the kernel commandline this loglevel
-	  will be used as the loglevel. IOW passing "quiet" will be the
-	  equivalent of passing "loglevel=<CONSOLE_LOGLEVEL_QUIET>"
+endmenu
diff --git a/arch/um/lkl/Makefile b/arch/um/lkl/Makefile
new file mode 100644
index 000000000000..45af83c3825a
--- /dev/null
+++ b/arch/um/lkl/Makefile
@@ -0,0 +1,69 @@
+# SPDX-License-Identifier: GPL-2.0
+
+HOST_DIR := $(ARCH_DIR)/lkl
+include $(HOST_DIR)/auto.conf
+
+SRCARCH := um/lkl
+ARCH_INCLUDE += -I$(srctree)/$(HOST_DIR)/um/include
+LINUXINCLUDE := $(subst $(ARCH_DIR),$(HOST_DIR),$(LINUXINCLUDE)) $(ARCH_INCLUDE)
+KBUILD_CFLAGS += -fno-builtin -D__arch_um__
+KBUILD_DEFCONFIG := lkl_defconfig
+
+ifneq (,$(filter $(OUTPUT_FORMAT),elf64-x86-64 elf32-i386 elf64-x86-64-freebsd elf32-littlearm elf64-littleaarch64))
+KBUILD_CFLAGS += -fPIC
+else ifneq (,$(filter $(OUTPUT_FORMAT),pe-i386 pe-x86-64 ))
+ifneq ($(OUTPUT_FORMAT),pe-x86-64)
+prefix=_
+endif
+# workaround for #include_next<stdarg.h> errors
+LINUXINCLUDE := -isystem $(HOST_DIR)/include/system $(LINUXINCLUDE)
+# workaround for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52991
+KBUILD_CFLAGS += -mno-ms-bitfields
+else
+$(error Unrecognized platform: $(OUTPUT_FORMAT))
+endif
+
+ifeq ($(shell uname -s), Linux)
+NPROC=$(shell nproc)
+else # e.g., FreeBSD
+NPROC=$(shell sysctl -n hw.ncpu)
+endif
+
+LDFLAGS_vmlinux += -r
+LKL_ENTRY_POINTS := lkl_start_kernel lkl_sys_halt lkl_syscall lkl_trigger_irq \
+	lkl_get_free_irq lkl_put_irq lkl_is_running lkl_bug lkl_printf
+
+ifeq ($(OUTPUT_FORMAT),elf32-i386)
+LKL_ENTRY_POINTS += \
+	__x86.get_pc_thunk.bx __x86.get_pc_thunk.dx __x86.get_pc_thunk.ax \
+	__x86.get_pc_thunk.cx __x86.get_pc_thunk.si __x86.get_pc_thunk.di
+endif
+
+core-y += $(HOST_DIR)/kernel/
+core-y += $(HOST_DIR)/mm/
+
+all: lkl.o
+
+lkl.o: vmlinux
+	$(OBJCOPY) -R .eh_frame -R .syscall_defs $(foreach sym,$(LKL_ENTRY_POINTS),-G$(prefix)$(sym)) vmlinux lkl.o
+
+$(HOST_DIR)/include/generated/uapi/asm/syscall_defs.h: vmlinux
+	$(OBJCOPY) -j .syscall_defs -O binary --set-section-flags .syscall_defs=alloc $< $@
+	$(Q) export tmpfile=$(shell mktemp); \
+	sed 's/\x0//g' $@ > $$tmpfile; mv $$tmpfile $@ ; rm -f $$tmpfile
+
+install: lkl.o headers $(HOST_DIR)/include/generated/uapi/asm/syscall_defs.h
+	@echo "  INSTALL	$(INSTALL_PATH)/lib/lkl.o"
+	@mkdir -p $(INSTALL_PATH)/lib/
+	@cp lkl.o $(INSTALL_PATH)/lib/
+	@$(srctree)/$(HOST_DIR)/scripts/headers_install.py \
+		$(subst -j,-j$(NPROC),$(findstring -j,$(MAKEFLAGS))) \
+		$(INSTALL_PATH)/include
+
+archclean:
+	$(Q)rm -rf $(srctree)/$(HOST_DIR)/include/generated
+	$(Q)$(MAKE) $(clean)=$(boot)
+
+define archhelp
+  echo '  install	- Install library and headers to INSTALL_PATH/{lib,include}'
+endef
diff --git a/arch/um/lkl/auto.conf b/arch/um/lkl/auto.conf
new file mode 100644
index 000000000000..4bfd65a02d73
--- /dev/null
+++ b/arch/um/lkl/auto.conf
@@ -0,0 +1 @@
+export OUTPUT_FORMAT=$(shell $(LD) -r -print-output-format)
diff --git a/arch/um/lkl/kernel/Makefile b/arch/um/lkl/kernel/Makefile
new file mode 100644
index 000000000000..ef489f2f7176
--- /dev/null
+++ b/arch/um/lkl/kernel/Makefile
@@ -0,0 +1,4 @@
+extra-y := vmlinux.lds
+
+obj-y = setup.o threads.o irq.o time.o syscalls.o misc.o console.o \
+	syscalls_32.o cpu.o
diff --git a/arch/um/lkl/mm/Makefile b/arch/um/lkl/mm/Makefile
new file mode 100644
index 000000000000..2af6e3051897
--- /dev/null
+++ b/arch/um/lkl/mm/Makefile
@@ -0,0 +1 @@
+obj-y = bootmem.o
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 14/37] lkl tools: skeleton for host side library, tests and tools
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Ben Wolsieffer, Conrad Meyer, Hajime Tazaki, Luca Dariz,
	Mark Stillwell, Michael Zimmermann, Motomu Utsumi,
	Patrick Collins, Petros Angelatos, Thomas Liebetraut, Xiao Jia,
	Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch adds the skeleton for the host library, tests and
application examples.

The host library is implementing the host operations needed by LKL and
is split into host dependent (depends on a specific host, e.g. POSIX
hosts) and host independent parts (will work on all supported hosts).

Signed-off-by: Ben Wolsieffer <benwolsieffer@gmail.com>
Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore         |   4 +
 tools/lkl/Build              |   0
 tools/lkl/Makefile           | 124 ++++++++++++
 tools/lkl/Makefile.autoconf  |  87 +++++++++
 tools/lkl/Targets            |   3 +
 tools/lkl/include/.gitignore |   1 +
 tools/lkl/include/lkl.h      | 358 +++++++++++++++++++++++++++++++++++
 tools/lkl/include/lkl_host.h |  19 ++
 tools/lkl/lib/.gitignore     |   3 +
 tools/lkl/lib/Build          |   1 +
 10 files changed, 600 insertions(+)
 create mode 100644 tools/lkl/.gitignore
 create mode 100644 tools/lkl/Build
 create mode 100644 tools/lkl/Makefile
 create mode 100644 tools/lkl/Makefile.autoconf
 create mode 100644 tools/lkl/Targets
 create mode 100644 tools/lkl/include/.gitignore
 create mode 100644 tools/lkl/include/lkl.h
 create mode 100644 tools/lkl/include/lkl_host.h
 create mode 100644 tools/lkl/lib/.gitignore
 create mode 100644 tools/lkl/lib/Build

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
new file mode 100644
index 000000000000..1aed58bfe171
--- /dev/null
+++ b/tools/lkl/.gitignore
@@ -0,0 +1,4 @@
+Makefile.conf
+include/lkl_autoconf.h
+tests/autoconf.sh
+bin/stat
diff --git a/tools/lkl/Build b/tools/lkl/Build
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/tools/lkl/Makefile b/tools/lkl/Makefile
new file mode 100644
index 000000000000..6d6d2cead03f
--- /dev/null
+++ b/tools/lkl/Makefile
@@ -0,0 +1,124 @@
+# Do not use make's built-in rules
+# (this improves performance and avoids hard-to-debug behaviour);
+# also do not print "Entering directory..." messages from make
+.SUFFIXES:
+MAKEFLAGS += -r --no-print-directory
+
+KCONFIG?=defconfig
+
+ifneq ($(silent),1)
+  ifneq ($(V),1)
+	QUIET_AUTOCONF       = @echo '  AUTOCONF '$@;
+	Q = @
+  endif
+endif
+
+PREFIX   := /usr
+
+ifeq (,$(srctree))
+  srctree := $(patsubst %/,%,$(dir $(shell pwd)))
+  srctree := $(patsubst %/,%,$(dir $(srctree)))
+endif
+export srctree
+
+-include $(srctree)/tools/scripts/Makefile.include
+
+# OUTPUT fixup should be *after* include ../scripts/Makefile.include
+ifneq ($(OUTPUT),)
+  OUTPUT := $(OUTPUT)/tools/lkl/
+else
+  OUTPUT := $(CURDIR)/
+endif
+export OUTPUT
+
+
+all:
+
+conf: $(OUTPUT)Makefile.conf
+
+$(OUTPUT)Makefile.conf: Makefile.autoconf
+	$(call QUIET_AUTOCONF, headers)$(MAKE) -f Makefile.autoconf -s
+
+-include $(OUTPUT)Makefile.conf
+
+export CFLAGS += -I$(OUTPUT)/include -Iinclude -Wall -g -O2 -Wextra \
+	 -Wno-unused-parameter \
+	 -Wno-missing-field-initializers -fno-strict-aliasing
+
+-include Targets
+
+TARGETS := $(progs-y:%=$(OUTPUT)%$(EXESUF))
+TARGETS += $(libs-y:%=$(OUTPUT)%$(SOSUF))
+all: $(TARGETS)
+
+# this workaround is for FreeBSD
+bin/stat:
+ifeq ($(LKL_HOST_CONFIG_BSD),y)
+	$(Q)ln -sf `which gnustat` bin/stat
+	$(Q)ln -sf `which gsed` bin/sed
+else
+	$(Q)touch bin/stat
+endif
+
+# rule to build lkl.o
+$(OUTPUT)lib/lkl.o: bin/stat
+	$(Q)$(MAKE) -C $(srctree) ARCH=um UMMODE=library $(KOPT) $(KCONFIG)
+# this workaround is for arm32 linker (ld.gold)
+	$(Q)export PATH=$(shell pwd)/bin/:${PATH} ;\
+	$(MAKE) -C $(srctree) ARCH=um UMMODE=library $(KOPT) install INSTALL_PATH=$(OUTPUT)
+
+# rules to link libs
+$(OUTPUT)%$(SOSUF): LDFLAGS += -shared
+$(OUTPUT)%$(SOSUF): $(OUTPUT)%-in.o $(OUTPUT)liblkl.a
+	$(QUIET_LINK)$(CC) $(LDFLAGS) $(LDFLAGS_$*-y) -o $@ $^ $(LDLIBS) $(LDLIBS_$*-y)
+
+# liblkl is special
+$(OUTPUT)liblkl$(SOSUF): $(OUTPUT)%-in.o $(OUTPUT)lib/lkl.o
+$(OUTPUT)liblkl.a: $(OUTPUT)lib/liblkl-in.o $(OUTPUT)lib/lkl.o
+	$(QUIET_AR)$(AR) -rc $@ $^
+
+# rule to link programs
+$(OUTPUT)%$(EXESUF): $(OUTPUT)%-in.o $(OUTPUT)liblkl.a
+	$(QUIET_LINK)$(CC) $(LDFLAGS) $(LDFLAGS_$*-y) -o $@ $^ $(LDLIBS) $(LDLIBS_$*-y)
+
+# rule to build objects
+$(OUTPUT)%-in.o: $(OUTPUT)lib/lkl.o FORCE
+	$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(patsubst %/,%,$(dir $*)) obj=$(notdir $*)
+
+
+clean:
+	$(call QUIET_CLEAN, objects)find $(OUTPUT) -name '*.o' -delete -o -name '\.*.cmd'\
+	 -delete -o -name '\.*.d' -delete
+	$(call QUIET_CLEAN, headers)$(RM) -r $(OUTPUT)/include/lkl/
+	$(call QUIET_CLEAN, liblkl.a)$(RM) $(OUTPUT)/liblkl.a
+	$(call QUIET_CLEAN, targets)$(RM) $(TARGETS) bin/stat
+
+clean-conf: clean
+	$(call QUIET_CLEAN, Makefile.conf)$(RM) $(OUTPUT)/Makefile.conf
+
+headers_install: $(TARGETS)
+	$(call QUIET_INSTALL, headers) \
+	    install -d $(DESTDIR)$(PREFIX)/include ; \
+	    install -m 644 include/lkl.h include/lkl_host.h $(OUTPUT)include/lkl_autoconf.h \
+	      include/lkl_config.h $(DESTDIR)$(PREFIX)/include ; \
+	    cp -r $(OUTPUT)include/lkl $(DESTDIR)$(PREFIX)/include
+
+libraries_install: $(libs-y:%=$(OUTPUT)%$(SOSUF)) $(OUTPUT)liblkl.a
+	$(call QUIET_INSTALL, libraries) \
+	    install -d $(DESTDIR)$(PREFIX)/lib ; \
+	    install -m 644 $^ $(DESTDIR)$(PREFIX)/lib
+
+programs_install: $(progs-y:%=$(OUTPUT)%$(EXESUF))
+	$(call QUIET_INSTALL, programs) \
+	    install -d $(DESTDIR)$(PREFIX)/bin ; \
+	    install -m 755 $^ $(DESTDIR)$(PREFIX)/bin
+
+install: headers_install libraries_install programs_install
+
+
+FORCE: ;
+.PHONY: all clean FORCE
+.PHONY: headers_install libraries_install programs_install install
+.NOTPARALLEL : lib/lkl.o
+.SECONDARY:
+
diff --git a/tools/lkl/Makefile.autoconf b/tools/lkl/Makefile.autoconf
new file mode 100644
index 000000000000..fd63b8aa5c77
--- /dev/null
+++ b/tools/lkl/Makefile.autoconf
@@ -0,0 +1,87 @@
+POSIX_HOSTS=elf64-x86-64 elf32-i386 elf64-x86-64-freebsd
+
+define set_autoconf_var
+  $(shell echo "#define LKL_HOST_CONFIG_$(1) $(2)" \
+	  >> $(OUTPUT)/include/lkl_autoconf.h)
+  $(shell echo "LKL_HOST_CONFIG_$(1)=$(2)" >> $(OUTPUT)/tests/autoconf.sh)
+  export LKL_HOST_CONFIG_$(1)=$(2)
+endef
+
+define find_include
+  $(eval include_paths=$(shell $(CC) -E -Wp,-v -xc /dev/null 2>&1 | grep '^ '))
+  $(foreach f, $(include_paths), $(wildcard $(f)/$(1)))
+endef
+
+define is_defined
+$(shell $(CC) -dM -E - </dev/null | grep $(1))
+endef
+
+define bsd_host
+  $(call set_autoconf_var,BSD,y)
+endef
+
+define arm_host
+  $(call set_autoconf_var,ARM,y)
+endef
+
+define aarch64_host
+  $(call set_autoconf_var,AARCH64,y)
+endef
+
+define virtio_net_dpdk
+  $(call set_autoconf_var,VIRTIO_NET_DPDK,y)
+  RTE_SDK ?= $(OUTPUT)/dpdk-17.02
+  RTE_TARGET ?= build
+  DPDK_LIBS = -lrte_pmd_vmxnet3_uio -lrte_pmd_ixgbe -lrte_pmd_e1000
+  DPDK_LIBS += -lrte_pmd_virtio
+  DPDK_LIBS += -lrte_timer -lrte_hash -lrte_mbuf -lrte_ethdev -lrte_eal
+  DPDK_LIBS += -lrte_mempool -lrte_ring -lrte_pmd_ring
+  DPDK_LIBS += -lrte_kvargs -lrte_net
+  CFLAGS += -I$$(RTE_SDK)/$$(RTE_TARGET)/include -msse4.2 -mpopcnt
+  LDFLAGS +=-L$$(RTE_SDK)/$$(RTE_TARGET)/lib
+  LDFLAGS +=-Wl,--whole-archive $$(DPDK_LIBS) -Wl,--no-whole-archive -lm -ldl
+endef
+
+define virtio_net_vde
+  $(call set_autoconf_var,VIRTIO_NET_VDE,y)
+  LDLIBS += $(shell pkg-config --libs vdeplug)
+endef
+
+define posix_host
+  $(call set_autoconf_var,POSIX,y)
+  $(call set_autoconf_var,VIRTIO_NET,y)
+  LDFLAGS += -pie
+  CFLAGS += -fPIC -pthread
+  SOSUF := .so
+  $(if $(filter $(1),elf64-x86-64-freebsd),$(call bsd_host))
+  $(if $(filter $(1),elf32-littlearm),$(call arm_host))
+  $(if $(filter $(1),elf64-littleaarch64),$(call aarch64_host))
+  $(if $(filter yes,$(dpdk)),$(call virtio_net_dpdk))
+  $(if $(filter yes,$(vde)),$(call virtio_net_vde))
+  $(if $(strip $(call find_include,fuse.h)),$(call set_autoconf_var,FUSE,y))
+  $(if $(strip $(call find_include,archive.h)),$(call set_autoconf_var,ARCHIVE,y))
+  $(if $(strip $(call find_include,linux/if_tun.h)),$(call set_autoconf_var,VIRTIO_NET_MACVTAP,y))
+  $(if $(filter $(1),elf64-x86-64-freebsd),$(call set_autoconf_var,NEEDS_LARGP,y))
+  $(if $(filter $(1),elf32-i386),$(call set_autoconf_var,I386,y))
+endef
+
+define do_autoconf
+  export CROSS_COMPILE := $(CROSS_COMPILE)
+  export CC := $(CROSS_COMPILE)gcc
+  export LD := $(CROSS_COMPILE)ld
+  export AR := $(CROSS_COMPILE)ar
+  $(eval LD := $(CROSS_COMPILE)ld)
+  $(eval CC := $(CROSS_COMPILE)gcc)
+  $(eval LD_FMT := $(shell $(LD) -r -print-output-format))
+  $(if $(filter $(LD_FMT),$(POSIX_HOSTS)),$(call posix_host,$(LD_FMT)))
+endef
+
+export do_autoconf
+
+
+$(OUTPUT)Makefile.conf: Makefile.autoconf
+	$(shell mkdir -p $(OUTPUT)/include)
+	$(shell mkdir -p $(OUTPUT)/tests)
+	$(shell echo -n "" > $(OUTPUT)/include/lkl_autoconf.h)
+	$(shell echo -n "" > $(OUTPUT)/tests/autoconf.sh)
+	@echo "$$do_autoconf" > $(OUTPUT)/Makefile.conf
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
new file mode 100644
index 000000000000..24c985e64638
--- /dev/null
+++ b/tools/lkl/Targets
@@ -0,0 +1,3 @@
+libs-y += lib/liblkl
+
+
diff --git a/tools/lkl/include/.gitignore b/tools/lkl/include/.gitignore
new file mode 100644
index 000000000000..c41a463c898d
--- /dev/null
+++ b/tools/lkl/include/.gitignore
@@ -0,0 +1 @@
+lkl/
\ No newline at end of file
diff --git a/tools/lkl/include/lkl.h b/tools/lkl/include/lkl.h
new file mode 100644
index 000000000000..76da534a85f1
--- /dev/null
+++ b/tools/lkl/include/lkl.h
@@ -0,0 +1,358 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_H
+#define _LKL_H
+
+#include "lkl_autoconf.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define _LKL_LIBC_COMPAT_H
+
+#ifdef __cplusplus
+#define class __lkl__class
+#endif
+
+/*
+ * Avoid collisions between Android which defines __unused and
+ * linux/icmp.h which uses __unused as a structure field.
+ */
+#pragma push_macro("__unused")
+#undef __unused
+
+#include <lkl/asm/syscalls.h>
+
+#pragma pop_macro("__unused")
+
+#ifdef __cplusplus
+#undef class
+#endif
+
+#if defined(__MINGW32__)
+#define strtok_r strtok_s
+#define inet_pton lkl_inet_pton
+
+int inet_pton(int af, const char *src, void *dst);
+#endif
+
+#if __LKL__BITS_PER_LONG == 64
+#define lkl_sys_fstatat lkl_sys_newfstatat
+#define lkl_sys_fstat lkl_sys_newfstat
+
+#else
+#define __lkl__NR_fcntl __lkl__NR_fcntl64
+
+#define lkl_stat lkl_stat64
+#define lkl_sys_stat lkl_sys_stat64
+#define lkl_sys_lstat lkl_sys_lstat64
+#define lkl_sys_truncate lkl_sys_truncate64
+#define lkl_sys_ftruncate lkl_sys_ftruncate64
+#define lkl_sys_sendfile lkl_sys_sendfile64
+#define lkl_sys_fstatat lkl_sys_fstatat64
+#define lkl_sys_fstat lkl_sys_fstat64
+#define lkl_sys_fcntl lkl_sys_fcntl64
+
+#define lkl_statfs lkl_statfs64
+
+static inline int lkl_sys_statfs(const char *path, struct lkl_statfs *buf)
+{
+	return lkl_sys_statfs64(path, sizeof(*buf), buf);
+}
+
+static inline int lkl_sys_fstatfs(unsigned int fd, struct lkl_statfs *buf)
+{
+	return lkl_sys_fstatfs64(fd, sizeof(*buf), buf);
+}
+
+#define lkl_sys_nanosleep lkl_sys_nanosleep_time32
+static inline int lkl_sys_nanosleep_time32(struct lkl_timespec *rqtp,
+					   struct lkl_timespec *rmtp)
+{
+	long p[6] = {(long)rqtp, (long)rmtp, 0, 0, 0, 0};
+
+	return lkl_syscall(__lkl__NR_nanosleep, p);
+}
+
+#endif
+
+static inline int lkl_sys_stat(const char *path, struct lkl_stat *buf)
+{
+	return lkl_sys_fstatat(LKL_AT_FDCWD, path, buf, 0);
+}
+
+static inline int lkl_sys_lstat(const char *path, struct lkl_stat *buf)
+{
+	return lkl_sys_fstatat(LKL_AT_FDCWD, path, buf,
+			       LKL_AT_SYMLINK_NOFOLLOW);
+}
+
+#ifdef __lkl__NR_llseek
+/**
+ * lkl_sys_lseek - wrapper for lkl_sys_llseek
+ */
+static inline long long lkl_sys_lseek(unsigned int fd, __lkl__kernel_loff_t off,
+				      unsigned int whence)
+{
+	long long res;
+	long ret = lkl_sys_llseek(fd, off >> 32, off & 0xffffffff, &res,
+				  whence);
+
+	return ret < 0 ? ret : res;
+}
+#endif
+
+static inline void *lkl_sys_mmap(void *addr, size_t length, int prot, int flags,
+				 int fd, off_t offset)
+{
+	return (void *)lkl_sys_mmap_pgoff((long)addr, length, prot, flags, fd,
+					  offset >> 12);
+}
+
+#define lkl_sys_mmap2 lkl_sys_mmap_pgoff
+
+#ifdef __lkl__NR_openat
+/**
+ * lkl_sys_open - wrapper for lkl_sys_openat
+ */
+static inline long lkl_sys_open(const char *file, int flags, int mode)
+{
+	return lkl_sys_openat(LKL_AT_FDCWD, file, flags, mode);
+}
+
+/**
+ * lkl_sys_creat - wrapper for lkl_sys_openat
+ */
+static inline long lkl_sys_creat(const char *file, int mode)
+{
+	return lkl_sys_openat(LKL_AT_FDCWD, file,
+			      LKL_O_CREAT|LKL_O_WRONLY|LKL_O_TRUNC, mode);
+}
+#endif
+
+
+#ifdef __lkl__NR_faccessat
+/**
+ * lkl_sys_access - wrapper for lkl_sys_faccessat
+ */
+static inline long lkl_sys_access(const char *file, int mode)
+{
+	return lkl_sys_faccessat(LKL_AT_FDCWD, file, mode);
+}
+#endif
+
+#ifdef __lkl__NR_fchownat
+/**
+ * lkl_sys_chown - wrapper for lkl_sys_fchownat
+ */
+static inline long lkl_sys_chown(const char *path, lkl_uid_t uid, lkl_gid_t gid)
+{
+	return lkl_sys_fchownat(LKL_AT_FDCWD, path, uid, gid, 0);
+}
+#endif
+
+#ifdef __lkl__NR_fchmodat
+/**
+ * lkl_sys_chmod - wrapper for lkl_sys_fchmodat
+ */
+static inline long lkl_sys_chmod(const char *path, mode_t mode)
+{
+	return lkl_sys_fchmodat(LKL_AT_FDCWD, path, mode);
+}
+#endif
+
+#ifdef __lkl__NR_linkat
+/**
+ * lkl_sys_link - wrapper for lkl_sys_linkat
+ */
+static inline long lkl_sys_link(const char *existing, const char *new)
+{
+	return lkl_sys_linkat(LKL_AT_FDCWD, existing, LKL_AT_FDCWD, new, 0);
+}
+#endif
+
+#ifdef __lkl__NR_unlinkat
+/**
+ * lkl_sys_unlink - wrapper for lkl_sys_unlinkat
+ */
+static inline long lkl_sys_unlink(const char *path)
+{
+	return lkl_sys_unlinkat(LKL_AT_FDCWD, path, 0);
+}
+#endif
+
+#ifdef __lkl__NR_symlinkat
+/**
+ * lkl_sys_symlink - wrapper for lkl_sys_symlinkat
+ */
+static inline long lkl_sys_symlink(const char *existing, const char *new)
+{
+	return lkl_sys_symlinkat(existing, LKL_AT_FDCWD, new);
+}
+#endif
+
+#ifdef __lkl__NR_readlinkat
+/**
+ * lkl_sys_readlink - wrapper for lkl_sys_readlinkat
+ */
+static inline long lkl_sys_readlink(const char *path, char *buf, size_t bufsize)
+{
+	return lkl_sys_readlinkat(LKL_AT_FDCWD, path, buf, bufsize);
+}
+#endif
+
+#ifdef __lkl__NR_renameat
+/**
+ * lkl_sys_rename - wrapper for lkl_sys_renameat
+ */
+static inline long lkl_sys_rename(const char *old, const char *new)
+{
+	return lkl_sys_renameat(LKL_AT_FDCWD, old, LKL_AT_FDCWD, new);
+}
+#endif
+
+#ifdef __lkl__NR_mkdirat
+/**
+ * lkl_sys_mkdir - wrapper for lkl_sys_mkdirat
+ */
+static inline long lkl_sys_mkdir(const char *path, mode_t mode)
+{
+	return lkl_sys_mkdirat(LKL_AT_FDCWD, path, mode);
+}
+#endif
+
+#ifdef __lkl__NR_unlinkat
+/**
+ * lkl_sys_rmdir - wrapper for lkl_sys_unlinkrat
+ */
+static inline long lkl_sys_rmdir(const char *path)
+{
+	return lkl_sys_unlinkat(LKL_AT_FDCWD, path, LKL_AT_REMOVEDIR);
+}
+#endif
+
+#ifdef __lkl__NR_mknodat
+/**
+ * lkl_sys_mknod - wrapper for lkl_sys_mknodat
+ */
+static inline long lkl_sys_mknod(const char *path, mode_t mode, dev_t dev)
+{
+	return lkl_sys_mknodat(LKL_AT_FDCWD, path, mode, dev);
+}
+#endif
+
+#ifdef __lkl__NR_pipe2
+/**
+ * lkl_sys_pipe - wrapper for lkl_sys_pipe2
+ */
+static inline long lkl_sys_pipe(int fd[2])
+{
+	return lkl_sys_pipe2(fd, 0);
+}
+#endif
+
+#ifdef __lkl__NR_sendto
+/**
+ * lkl_sys_send - wrapper for lkl_sys_sendto
+ */
+static inline long lkl_sys_send(int fd, void *buf, size_t len, int flags)
+{
+	return lkl_sys_sendto(fd, buf, len, flags, 0, 0);
+}
+#endif
+
+#ifdef __lkl__NR_recvfrom
+/**
+ * lkl_sys_recv - wrapper for lkl_sys_recvfrom
+ */
+static inline long lkl_sys_recv(int fd, void *buf, size_t len, int flags)
+{
+	return lkl_sys_recvfrom(fd, buf, len, flags, 0, 0);
+}
+#endif
+
+#ifdef __lkl__NR_pselect6
+/**
+ * lkl_sys_select - wrapper for lkl_sys_pselect
+ */
+static inline long lkl_sys_select(int n, lkl_fd_set *rfds, lkl_fd_set *wfds,
+				  lkl_fd_set *efds, struct lkl_timeval *tv)
+{
+	long data[2] = { 0, _LKL_NSIG/8 };
+	struct lkl_timespec ts;
+	lkl_time_t extra_secs;
+	const lkl_time_t max_time = ((1ULL<<8)*sizeof(time_t)-1)-1;
+
+	if (tv) {
+		if (tv->tv_sec < 0 || tv->tv_usec < 0)
+			return -LKL_EINVAL;
+
+		extra_secs = tv->tv_usec / 1000000;
+		ts.tv_nsec = tv->tv_usec % 1000000 * 1000;
+		ts.tv_sec = extra_secs > max_time - tv->tv_sec ?
+			max_time : tv->tv_sec + extra_secs;
+	}
+	return lkl_sys_pselect6(n, rfds, wfds, efds, tv ?
+				(struct __lkl__kernel_timespec *)&ts : 0, data);
+}
+#endif
+
+#ifdef __lkl__NR_ppoll
+/**
+ * lkl_sys_poll - wrapper for lkl_sys_ppoll
+ */
+static inline long lkl_sys_poll(struct lkl_pollfd *fds, int n, int timeout)
+{
+	return lkl_sys_ppoll(fds, n, timeout >= 0 ?
+			     (struct __lkl__kernel_timespec *)
+			     &((struct lkl_timespec){ .tv_sec = timeout/1000,
+				   .tv_nsec = timeout%1000*1000000 }) : 0,
+			     0, _LKL_NSIG/8);
+}
+#endif
+
+#ifdef __lkl__NR_epoll_create1
+/**
+ * lkl_sys_epoll_create - wrapper for lkl_sys_epoll_create1
+ */
+static inline long lkl_sys_epoll_create(int size)
+{
+	return lkl_sys_epoll_create1(0);
+}
+#endif
+
+#ifdef __lkl__NR_epoll_pwait
+/**
+ * lkl_sys_epoll_wait - wrapper for lkl_sys_epoll_pwait
+ */
+static inline long lkl_sys_epoll_wait(int fd, struct lkl_epoll_event *ev,
+				      int cnt, int to)
+{
+	return lkl_sys_epoll_pwait(fd, ev, cnt, to, 0, _LKL_NSIG/8);
+}
+#endif
+
+
+
+/**
+ * lkl_strerror - returns a string describing the given error code
+ *
+ * @err - error code
+ * @returns - string for the given error code
+ */
+const char *lkl_strerror(int err);
+
+/**
+ * lkl_perror - prints a string describing the given error code
+ *
+ * @msg - prefix for the error message
+ * @err - error code
+ */
+void lkl_perror(char *msg, int err);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
new file mode 100644
index 000000000000..b5f96096fe69
--- /dev/null
+++ b/tools/lkl/include/lkl_host.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_HOST_H
+#define _LKL_HOST_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <lkl/asm/host_ops.h>
+#include <lkl.h>
+
+extern struct lkl_host_operations lkl_host_ops;
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/tools/lkl/lib/.gitignore b/tools/lkl/lib/.gitignore
new file mode 100644
index 000000000000..427ae0273fdd
--- /dev/null
+++ b/tools/lkl/lib/.gitignore
@@ -0,0 +1,3 @@
+lkl.o
+liblkl.a
+
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
new file mode 100644
index 000000000000..8b137891791f
--- /dev/null
+++ b/tools/lkl/lib/Build
@@ -0,0 +1 @@
+
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 14/37] lkl tools: skeleton for host side library, tests and tools
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Patrick Collins, Xiao Jia, Conrad Meyer,
	Octavian Purdila, Motomu Utsumi, Akira Moroo, Petros Angelatos,
	Yuan Liu, Thomas Liebetraut, Mark Stillwell, Ben Wolsieffer,
	linux-kernel-library, Michael Zimmermann, Luca Dariz,
	Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch adds the skeleton for the host library, tests and
application examples.

The host library is implementing the host operations needed by LKL and
is split into host dependent (depends on a specific host, e.g. POSIX
hosts) and host independent parts (will work on all supported hosts).

Signed-off-by: Ben Wolsieffer <benwolsieffer@gmail.com>
Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore         |   4 +
 tools/lkl/Build              |   0
 tools/lkl/Makefile           | 124 ++++++++++++
 tools/lkl/Makefile.autoconf  |  87 +++++++++
 tools/lkl/Targets            |   3 +
 tools/lkl/include/.gitignore |   1 +
 tools/lkl/include/lkl.h      | 358 +++++++++++++++++++++++++++++++++++
 tools/lkl/include/lkl_host.h |  19 ++
 tools/lkl/lib/.gitignore     |   3 +
 tools/lkl/lib/Build          |   1 +
 10 files changed, 600 insertions(+)
 create mode 100644 tools/lkl/.gitignore
 create mode 100644 tools/lkl/Build
 create mode 100644 tools/lkl/Makefile
 create mode 100644 tools/lkl/Makefile.autoconf
 create mode 100644 tools/lkl/Targets
 create mode 100644 tools/lkl/include/.gitignore
 create mode 100644 tools/lkl/include/lkl.h
 create mode 100644 tools/lkl/include/lkl_host.h
 create mode 100644 tools/lkl/lib/.gitignore
 create mode 100644 tools/lkl/lib/Build

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
new file mode 100644
index 000000000000..1aed58bfe171
--- /dev/null
+++ b/tools/lkl/.gitignore
@@ -0,0 +1,4 @@
+Makefile.conf
+include/lkl_autoconf.h
+tests/autoconf.sh
+bin/stat
diff --git a/tools/lkl/Build b/tools/lkl/Build
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/tools/lkl/Makefile b/tools/lkl/Makefile
new file mode 100644
index 000000000000..6d6d2cead03f
--- /dev/null
+++ b/tools/lkl/Makefile
@@ -0,0 +1,124 @@
+# Do not use make's built-in rules
+# (this improves performance and avoids hard-to-debug behaviour);
+# also do not print "Entering directory..." messages from make
+.SUFFIXES:
+MAKEFLAGS += -r --no-print-directory
+
+KCONFIG?=defconfig
+
+ifneq ($(silent),1)
+  ifneq ($(V),1)
+	QUIET_AUTOCONF       = @echo '  AUTOCONF '$@;
+	Q = @
+  endif
+endif
+
+PREFIX   := /usr
+
+ifeq (,$(srctree))
+  srctree := $(patsubst %/,%,$(dir $(shell pwd)))
+  srctree := $(patsubst %/,%,$(dir $(srctree)))
+endif
+export srctree
+
+-include $(srctree)/tools/scripts/Makefile.include
+
+# OUTPUT fixup should be *after* include ../scripts/Makefile.include
+ifneq ($(OUTPUT),)
+  OUTPUT := $(OUTPUT)/tools/lkl/
+else
+  OUTPUT := $(CURDIR)/
+endif
+export OUTPUT
+
+
+all:
+
+conf: $(OUTPUT)Makefile.conf
+
+$(OUTPUT)Makefile.conf: Makefile.autoconf
+	$(call QUIET_AUTOCONF, headers)$(MAKE) -f Makefile.autoconf -s
+
+-include $(OUTPUT)Makefile.conf
+
+export CFLAGS += -I$(OUTPUT)/include -Iinclude -Wall -g -O2 -Wextra \
+	 -Wno-unused-parameter \
+	 -Wno-missing-field-initializers -fno-strict-aliasing
+
+-include Targets
+
+TARGETS := $(progs-y:%=$(OUTPUT)%$(EXESUF))
+TARGETS += $(libs-y:%=$(OUTPUT)%$(SOSUF))
+all: $(TARGETS)
+
+# this workaround is for FreeBSD
+bin/stat:
+ifeq ($(LKL_HOST_CONFIG_BSD),y)
+	$(Q)ln -sf `which gnustat` bin/stat
+	$(Q)ln -sf `which gsed` bin/sed
+else
+	$(Q)touch bin/stat
+endif
+
+# rule to build lkl.o
+$(OUTPUT)lib/lkl.o: bin/stat
+	$(Q)$(MAKE) -C $(srctree) ARCH=um UMMODE=library $(KOPT) $(KCONFIG)
+# this workaround is for arm32 linker (ld.gold)
+	$(Q)export PATH=$(shell pwd)/bin/:${PATH} ;\
+	$(MAKE) -C $(srctree) ARCH=um UMMODE=library $(KOPT) install INSTALL_PATH=$(OUTPUT)
+
+# rules to link libs
+$(OUTPUT)%$(SOSUF): LDFLAGS += -shared
+$(OUTPUT)%$(SOSUF): $(OUTPUT)%-in.o $(OUTPUT)liblkl.a
+	$(QUIET_LINK)$(CC) $(LDFLAGS) $(LDFLAGS_$*-y) -o $@ $^ $(LDLIBS) $(LDLIBS_$*-y)
+
+# liblkl is special
+$(OUTPUT)liblkl$(SOSUF): $(OUTPUT)%-in.o $(OUTPUT)lib/lkl.o
+$(OUTPUT)liblkl.a: $(OUTPUT)lib/liblkl-in.o $(OUTPUT)lib/lkl.o
+	$(QUIET_AR)$(AR) -rc $@ $^
+
+# rule to link programs
+$(OUTPUT)%$(EXESUF): $(OUTPUT)%-in.o $(OUTPUT)liblkl.a
+	$(QUIET_LINK)$(CC) $(LDFLAGS) $(LDFLAGS_$*-y) -o $@ $^ $(LDLIBS) $(LDLIBS_$*-y)
+
+# rule to build objects
+$(OUTPUT)%-in.o: $(OUTPUT)lib/lkl.o FORCE
+	$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(patsubst %/,%,$(dir $*)) obj=$(notdir $*)
+
+
+clean:
+	$(call QUIET_CLEAN, objects)find $(OUTPUT) -name '*.o' -delete -o -name '\.*.cmd'\
+	 -delete -o -name '\.*.d' -delete
+	$(call QUIET_CLEAN, headers)$(RM) -r $(OUTPUT)/include/lkl/
+	$(call QUIET_CLEAN, liblkl.a)$(RM) $(OUTPUT)/liblkl.a
+	$(call QUIET_CLEAN, targets)$(RM) $(TARGETS) bin/stat
+
+clean-conf: clean
+	$(call QUIET_CLEAN, Makefile.conf)$(RM) $(OUTPUT)/Makefile.conf
+
+headers_install: $(TARGETS)
+	$(call QUIET_INSTALL, headers) \
+	    install -d $(DESTDIR)$(PREFIX)/include ; \
+	    install -m 644 include/lkl.h include/lkl_host.h $(OUTPUT)include/lkl_autoconf.h \
+	      include/lkl_config.h $(DESTDIR)$(PREFIX)/include ; \
+	    cp -r $(OUTPUT)include/lkl $(DESTDIR)$(PREFIX)/include
+
+libraries_install: $(libs-y:%=$(OUTPUT)%$(SOSUF)) $(OUTPUT)liblkl.a
+	$(call QUIET_INSTALL, libraries) \
+	    install -d $(DESTDIR)$(PREFIX)/lib ; \
+	    install -m 644 $^ $(DESTDIR)$(PREFIX)/lib
+
+programs_install: $(progs-y:%=$(OUTPUT)%$(EXESUF))
+	$(call QUIET_INSTALL, programs) \
+	    install -d $(DESTDIR)$(PREFIX)/bin ; \
+	    install -m 755 $^ $(DESTDIR)$(PREFIX)/bin
+
+install: headers_install libraries_install programs_install
+
+
+FORCE: ;
+.PHONY: all clean FORCE
+.PHONY: headers_install libraries_install programs_install install
+.NOTPARALLEL : lib/lkl.o
+.SECONDARY:
+
diff --git a/tools/lkl/Makefile.autoconf b/tools/lkl/Makefile.autoconf
new file mode 100644
index 000000000000..fd63b8aa5c77
--- /dev/null
+++ b/tools/lkl/Makefile.autoconf
@@ -0,0 +1,87 @@
+POSIX_HOSTS=elf64-x86-64 elf32-i386 elf64-x86-64-freebsd
+
+define set_autoconf_var
+  $(shell echo "#define LKL_HOST_CONFIG_$(1) $(2)" \
+	  >> $(OUTPUT)/include/lkl_autoconf.h)
+  $(shell echo "LKL_HOST_CONFIG_$(1)=$(2)" >> $(OUTPUT)/tests/autoconf.sh)
+  export LKL_HOST_CONFIG_$(1)=$(2)
+endef
+
+define find_include
+  $(eval include_paths=$(shell $(CC) -E -Wp,-v -xc /dev/null 2>&1 | grep '^ '))
+  $(foreach f, $(include_paths), $(wildcard $(f)/$(1)))
+endef
+
+define is_defined
+$(shell $(CC) -dM -E - </dev/null | grep $(1))
+endef
+
+define bsd_host
+  $(call set_autoconf_var,BSD,y)
+endef
+
+define arm_host
+  $(call set_autoconf_var,ARM,y)
+endef
+
+define aarch64_host
+  $(call set_autoconf_var,AARCH64,y)
+endef
+
+define virtio_net_dpdk
+  $(call set_autoconf_var,VIRTIO_NET_DPDK,y)
+  RTE_SDK ?= $(OUTPUT)/dpdk-17.02
+  RTE_TARGET ?= build
+  DPDK_LIBS = -lrte_pmd_vmxnet3_uio -lrte_pmd_ixgbe -lrte_pmd_e1000
+  DPDK_LIBS += -lrte_pmd_virtio
+  DPDK_LIBS += -lrte_timer -lrte_hash -lrte_mbuf -lrte_ethdev -lrte_eal
+  DPDK_LIBS += -lrte_mempool -lrte_ring -lrte_pmd_ring
+  DPDK_LIBS += -lrte_kvargs -lrte_net
+  CFLAGS += -I$$(RTE_SDK)/$$(RTE_TARGET)/include -msse4.2 -mpopcnt
+  LDFLAGS +=-L$$(RTE_SDK)/$$(RTE_TARGET)/lib
+  LDFLAGS +=-Wl,--whole-archive $$(DPDK_LIBS) -Wl,--no-whole-archive -lm -ldl
+endef
+
+define virtio_net_vde
+  $(call set_autoconf_var,VIRTIO_NET_VDE,y)
+  LDLIBS += $(shell pkg-config --libs vdeplug)
+endef
+
+define posix_host
+  $(call set_autoconf_var,POSIX,y)
+  $(call set_autoconf_var,VIRTIO_NET,y)
+  LDFLAGS += -pie
+  CFLAGS += -fPIC -pthread
+  SOSUF := .so
+  $(if $(filter $(1),elf64-x86-64-freebsd),$(call bsd_host))
+  $(if $(filter $(1),elf32-littlearm),$(call arm_host))
+  $(if $(filter $(1),elf64-littleaarch64),$(call aarch64_host))
+  $(if $(filter yes,$(dpdk)),$(call virtio_net_dpdk))
+  $(if $(filter yes,$(vde)),$(call virtio_net_vde))
+  $(if $(strip $(call find_include,fuse.h)),$(call set_autoconf_var,FUSE,y))
+  $(if $(strip $(call find_include,archive.h)),$(call set_autoconf_var,ARCHIVE,y))
+  $(if $(strip $(call find_include,linux/if_tun.h)),$(call set_autoconf_var,VIRTIO_NET_MACVTAP,y))
+  $(if $(filter $(1),elf64-x86-64-freebsd),$(call set_autoconf_var,NEEDS_LARGP,y))
+  $(if $(filter $(1),elf32-i386),$(call set_autoconf_var,I386,y))
+endef
+
+define do_autoconf
+  export CROSS_COMPILE := $(CROSS_COMPILE)
+  export CC := $(CROSS_COMPILE)gcc
+  export LD := $(CROSS_COMPILE)ld
+  export AR := $(CROSS_COMPILE)ar
+  $(eval LD := $(CROSS_COMPILE)ld)
+  $(eval CC := $(CROSS_COMPILE)gcc)
+  $(eval LD_FMT := $(shell $(LD) -r -print-output-format))
+  $(if $(filter $(LD_FMT),$(POSIX_HOSTS)),$(call posix_host,$(LD_FMT)))
+endef
+
+export do_autoconf
+
+
+$(OUTPUT)Makefile.conf: Makefile.autoconf
+	$(shell mkdir -p $(OUTPUT)/include)
+	$(shell mkdir -p $(OUTPUT)/tests)
+	$(shell echo -n "" > $(OUTPUT)/include/lkl_autoconf.h)
+	$(shell echo -n "" > $(OUTPUT)/tests/autoconf.sh)
+	@echo "$$do_autoconf" > $(OUTPUT)/Makefile.conf
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
new file mode 100644
index 000000000000..24c985e64638
--- /dev/null
+++ b/tools/lkl/Targets
@@ -0,0 +1,3 @@
+libs-y += lib/liblkl
+
+
diff --git a/tools/lkl/include/.gitignore b/tools/lkl/include/.gitignore
new file mode 100644
index 000000000000..c41a463c898d
--- /dev/null
+++ b/tools/lkl/include/.gitignore
@@ -0,0 +1 @@
+lkl/
\ No newline at end of file
diff --git a/tools/lkl/include/lkl.h b/tools/lkl/include/lkl.h
new file mode 100644
index 000000000000..76da534a85f1
--- /dev/null
+++ b/tools/lkl/include/lkl.h
@@ -0,0 +1,358 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_H
+#define _LKL_H
+
+#include "lkl_autoconf.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define _LKL_LIBC_COMPAT_H
+
+#ifdef __cplusplus
+#define class __lkl__class
+#endif
+
+/*
+ * Avoid collisions between Android which defines __unused and
+ * linux/icmp.h which uses __unused as a structure field.
+ */
+#pragma push_macro("__unused")
+#undef __unused
+
+#include <lkl/asm/syscalls.h>
+
+#pragma pop_macro("__unused")
+
+#ifdef __cplusplus
+#undef class
+#endif
+
+#if defined(__MINGW32__)
+#define strtok_r strtok_s
+#define inet_pton lkl_inet_pton
+
+int inet_pton(int af, const char *src, void *dst);
+#endif
+
+#if __LKL__BITS_PER_LONG == 64
+#define lkl_sys_fstatat lkl_sys_newfstatat
+#define lkl_sys_fstat lkl_sys_newfstat
+
+#else
+#define __lkl__NR_fcntl __lkl__NR_fcntl64
+
+#define lkl_stat lkl_stat64
+#define lkl_sys_stat lkl_sys_stat64
+#define lkl_sys_lstat lkl_sys_lstat64
+#define lkl_sys_truncate lkl_sys_truncate64
+#define lkl_sys_ftruncate lkl_sys_ftruncate64
+#define lkl_sys_sendfile lkl_sys_sendfile64
+#define lkl_sys_fstatat lkl_sys_fstatat64
+#define lkl_sys_fstat lkl_sys_fstat64
+#define lkl_sys_fcntl lkl_sys_fcntl64
+
+#define lkl_statfs lkl_statfs64
+
+static inline int lkl_sys_statfs(const char *path, struct lkl_statfs *buf)
+{
+	return lkl_sys_statfs64(path, sizeof(*buf), buf);
+}
+
+static inline int lkl_sys_fstatfs(unsigned int fd, struct lkl_statfs *buf)
+{
+	return lkl_sys_fstatfs64(fd, sizeof(*buf), buf);
+}
+
+#define lkl_sys_nanosleep lkl_sys_nanosleep_time32
+static inline int lkl_sys_nanosleep_time32(struct lkl_timespec *rqtp,
+					   struct lkl_timespec *rmtp)
+{
+	long p[6] = {(long)rqtp, (long)rmtp, 0, 0, 0, 0};
+
+	return lkl_syscall(__lkl__NR_nanosleep, p);
+}
+
+#endif
+
+static inline int lkl_sys_stat(const char *path, struct lkl_stat *buf)
+{
+	return lkl_sys_fstatat(LKL_AT_FDCWD, path, buf, 0);
+}
+
+static inline int lkl_sys_lstat(const char *path, struct lkl_stat *buf)
+{
+	return lkl_sys_fstatat(LKL_AT_FDCWD, path, buf,
+			       LKL_AT_SYMLINK_NOFOLLOW);
+}
+
+#ifdef __lkl__NR_llseek
+/**
+ * lkl_sys_lseek - wrapper for lkl_sys_llseek
+ */
+static inline long long lkl_sys_lseek(unsigned int fd, __lkl__kernel_loff_t off,
+				      unsigned int whence)
+{
+	long long res;
+	long ret = lkl_sys_llseek(fd, off >> 32, off & 0xffffffff, &res,
+				  whence);
+
+	return ret < 0 ? ret : res;
+}
+#endif
+
+static inline void *lkl_sys_mmap(void *addr, size_t length, int prot, int flags,
+				 int fd, off_t offset)
+{
+	return (void *)lkl_sys_mmap_pgoff((long)addr, length, prot, flags, fd,
+					  offset >> 12);
+}
+
+#define lkl_sys_mmap2 lkl_sys_mmap_pgoff
+
+#ifdef __lkl__NR_openat
+/**
+ * lkl_sys_open - wrapper for lkl_sys_openat
+ */
+static inline long lkl_sys_open(const char *file, int flags, int mode)
+{
+	return lkl_sys_openat(LKL_AT_FDCWD, file, flags, mode);
+}
+
+/**
+ * lkl_sys_creat - wrapper for lkl_sys_openat
+ */
+static inline long lkl_sys_creat(const char *file, int mode)
+{
+	return lkl_sys_openat(LKL_AT_FDCWD, file,
+			      LKL_O_CREAT|LKL_O_WRONLY|LKL_O_TRUNC, mode);
+}
+#endif
+
+
+#ifdef __lkl__NR_faccessat
+/**
+ * lkl_sys_access - wrapper for lkl_sys_faccessat
+ */
+static inline long lkl_sys_access(const char *file, int mode)
+{
+	return lkl_sys_faccessat(LKL_AT_FDCWD, file, mode);
+}
+#endif
+
+#ifdef __lkl__NR_fchownat
+/**
+ * lkl_sys_chown - wrapper for lkl_sys_fchownat
+ */
+static inline long lkl_sys_chown(const char *path, lkl_uid_t uid, lkl_gid_t gid)
+{
+	return lkl_sys_fchownat(LKL_AT_FDCWD, path, uid, gid, 0);
+}
+#endif
+
+#ifdef __lkl__NR_fchmodat
+/**
+ * lkl_sys_chmod - wrapper for lkl_sys_fchmodat
+ */
+static inline long lkl_sys_chmod(const char *path, mode_t mode)
+{
+	return lkl_sys_fchmodat(LKL_AT_FDCWD, path, mode);
+}
+#endif
+
+#ifdef __lkl__NR_linkat
+/**
+ * lkl_sys_link - wrapper for lkl_sys_linkat
+ */
+static inline long lkl_sys_link(const char *existing, const char *new)
+{
+	return lkl_sys_linkat(LKL_AT_FDCWD, existing, LKL_AT_FDCWD, new, 0);
+}
+#endif
+
+#ifdef __lkl__NR_unlinkat
+/**
+ * lkl_sys_unlink - wrapper for lkl_sys_unlinkat
+ */
+static inline long lkl_sys_unlink(const char *path)
+{
+	return lkl_sys_unlinkat(LKL_AT_FDCWD, path, 0);
+}
+#endif
+
+#ifdef __lkl__NR_symlinkat
+/**
+ * lkl_sys_symlink - wrapper for lkl_sys_symlinkat
+ */
+static inline long lkl_sys_symlink(const char *existing, const char *new)
+{
+	return lkl_sys_symlinkat(existing, LKL_AT_FDCWD, new);
+}
+#endif
+
+#ifdef __lkl__NR_readlinkat
+/**
+ * lkl_sys_readlink - wrapper for lkl_sys_readlinkat
+ */
+static inline long lkl_sys_readlink(const char *path, char *buf, size_t bufsize)
+{
+	return lkl_sys_readlinkat(LKL_AT_FDCWD, path, buf, bufsize);
+}
+#endif
+
+#ifdef __lkl__NR_renameat
+/**
+ * lkl_sys_rename - wrapper for lkl_sys_renameat
+ */
+static inline long lkl_sys_rename(const char *old, const char *new)
+{
+	return lkl_sys_renameat(LKL_AT_FDCWD, old, LKL_AT_FDCWD, new);
+}
+#endif
+
+#ifdef __lkl__NR_mkdirat
+/**
+ * lkl_sys_mkdir - wrapper for lkl_sys_mkdirat
+ */
+static inline long lkl_sys_mkdir(const char *path, mode_t mode)
+{
+	return lkl_sys_mkdirat(LKL_AT_FDCWD, path, mode);
+}
+#endif
+
+#ifdef __lkl__NR_unlinkat
+/**
+ * lkl_sys_rmdir - wrapper for lkl_sys_unlinkrat
+ */
+static inline long lkl_sys_rmdir(const char *path)
+{
+	return lkl_sys_unlinkat(LKL_AT_FDCWD, path, LKL_AT_REMOVEDIR);
+}
+#endif
+
+#ifdef __lkl__NR_mknodat
+/**
+ * lkl_sys_mknod - wrapper for lkl_sys_mknodat
+ */
+static inline long lkl_sys_mknod(const char *path, mode_t mode, dev_t dev)
+{
+	return lkl_sys_mknodat(LKL_AT_FDCWD, path, mode, dev);
+}
+#endif
+
+#ifdef __lkl__NR_pipe2
+/**
+ * lkl_sys_pipe - wrapper for lkl_sys_pipe2
+ */
+static inline long lkl_sys_pipe(int fd[2])
+{
+	return lkl_sys_pipe2(fd, 0);
+}
+#endif
+
+#ifdef __lkl__NR_sendto
+/**
+ * lkl_sys_send - wrapper for lkl_sys_sendto
+ */
+static inline long lkl_sys_send(int fd, void *buf, size_t len, int flags)
+{
+	return lkl_sys_sendto(fd, buf, len, flags, 0, 0);
+}
+#endif
+
+#ifdef __lkl__NR_recvfrom
+/**
+ * lkl_sys_recv - wrapper for lkl_sys_recvfrom
+ */
+static inline long lkl_sys_recv(int fd, void *buf, size_t len, int flags)
+{
+	return lkl_sys_recvfrom(fd, buf, len, flags, 0, 0);
+}
+#endif
+
+#ifdef __lkl__NR_pselect6
+/**
+ * lkl_sys_select - wrapper for lkl_sys_pselect
+ */
+static inline long lkl_sys_select(int n, lkl_fd_set *rfds, lkl_fd_set *wfds,
+				  lkl_fd_set *efds, struct lkl_timeval *tv)
+{
+	long data[2] = { 0, _LKL_NSIG/8 };
+	struct lkl_timespec ts;
+	lkl_time_t extra_secs;
+	const lkl_time_t max_time = ((1ULL<<8)*sizeof(time_t)-1)-1;
+
+	if (tv) {
+		if (tv->tv_sec < 0 || tv->tv_usec < 0)
+			return -LKL_EINVAL;
+
+		extra_secs = tv->tv_usec / 1000000;
+		ts.tv_nsec = tv->tv_usec % 1000000 * 1000;
+		ts.tv_sec = extra_secs > max_time - tv->tv_sec ?
+			max_time : tv->tv_sec + extra_secs;
+	}
+	return lkl_sys_pselect6(n, rfds, wfds, efds, tv ?
+				(struct __lkl__kernel_timespec *)&ts : 0, data);
+}
+#endif
+
+#ifdef __lkl__NR_ppoll
+/**
+ * lkl_sys_poll - wrapper for lkl_sys_ppoll
+ */
+static inline long lkl_sys_poll(struct lkl_pollfd *fds, int n, int timeout)
+{
+	return lkl_sys_ppoll(fds, n, timeout >= 0 ?
+			     (struct __lkl__kernel_timespec *)
+			     &((struct lkl_timespec){ .tv_sec = timeout/1000,
+				   .tv_nsec = timeout%1000*1000000 }) : 0,
+			     0, _LKL_NSIG/8);
+}
+#endif
+
+#ifdef __lkl__NR_epoll_create1
+/**
+ * lkl_sys_epoll_create - wrapper for lkl_sys_epoll_create1
+ */
+static inline long lkl_sys_epoll_create(int size)
+{
+	return lkl_sys_epoll_create1(0);
+}
+#endif
+
+#ifdef __lkl__NR_epoll_pwait
+/**
+ * lkl_sys_epoll_wait - wrapper for lkl_sys_epoll_pwait
+ */
+static inline long lkl_sys_epoll_wait(int fd, struct lkl_epoll_event *ev,
+				      int cnt, int to)
+{
+	return lkl_sys_epoll_pwait(fd, ev, cnt, to, 0, _LKL_NSIG/8);
+}
+#endif
+
+
+
+/**
+ * lkl_strerror - returns a string describing the given error code
+ *
+ * @err - error code
+ * @returns - string for the given error code
+ */
+const char *lkl_strerror(int err);
+
+/**
+ * lkl_perror - prints a string describing the given error code
+ *
+ * @msg - prefix for the error message
+ * @err - error code
+ */
+void lkl_perror(char *msg, int err);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
new file mode 100644
index 000000000000..b5f96096fe69
--- /dev/null
+++ b/tools/lkl/include/lkl_host.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_HOST_H
+#define _LKL_HOST_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <lkl/asm/host_ops.h>
+#include <lkl.h>
+
+extern struct lkl_host_operations lkl_host_ops;
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/tools/lkl/lib/.gitignore b/tools/lkl/lib/.gitignore
new file mode 100644
index 000000000000..427ae0273fdd
--- /dev/null
+++ b/tools/lkl/lib/.gitignore
@@ -0,0 +1,3 @@
+lkl.o
+liblkl.a
+
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
new file mode 100644
index 000000000000..8b137891791f
--- /dev/null
+++ b/tools/lkl/lib/Build
@@ -0,0 +1 @@
+
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 15/37] lkl tools: host lib: add utilities functions
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Conrad Meyer, Hajime Tazaki, Patrick Collins, Yuan Liu,
	Motomu Utsumi

From: Octavian Purdila <tavi.purdila@gmail.com>

Add basic utility functions for getting a string from a kernel error
code and a fprintf like function that uses the host print
operation. The latter is useful for informing the user about errors
that occur in the host library.

Other configuration and debug utilities are also added.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl_config.h |  61 +++
 tools/lkl/include/lkl_host.h   |   7 +
 tools/lkl/lib/Build            |   7 +
 tools/lkl/lib/config.c         | 793 +++++++++++++++++++++++++++++++++
 tools/lkl/lib/dbg.c            | 300 +++++++++++++
 tools/lkl/lib/dbg_handler.c    |  44 ++
 tools/lkl/lib/endian.h         |  31 ++
 tools/lkl/lib/jmp_buf.c        |  14 +
 tools/lkl/lib/jmp_buf.h        |   8 +
 tools/lkl/lib/utils.c          | 266 +++++++++++
 10 files changed, 1531 insertions(+)
 create mode 100644 tools/lkl/include/lkl_config.h
 create mode 100644 tools/lkl/lib/config.c
 create mode 100644 tools/lkl/lib/dbg.c
 create mode 100644 tools/lkl/lib/dbg_handler.c
 create mode 100644 tools/lkl/lib/endian.h
 create mode 100644 tools/lkl/lib/jmp_buf.c
 create mode 100644 tools/lkl/lib/jmp_buf.h
 create mode 100644 tools/lkl/lib/utils.c

diff --git a/tools/lkl/include/lkl_config.h b/tools/lkl/include/lkl_config.h
new file mode 100644
index 000000000000..d3edf8b414cf
--- /dev/null
+++ b/tools/lkl/include/lkl_config.h
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_CONFIG_H
+#define _LKL_LIB_CONFIG_H
+
+#define LKL_CONFIG_JSON_TOKEN_MAX 300
+
+struct lkl_config_iface {
+	struct lkl_config_iface *next;
+	struct lkl_netdev *nd;
+
+	/* OBSOLETE: should use IFTYPE and IFPARAMS */
+	char *iftap;
+	char *iftype;
+	char *ifparams;
+	char *ifmtu_str;
+	char *ifip;
+	char *ifipv6;
+	char *ifgateway;
+	char *ifgateway6;
+	char *ifmac_str;
+	char *ifnetmask_len;
+	char *ifnetmask6_len;
+	char *ifoffload_str;
+	char *ifneigh_entries;
+	char *ifqdisc_entries;
+};
+
+struct lkl_config {
+	int ifnum;
+	struct lkl_config_iface *ifaces;
+
+	char *gateway;
+	char *gateway6;
+	char *debug;
+	char *mount;
+	/* single_cpu mode:
+	 * 0: Don't pin to single CPU (default).
+	 * 1: Pin only LKL kernel threads to single CPU.
+	 * 2: Pin all LKL threads to single CPU including all LKL kernel threads
+	 * and device polling threads. Avoid this mode if having busy polling
+	 * threads.
+	 *
+	 * mode 2 can achieve better TCP_RR but worse TCP_STREAM than mode 1.
+	 * You should choose the best for your application and virtio device
+	 * type.
+	 */
+	char *single_cpu;
+	char *sysctls;
+	char *boot_cmdline;
+	char *dump;
+	char *delay_main;
+};
+
+int lkl_load_config_json(struct lkl_config *cfg, char *jstr);
+int lkl_load_config_env(struct lkl_config *cfg);
+void lkl_show_config(struct lkl_config *cfg);
+int lkl_load_config_pre(struct lkl_config *cfg);
+int lkl_load_config_post(struct lkl_config *cfg);
+int lkl_unload_config(struct lkl_config *cfg);
+
+#endif /* _LKL_LIB_CONFIG_H */
diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
index b5f96096fe69..85e80eb4ad0d 100644
--- a/tools/lkl/include/lkl_host.h
+++ b/tools/lkl/include/lkl_host.h
@@ -11,6 +11,13 @@ extern "C" {
 
 extern struct lkl_host_operations lkl_host_ops;
 
+/**
+ * lkl_printf - print a message via the host print operation
+ *
+ * @fmt: printf like format string
+ */
+int lkl_printf(const char *fmt, ...);
+
 
 #ifdef __cplusplus
 }
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index 8b137891791f..658bfa865b9c 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1 +1,8 @@
+CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 
+liblkl-y += jmp_buf.o
+liblkl-y += utils.o
+liblkl-y += dbg.o
+liblkl-y += dbg_handler.o
+liblkl-y += ../../perf/pmu-events/jsmn.o
+liblkl-y += config.o
diff --git a/tools/lkl/lib/config.c b/tools/lkl/lib/config.c
new file mode 100644
index 000000000000..0e77a997348a
--- /dev/null
+++ b/tools/lkl/lib/config.c
@@ -0,0 +1,793 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdlib.h>
+#define _HAVE_STRING_ARCH_strtok_r
+#include <string.h>
+#include <lkl_host.h>
+#include <lkl_config.h>
+
+#include "jsmn.h"
+
+static int jsoneq(const char *json, jsmntok_t *tok, const char *s)
+{
+	if (tok->type == JSMN_STRING &&
+		(int) strlen(s) == tok->end - tok->start &&
+		strncmp(json + tok->start, s, tok->end - tok->start) == 0) {
+		return 0;
+	}
+	return -1;
+}
+
+static int cfgcpy(char **to, char *from)
+{
+	if (!from)
+		return 0;
+	if (*to)
+		free(*to);
+	*to = (char *)malloc((strlen(from) + 1) * sizeof(char));
+	if (*to == NULL) {
+		lkl_printf("malloc failed\n");
+		return -1;
+	}
+	strcpy(*to, from);
+	return 0;
+}
+
+static int cfgncpy(char **to, char *from, int len)
+{
+	if (!from)
+		return 0;
+	if (*to)
+		free(*to);
+	*to = (char *)malloc((len + 1) * sizeof(char));
+	if (*to == NULL) {
+		lkl_printf("malloc failed\n");
+		return -1;
+	}
+	strncpy(*to, from, len + 1);
+	(*to)[len] = '\0';
+	return 0;
+}
+
+static int parse_ifarr(struct lkl_config *cfg,
+		jsmntok_t *toks, char *jstr, int startpos)
+{
+	int ifidx, pos, posend, ret;
+	char **cfgptr;
+	struct lkl_config_iface *iface, *prev = NULL;
+
+	if (!cfg || !toks || !jstr)
+		return -1;
+	pos = startpos;
+	pos++;
+	if (toks[pos].type != JSMN_ARRAY) {
+		lkl_printf("unexpected json type, json array expected\n");
+		return -1;
+	}
+
+	cfg->ifnum = toks[pos].size;
+	pos++;
+	iface = cfg->ifaces;
+
+	for (ifidx = 0; ifidx < cfg->ifnum; ifidx++) {
+		if (toks[pos].type != JSMN_OBJECT) {
+			lkl_printf("object json type expected\n");
+			return -1;
+		}
+
+		posend = pos + toks[pos].size;
+		pos++;
+		iface = malloc(sizeof(struct lkl_config_iface));
+		memset(iface, 0, sizeof(struct lkl_config_iface));
+
+		if (prev)
+			prev->next = iface;
+		else
+			cfg->ifaces = iface;
+		prev = iface;
+
+		for (; pos < posend; pos += 2) {
+			if (toks[pos].type != JSMN_STRING) {
+				lkl_printf("object json type expected\n");
+				return -1;
+			}
+			if (jsoneq(jstr, &toks[pos], "type") == 0) {
+				cfgptr = &iface->iftype;
+			} else if (jsoneq(jstr, &toks[pos], "param") == 0) {
+				cfgptr = &iface->ifparams;
+			} else if (jsoneq(jstr, &toks[pos], "mtu") == 0) {
+				cfgptr = &iface->ifmtu_str;
+			} else if (jsoneq(jstr, &toks[pos], "ip") == 0) {
+				cfgptr = &iface->ifip;
+			} else if (jsoneq(jstr, &toks[pos], "ipv6") == 0) {
+				cfgptr = &iface->ifipv6;
+			} else if (jsoneq(jstr, &toks[pos], "ifgateway") == 0) {
+				cfgptr = &iface->ifgateway;
+			} else if (jsoneq(jstr, &toks[pos],
+							"ifgateway6") == 0) {
+				cfgptr = &iface->ifgateway6;
+			} else if (jsoneq(jstr, &toks[pos], "mac") == 0) {
+				cfgptr = &iface->ifmac_str;
+			} else if (jsoneq(jstr, &toks[pos], "masklen") == 0) {
+				cfgptr = &iface->ifnetmask_len;
+			} else if (jsoneq(jstr, &toks[pos], "masklen6") == 0) {
+				cfgptr = &iface->ifnetmask6_len;
+			} else if (jsoneq(jstr, &toks[pos], "neigh") == 0) {
+				cfgptr = &iface->ifneigh_entries;
+			} else if (jsoneq(jstr, &toks[pos], "qdisc") == 0) {
+				cfgptr = &iface->ifqdisc_entries;
+			} else if (jsoneq(jstr, &toks[pos], "offload") == 0) {
+				cfgptr = &iface->ifoffload_str;
+			} else {
+				lkl_printf("unexpected key: %.*s\n",
+						toks[pos].end-toks[pos].start,
+						jstr + toks[pos].start);
+				return -1;
+			}
+			ret = cfgncpy(cfgptr, jstr + toks[pos+1].start,
+					toks[pos+1].end-toks[pos+1].start);
+			if (ret < 0)
+				return ret;
+		}
+	}
+	return pos - startpos;
+}
+
+#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
+
+int lkl_load_config_json(struct lkl_config *cfg, char *jstr)
+{
+	int pos, ret;
+	char **cfgptr;
+	jsmn_parser jp;
+	jsmntok_t toks[LKL_CONFIG_JSON_TOKEN_MAX];
+
+	if (!cfg || !jstr)
+		return -1;
+	jsmn_init(&jp);
+	ret = jsmn_parse(&jp, jstr, strlen(jstr), toks, ARRAY_SIZE(toks));
+	if (ret != JSMN_SUCCESS) {
+		lkl_printf("failed to parse json\n");
+		return -1;
+	}
+	if (toks[0].type != JSMN_OBJECT) {
+		lkl_printf("object json type expected\n");
+		return -1;
+	}
+	for (pos = 1; pos < jp.toknext; pos++) {
+		if (toks[pos].type != JSMN_STRING) {
+			lkl_printf("string json type expected\n");
+			return -1;
+		}
+		if (jsoneq(jstr, &toks[pos], "interfaces") == 0) {
+			ret = parse_ifarr(cfg, toks, jstr, pos);
+			if (ret < 0)
+				return ret;
+			pos += ret;
+			pos--;
+			continue;
+		}
+		if (jsoneq(jstr, &toks[pos], "gateway") == 0) {
+			cfgptr = &cfg->gateway;
+		} else if (jsoneq(jstr, &toks[pos], "gateway6") == 0) {
+			cfgptr = &cfg->gateway6;
+		} else if (jsoneq(jstr, &toks[pos], "debug") == 0) {
+			cfgptr = &cfg->debug;
+		} else if (jsoneq(jstr, &toks[pos], "mount") == 0) {
+			cfgptr = &cfg->mount;
+		} else if (jsoneq(jstr, &toks[pos], "singlecpu") == 0) {
+			cfgptr = &cfg->single_cpu;
+		} else if (jsoneq(jstr, &toks[pos], "sysctl") == 0) {
+			cfgptr = &cfg->sysctls;
+		} else if (jsoneq(jstr, &toks[pos], "boot_cmdline") == 0) {
+			cfgptr = &cfg->boot_cmdline;
+		} else if (jsoneq(jstr, &toks[pos], "dump") == 0) {
+			cfgptr = &cfg->dump;
+		} else if (jsoneq(jstr, &toks[pos], "delay_main") == 0) {
+			cfgptr = &cfg->delay_main;
+		} else {
+			lkl_printf("unexpected key in json %.*s\n",
+					toks[pos].end-toks[pos].start,
+					jstr + toks[pos].start);
+			return -1;
+		}
+		pos++;
+		ret = cfgncpy(cfgptr, jstr + toks[pos].start,
+				toks[pos].end-toks[pos].start);
+		if (ret < 0)
+			return ret;
+	}
+	return 0;
+}
+
+void lkl_show_config(struct lkl_config *cfg)
+{
+	struct lkl_config_iface *iface;
+	int i = 0;
+
+	if (!cfg)
+		return;
+	lkl_printf("gateway: %s\n", cfg->gateway);
+	lkl_printf("gateway6: %s\n", cfg->gateway6);
+	lkl_printf("debug: %s\n", cfg->debug);
+	lkl_printf("mount: %s\n", cfg->mount);
+	lkl_printf("singlecpu: %s\n", cfg->single_cpu);
+	lkl_printf("sysctl: %s\n", cfg->sysctls);
+	lkl_printf("cmdline: %s\n", cfg->boot_cmdline);
+	lkl_printf("dump: %s\n", cfg->dump);
+	lkl_printf("delay: %s\n", cfg->delay_main);
+
+	for (iface = cfg->ifaces; iface; iface = iface->next, i++) {
+		lkl_printf("ifmac[%d] = %s\n", i, iface->ifmac_str);
+		lkl_printf("ifmtu[%d] = %s\n", i, iface->ifmtu_str);
+		lkl_printf("iftype[%d] = %s\n", i, iface->iftype);
+		lkl_printf("ifparam[%d] = %s\n", i, iface->ifparams);
+		lkl_printf("ifip[%d] = %s\n", i, iface->ifip);
+		lkl_printf("ifmasklen[%d] = %s\n", i, iface->ifnetmask_len);
+		lkl_printf("ifgateway[%d] = %s\n", i, iface->ifgateway);
+		lkl_printf("ifip6[%d] = %s\n", i, iface->ifipv6);
+		lkl_printf("ifmasklen6[%d] = %s\n", i, iface->ifnetmask6_len);
+		lkl_printf("ifgateway6[%d] = %s\n", i, iface->ifgateway6);
+		lkl_printf("ifoffload[%d] = %s\n", i, iface->ifoffload_str);
+		lkl_printf("ifneigh[%d] = %s\n", i, iface->ifneigh_entries);
+		lkl_printf("ifqdisk[%d] = %s\n", i, iface->ifqdisc_entries);
+	}
+}
+
+int lkl_load_config_env(struct lkl_config *cfg)
+{
+	int ret;
+	char *envtap = getenv("LKL_HIJACK_NET_TAP");
+	char *enviftype = getenv("LKL_HIJACK_NET_IFTYPE");
+	char *envifparams = getenv("LKL_HIJACK_NET_IFPARAMS");
+	char *envmtu_str = getenv("LKL_HIJACK_NET_MTU");
+	char *envip = getenv("LKL_HIJACK_NET_IP");
+	char *envipv6 = getenv("LKL_HIJACK_NET_IPV6");
+	char *envifgateway = getenv("LKL_HIJACK_NET_IFGATEWAY");
+	char *envifgateway6 = getenv("LKL_HIJACK_NET_IFGATEWAY6");
+	char *envmac_str = getenv("LKL_HIJACK_NET_MAC");
+	char *envnetmask_len = getenv("LKL_HIJACK_NET_NETMASK_LEN");
+	char *envnetmask6_len = getenv("LKL_HIJACK_NET_NETMASK6_LEN");
+	char *envgateway = getenv("LKL_HIJACK_NET_GATEWAY");
+	char *envgateway6 = getenv("LKL_HIJACK_NET_GATEWAY6");
+	char *envdebug = getenv("LKL_HIJACK_DEBUG");
+	char *envmount = getenv("LKL_HIJACK_MOUNT");
+	char *envneigh_entries = getenv("LKL_HIJACK_NET_NEIGHBOR");
+	char *envqdisc_entries = getenv("LKL_HIJACK_NET_QDISC");
+	char *envsingle_cpu = getenv("LKL_HIJACK_SINGLE_CPU");
+	char *envoffload_str = getenv("LKL_HIJACK_OFFLOAD");
+	char *envsysctls = getenv("LKL_HIJACK_SYSCTL");
+	char *envboot_cmdline = getenv("LKL_HIJACK_BOOT_CMDLINE") ? : "";
+	char *envdump = getenv("LKL_HIJACK_DUMP");
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return -1;
+	if (envtap || enviftype)
+		cfg->ifnum = 1;
+
+	iface = malloc(sizeof(struct lkl_config_iface));
+	memset(iface, 0, sizeof(struct lkl_config_iface));
+
+	ret = cfgcpy(&iface->iftap, envtap);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->iftype, enviftype);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifparams, envifparams);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifmtu_str, envmtu_str);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifip, envip);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifipv6, envipv6);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifgateway, envifgateway);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifgateway6, envifgateway6);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifmac_str, envmac_str);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifnetmask_len, envnetmask_len);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifnetmask6_len, envnetmask6_len);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifoffload_str, envoffload_str);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifneigh_entries, envneigh_entries);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifqdisc_entries, envqdisc_entries);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->gateway, envgateway);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->gateway6, envgateway6);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->debug, envdebug);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->mount, envmount);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->single_cpu, envsingle_cpu);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->sysctls, envsysctls);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->boot_cmdline, envboot_cmdline);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->dump, envdump);
+	if (ret < 0)
+		return ret;
+	return 0;
+}
+
+static int parse_mac_str(char *mac_str, __lkl__u8 mac[LKL_ETH_ALEN])
+{
+	char delim[] = ":";
+	char *saveptr = NULL, *token = NULL;
+	int i = 0;
+
+	if (!mac_str)
+		return 0;
+
+	for (token = strtok_r(mac_str, delim, &saveptr);
+	     i < LKL_ETH_ALEN; i++) {
+		if (!token) {
+			/* The address is too short */
+			return -1;
+		}
+
+		mac[i] = (__lkl__u8) strtol(token, NULL, 16);
+		token = strtok_r(NULL, delim, &saveptr);
+	}
+
+	if (strtok_r(NULL, delim, &saveptr)) {
+		/* The address is too long */
+		return -1;
+	}
+
+	return 1;
+}
+
+/* Add permanent neighbor entries in the form of "ip|mac;ip|mac;..." */
+static void add_neighbor(int ifindex, char *entries)
+{
+	char *saveptr = NULL, *token = NULL;
+	char *ip = NULL, *mac_str = NULL;
+	int ret = 0;
+	__lkl__u8 mac[LKL_ETH_ALEN];
+	char ip_addr[16];
+	int af;
+
+	for (token = strtok_r(entries, ";", &saveptr); token;
+	     token = strtok_r(NULL, ";", &saveptr)) {
+		ip = strtok(token, "|");
+		mac_str = strtok(NULL, "|");
+		if (ip == NULL || mac_str == NULL || strtok(NULL, "|") != NULL)
+			return;
+
+		af = LKL_AF_INET;
+		ret = inet_pton(LKL_AF_INET, ip, ip_addr);
+		if (ret == 0) {
+			ret = inet_pton(LKL_AF_INET6, ip, ip_addr);
+			af = LKL_AF_INET6;
+		}
+		if (ret != 1) {
+			lkl_printf("Bad ip address: %s\n", ip);
+			return;
+		}
+
+		ret = parse_mac_str(mac_str, mac);
+		if (ret != 1) {
+			lkl_printf("Failed to parse mac: %s\n", mac_str);
+			return;
+		}
+		ret = lkl_add_neighbor(ifindex, af, ip_addr, mac);
+		if (ret) {
+			lkl_printf("Failed to add neighbor entry: %s\n",
+				   lkl_strerror(ret));
+			return;
+		}
+	}
+}
+
+/* We don't have an easy way to make FILE*s out of our fds, so we
+ * can't use e.g. fgets
+ */
+static int dump_file(char *path)
+{
+	int ret = -1, bytes_read = 0;
+	char str[1024] = { 0 };
+	int fd;
+
+	fd = lkl_sys_open(path, LKL_O_RDONLY, 0);
+
+	if (fd < 0) {
+		lkl_printf("%s lkl_sys_open %s: %s\n",
+			   __func__, path, lkl_strerror(fd));
+		return -1;
+	}
+
+	/* Need to print this out in order to make sense of the output */
+	lkl_printf("Reading from %s:\n==========\n", path);
+	while ((ret = lkl_sys_read(fd, str, sizeof(str) - 1)) > 0)
+		bytes_read += lkl_printf("%s", str);
+	lkl_printf("==========\n");
+
+	if (ret) {
+		lkl_printf("%s lkl_sys_read %s: %s\n",
+			   __func__, path, lkl_strerror(ret));
+		return -1;
+	}
+
+	return 0;
+}
+
+static void mount_cmds_exec(char *_cmds, int (*callback)(char *))
+{
+	char *saveptr = NULL, *token;
+	int ret = 0;
+	char *cmds = strdup(_cmds);
+
+	token = strtok_r(cmds, ",", &saveptr);
+
+	while (token && ret >= 0) {
+		ret = callback(token);
+		token = strtok_r(NULL, ",", &saveptr);
+	}
+
+	if (ret < 0)
+		lkl_printf("%s: failed parsing %s\n", __func__, _cmds);
+
+	free(cmds);
+}
+
+static int lkl_config_netdev_create(struct lkl_config *cfg,
+				    struct lkl_config_iface *iface)
+{
+	int ret, offload = 0;
+	struct lkl_netdev_args nd_args;
+	__lkl__u8 mac[LKL_ETH_ALEN] = {0};
+	struct lkl_netdev *nd = NULL;
+
+	if (iface->ifoffload_str)
+		offload = strtol(iface->ifoffload_str, NULL, 0);
+	memset(&nd_args, 0, sizeof(struct lkl_netdev_args));
+
+	if (iface->iftap) {
+		lkl_printf("WARN: LKL_HIJACK_NET_TAP is now obsoleted.\n");
+		lkl_printf("use LKL_HIJACK_NET_IFTYPE and PARAMS\n");
+		nd = lkl_netdev_tap_create(iface->iftap, offload);
+	}
+
+	if (!nd && iface->iftype && iface->ifparams) {
+		if ((strcmp(iface->iftype, "tap") == 0)) {
+			nd = lkl_netdev_tap_create(iface->ifparams, offload);
+		} else if ((strcmp(iface->iftype, "macvtap") == 0)) {
+			nd = lkl_netdev_macvtap_create(iface->ifparams,
+						       offload);
+		} else if ((strcmp(iface->iftype, "dpdk") == 0)) {
+			nd = lkl_netdev_dpdk_create(iface->ifparams, offload,
+						    mac);
+		} else if ((strcmp(iface->iftype, "pipe") == 0)) {
+			nd = lkl_netdev_pipe_create(iface->ifparams, offload);
+		} else {
+			if (offload) {
+				lkl_printf("WARN: %s isn't supported on %s\n",
+					   "LKL_HIJACK_OFFLOAD",
+					   iface->iftype);
+				lkl_printf(
+					"WARN: Disabling offload features.\n");
+			}
+			offload = 0;
+		}
+		if (strcmp(iface->iftype, "vde") == 0)
+			nd = lkl_netdev_vde_create(iface->ifparams);
+		if (strcmp(iface->iftype, "raw") == 0)
+			nd = lkl_netdev_raw_create(iface->ifparams);
+	}
+
+	if (nd) {
+		if ((mac[0] != 0) || (mac[1] != 0) ||
+				(mac[2] != 0) || (mac[3] != 0) ||
+				(mac[4] != 0) || (mac[5] != 0)) {
+			nd_args.mac = mac;
+		} else {
+			ret = parse_mac_str(iface->ifmac_str, mac);
+
+			if (ret < 0) {
+				lkl_printf("failed to parse mac\n");
+				return -1;
+			} else if (ret > 0) {
+				nd_args.mac = mac;
+			} else {
+				nd_args.mac = NULL;
+			}
+		}
+
+		nd_args.offload = offload;
+		ret = lkl_netdev_add(nd, &nd_args);
+		if (ret < 0) {
+			lkl_printf("failed to add netdev: %s\n",
+				   lkl_strerror(ret));
+			return -1;
+		}
+		nd->id = ret;
+		iface->nd = nd;
+	}
+	return 0;
+}
+
+static int lkl_config_netdev_configure(struct lkl_config *cfg,
+				       struct lkl_config_iface *iface)
+{
+	int ret, nd_ifindex = -1;
+	struct lkl_netdev *nd = iface->nd;
+
+	if (!nd) {
+		lkl_printf("no netdev available %s\n", iface ? iface->ifparams
+			   : "(null)");
+		return -1;
+	}
+
+	if (nd->id >= 0) {
+		nd_ifindex = lkl_netdev_get_ifindex(nd->id);
+		if (nd_ifindex > 0)
+			lkl_if_up(nd_ifindex);
+		else
+			lkl_printf(
+				"failed to get ifindex for netdev id %d: %s\n",
+				nd->id, lkl_strerror(nd_ifindex));
+	}
+
+	if (nd_ifindex >= 0 && iface->ifmtu_str) {
+		int mtu = atoi(iface->ifmtu_str);
+
+		ret = lkl_if_set_mtu(nd_ifindex, mtu);
+		if (ret < 0)
+			lkl_printf("failed to set MTU: %s\n",
+				   lkl_strerror(ret));
+	}
+
+	if (nd_ifindex >= 0 && iface->ifip && iface->ifnetmask_len) {
+		unsigned int addr;
+
+		if (inet_pton(LKL_AF_INET, iface->ifip,
+			      (struct lkl_in_addr *)&addr) != 1)
+			lkl_printf("Invalid ipv4 address: %s\n", iface->ifip);
+
+		int nmlen = atoi(iface->ifnetmask_len);
+
+		if (addr != LKL_INADDR_NONE && nmlen > 0 && nmlen < 32) {
+			ret = lkl_if_set_ipv4(nd_ifindex, addr, nmlen);
+			if (ret < 0)
+				lkl_printf("failed to set IPv4 address: %s\n",
+					   lkl_strerror(ret));
+		}
+		if (iface->ifgateway) {
+			unsigned int gwaddr;
+
+			if (inet_pton(LKL_AF_INET, iface->ifgateway,
+				      (struct lkl_in_addr *)&gwaddr) != 1)
+				lkl_printf("Invalid ipv4 gateway: %s\n",
+					   iface->ifgateway);
+
+			if (gwaddr != LKL_INADDR_NONE) {
+				ret = lkl_if_set_ipv4_gateway(nd_ifindex,
+						addr, nmlen, gwaddr);
+				if (ret < 0)
+					lkl_printf(
+						"failed to set v4 if gw: %s\n",
+						lkl_strerror(ret));
+			}
+		}
+	}
+
+	if (nd_ifindex >= 0 && iface->ifipv6 &&
+			iface->ifnetmask6_len) {
+		struct lkl_in6_addr addr;
+		unsigned int pflen = atoi(iface->ifnetmask6_len);
+
+		if (inet_pton(LKL_AF_INET6, iface->ifipv6,
+			      (struct lkl_in6_addr *)&addr) != 1) {
+			lkl_printf("Invalid ipv6 addr: %s\n",
+				   iface->ifipv6);
+		}  else {
+			ret = lkl_if_set_ipv6(nd_ifindex, &addr, pflen);
+			if (ret < 0)
+				lkl_printf("failed to set IPv6 address: %s\n",
+					   lkl_strerror(ret));
+		}
+		if (iface->ifgateway6) {
+			char gwaddr[16];
+
+			if (inet_pton(LKL_AF_INET6, iface->ifgateway6,
+								gwaddr) != 1) {
+				lkl_printf("Invalid ipv6 gateway: %s\n",
+					   iface->ifgateway6);
+			} else {
+				ret = lkl_if_set_ipv6_gateway(nd_ifindex,
+						&addr, pflen, gwaddr);
+				if (ret < 0)
+					lkl_printf(
+						"failed to set v6 if gw: %s\n",
+						lkl_strerror(ret));
+			}
+		}
+	}
+
+	if (nd_ifindex >= 0 && iface->ifneigh_entries)
+		add_neighbor(nd_ifindex, iface->ifneigh_entries);
+
+	if (nd_ifindex >= 0 && iface->ifqdisc_entries)
+		lkl_qdisc_parse_add(nd_ifindex, iface->ifqdisc_entries);
+
+	return 0;
+}
+
+static void free_cfgparam(char *cfgparam)
+{
+	if (cfgparam)
+		free(cfgparam);
+}
+
+static int lkl_clean_config(struct lkl_config *cfg)
+{
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return -1;
+
+	for (iface = cfg->ifaces; iface; iface = iface->next) {
+		free_cfgparam(iface->iftap);
+		free_cfgparam(iface->iftype);
+		free_cfgparam(iface->ifparams);
+		free_cfgparam(iface->ifmtu_str);
+		free_cfgparam(iface->ifip);
+		free_cfgparam(iface->ifipv6);
+		free_cfgparam(iface->ifgateway);
+		free_cfgparam(iface->ifgateway6);
+		free_cfgparam(iface->ifmac_str);
+		free_cfgparam(iface->ifnetmask_len);
+		free_cfgparam(iface->ifnetmask6_len);
+		free_cfgparam(iface->ifoffload_str);
+		free_cfgparam(iface->ifneigh_entries);
+		free_cfgparam(iface->ifqdisc_entries);
+	}
+	free_cfgparam(cfg->gateway);
+	free_cfgparam(cfg->gateway6);
+	free_cfgparam(cfg->debug);
+	free_cfgparam(cfg->mount);
+	free_cfgparam(cfg->single_cpu);
+	free_cfgparam(cfg->sysctls);
+	free_cfgparam(cfg->boot_cmdline);
+	free_cfgparam(cfg->dump);
+	free_cfgparam(cfg->delay_main);
+	return 0;
+}
+
+
+int lkl_load_config_pre(struct lkl_config *cfg)
+{
+	int lkl_debug, ret;
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return 0;
+
+	if (cfg->debug)
+		lkl_debug = strtol(cfg->debug, NULL, 0);
+
+	if (!cfg->debug || (lkl_debug == 0))
+		lkl_host_ops.print = NULL;
+
+	for (iface = cfg->ifaces; iface; iface = iface->next) {
+		ret = lkl_config_netdev_create(cfg, iface);
+		if (ret < 0)
+			return -1;
+	}
+
+	return 0;
+}
+
+int lkl_load_config_post(struct lkl_config *cfg)
+{
+	int ret;
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return 0;
+
+	if (cfg->mount)
+		mount_cmds_exec(cfg->mount, lkl_mount_fs);
+
+	for (iface = cfg->ifaces; iface; iface = iface->next) {
+		ret = lkl_config_netdev_configure(cfg, iface);
+		if (ret < 0)
+			break;
+	}
+
+	if (cfg->gateway) {
+		unsigned int gwaddr;
+
+		if (inet_pton(LKL_AF_INET, cfg->gateway,
+			      (struct lkl_in_addr *)&gwaddr) != 1)
+			lkl_printf("Invalid ipv4 gateway: %s\n", cfg->gateway);
+
+		if (gwaddr != LKL_INADDR_NONE) {
+			ret = lkl_set_ipv4_gateway(gwaddr);
+			if (ret < 0)
+				lkl_printf("failed to set IPv4 gateway: %s\n",
+					   lkl_strerror(ret));
+		}
+	}
+
+	if (cfg->gateway6) {
+		char gw[16];
+
+		if (inet_pton(LKL_AF_INET6, cfg->gateway6, gw) != 1) {
+			lkl_printf("Invalid ipv6 gateway: %s\n", cfg->gateway6);
+		} else {
+			ret = lkl_set_ipv6_gateway(gw);
+			if (ret < 0)
+				lkl_printf("failed to set IPv6 gateway: %s\n",
+					   lkl_strerror(ret));
+		}
+	}
+
+	if (cfg->sysctls)
+		lkl_sysctl_parse_write(cfg->sysctls);
+
+	/* put a delay before calling main() */
+	if (cfg->delay_main) {
+		unsigned long delay = strtoul(cfg->delay_main, NULL, 10);
+
+		if (delay == ~0UL)
+			lkl_printf("got invalid delay_main value (%s)\n",
+				   cfg->delay_main);
+		else {
+			lkl_printf("sleeping %lu usec\n", delay);
+			usleep(delay);
+		}
+	}
+
+	return 0;
+}
+
+int lkl_unload_config(struct lkl_config *cfg)
+{
+	struct lkl_config_iface *iface;
+
+	if (cfg) {
+		if (cfg->dump)
+			mount_cmds_exec(cfg->dump, dump_file);
+
+		for (iface = cfg->ifaces; iface; iface = iface->next) {
+			if (iface->nd) {
+				if (iface->nd->id >= 0)
+					lkl_netdev_remove(iface->nd->id);
+				lkl_netdev_free(iface->nd);
+			}
+		}
+
+		lkl_clean_config(cfg);
+	}
+
+	return 0;
+}
diff --git a/tools/lkl/lib/dbg.c b/tools/lkl/lib/dbg.c
new file mode 100644
index 000000000000..b613353bce5c
--- /dev/null
+++ b/tools/lkl/lib/dbg.c
@@ -0,0 +1,300 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <errno.h>
+#include <lkl.h>
+#include <limits.h>
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+static const char *PROMOTE = "$";
+#define str(x) #x
+#define xstr(s) str(s)
+#define MAX_BUF 100
+static char cmd[MAX_BUF];
+static char argv[10][MAX_BUF];
+static int argc;
+static char cur_dir[MAX_BUF] = "/";
+
+static char *normalize_path(const char *src, size_t src_len)
+{
+	char *res;
+	unsigned int res_len;
+	const char *ptr = src;
+	const char *end = &src[src_len];
+	const char *next;
+
+	res = malloc((src_len > 0 ? src_len : 1) + 1);
+	res_len = 0;
+
+	for (ptr = src; ptr < end; ptr = next+1) {
+		size_t len;
+
+		next = memchr(ptr, '/', end-ptr);
+		if (next == NULL)
+			next = end;
+
+		len = next-ptr;
+		switch (len) {
+		case 2:
+			if (ptr[0] == '.' && ptr[1] == '.') {
+				const char *slash = strrchr(res, '/');
+
+				if (slash != NULL)
+					res_len = slash - res;
+				continue;
+			}
+			break;
+		case 1:
+			if (ptr[0] == '.')
+				continue;
+			break;
+		case 0:
+			continue;
+		}
+		res[res_len++] = '/';
+		memcpy(&res[res_len], ptr, len);
+		res_len += len;
+	}
+	if (res_len == 0)
+		res[res_len++] = '/';
+	res[res_len] = '\0';
+	return res;
+}
+
+static void build_path(char *path)
+{
+	char *npath;
+
+	strcpy(path, cur_dir);
+	if (argc >= 1) {
+		if (argv[0][0] == '/')
+			strncpy(path, argv[0], LKL_PATH_MAX);
+		else {
+			strncat(path, "/", LKL_PATH_MAX - strlen(path) - 1);
+			strncat(path, argv[0], LKL_PATH_MAX - strlen(path) - 1);
+		}
+	}
+	npath = normalize_path(path, strlen(path));
+	strcpy(path, npath);
+	free(npath);
+}
+
+static void help(void)
+{
+	const char *msg =
+		"cat FILE\n"
+		"\tShow content of FILE\n"
+		"cd [DIR]\n"
+		"\tChange directory to DIR\n"
+		"exit\n"
+		"\tExit the debug session\n"
+		"help\n"
+		"\tShow this message\n"
+		"ls [DIR]\n"
+		"\tList files in DIR\n"
+		"mount FSTYPE\n"
+		"\tMount FSTYPE as /FSTYPE\n"
+		"overwrite FILE\n"
+		"\tOverwrite content of FILE from stdin\n"
+		"pwd\n"
+		"\tShow current directory\n"
+		;
+	printf("%s", msg);
+}
+
+static void ls(void)
+{
+	char path[LKL_PATH_MAX];
+	struct lkl_dir *dir;
+	struct lkl_linux_dirent64 *de;
+	int err;
+
+	build_path(path);
+	dir = lkl_opendir(path, &err);
+	if (dir) {
+		do {
+			de = lkl_readdir(dir);
+			if (de) {
+				printf("%s\n", de->d_name);
+			} else {
+				err = lkl_errdir(dir);
+				if (err != 0) {
+					fprintf(stderr, "%s\n",
+						lkl_strerror(err));
+				}
+				break;
+			}
+		} while (1);
+		lkl_closedir(dir);
+	} else {
+		fprintf(stderr, "%s: %s\n", path, lkl_strerror(err));
+	}
+}
+
+static void cd(void)
+{
+	char path[LKL_PATH_MAX];
+	struct lkl_dir *dir;
+	int err;
+
+	build_path(path);
+	dir = lkl_opendir(path, &err);
+	if (dir) {
+		strcpy(cur_dir, path);
+		lkl_closedir(dir);
+	} else {
+		fprintf(stderr, "%s: %s\n", path, lkl_strerror(err));
+	}
+}
+
+static void mount(void)
+{
+	char *fstype;
+	int ret = 0;
+
+	if (argc != 1) {
+		fprintf(stderr, "%s\n", "One argument is needed.");
+		return;
+	}
+
+	fstype = argv[0];
+	ret = lkl_mount_fs(fstype);
+	if (ret == 1)
+		fprintf(stderr, "%s is already mounted.\n", fstype);
+}
+
+static void cat(void)
+{
+	char path[LKL_PATH_MAX];
+	int ret;
+	char buf[1024];
+	int fd;
+
+	if (argc != 1) {
+		fprintf(stderr, "%s\n", "One argument is needed.");
+		return;
+	}
+
+	build_path(path);
+	fd = lkl_sys_open(path, LKL_O_RDONLY, 0);
+
+	if (fd < 0) {
+		fprintf(stderr, "lkl_sys_open %s: %s\n",
+			path, lkl_strerror(fd));
+		return;
+	}
+
+	while ((ret = lkl_sys_read(fd, buf, sizeof(buf) - 1)) > 0) {
+		buf[ret] = '\0';
+		printf("%s", buf);
+	}
+
+	if (ret) {
+		fprintf(stderr, "lkl_sys_read %s: %s\n",
+			path, lkl_strerror(ret));
+	}
+	lkl_sys_close(fd);
+}
+
+static void overwrite(void)
+{
+	char path[LKL_PATH_MAX];
+	int ret;
+	int fd;
+	char buf[1024];
+
+	build_path(path);
+	fd = lkl_sys_open(path, LKL_O_WRONLY | LKL_O_CREAT, 0);
+	if (fd < 0) {
+		fprintf(stderr, "lkl_sys_open %s: %s\n",
+			path, lkl_strerror(fd));
+		return;
+	}
+	printf("Input the content and stop by hitting Ctrl-D:\n");
+	while (fgets(buf, 1023, stdin)) {
+		ret = lkl_sys_write(fd, buf, strlen(buf));
+		if (ret < 0) {
+			fprintf(stderr, "lkl_sys_write %s: %s\n",
+				path, lkl_strerror(fd));
+		}
+	}
+	lkl_sys_close(fd);
+}
+
+static void pwd(void)
+{
+	printf("%s\n", cur_dir);
+}
+
+static int parse_cmd(char *input)
+{
+	char *token;
+
+	token = strtok(input, " ");
+	if (token)
+		strcpy(cmd, token);
+	else
+		return -1;
+
+	argc = 0;
+	token = strtok(NULL, " ");
+	while (token) {
+		if (argc >= 10) {
+			fprintf(stderr, "To many args > 10\n");
+			return -1;
+		}
+		strcpy(argv[argc++], token);
+		token = strtok(NULL, " ");
+	}
+	return 0;
+}
+
+static void run_cmd(void)
+{
+	if (strcmp(cmd, "cat") == 0)
+		cat();
+	else if (strcmp(cmd, "cd") == 0)
+		cd();
+	else if (strcmp(cmd, "help") == 0)
+		help();
+	else if (strcmp(cmd, "ls") == 0)
+		ls();
+	else if (strcmp(cmd, "mount") == 0)
+		mount();
+	else if (strcmp(cmd, "overwrite") == 0)
+		overwrite();
+	else if (strcmp(cmd, "pwd") == 0)
+		pwd();
+	else
+		fprintf(stderr, "Unknown command: %s\n", cmd);
+}
+
+void dbg_entrance(void)
+{
+	char input[MAX_BUF + 1];
+	int ret;
+	int c;
+
+	printf("Type help to see a list of commands\n");
+	do {
+		printf("%s ", PROMOTE);
+		ret = scanf("%" xstr(MAX_BUF) "[^\n]s", input);
+		while ((c = getchar()) != '\n' && c != EOF)
+			;
+		if (ret == 0)
+			continue;
+		if (ret != 1 && errno != EINTR) {
+			perror("scanf");
+			continue;
+		}
+		if (strlen(input) == MAX_BUF) {
+			fprintf(stderr, "Too long input > %d\n", MAX_BUF - 1);
+			continue;
+		}
+		if (parse_cmd(input))
+			continue;
+		if (strcmp(cmd, "exit") == 0)
+			break;
+		run_cmd();
+	} while (1);
+}
diff --git a/tools/lkl/lib/dbg_handler.c b/tools/lkl/lib/dbg_handler.c
new file mode 100644
index 000000000000..01d165a5fc1e
--- /dev/null
+++ b/tools/lkl/lib/dbg_handler.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <lkl_host.h>
+
+extern void dbg_entrance(void);
+static int dbg_running;
+
+static void dbg_thread(void *arg)
+{
+	lkl_host_ops.thread_detach();
+	printf("======Enter Debug======\n");
+	dbg_entrance();
+	printf("======Exit Debug======\n");
+	dbg_running = 0;
+}
+
+void dbg_handler(int signum)
+{
+	/* We don't care about the possible race on dbg_running. */
+	if (dbg_running) {
+		fprintf(stderr, "A debug lib is running\n");
+		return;
+	}
+	dbg_running = 1;
+	lkl_host_ops.thread_create(&dbg_thread, NULL);
+}
+
+#ifndef __MINGW32__
+#include <signal.h>
+void lkl_register_dbg_handler(void)
+{
+	struct sigaction sa;
+
+	sigemptyset(&sa.sa_mask);
+	sa.sa_handler = dbg_handler;
+	if (sigaction(SIGTSTP, &sa, NULL) == -1)
+		perror("sigaction");
+}
+#else
+void lkl_register_dbg_handler(void)
+{
+	fprintf(stderr, "%s is not implemented.\n", __func__);
+}
+#endif
diff --git a/tools/lkl/lib/endian.h b/tools/lkl/lib/endian.h
new file mode 100644
index 000000000000..aaccfa0edb65
--- /dev/null
+++ b/tools/lkl/lib/endian.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_ENDIAN_H
+#define _LKL_LIB_ENDIAN_H
+
+#if defined(__FreeBSD__)
+#include <sys/endian.h>
+#elif defined(__ANDROID__)
+#include <sys/endian.h>
+#elif defined(__MINGW32__)
+#include <winsock.h>
+#define le32toh(x) (x)
+#define le16toh(x) (x)
+#define htole32(x) (x)
+#define htole16(x) (x)
+#define le64toh(x) (x)
+#define htobe32(x) htonl(x)
+#define htobe16(x) htons(x)
+#define be32toh(x) ntohl(x)
+#define be16toh(x) ntohs(x)
+#else
+#include <endian.h>
+#endif
+
+#ifndef htonl
+#define htonl(x) htobe32(x)
+#define htons(x) htobe16(x)
+#define ntohl(x) be32toh(x)
+#define ntohs(x) be16toh(x)
+#endif
+
+#endif /* _LKL_LIB_ENDIAN_H */
diff --git a/tools/lkl/lib/jmp_buf.c b/tools/lkl/lib/jmp_buf.c
new file mode 100644
index 000000000000..f6bdd7e4bd83
--- /dev/null
+++ b/tools/lkl/lib/jmp_buf.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <setjmp.h>
+#include <lkl_host.h>
+
+void jmp_buf_set(struct lkl_jmp_buf *jmpb, void (*f)(void))
+{
+	if (!setjmp(*((jmp_buf *)jmpb->buf)))
+		f();
+}
+
+void jmp_buf_longjmp(struct lkl_jmp_buf *jmpb, int val)
+{
+	longjmp(*((jmp_buf *)jmpb->buf), val);
+}
diff --git a/tools/lkl/lib/jmp_buf.h b/tools/lkl/lib/jmp_buf.h
new file mode 100644
index 000000000000..8782cbaaf51f
--- /dev/null
+++ b/tools/lkl/lib/jmp_buf.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_JMP_BUF_H
+#define _LKL_LIB_JMP_BUF_H
+
+void jmp_buf_set(struct lkl_jmp_buf *jmpb, void (*f)(void));
+void jmp_buf_longjmp(struct lkl_jmp_buf *jmpb, int val);
+
+#endif
diff --git a/tools/lkl/lib/utils.c b/tools/lkl/lib/utils.c
new file mode 100644
index 000000000000..7de92bbe5475
--- /dev/null
+++ b/tools/lkl/lib/utils.c
@@ -0,0 +1,266 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <lkl_host.h>
+
+static const char * const lkl_err_strings[] = {
+	"Success",
+	"Operation not permitted",
+	"No such file or directory",
+	"No such process",
+	"Interrupted system call",
+	"I/O error",
+	"No such device or address",
+	"Argument list too long",
+	"Exec format error",
+	"Bad file number",
+	"No child processes",
+	"Try again",
+	"Out of memory",
+	"Permission denied",
+	"Bad address",
+	"Block device required",
+	"Device or resource busy",
+	"File exists",
+	"Cross-device link",
+	"No such device",
+	"Not a directory",
+	"Is a directory",
+	"Invalid argument",
+	"File table overflow",
+	"Too many open files",
+	"Not a typewriter",
+	"Text file busy",
+	"File too large",
+	"No space left on device",
+	"Illegal seek",
+	"Read-only file system",
+	"Too many links",
+	"Broken pipe",
+	"Math argument out of domain of func",
+	"Math result not representable",
+	"Resource deadlock would occur",
+	"File name too long",
+	"No record locks available",
+	"Invalid system call number",
+	"Directory not empty",
+	"Too many symbolic links encountered",
+	"Bad error code", /* EWOULDBLOCK is EAGAIN */
+	"No message of desired type",
+	"Identifier removed",
+	"Channel number out of range",
+	"Level 2 not synchronized",
+	"Level 3 halted",
+	"Level 3 reset",
+	"Link number out of range",
+	"Protocol driver not attached",
+	"No CSI structure available",
+	"Level 2 halted",
+	"Invalid exchange",
+	"Invalid request descriptor",
+	"Exchange full",
+	"No anode",
+	"Invalid request code",
+	"Invalid slot",
+	"Bad error code", /* EDEADLOCK is EDEADLK */
+	"Bad font file format",
+	"Device not a stream",
+	"No data available",
+	"Timer expired",
+	"Out of streams resources",
+	"Machine is not on the network",
+	"Package not installed",
+	"Object is remote",
+	"Link has been severed",
+	"Advertise error",
+	"Srmount error",
+	"Communication error on send",
+	"Protocol error",
+	"Multihop attempted",
+	"RFS specific error",
+	"Not a data message",
+	"Value too large for defined data type",
+	"Name not unique on network",
+	"File descriptor in bad state",
+	"Remote address changed",
+	"Can not access a needed shared library",
+	"Accessing a corrupted shared library",
+	".lib section in a.out corrupted",
+	"Attempting to link in too many shared libraries",
+	"Cannot exec a shared library directly",
+	"Illegal byte sequence",
+	"Interrupted system call should be restarted",
+	"Streams pipe error",
+	"Too many users",
+	"Socket operation on non-socket",
+	"Destination address required",
+	"Message too long",
+	"Protocol wrong type for socket",
+	"Protocol not available",
+	"Protocol not supported",
+	"Socket type not supported",
+	"Operation not supported on transport endpoint",
+	"Protocol family not supported",
+	"Address family not supported by protocol",
+	"Address already in use",
+	"Cannot assign requested address",
+	"Network is down",
+	"Network is unreachable",
+	"Network dropped connection because of reset",
+	"Software caused connection abort",
+	"Connection reset by peer",
+	"No buffer space available",
+	"Transport endpoint is already connected",
+	"Transport endpoint is not connected",
+	"Cannot send after transport endpoint shutdown",
+	"Too many references: cannot splice",
+	"Connection timed out",
+	"Connection refused",
+	"Host is down",
+	"No route to host",
+	"Operation already in progress",
+	"Operation now in progress",
+	"Stale file handle",
+	"Structure needs cleaning",
+	"Not a XENIX named type file",
+	"No XENIX semaphores available",
+	"Is a named type file",
+	"Remote I/O error",
+	"Quota exceeded",
+	"No medium found",
+	"Wrong medium type",
+	"Operation Canceled",
+	"Required key not available",
+	"Key has expired",
+	"Key has been revoked",
+	"Key was rejected by service",
+	"Owner died",
+	"State not recoverable",
+	"Operation not possible due to RF-kill",
+	"Memory page has hardware error",
+};
+
+const char *lkl_strerror(int err)
+{
+	if (err < 0)
+		err = -err;
+
+	if ((size_t)err >= sizeof(lkl_err_strings) / sizeof(const char *))
+		return "Bad error code";
+
+	return lkl_err_strings[err];
+}
+
+void lkl_perror(char *msg, int err)
+{
+	const char *err_msg = lkl_strerror(err);
+	/* We need to use 'real' printf because lkl_host_ops.print can
+	 * be turned off when debugging is off.
+	 */
+	lkl_printf("%s: %s\n", msg, err_msg);
+}
+
+static int lkl_vprintf(const char *fmt, va_list args)
+{
+	int n;
+	char *buffer;
+	va_list copy;
+
+	if (!lkl_host_ops.print)
+		return 0;
+
+	va_copy(copy, args);
+	n = vsnprintf(NULL, 0, fmt, copy);
+	va_end(copy);
+
+	buffer = lkl_host_ops.mem_alloc(n + 1);
+	if (!buffer)
+		return -1;
+
+	vsnprintf(buffer, n + 1, fmt, args);
+
+	lkl_host_ops.print(buffer, n);
+	lkl_host_ops.mem_free(buffer);
+
+	return n;
+}
+
+int lkl_printf(const char *fmt, ...)
+{
+	int n;
+	va_list args;
+
+	va_start(args, fmt);
+	n = lkl_vprintf(fmt, args);
+	va_end(args);
+
+	return n;
+}
+
+void lkl_bug(const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	lkl_vprintf(fmt, args);
+	va_end(args);
+
+	lkl_host_ops.panic();
+}
+#ifndef __arch_um__
+int lkl_sysctl(const char *path, const char *value)
+{
+	int ret;
+	int fd;
+	char *delim, *p;
+	char full_path[256];
+
+	lkl_mount_fs("proc");
+
+	snprintf(full_path, sizeof(full_path), "/proc/sys/%s", path);
+	p = full_path;
+	while ((delim = strstr(p, "."))) {
+		*delim = '/';
+		p = delim + 1;
+	}
+
+	fd = lkl_sys_open(full_path, LKL_O_WRONLY | LKL_O_CREAT, 0);
+	if (fd < 0) {
+		lkl_printf("lkl_sys_open %s: %s\n",
+			   full_path, lkl_strerror(fd));
+		return -1;
+	}
+	ret = lkl_sys_write(fd, value, strlen(value));
+	if (ret < 0) {
+		lkl_printf("lkl_sys_write %s: %s\n",
+			full_path, lkl_strerror(fd));
+	}
+
+	lkl_sys_close(fd);
+
+	return 0;
+}
+
+/* Configure sysctl parameters as the form of "key=value;key=value;..." */
+void lkl_sysctl_parse_write(const char *sysctls)
+{
+	char *saveptr = NULL, *token = NULL;
+	char *key = NULL, *value = NULL;
+	char strings[256];
+	int ret = 0;
+
+	strcpy(strings, sysctls);
+	for (token = strtok_r(strings, ";", &saveptr); token;
+	     token = strtok_r(NULL, ";", &saveptr)) {
+		key = strtok(token, "=");
+		value = strtok(NULL, "=");
+		ret = lkl_sysctl(key, value);
+		if (ret) {
+			lkl_printf("Failed to configure sysctl entries: %s\n",
+				   lkl_strerror(ret));
+			return;
+		}
+	}
+}
+#endif
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 15/37] lkl tools: host lib: add utilities functions
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Conrad Meyer, Octavian Purdila, Motomu Utsumi,
	Akira Moroo, Yuan Liu, Patrick Collins, linux-kernel-library,
	Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add basic utility functions for getting a string from a kernel error
code and a fprintf like function that uses the host print
operation. The latter is useful for informing the user about errors
that occur in the host library.

Other configuration and debug utilities are also added.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl_config.h |  61 +++
 tools/lkl/include/lkl_host.h   |   7 +
 tools/lkl/lib/Build            |   7 +
 tools/lkl/lib/config.c         | 793 +++++++++++++++++++++++++++++++++
 tools/lkl/lib/dbg.c            | 300 +++++++++++++
 tools/lkl/lib/dbg_handler.c    |  44 ++
 tools/lkl/lib/endian.h         |  31 ++
 tools/lkl/lib/jmp_buf.c        |  14 +
 tools/lkl/lib/jmp_buf.h        |   8 +
 tools/lkl/lib/utils.c          | 266 +++++++++++
 10 files changed, 1531 insertions(+)
 create mode 100644 tools/lkl/include/lkl_config.h
 create mode 100644 tools/lkl/lib/config.c
 create mode 100644 tools/lkl/lib/dbg.c
 create mode 100644 tools/lkl/lib/dbg_handler.c
 create mode 100644 tools/lkl/lib/endian.h
 create mode 100644 tools/lkl/lib/jmp_buf.c
 create mode 100644 tools/lkl/lib/jmp_buf.h
 create mode 100644 tools/lkl/lib/utils.c

diff --git a/tools/lkl/include/lkl_config.h b/tools/lkl/include/lkl_config.h
new file mode 100644
index 000000000000..d3edf8b414cf
--- /dev/null
+++ b/tools/lkl/include/lkl_config.h
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_CONFIG_H
+#define _LKL_LIB_CONFIG_H
+
+#define LKL_CONFIG_JSON_TOKEN_MAX 300
+
+struct lkl_config_iface {
+	struct lkl_config_iface *next;
+	struct lkl_netdev *nd;
+
+	/* OBSOLETE: should use IFTYPE and IFPARAMS */
+	char *iftap;
+	char *iftype;
+	char *ifparams;
+	char *ifmtu_str;
+	char *ifip;
+	char *ifipv6;
+	char *ifgateway;
+	char *ifgateway6;
+	char *ifmac_str;
+	char *ifnetmask_len;
+	char *ifnetmask6_len;
+	char *ifoffload_str;
+	char *ifneigh_entries;
+	char *ifqdisc_entries;
+};
+
+struct lkl_config {
+	int ifnum;
+	struct lkl_config_iface *ifaces;
+
+	char *gateway;
+	char *gateway6;
+	char *debug;
+	char *mount;
+	/* single_cpu mode:
+	 * 0: Don't pin to single CPU (default).
+	 * 1: Pin only LKL kernel threads to single CPU.
+	 * 2: Pin all LKL threads to single CPU including all LKL kernel threads
+	 * and device polling threads. Avoid this mode if having busy polling
+	 * threads.
+	 *
+	 * mode 2 can achieve better TCP_RR but worse TCP_STREAM than mode 1.
+	 * You should choose the best for your application and virtio device
+	 * type.
+	 */
+	char *single_cpu;
+	char *sysctls;
+	char *boot_cmdline;
+	char *dump;
+	char *delay_main;
+};
+
+int lkl_load_config_json(struct lkl_config *cfg, char *jstr);
+int lkl_load_config_env(struct lkl_config *cfg);
+void lkl_show_config(struct lkl_config *cfg);
+int lkl_load_config_pre(struct lkl_config *cfg);
+int lkl_load_config_post(struct lkl_config *cfg);
+int lkl_unload_config(struct lkl_config *cfg);
+
+#endif /* _LKL_LIB_CONFIG_H */
diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
index b5f96096fe69..85e80eb4ad0d 100644
--- a/tools/lkl/include/lkl_host.h
+++ b/tools/lkl/include/lkl_host.h
@@ -11,6 +11,13 @@ extern "C" {
 
 extern struct lkl_host_operations lkl_host_ops;
 
+/**
+ * lkl_printf - print a message via the host print operation
+ *
+ * @fmt: printf like format string
+ */
+int lkl_printf(const char *fmt, ...);
+
 
 #ifdef __cplusplus
 }
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index 8b137891791f..658bfa865b9c 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1 +1,8 @@
+CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 
+liblkl-y += jmp_buf.o
+liblkl-y += utils.o
+liblkl-y += dbg.o
+liblkl-y += dbg_handler.o
+liblkl-y += ../../perf/pmu-events/jsmn.o
+liblkl-y += config.o
diff --git a/tools/lkl/lib/config.c b/tools/lkl/lib/config.c
new file mode 100644
index 000000000000..0e77a997348a
--- /dev/null
+++ b/tools/lkl/lib/config.c
@@ -0,0 +1,793 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdlib.h>
+#define _HAVE_STRING_ARCH_strtok_r
+#include <string.h>
+#include <lkl_host.h>
+#include <lkl_config.h>
+
+#include "jsmn.h"
+
+static int jsoneq(const char *json, jsmntok_t *tok, const char *s)
+{
+	if (tok->type == JSMN_STRING &&
+		(int) strlen(s) == tok->end - tok->start &&
+		strncmp(json + tok->start, s, tok->end - tok->start) == 0) {
+		return 0;
+	}
+	return -1;
+}
+
+static int cfgcpy(char **to, char *from)
+{
+	if (!from)
+		return 0;
+	if (*to)
+		free(*to);
+	*to = (char *)malloc((strlen(from) + 1) * sizeof(char));
+	if (*to == NULL) {
+		lkl_printf("malloc failed\n");
+		return -1;
+	}
+	strcpy(*to, from);
+	return 0;
+}
+
+static int cfgncpy(char **to, char *from, int len)
+{
+	if (!from)
+		return 0;
+	if (*to)
+		free(*to);
+	*to = (char *)malloc((len + 1) * sizeof(char));
+	if (*to == NULL) {
+		lkl_printf("malloc failed\n");
+		return -1;
+	}
+	strncpy(*to, from, len + 1);
+	(*to)[len] = '\0';
+	return 0;
+}
+
+static int parse_ifarr(struct lkl_config *cfg,
+		jsmntok_t *toks, char *jstr, int startpos)
+{
+	int ifidx, pos, posend, ret;
+	char **cfgptr;
+	struct lkl_config_iface *iface, *prev = NULL;
+
+	if (!cfg || !toks || !jstr)
+		return -1;
+	pos = startpos;
+	pos++;
+	if (toks[pos].type != JSMN_ARRAY) {
+		lkl_printf("unexpected json type, json array expected\n");
+		return -1;
+	}
+
+	cfg->ifnum = toks[pos].size;
+	pos++;
+	iface = cfg->ifaces;
+
+	for (ifidx = 0; ifidx < cfg->ifnum; ifidx++) {
+		if (toks[pos].type != JSMN_OBJECT) {
+			lkl_printf("object json type expected\n");
+			return -1;
+		}
+
+		posend = pos + toks[pos].size;
+		pos++;
+		iface = malloc(sizeof(struct lkl_config_iface));
+		memset(iface, 0, sizeof(struct lkl_config_iface));
+
+		if (prev)
+			prev->next = iface;
+		else
+			cfg->ifaces = iface;
+		prev = iface;
+
+		for (; pos < posend; pos += 2) {
+			if (toks[pos].type != JSMN_STRING) {
+				lkl_printf("object json type expected\n");
+				return -1;
+			}
+			if (jsoneq(jstr, &toks[pos], "type") == 0) {
+				cfgptr = &iface->iftype;
+			} else if (jsoneq(jstr, &toks[pos], "param") == 0) {
+				cfgptr = &iface->ifparams;
+			} else if (jsoneq(jstr, &toks[pos], "mtu") == 0) {
+				cfgptr = &iface->ifmtu_str;
+			} else if (jsoneq(jstr, &toks[pos], "ip") == 0) {
+				cfgptr = &iface->ifip;
+			} else if (jsoneq(jstr, &toks[pos], "ipv6") == 0) {
+				cfgptr = &iface->ifipv6;
+			} else if (jsoneq(jstr, &toks[pos], "ifgateway") == 0) {
+				cfgptr = &iface->ifgateway;
+			} else if (jsoneq(jstr, &toks[pos],
+							"ifgateway6") == 0) {
+				cfgptr = &iface->ifgateway6;
+			} else if (jsoneq(jstr, &toks[pos], "mac") == 0) {
+				cfgptr = &iface->ifmac_str;
+			} else if (jsoneq(jstr, &toks[pos], "masklen") == 0) {
+				cfgptr = &iface->ifnetmask_len;
+			} else if (jsoneq(jstr, &toks[pos], "masklen6") == 0) {
+				cfgptr = &iface->ifnetmask6_len;
+			} else if (jsoneq(jstr, &toks[pos], "neigh") == 0) {
+				cfgptr = &iface->ifneigh_entries;
+			} else if (jsoneq(jstr, &toks[pos], "qdisc") == 0) {
+				cfgptr = &iface->ifqdisc_entries;
+			} else if (jsoneq(jstr, &toks[pos], "offload") == 0) {
+				cfgptr = &iface->ifoffload_str;
+			} else {
+				lkl_printf("unexpected key: %.*s\n",
+						toks[pos].end-toks[pos].start,
+						jstr + toks[pos].start);
+				return -1;
+			}
+			ret = cfgncpy(cfgptr, jstr + toks[pos+1].start,
+					toks[pos+1].end-toks[pos+1].start);
+			if (ret < 0)
+				return ret;
+		}
+	}
+	return pos - startpos;
+}
+
+#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
+
+int lkl_load_config_json(struct lkl_config *cfg, char *jstr)
+{
+	int pos, ret;
+	char **cfgptr;
+	jsmn_parser jp;
+	jsmntok_t toks[LKL_CONFIG_JSON_TOKEN_MAX];
+
+	if (!cfg || !jstr)
+		return -1;
+	jsmn_init(&jp);
+	ret = jsmn_parse(&jp, jstr, strlen(jstr), toks, ARRAY_SIZE(toks));
+	if (ret != JSMN_SUCCESS) {
+		lkl_printf("failed to parse json\n");
+		return -1;
+	}
+	if (toks[0].type != JSMN_OBJECT) {
+		lkl_printf("object json type expected\n");
+		return -1;
+	}
+	for (pos = 1; pos < jp.toknext; pos++) {
+		if (toks[pos].type != JSMN_STRING) {
+			lkl_printf("string json type expected\n");
+			return -1;
+		}
+		if (jsoneq(jstr, &toks[pos], "interfaces") == 0) {
+			ret = parse_ifarr(cfg, toks, jstr, pos);
+			if (ret < 0)
+				return ret;
+			pos += ret;
+			pos--;
+			continue;
+		}
+		if (jsoneq(jstr, &toks[pos], "gateway") == 0) {
+			cfgptr = &cfg->gateway;
+		} else if (jsoneq(jstr, &toks[pos], "gateway6") == 0) {
+			cfgptr = &cfg->gateway6;
+		} else if (jsoneq(jstr, &toks[pos], "debug") == 0) {
+			cfgptr = &cfg->debug;
+		} else if (jsoneq(jstr, &toks[pos], "mount") == 0) {
+			cfgptr = &cfg->mount;
+		} else if (jsoneq(jstr, &toks[pos], "singlecpu") == 0) {
+			cfgptr = &cfg->single_cpu;
+		} else if (jsoneq(jstr, &toks[pos], "sysctl") == 0) {
+			cfgptr = &cfg->sysctls;
+		} else if (jsoneq(jstr, &toks[pos], "boot_cmdline") == 0) {
+			cfgptr = &cfg->boot_cmdline;
+		} else if (jsoneq(jstr, &toks[pos], "dump") == 0) {
+			cfgptr = &cfg->dump;
+		} else if (jsoneq(jstr, &toks[pos], "delay_main") == 0) {
+			cfgptr = &cfg->delay_main;
+		} else {
+			lkl_printf("unexpected key in json %.*s\n",
+					toks[pos].end-toks[pos].start,
+					jstr + toks[pos].start);
+			return -1;
+		}
+		pos++;
+		ret = cfgncpy(cfgptr, jstr + toks[pos].start,
+				toks[pos].end-toks[pos].start);
+		if (ret < 0)
+			return ret;
+	}
+	return 0;
+}
+
+void lkl_show_config(struct lkl_config *cfg)
+{
+	struct lkl_config_iface *iface;
+	int i = 0;
+
+	if (!cfg)
+		return;
+	lkl_printf("gateway: %s\n", cfg->gateway);
+	lkl_printf("gateway6: %s\n", cfg->gateway6);
+	lkl_printf("debug: %s\n", cfg->debug);
+	lkl_printf("mount: %s\n", cfg->mount);
+	lkl_printf("singlecpu: %s\n", cfg->single_cpu);
+	lkl_printf("sysctl: %s\n", cfg->sysctls);
+	lkl_printf("cmdline: %s\n", cfg->boot_cmdline);
+	lkl_printf("dump: %s\n", cfg->dump);
+	lkl_printf("delay: %s\n", cfg->delay_main);
+
+	for (iface = cfg->ifaces; iface; iface = iface->next, i++) {
+		lkl_printf("ifmac[%d] = %s\n", i, iface->ifmac_str);
+		lkl_printf("ifmtu[%d] = %s\n", i, iface->ifmtu_str);
+		lkl_printf("iftype[%d] = %s\n", i, iface->iftype);
+		lkl_printf("ifparam[%d] = %s\n", i, iface->ifparams);
+		lkl_printf("ifip[%d] = %s\n", i, iface->ifip);
+		lkl_printf("ifmasklen[%d] = %s\n", i, iface->ifnetmask_len);
+		lkl_printf("ifgateway[%d] = %s\n", i, iface->ifgateway);
+		lkl_printf("ifip6[%d] = %s\n", i, iface->ifipv6);
+		lkl_printf("ifmasklen6[%d] = %s\n", i, iface->ifnetmask6_len);
+		lkl_printf("ifgateway6[%d] = %s\n", i, iface->ifgateway6);
+		lkl_printf("ifoffload[%d] = %s\n", i, iface->ifoffload_str);
+		lkl_printf("ifneigh[%d] = %s\n", i, iface->ifneigh_entries);
+		lkl_printf("ifqdisk[%d] = %s\n", i, iface->ifqdisc_entries);
+	}
+}
+
+int lkl_load_config_env(struct lkl_config *cfg)
+{
+	int ret;
+	char *envtap = getenv("LKL_HIJACK_NET_TAP");
+	char *enviftype = getenv("LKL_HIJACK_NET_IFTYPE");
+	char *envifparams = getenv("LKL_HIJACK_NET_IFPARAMS");
+	char *envmtu_str = getenv("LKL_HIJACK_NET_MTU");
+	char *envip = getenv("LKL_HIJACK_NET_IP");
+	char *envipv6 = getenv("LKL_HIJACK_NET_IPV6");
+	char *envifgateway = getenv("LKL_HIJACK_NET_IFGATEWAY");
+	char *envifgateway6 = getenv("LKL_HIJACK_NET_IFGATEWAY6");
+	char *envmac_str = getenv("LKL_HIJACK_NET_MAC");
+	char *envnetmask_len = getenv("LKL_HIJACK_NET_NETMASK_LEN");
+	char *envnetmask6_len = getenv("LKL_HIJACK_NET_NETMASK6_LEN");
+	char *envgateway = getenv("LKL_HIJACK_NET_GATEWAY");
+	char *envgateway6 = getenv("LKL_HIJACK_NET_GATEWAY6");
+	char *envdebug = getenv("LKL_HIJACK_DEBUG");
+	char *envmount = getenv("LKL_HIJACK_MOUNT");
+	char *envneigh_entries = getenv("LKL_HIJACK_NET_NEIGHBOR");
+	char *envqdisc_entries = getenv("LKL_HIJACK_NET_QDISC");
+	char *envsingle_cpu = getenv("LKL_HIJACK_SINGLE_CPU");
+	char *envoffload_str = getenv("LKL_HIJACK_OFFLOAD");
+	char *envsysctls = getenv("LKL_HIJACK_SYSCTL");
+	char *envboot_cmdline = getenv("LKL_HIJACK_BOOT_CMDLINE") ? : "";
+	char *envdump = getenv("LKL_HIJACK_DUMP");
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return -1;
+	if (envtap || enviftype)
+		cfg->ifnum = 1;
+
+	iface = malloc(sizeof(struct lkl_config_iface));
+	memset(iface, 0, sizeof(struct lkl_config_iface));
+
+	ret = cfgcpy(&iface->iftap, envtap);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->iftype, enviftype);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifparams, envifparams);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifmtu_str, envmtu_str);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifip, envip);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifipv6, envipv6);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifgateway, envifgateway);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifgateway6, envifgateway6);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifmac_str, envmac_str);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifnetmask_len, envnetmask_len);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifnetmask6_len, envnetmask6_len);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifoffload_str, envoffload_str);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifneigh_entries, envneigh_entries);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&iface->ifqdisc_entries, envqdisc_entries);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->gateway, envgateway);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->gateway6, envgateway6);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->debug, envdebug);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->mount, envmount);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->single_cpu, envsingle_cpu);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->sysctls, envsysctls);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->boot_cmdline, envboot_cmdline);
+	if (ret < 0)
+		return ret;
+	ret = cfgcpy(&cfg->dump, envdump);
+	if (ret < 0)
+		return ret;
+	return 0;
+}
+
+static int parse_mac_str(char *mac_str, __lkl__u8 mac[LKL_ETH_ALEN])
+{
+	char delim[] = ":";
+	char *saveptr = NULL, *token = NULL;
+	int i = 0;
+
+	if (!mac_str)
+		return 0;
+
+	for (token = strtok_r(mac_str, delim, &saveptr);
+	     i < LKL_ETH_ALEN; i++) {
+		if (!token) {
+			/* The address is too short */
+			return -1;
+		}
+
+		mac[i] = (__lkl__u8) strtol(token, NULL, 16);
+		token = strtok_r(NULL, delim, &saveptr);
+	}
+
+	if (strtok_r(NULL, delim, &saveptr)) {
+		/* The address is too long */
+		return -1;
+	}
+
+	return 1;
+}
+
+/* Add permanent neighbor entries in the form of "ip|mac;ip|mac;..." */
+static void add_neighbor(int ifindex, char *entries)
+{
+	char *saveptr = NULL, *token = NULL;
+	char *ip = NULL, *mac_str = NULL;
+	int ret = 0;
+	__lkl__u8 mac[LKL_ETH_ALEN];
+	char ip_addr[16];
+	int af;
+
+	for (token = strtok_r(entries, ";", &saveptr); token;
+	     token = strtok_r(NULL, ";", &saveptr)) {
+		ip = strtok(token, "|");
+		mac_str = strtok(NULL, "|");
+		if (ip == NULL || mac_str == NULL || strtok(NULL, "|") != NULL)
+			return;
+
+		af = LKL_AF_INET;
+		ret = inet_pton(LKL_AF_INET, ip, ip_addr);
+		if (ret == 0) {
+			ret = inet_pton(LKL_AF_INET6, ip, ip_addr);
+			af = LKL_AF_INET6;
+		}
+		if (ret != 1) {
+			lkl_printf("Bad ip address: %s\n", ip);
+			return;
+		}
+
+		ret = parse_mac_str(mac_str, mac);
+		if (ret != 1) {
+			lkl_printf("Failed to parse mac: %s\n", mac_str);
+			return;
+		}
+		ret = lkl_add_neighbor(ifindex, af, ip_addr, mac);
+		if (ret) {
+			lkl_printf("Failed to add neighbor entry: %s\n",
+				   lkl_strerror(ret));
+			return;
+		}
+	}
+}
+
+/* We don't have an easy way to make FILE*s out of our fds, so we
+ * can't use e.g. fgets
+ */
+static int dump_file(char *path)
+{
+	int ret = -1, bytes_read = 0;
+	char str[1024] = { 0 };
+	int fd;
+
+	fd = lkl_sys_open(path, LKL_O_RDONLY, 0);
+
+	if (fd < 0) {
+		lkl_printf("%s lkl_sys_open %s: %s\n",
+			   __func__, path, lkl_strerror(fd));
+		return -1;
+	}
+
+	/* Need to print this out in order to make sense of the output */
+	lkl_printf("Reading from %s:\n==========\n", path);
+	while ((ret = lkl_sys_read(fd, str, sizeof(str) - 1)) > 0)
+		bytes_read += lkl_printf("%s", str);
+	lkl_printf("==========\n");
+
+	if (ret) {
+		lkl_printf("%s lkl_sys_read %s: %s\n",
+			   __func__, path, lkl_strerror(ret));
+		return -1;
+	}
+
+	return 0;
+}
+
+static void mount_cmds_exec(char *_cmds, int (*callback)(char *))
+{
+	char *saveptr = NULL, *token;
+	int ret = 0;
+	char *cmds = strdup(_cmds);
+
+	token = strtok_r(cmds, ",", &saveptr);
+
+	while (token && ret >= 0) {
+		ret = callback(token);
+		token = strtok_r(NULL, ",", &saveptr);
+	}
+
+	if (ret < 0)
+		lkl_printf("%s: failed parsing %s\n", __func__, _cmds);
+
+	free(cmds);
+}
+
+static int lkl_config_netdev_create(struct lkl_config *cfg,
+				    struct lkl_config_iface *iface)
+{
+	int ret, offload = 0;
+	struct lkl_netdev_args nd_args;
+	__lkl__u8 mac[LKL_ETH_ALEN] = {0};
+	struct lkl_netdev *nd = NULL;
+
+	if (iface->ifoffload_str)
+		offload = strtol(iface->ifoffload_str, NULL, 0);
+	memset(&nd_args, 0, sizeof(struct lkl_netdev_args));
+
+	if (iface->iftap) {
+		lkl_printf("WARN: LKL_HIJACK_NET_TAP is now obsoleted.\n");
+		lkl_printf("use LKL_HIJACK_NET_IFTYPE and PARAMS\n");
+		nd = lkl_netdev_tap_create(iface->iftap, offload);
+	}
+
+	if (!nd && iface->iftype && iface->ifparams) {
+		if ((strcmp(iface->iftype, "tap") == 0)) {
+			nd = lkl_netdev_tap_create(iface->ifparams, offload);
+		} else if ((strcmp(iface->iftype, "macvtap") == 0)) {
+			nd = lkl_netdev_macvtap_create(iface->ifparams,
+						       offload);
+		} else if ((strcmp(iface->iftype, "dpdk") == 0)) {
+			nd = lkl_netdev_dpdk_create(iface->ifparams, offload,
+						    mac);
+		} else if ((strcmp(iface->iftype, "pipe") == 0)) {
+			nd = lkl_netdev_pipe_create(iface->ifparams, offload);
+		} else {
+			if (offload) {
+				lkl_printf("WARN: %s isn't supported on %s\n",
+					   "LKL_HIJACK_OFFLOAD",
+					   iface->iftype);
+				lkl_printf(
+					"WARN: Disabling offload features.\n");
+			}
+			offload = 0;
+		}
+		if (strcmp(iface->iftype, "vde") == 0)
+			nd = lkl_netdev_vde_create(iface->ifparams);
+		if (strcmp(iface->iftype, "raw") == 0)
+			nd = lkl_netdev_raw_create(iface->ifparams);
+	}
+
+	if (nd) {
+		if ((mac[0] != 0) || (mac[1] != 0) ||
+				(mac[2] != 0) || (mac[3] != 0) ||
+				(mac[4] != 0) || (mac[5] != 0)) {
+			nd_args.mac = mac;
+		} else {
+			ret = parse_mac_str(iface->ifmac_str, mac);
+
+			if (ret < 0) {
+				lkl_printf("failed to parse mac\n");
+				return -1;
+			} else if (ret > 0) {
+				nd_args.mac = mac;
+			} else {
+				nd_args.mac = NULL;
+			}
+		}
+
+		nd_args.offload = offload;
+		ret = lkl_netdev_add(nd, &nd_args);
+		if (ret < 0) {
+			lkl_printf("failed to add netdev: %s\n",
+				   lkl_strerror(ret));
+			return -1;
+		}
+		nd->id = ret;
+		iface->nd = nd;
+	}
+	return 0;
+}
+
+static int lkl_config_netdev_configure(struct lkl_config *cfg,
+				       struct lkl_config_iface *iface)
+{
+	int ret, nd_ifindex = -1;
+	struct lkl_netdev *nd = iface->nd;
+
+	if (!nd) {
+		lkl_printf("no netdev available %s\n", iface ? iface->ifparams
+			   : "(null)");
+		return -1;
+	}
+
+	if (nd->id >= 0) {
+		nd_ifindex = lkl_netdev_get_ifindex(nd->id);
+		if (nd_ifindex > 0)
+			lkl_if_up(nd_ifindex);
+		else
+			lkl_printf(
+				"failed to get ifindex for netdev id %d: %s\n",
+				nd->id, lkl_strerror(nd_ifindex));
+	}
+
+	if (nd_ifindex >= 0 && iface->ifmtu_str) {
+		int mtu = atoi(iface->ifmtu_str);
+
+		ret = lkl_if_set_mtu(nd_ifindex, mtu);
+		if (ret < 0)
+			lkl_printf("failed to set MTU: %s\n",
+				   lkl_strerror(ret));
+	}
+
+	if (nd_ifindex >= 0 && iface->ifip && iface->ifnetmask_len) {
+		unsigned int addr;
+
+		if (inet_pton(LKL_AF_INET, iface->ifip,
+			      (struct lkl_in_addr *)&addr) != 1)
+			lkl_printf("Invalid ipv4 address: %s\n", iface->ifip);
+
+		int nmlen = atoi(iface->ifnetmask_len);
+
+		if (addr != LKL_INADDR_NONE && nmlen > 0 && nmlen < 32) {
+			ret = lkl_if_set_ipv4(nd_ifindex, addr, nmlen);
+			if (ret < 0)
+				lkl_printf("failed to set IPv4 address: %s\n",
+					   lkl_strerror(ret));
+		}
+		if (iface->ifgateway) {
+			unsigned int gwaddr;
+
+			if (inet_pton(LKL_AF_INET, iface->ifgateway,
+				      (struct lkl_in_addr *)&gwaddr) != 1)
+				lkl_printf("Invalid ipv4 gateway: %s\n",
+					   iface->ifgateway);
+
+			if (gwaddr != LKL_INADDR_NONE) {
+				ret = lkl_if_set_ipv4_gateway(nd_ifindex,
+						addr, nmlen, gwaddr);
+				if (ret < 0)
+					lkl_printf(
+						"failed to set v4 if gw: %s\n",
+						lkl_strerror(ret));
+			}
+		}
+	}
+
+	if (nd_ifindex >= 0 && iface->ifipv6 &&
+			iface->ifnetmask6_len) {
+		struct lkl_in6_addr addr;
+		unsigned int pflen = atoi(iface->ifnetmask6_len);
+
+		if (inet_pton(LKL_AF_INET6, iface->ifipv6,
+			      (struct lkl_in6_addr *)&addr) != 1) {
+			lkl_printf("Invalid ipv6 addr: %s\n",
+				   iface->ifipv6);
+		}  else {
+			ret = lkl_if_set_ipv6(nd_ifindex, &addr, pflen);
+			if (ret < 0)
+				lkl_printf("failed to set IPv6 address: %s\n",
+					   lkl_strerror(ret));
+		}
+		if (iface->ifgateway6) {
+			char gwaddr[16];
+
+			if (inet_pton(LKL_AF_INET6, iface->ifgateway6,
+								gwaddr) != 1) {
+				lkl_printf("Invalid ipv6 gateway: %s\n",
+					   iface->ifgateway6);
+			} else {
+				ret = lkl_if_set_ipv6_gateway(nd_ifindex,
+						&addr, pflen, gwaddr);
+				if (ret < 0)
+					lkl_printf(
+						"failed to set v6 if gw: %s\n",
+						lkl_strerror(ret));
+			}
+		}
+	}
+
+	if (nd_ifindex >= 0 && iface->ifneigh_entries)
+		add_neighbor(nd_ifindex, iface->ifneigh_entries);
+
+	if (nd_ifindex >= 0 && iface->ifqdisc_entries)
+		lkl_qdisc_parse_add(nd_ifindex, iface->ifqdisc_entries);
+
+	return 0;
+}
+
+static void free_cfgparam(char *cfgparam)
+{
+	if (cfgparam)
+		free(cfgparam);
+}
+
+static int lkl_clean_config(struct lkl_config *cfg)
+{
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return -1;
+
+	for (iface = cfg->ifaces; iface; iface = iface->next) {
+		free_cfgparam(iface->iftap);
+		free_cfgparam(iface->iftype);
+		free_cfgparam(iface->ifparams);
+		free_cfgparam(iface->ifmtu_str);
+		free_cfgparam(iface->ifip);
+		free_cfgparam(iface->ifipv6);
+		free_cfgparam(iface->ifgateway);
+		free_cfgparam(iface->ifgateway6);
+		free_cfgparam(iface->ifmac_str);
+		free_cfgparam(iface->ifnetmask_len);
+		free_cfgparam(iface->ifnetmask6_len);
+		free_cfgparam(iface->ifoffload_str);
+		free_cfgparam(iface->ifneigh_entries);
+		free_cfgparam(iface->ifqdisc_entries);
+	}
+	free_cfgparam(cfg->gateway);
+	free_cfgparam(cfg->gateway6);
+	free_cfgparam(cfg->debug);
+	free_cfgparam(cfg->mount);
+	free_cfgparam(cfg->single_cpu);
+	free_cfgparam(cfg->sysctls);
+	free_cfgparam(cfg->boot_cmdline);
+	free_cfgparam(cfg->dump);
+	free_cfgparam(cfg->delay_main);
+	return 0;
+}
+
+
+int lkl_load_config_pre(struct lkl_config *cfg)
+{
+	int lkl_debug, ret;
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return 0;
+
+	if (cfg->debug)
+		lkl_debug = strtol(cfg->debug, NULL, 0);
+
+	if (!cfg->debug || (lkl_debug == 0))
+		lkl_host_ops.print = NULL;
+
+	for (iface = cfg->ifaces; iface; iface = iface->next) {
+		ret = lkl_config_netdev_create(cfg, iface);
+		if (ret < 0)
+			return -1;
+	}
+
+	return 0;
+}
+
+int lkl_load_config_post(struct lkl_config *cfg)
+{
+	int ret;
+	struct lkl_config_iface *iface;
+
+	if (!cfg)
+		return 0;
+
+	if (cfg->mount)
+		mount_cmds_exec(cfg->mount, lkl_mount_fs);
+
+	for (iface = cfg->ifaces; iface; iface = iface->next) {
+		ret = lkl_config_netdev_configure(cfg, iface);
+		if (ret < 0)
+			break;
+	}
+
+	if (cfg->gateway) {
+		unsigned int gwaddr;
+
+		if (inet_pton(LKL_AF_INET, cfg->gateway,
+			      (struct lkl_in_addr *)&gwaddr) != 1)
+			lkl_printf("Invalid ipv4 gateway: %s\n", cfg->gateway);
+
+		if (gwaddr != LKL_INADDR_NONE) {
+			ret = lkl_set_ipv4_gateway(gwaddr);
+			if (ret < 0)
+				lkl_printf("failed to set IPv4 gateway: %s\n",
+					   lkl_strerror(ret));
+		}
+	}
+
+	if (cfg->gateway6) {
+		char gw[16];
+
+		if (inet_pton(LKL_AF_INET6, cfg->gateway6, gw) != 1) {
+			lkl_printf("Invalid ipv6 gateway: %s\n", cfg->gateway6);
+		} else {
+			ret = lkl_set_ipv6_gateway(gw);
+			if (ret < 0)
+				lkl_printf("failed to set IPv6 gateway: %s\n",
+					   lkl_strerror(ret));
+		}
+	}
+
+	if (cfg->sysctls)
+		lkl_sysctl_parse_write(cfg->sysctls);
+
+	/* put a delay before calling main() */
+	if (cfg->delay_main) {
+		unsigned long delay = strtoul(cfg->delay_main, NULL, 10);
+
+		if (delay == ~0UL)
+			lkl_printf("got invalid delay_main value (%s)\n",
+				   cfg->delay_main);
+		else {
+			lkl_printf("sleeping %lu usec\n", delay);
+			usleep(delay);
+		}
+	}
+
+	return 0;
+}
+
+int lkl_unload_config(struct lkl_config *cfg)
+{
+	struct lkl_config_iface *iface;
+
+	if (cfg) {
+		if (cfg->dump)
+			mount_cmds_exec(cfg->dump, dump_file);
+
+		for (iface = cfg->ifaces; iface; iface = iface->next) {
+			if (iface->nd) {
+				if (iface->nd->id >= 0)
+					lkl_netdev_remove(iface->nd->id);
+				lkl_netdev_free(iface->nd);
+			}
+		}
+
+		lkl_clean_config(cfg);
+	}
+
+	return 0;
+}
diff --git a/tools/lkl/lib/dbg.c b/tools/lkl/lib/dbg.c
new file mode 100644
index 000000000000..b613353bce5c
--- /dev/null
+++ b/tools/lkl/lib/dbg.c
@@ -0,0 +1,300 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <errno.h>
+#include <lkl.h>
+#include <limits.h>
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+static const char *PROMOTE = "$";
+#define str(x) #x
+#define xstr(s) str(s)
+#define MAX_BUF 100
+static char cmd[MAX_BUF];
+static char argv[10][MAX_BUF];
+static int argc;
+static char cur_dir[MAX_BUF] = "/";
+
+static char *normalize_path(const char *src, size_t src_len)
+{
+	char *res;
+	unsigned int res_len;
+	const char *ptr = src;
+	const char *end = &src[src_len];
+	const char *next;
+
+	res = malloc((src_len > 0 ? src_len : 1) + 1);
+	res_len = 0;
+
+	for (ptr = src; ptr < end; ptr = next+1) {
+		size_t len;
+
+		next = memchr(ptr, '/', end-ptr);
+		if (next == NULL)
+			next = end;
+
+		len = next-ptr;
+		switch (len) {
+		case 2:
+			if (ptr[0] == '.' && ptr[1] == '.') {
+				const char *slash = strrchr(res, '/');
+
+				if (slash != NULL)
+					res_len = slash - res;
+				continue;
+			}
+			break;
+		case 1:
+			if (ptr[0] == '.')
+				continue;
+			break;
+		case 0:
+			continue;
+		}
+		res[res_len++] = '/';
+		memcpy(&res[res_len], ptr, len);
+		res_len += len;
+	}
+	if (res_len == 0)
+		res[res_len++] = '/';
+	res[res_len] = '\0';
+	return res;
+}
+
+static void build_path(char *path)
+{
+	char *npath;
+
+	strcpy(path, cur_dir);
+	if (argc >= 1) {
+		if (argv[0][0] == '/')
+			strncpy(path, argv[0], LKL_PATH_MAX);
+		else {
+			strncat(path, "/", LKL_PATH_MAX - strlen(path) - 1);
+			strncat(path, argv[0], LKL_PATH_MAX - strlen(path) - 1);
+		}
+	}
+	npath = normalize_path(path, strlen(path));
+	strcpy(path, npath);
+	free(npath);
+}
+
+static void help(void)
+{
+	const char *msg =
+		"cat FILE\n"
+		"\tShow content of FILE\n"
+		"cd [DIR]\n"
+		"\tChange directory to DIR\n"
+		"exit\n"
+		"\tExit the debug session\n"
+		"help\n"
+		"\tShow this message\n"
+		"ls [DIR]\n"
+		"\tList files in DIR\n"
+		"mount FSTYPE\n"
+		"\tMount FSTYPE as /FSTYPE\n"
+		"overwrite FILE\n"
+		"\tOverwrite content of FILE from stdin\n"
+		"pwd\n"
+		"\tShow current directory\n"
+		;
+	printf("%s", msg);
+}
+
+static void ls(void)
+{
+	char path[LKL_PATH_MAX];
+	struct lkl_dir *dir;
+	struct lkl_linux_dirent64 *de;
+	int err;
+
+	build_path(path);
+	dir = lkl_opendir(path, &err);
+	if (dir) {
+		do {
+			de = lkl_readdir(dir);
+			if (de) {
+				printf("%s\n", de->d_name);
+			} else {
+				err = lkl_errdir(dir);
+				if (err != 0) {
+					fprintf(stderr, "%s\n",
+						lkl_strerror(err));
+				}
+				break;
+			}
+		} while (1);
+		lkl_closedir(dir);
+	} else {
+		fprintf(stderr, "%s: %s\n", path, lkl_strerror(err));
+	}
+}
+
+static void cd(void)
+{
+	char path[LKL_PATH_MAX];
+	struct lkl_dir *dir;
+	int err;
+
+	build_path(path);
+	dir = lkl_opendir(path, &err);
+	if (dir) {
+		strcpy(cur_dir, path);
+		lkl_closedir(dir);
+	} else {
+		fprintf(stderr, "%s: %s\n", path, lkl_strerror(err));
+	}
+}
+
+static void mount(void)
+{
+	char *fstype;
+	int ret = 0;
+
+	if (argc != 1) {
+		fprintf(stderr, "%s\n", "One argument is needed.");
+		return;
+	}
+
+	fstype = argv[0];
+	ret = lkl_mount_fs(fstype);
+	if (ret == 1)
+		fprintf(stderr, "%s is already mounted.\n", fstype);
+}
+
+static void cat(void)
+{
+	char path[LKL_PATH_MAX];
+	int ret;
+	char buf[1024];
+	int fd;
+
+	if (argc != 1) {
+		fprintf(stderr, "%s\n", "One argument is needed.");
+		return;
+	}
+
+	build_path(path);
+	fd = lkl_sys_open(path, LKL_O_RDONLY, 0);
+
+	if (fd < 0) {
+		fprintf(stderr, "lkl_sys_open %s: %s\n",
+			path, lkl_strerror(fd));
+		return;
+	}
+
+	while ((ret = lkl_sys_read(fd, buf, sizeof(buf) - 1)) > 0) {
+		buf[ret] = '\0';
+		printf("%s", buf);
+	}
+
+	if (ret) {
+		fprintf(stderr, "lkl_sys_read %s: %s\n",
+			path, lkl_strerror(ret));
+	}
+	lkl_sys_close(fd);
+}
+
+static void overwrite(void)
+{
+	char path[LKL_PATH_MAX];
+	int ret;
+	int fd;
+	char buf[1024];
+
+	build_path(path);
+	fd = lkl_sys_open(path, LKL_O_WRONLY | LKL_O_CREAT, 0);
+	if (fd < 0) {
+		fprintf(stderr, "lkl_sys_open %s: %s\n",
+			path, lkl_strerror(fd));
+		return;
+	}
+	printf("Input the content and stop by hitting Ctrl-D:\n");
+	while (fgets(buf, 1023, stdin)) {
+		ret = lkl_sys_write(fd, buf, strlen(buf));
+		if (ret < 0) {
+			fprintf(stderr, "lkl_sys_write %s: %s\n",
+				path, lkl_strerror(fd));
+		}
+	}
+	lkl_sys_close(fd);
+}
+
+static void pwd(void)
+{
+	printf("%s\n", cur_dir);
+}
+
+static int parse_cmd(char *input)
+{
+	char *token;
+
+	token = strtok(input, " ");
+	if (token)
+		strcpy(cmd, token);
+	else
+		return -1;
+
+	argc = 0;
+	token = strtok(NULL, " ");
+	while (token) {
+		if (argc >= 10) {
+			fprintf(stderr, "To many args > 10\n");
+			return -1;
+		}
+		strcpy(argv[argc++], token);
+		token = strtok(NULL, " ");
+	}
+	return 0;
+}
+
+static void run_cmd(void)
+{
+	if (strcmp(cmd, "cat") == 0)
+		cat();
+	else if (strcmp(cmd, "cd") == 0)
+		cd();
+	else if (strcmp(cmd, "help") == 0)
+		help();
+	else if (strcmp(cmd, "ls") == 0)
+		ls();
+	else if (strcmp(cmd, "mount") == 0)
+		mount();
+	else if (strcmp(cmd, "overwrite") == 0)
+		overwrite();
+	else if (strcmp(cmd, "pwd") == 0)
+		pwd();
+	else
+		fprintf(stderr, "Unknown command: %s\n", cmd);
+}
+
+void dbg_entrance(void)
+{
+	char input[MAX_BUF + 1];
+	int ret;
+	int c;
+
+	printf("Type help to see a list of commands\n");
+	do {
+		printf("%s ", PROMOTE);
+		ret = scanf("%" xstr(MAX_BUF) "[^\n]s", input);
+		while ((c = getchar()) != '\n' && c != EOF)
+			;
+		if (ret == 0)
+			continue;
+		if (ret != 1 && errno != EINTR) {
+			perror("scanf");
+			continue;
+		}
+		if (strlen(input) == MAX_BUF) {
+			fprintf(stderr, "Too long input > %d\n", MAX_BUF - 1);
+			continue;
+		}
+		if (parse_cmd(input))
+			continue;
+		if (strcmp(cmd, "exit") == 0)
+			break;
+		run_cmd();
+	} while (1);
+}
diff --git a/tools/lkl/lib/dbg_handler.c b/tools/lkl/lib/dbg_handler.c
new file mode 100644
index 000000000000..01d165a5fc1e
--- /dev/null
+++ b/tools/lkl/lib/dbg_handler.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <lkl_host.h>
+
+extern void dbg_entrance(void);
+static int dbg_running;
+
+static void dbg_thread(void *arg)
+{
+	lkl_host_ops.thread_detach();
+	printf("======Enter Debug======\n");
+	dbg_entrance();
+	printf("======Exit Debug======\n");
+	dbg_running = 0;
+}
+
+void dbg_handler(int signum)
+{
+	/* We don't care about the possible race on dbg_running. */
+	if (dbg_running) {
+		fprintf(stderr, "A debug lib is running\n");
+		return;
+	}
+	dbg_running = 1;
+	lkl_host_ops.thread_create(&dbg_thread, NULL);
+}
+
+#ifndef __MINGW32__
+#include <signal.h>
+void lkl_register_dbg_handler(void)
+{
+	struct sigaction sa;
+
+	sigemptyset(&sa.sa_mask);
+	sa.sa_handler = dbg_handler;
+	if (sigaction(SIGTSTP, &sa, NULL) == -1)
+		perror("sigaction");
+}
+#else
+void lkl_register_dbg_handler(void)
+{
+	fprintf(stderr, "%s is not implemented.\n", __func__);
+}
+#endif
diff --git a/tools/lkl/lib/endian.h b/tools/lkl/lib/endian.h
new file mode 100644
index 000000000000..aaccfa0edb65
--- /dev/null
+++ b/tools/lkl/lib/endian.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_ENDIAN_H
+#define _LKL_LIB_ENDIAN_H
+
+#if defined(__FreeBSD__)
+#include <sys/endian.h>
+#elif defined(__ANDROID__)
+#include <sys/endian.h>
+#elif defined(__MINGW32__)
+#include <winsock.h>
+#define le32toh(x) (x)
+#define le16toh(x) (x)
+#define htole32(x) (x)
+#define htole16(x) (x)
+#define le64toh(x) (x)
+#define htobe32(x) htonl(x)
+#define htobe16(x) htons(x)
+#define be32toh(x) ntohl(x)
+#define be16toh(x) ntohs(x)
+#else
+#include <endian.h>
+#endif
+
+#ifndef htonl
+#define htonl(x) htobe32(x)
+#define htons(x) htobe16(x)
+#define ntohl(x) be32toh(x)
+#define ntohs(x) be16toh(x)
+#endif
+
+#endif /* _LKL_LIB_ENDIAN_H */
diff --git a/tools/lkl/lib/jmp_buf.c b/tools/lkl/lib/jmp_buf.c
new file mode 100644
index 000000000000..f6bdd7e4bd83
--- /dev/null
+++ b/tools/lkl/lib/jmp_buf.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <setjmp.h>
+#include <lkl_host.h>
+
+void jmp_buf_set(struct lkl_jmp_buf *jmpb, void (*f)(void))
+{
+	if (!setjmp(*((jmp_buf *)jmpb->buf)))
+		f();
+}
+
+void jmp_buf_longjmp(struct lkl_jmp_buf *jmpb, int val)
+{
+	longjmp(*((jmp_buf *)jmpb->buf), val);
+}
diff --git a/tools/lkl/lib/jmp_buf.h b/tools/lkl/lib/jmp_buf.h
new file mode 100644
index 000000000000..8782cbaaf51f
--- /dev/null
+++ b/tools/lkl/lib/jmp_buf.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_JMP_BUF_H
+#define _LKL_LIB_JMP_BUF_H
+
+void jmp_buf_set(struct lkl_jmp_buf *jmpb, void (*f)(void));
+void jmp_buf_longjmp(struct lkl_jmp_buf *jmpb, int val);
+
+#endif
diff --git a/tools/lkl/lib/utils.c b/tools/lkl/lib/utils.c
new file mode 100644
index 000000000000..7de92bbe5475
--- /dev/null
+++ b/tools/lkl/lib/utils.c
@@ -0,0 +1,266 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <lkl_host.h>
+
+static const char * const lkl_err_strings[] = {
+	"Success",
+	"Operation not permitted",
+	"No such file or directory",
+	"No such process",
+	"Interrupted system call",
+	"I/O error",
+	"No such device or address",
+	"Argument list too long",
+	"Exec format error",
+	"Bad file number",
+	"No child processes",
+	"Try again",
+	"Out of memory",
+	"Permission denied",
+	"Bad address",
+	"Block device required",
+	"Device or resource busy",
+	"File exists",
+	"Cross-device link",
+	"No such device",
+	"Not a directory",
+	"Is a directory",
+	"Invalid argument",
+	"File table overflow",
+	"Too many open files",
+	"Not a typewriter",
+	"Text file busy",
+	"File too large",
+	"No space left on device",
+	"Illegal seek",
+	"Read-only file system",
+	"Too many links",
+	"Broken pipe",
+	"Math argument out of domain of func",
+	"Math result not representable",
+	"Resource deadlock would occur",
+	"File name too long",
+	"No record locks available",
+	"Invalid system call number",
+	"Directory not empty",
+	"Too many symbolic links encountered",
+	"Bad error code", /* EWOULDBLOCK is EAGAIN */
+	"No message of desired type",
+	"Identifier removed",
+	"Channel number out of range",
+	"Level 2 not synchronized",
+	"Level 3 halted",
+	"Level 3 reset",
+	"Link number out of range",
+	"Protocol driver not attached",
+	"No CSI structure available",
+	"Level 2 halted",
+	"Invalid exchange",
+	"Invalid request descriptor",
+	"Exchange full",
+	"No anode",
+	"Invalid request code",
+	"Invalid slot",
+	"Bad error code", /* EDEADLOCK is EDEADLK */
+	"Bad font file format",
+	"Device not a stream",
+	"No data available",
+	"Timer expired",
+	"Out of streams resources",
+	"Machine is not on the network",
+	"Package not installed",
+	"Object is remote",
+	"Link has been severed",
+	"Advertise error",
+	"Srmount error",
+	"Communication error on send",
+	"Protocol error",
+	"Multihop attempted",
+	"RFS specific error",
+	"Not a data message",
+	"Value too large for defined data type",
+	"Name not unique on network",
+	"File descriptor in bad state",
+	"Remote address changed",
+	"Can not access a needed shared library",
+	"Accessing a corrupted shared library",
+	".lib section in a.out corrupted",
+	"Attempting to link in too many shared libraries",
+	"Cannot exec a shared library directly",
+	"Illegal byte sequence",
+	"Interrupted system call should be restarted",
+	"Streams pipe error",
+	"Too many users",
+	"Socket operation on non-socket",
+	"Destination address required",
+	"Message too long",
+	"Protocol wrong type for socket",
+	"Protocol not available",
+	"Protocol not supported",
+	"Socket type not supported",
+	"Operation not supported on transport endpoint",
+	"Protocol family not supported",
+	"Address family not supported by protocol",
+	"Address already in use",
+	"Cannot assign requested address",
+	"Network is down",
+	"Network is unreachable",
+	"Network dropped connection because of reset",
+	"Software caused connection abort",
+	"Connection reset by peer",
+	"No buffer space available",
+	"Transport endpoint is already connected",
+	"Transport endpoint is not connected",
+	"Cannot send after transport endpoint shutdown",
+	"Too many references: cannot splice",
+	"Connection timed out",
+	"Connection refused",
+	"Host is down",
+	"No route to host",
+	"Operation already in progress",
+	"Operation now in progress",
+	"Stale file handle",
+	"Structure needs cleaning",
+	"Not a XENIX named type file",
+	"No XENIX semaphores available",
+	"Is a named type file",
+	"Remote I/O error",
+	"Quota exceeded",
+	"No medium found",
+	"Wrong medium type",
+	"Operation Canceled",
+	"Required key not available",
+	"Key has expired",
+	"Key has been revoked",
+	"Key was rejected by service",
+	"Owner died",
+	"State not recoverable",
+	"Operation not possible due to RF-kill",
+	"Memory page has hardware error",
+};
+
+const char *lkl_strerror(int err)
+{
+	if (err < 0)
+		err = -err;
+
+	if ((size_t)err >= sizeof(lkl_err_strings) / sizeof(const char *))
+		return "Bad error code";
+
+	return lkl_err_strings[err];
+}
+
+void lkl_perror(char *msg, int err)
+{
+	const char *err_msg = lkl_strerror(err);
+	/* We need to use 'real' printf because lkl_host_ops.print can
+	 * be turned off when debugging is off.
+	 */
+	lkl_printf("%s: %s\n", msg, err_msg);
+}
+
+static int lkl_vprintf(const char *fmt, va_list args)
+{
+	int n;
+	char *buffer;
+	va_list copy;
+
+	if (!lkl_host_ops.print)
+		return 0;
+
+	va_copy(copy, args);
+	n = vsnprintf(NULL, 0, fmt, copy);
+	va_end(copy);
+
+	buffer = lkl_host_ops.mem_alloc(n + 1);
+	if (!buffer)
+		return -1;
+
+	vsnprintf(buffer, n + 1, fmt, args);
+
+	lkl_host_ops.print(buffer, n);
+	lkl_host_ops.mem_free(buffer);
+
+	return n;
+}
+
+int lkl_printf(const char *fmt, ...)
+{
+	int n;
+	va_list args;
+
+	va_start(args, fmt);
+	n = lkl_vprintf(fmt, args);
+	va_end(args);
+
+	return n;
+}
+
+void lkl_bug(const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	lkl_vprintf(fmt, args);
+	va_end(args);
+
+	lkl_host_ops.panic();
+}
+#ifndef __arch_um__
+int lkl_sysctl(const char *path, const char *value)
+{
+	int ret;
+	int fd;
+	char *delim, *p;
+	char full_path[256];
+
+	lkl_mount_fs("proc");
+
+	snprintf(full_path, sizeof(full_path), "/proc/sys/%s", path);
+	p = full_path;
+	while ((delim = strstr(p, "."))) {
+		*delim = '/';
+		p = delim + 1;
+	}
+
+	fd = lkl_sys_open(full_path, LKL_O_WRONLY | LKL_O_CREAT, 0);
+	if (fd < 0) {
+		lkl_printf("lkl_sys_open %s: %s\n",
+			   full_path, lkl_strerror(fd));
+		return -1;
+	}
+	ret = lkl_sys_write(fd, value, strlen(value));
+	if (ret < 0) {
+		lkl_printf("lkl_sys_write %s: %s\n",
+			full_path, lkl_strerror(fd));
+	}
+
+	lkl_sys_close(fd);
+
+	return 0;
+}
+
+/* Configure sysctl parameters as the form of "key=value;key=value;..." */
+void lkl_sysctl_parse_write(const char *sysctls)
+{
+	char *saveptr = NULL, *token = NULL;
+	char *key = NULL, *value = NULL;
+	char strings[256];
+	int ret = 0;
+
+	strcpy(strings, sysctls);
+	for (token = strtok_r(strings, ";", &saveptr); token;
+	     token = strtok_r(NULL, ";", &saveptr)) {
+		key = strtok(token, "=");
+		value = strtok(NULL, "=");
+		ret = lkl_sysctl(key, value);
+		if (ret) {
+			lkl_printf("Failed to configure sysctl entries: %s\n",
+				   lkl_strerror(ret));
+			return;
+		}
+	}
+}
+#endif
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 16/37] lkl tools: host lib: memory mapped I/O helpers
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch adds helpers for implementing the memory mapped I/O host
operations that can be used by code that implements host
devices. Generic host operations for lkl_ioremap and lkl_iomem_access
are provided that allows multiplexing multiple I/O memory mapped
regions.

The host device code can create a new memory mapped I/O region with
register_iomem(). Read and write access functions need to be provided
by the caller.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/lib/Build   |  1 +
 tools/lkl/lib/iomem.c | 88 +++++++++++++++++++++++++++++++++++++++++++
 tools/lkl/lib/iomem.h | 15 ++++++++
 3 files changed, 104 insertions(+)
 create mode 100644 tools/lkl/lib/iomem.c
 create mode 100644 tools/lkl/lib/iomem.h

diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index 658bfa865b9c..8d01982638f8 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1,5 +1,6 @@
 CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 
+liblkl-y += iomem.o
 liblkl-y += jmp_buf.o
 liblkl-y += utils.o
 liblkl-y += dbg.o
diff --git a/tools/lkl/lib/iomem.c b/tools/lkl/lib/iomem.c
new file mode 100644
index 000000000000..2301fe4e5ad5
--- /dev/null
+++ b/tools/lkl/lib/iomem.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <stdint.h>
+#include <lkl_host.h>
+
+#include "iomem.h"
+
+#define IOMEM_OFFSET_BITS		24
+#define MAX_IOMEM_REGIONS		256
+
+#define IOMEM_ADDR_TO_INDEX(addr) \
+	(((uintptr_t)addr) >> IOMEM_OFFSET_BITS)
+#define IOMEM_ADDR_TO_OFFSET(addr) \
+	(((uintptr_t)addr) & ((1 << IOMEM_OFFSET_BITS) - 1))
+#define IOMEM_INDEX_TO_ADDR(i) \
+	(void *)(uintptr_t)(i << IOMEM_OFFSET_BITS)
+
+static struct iomem_region {
+	void *data;
+	int size;
+	const struct lkl_iomem_ops *ops;
+} iomem_regions[MAX_IOMEM_REGIONS];
+
+void *register_iomem(void *data, int size, const struct lkl_iomem_ops *ops)
+{
+	int i;
+
+	if (size > (1 << IOMEM_OFFSET_BITS) - 1)
+		return NULL;
+
+	for (i = 1; i < MAX_IOMEM_REGIONS; i++)
+		if (!iomem_regions[i].ops)
+			break;
+
+	if (i >= MAX_IOMEM_REGIONS)
+		return NULL;
+
+	iomem_regions[i].data = data;
+	iomem_regions[i].size = size;
+	iomem_regions[i].ops = ops;
+	return IOMEM_INDEX_TO_ADDR(i);
+}
+
+void unregister_iomem(void *base)
+{
+	unsigned int index = IOMEM_ADDR_TO_INDEX(base);
+
+	if (index >= MAX_IOMEM_REGIONS) {
+		lkl_printf("%s: invalid iomem_addr %p\n", __func__, base);
+		return;
+	}
+
+	iomem_regions[index].size = 0;
+	iomem_regions[index].ops = NULL;
+}
+
+void *lkl_ioremap(long addr, int size)
+{
+	int index = IOMEM_ADDR_TO_INDEX(addr);
+	struct iomem_region *iomem = &iomem_regions[index];
+
+	if (index >= MAX_IOMEM_REGIONS)
+		return NULL;
+
+	if (iomem->ops && size <= iomem->size)
+		return IOMEM_INDEX_TO_ADDR(index);
+
+	return NULL;
+}
+
+int lkl_iomem_access(const volatile void *addr, void *res, int size, int write)
+{
+	int index = IOMEM_ADDR_TO_INDEX(addr);
+	struct iomem_region *iomem = &iomem_regions[index];
+	int offset = IOMEM_ADDR_TO_OFFSET(addr);
+	int ret;
+
+	if (index > MAX_IOMEM_REGIONS || !iomem_regions[index].ops ||
+	    offset + size > iomem_regions[index].size)
+		return -1;
+
+	if (write)
+		ret = iomem->ops->write(iomem->data, offset, res, size);
+	else
+		ret = iomem->ops->read(iomem->data, offset, res, size);
+
+	return ret;
+}
diff --git a/tools/lkl/lib/iomem.h b/tools/lkl/lib/iomem.h
new file mode 100644
index 000000000000..0ad80ccc2626
--- /dev/null
+++ b/tools/lkl/lib/iomem.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_IOMEM_H
+#define _LKL_LIB_IOMEM_H
+
+struct lkl_iomem_ops {
+	int (*read)(void *data, int offset, void *res, int size);
+	int (*write)(void *data, int offset, void *value, int size);
+};
+
+void *register_iomem(void *data, int size, const struct lkl_iomem_ops *ops);
+void unregister_iomem(void *iomem_base);
+void *lkl_ioremap(long addr, int size);
+int lkl_iomem_access(const volatile void *addr, void *res, int size, int write);
+
+#endif /* _LKL_LIB_IOMEM_H */
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 16/37] lkl tools: host lib: memory mapped I/O helpers
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, linux-kernel-library, linux-arch, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch adds helpers for implementing the memory mapped I/O host
operations that can be used by code that implements host
devices. Generic host operations for lkl_ioremap and lkl_iomem_access
are provided that allows multiplexing multiple I/O memory mapped
regions.

The host device code can create a new memory mapped I/O region with
register_iomem(). Read and write access functions need to be provided
by the caller.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/lib/Build   |  1 +
 tools/lkl/lib/iomem.c | 88 +++++++++++++++++++++++++++++++++++++++++++
 tools/lkl/lib/iomem.h | 15 ++++++++
 3 files changed, 104 insertions(+)
 create mode 100644 tools/lkl/lib/iomem.c
 create mode 100644 tools/lkl/lib/iomem.h

diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index 658bfa865b9c..8d01982638f8 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1,5 +1,6 @@
 CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 
+liblkl-y += iomem.o
 liblkl-y += jmp_buf.o
 liblkl-y += utils.o
 liblkl-y += dbg.o
diff --git a/tools/lkl/lib/iomem.c b/tools/lkl/lib/iomem.c
new file mode 100644
index 000000000000..2301fe4e5ad5
--- /dev/null
+++ b/tools/lkl/lib/iomem.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <stdint.h>
+#include <lkl_host.h>
+
+#include "iomem.h"
+
+#define IOMEM_OFFSET_BITS		24
+#define MAX_IOMEM_REGIONS		256
+
+#define IOMEM_ADDR_TO_INDEX(addr) \
+	(((uintptr_t)addr) >> IOMEM_OFFSET_BITS)
+#define IOMEM_ADDR_TO_OFFSET(addr) \
+	(((uintptr_t)addr) & ((1 << IOMEM_OFFSET_BITS) - 1))
+#define IOMEM_INDEX_TO_ADDR(i) \
+	(void *)(uintptr_t)(i << IOMEM_OFFSET_BITS)
+
+static struct iomem_region {
+	void *data;
+	int size;
+	const struct lkl_iomem_ops *ops;
+} iomem_regions[MAX_IOMEM_REGIONS];
+
+void *register_iomem(void *data, int size, const struct lkl_iomem_ops *ops)
+{
+	int i;
+
+	if (size > (1 << IOMEM_OFFSET_BITS) - 1)
+		return NULL;
+
+	for (i = 1; i < MAX_IOMEM_REGIONS; i++)
+		if (!iomem_regions[i].ops)
+			break;
+
+	if (i >= MAX_IOMEM_REGIONS)
+		return NULL;
+
+	iomem_regions[i].data = data;
+	iomem_regions[i].size = size;
+	iomem_regions[i].ops = ops;
+	return IOMEM_INDEX_TO_ADDR(i);
+}
+
+void unregister_iomem(void *base)
+{
+	unsigned int index = IOMEM_ADDR_TO_INDEX(base);
+
+	if (index >= MAX_IOMEM_REGIONS) {
+		lkl_printf("%s: invalid iomem_addr %p\n", __func__, base);
+		return;
+	}
+
+	iomem_regions[index].size = 0;
+	iomem_regions[index].ops = NULL;
+}
+
+void *lkl_ioremap(long addr, int size)
+{
+	int index = IOMEM_ADDR_TO_INDEX(addr);
+	struct iomem_region *iomem = &iomem_regions[index];
+
+	if (index >= MAX_IOMEM_REGIONS)
+		return NULL;
+
+	if (iomem->ops && size <= iomem->size)
+		return IOMEM_INDEX_TO_ADDR(index);
+
+	return NULL;
+}
+
+int lkl_iomem_access(const volatile void *addr, void *res, int size, int write)
+{
+	int index = IOMEM_ADDR_TO_INDEX(addr);
+	struct iomem_region *iomem = &iomem_regions[index];
+	int offset = IOMEM_ADDR_TO_OFFSET(addr);
+	int ret;
+
+	if (index > MAX_IOMEM_REGIONS || !iomem_regions[index].ops ||
+	    offset + size > iomem_regions[index].size)
+		return -1;
+
+	if (write)
+		ret = iomem->ops->write(iomem->data, offset, res, size);
+	else
+		ret = iomem->ops->read(iomem->data, offset, res, size);
+
+	return ret;
+}
diff --git a/tools/lkl/lib/iomem.h b/tools/lkl/lib/iomem.h
new file mode 100644
index 000000000000..0ad80ccc2626
--- /dev/null
+++ b/tools/lkl/lib/iomem.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_IOMEM_H
+#define _LKL_LIB_IOMEM_H
+
+struct lkl_iomem_ops {
+	int (*read)(void *data, int offset, void *res, int size);
+	int (*write)(void *data, int offset, void *value, int size);
+};
+
+void *register_iomem(void *data, int size, const struct lkl_iomem_ops *ops);
+void unregister_iomem(void *iomem_base);
+void *lkl_ioremap(long addr, int size);
+int lkl_iomem_access(const volatile void *addr, void *res, int size, int write);
+
+#endif /* _LKL_LIB_IOMEM_H */
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Conrad Meyer, Hajime Tazaki, Michael Zimmermann, Patrick Collins,
	Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Add helpers for implementing host virtio devices. It uses the memory
mapped I/O helpers to interact with the Linux MMIO virtio transport
driver and offers support to setup and add a new virtio device,
dispatch requests from the incoming queues as well as support for
completing requests.

All added virtio devices are stored in lkl_virtio_devs as strings, per
the Linux MMIO virtio transport driver command line specification.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl_host.h |   1 +
 tools/lkl/lib/Build          |   1 +
 tools/lkl/lib/virtio.c       | 631 +++++++++++++++++++++++++++++++++++
 tools/lkl/lib/virtio.h       |  93 ++++++
 4 files changed, 726 insertions(+)
 create mode 100644 tools/lkl/lib/virtio.c
 create mode 100644 tools/lkl/lib/virtio.h

diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
index 85e80eb4ad0d..81239e2b556f 100644
--- a/tools/lkl/include/lkl_host.h
+++ b/tools/lkl/include/lkl_host.h
@@ -18,6 +18,7 @@ extern struct lkl_host_operations lkl_host_ops;
  */
 int lkl_printf(const char *fmt, ...);
 
+extern char lkl_virtio_devs[4096];
 
 #ifdef __cplusplus
 }
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index 8d01982638f8..5fd1843b51d1 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -3,6 +3,7 @@ CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 liblkl-y += iomem.o
 liblkl-y += jmp_buf.o
 liblkl-y += utils.o
+liblkl-y += virtio.o
 liblkl-y += dbg.o
 liblkl-y += dbg_handler.o
 liblkl-y += ../../perf/pmu-events/jsmn.o
diff --git a/tools/lkl/lib/virtio.c b/tools/lkl/lib/virtio.c
new file mode 100644
index 000000000000..c5247665482d
--- /dev/null
+++ b/tools/lkl/lib/virtio.c
@@ -0,0 +1,631 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <stdio.h>
+#include <stdbool.h>
+#include <inttypes.h>
+#include <lkl_host.h>
+#include <lkl/linux/virtio_ring.h>
+#include "iomem.h"
+#include "virtio.h"
+#include "endian.h"
+
+#define VIRTIO_DEV_MAGIC		0x74726976
+#define VIRTIO_DEV_VERSION		2
+
+#define VIRTIO_MMIO_MAGIC_VALUE		0x000
+#define VIRTIO_MMIO_VERSION		0x004
+#define VIRTIO_MMIO_DEVICE_ID		0x008
+#define VIRTIO_MMIO_VENDOR_ID		0x00c
+#define VIRTIO_MMIO_DEVICE_FEATURES	0x010
+#define VIRTIO_MMIO_DEVICE_FEATURES_SEL 0x014
+#define VIRTIO_MMIO_DRIVER_FEATURES	0x020
+#define VIRTIO_MMIO_DRIVER_FEATURES_SEL 0x024
+#define VIRTIO_MMIO_QUEUE_SEL		0x030
+#define VIRTIO_MMIO_QUEUE_NUM_MAX	0x034
+#define VIRTIO_MMIO_QUEUE_NUM		0x038
+#define VIRTIO_MMIO_QUEUE_READY		0x044
+#define VIRTIO_MMIO_QUEUE_NOTIFY	0x050
+#define VIRTIO_MMIO_INTERRUPT_STATUS	0x060
+#define VIRTIO_MMIO_INTERRUPT_ACK	0x064
+#define VIRTIO_MMIO_STATUS		0x070
+#define VIRTIO_MMIO_QUEUE_DESC_LOW	0x080
+#define VIRTIO_MMIO_QUEUE_DESC_HIGH	0x084
+#define VIRTIO_MMIO_QUEUE_AVAIL_LOW	0x090
+#define VIRTIO_MMIO_QUEUE_AVAIL_HIGH	0x094
+#define VIRTIO_MMIO_QUEUE_USED_LOW	0x0a0
+#define VIRTIO_MMIO_QUEUE_USED_HIGH	0x0a4
+#define VIRTIO_MMIO_CONFIG_GENERATION	0x0fc
+#define VIRTIO_MMIO_CONFIG		0x100
+#define VIRTIO_MMIO_INT_VRING		0x01
+#define VIRTIO_MMIO_INT_CONFIG		0x02
+
+#define BIT(x) (1ULL << x)
+
+#define virtio_panic(msg, ...) do {					\
+		lkl_printf("LKL virtio error" msg, ##__VA_ARGS__);	\
+		lkl_host_ops.panic();					\
+	} while (0)
+
+struct virtio_queue {
+	uint32_t num_max;
+	uint32_t num;
+	uint32_t ready;
+	uint32_t max_merge_len;
+
+	struct lkl_vring_desc *desc;
+	struct lkl_vring_avail *avail;
+	struct lkl_vring_used *used;
+	uint16_t last_avail_idx;
+	uint16_t last_used_idx_signaled;
+};
+
+struct _virtio_req {
+	struct virtio_req req;
+	struct virtio_dev *dev;
+	struct virtio_queue *q;
+	uint16_t idx;
+};
+
+
+static inline uint16_t virtio_get_used_event(struct virtio_queue *q)
+{
+	return q->avail->ring[q->num];
+}
+
+static inline void virtio_set_avail_event(struct virtio_queue *q, uint16_t val)
+{
+	*((uint16_t *)&q->used->ring[q->num]) = val;
+}
+
+static inline void virtio_deliver_irq(struct virtio_dev *dev)
+{
+	dev->int_status |= VIRTIO_MMIO_INT_VRING;
+	/* Make sure all memory writes before are visible to the driver. */
+	__sync_synchronize();
+	lkl_trigger_irq(dev->irq);
+}
+
+static inline uint16_t virtio_get_used_idx(struct virtio_queue *q)
+{
+	return le16toh(q->used->idx);
+}
+
+static inline void virtio_add_used(struct virtio_queue *q, uint16_t used_idx,
+				   uint16_t avail_idx, uint16_t len)
+{
+	uint16_t desc_idx = q->avail->ring[avail_idx & (q->num - 1)];
+
+	used_idx = used_idx & (q->num - 1);
+	q->used->ring[used_idx].id = desc_idx;
+	q->used->ring[used_idx].len = htole16(len);
+}
+
+/*
+ * Make sure all memory writes before are visible to the driver before updating
+ * the idx.  We need it here even we already have one in virtio_deliver_irq()
+ * because there might already be an driver thread reading the idx and dequeuing
+ * used buffers.
+ */
+static inline void virtio_sync_used_idx(struct virtio_queue *q, uint16_t idx)
+{
+	__sync_synchronize();
+	q->used->idx = htole16(idx);
+}
+
+#define min_len(a, b) (a < b ? a : b)
+
+void virtio_req_complete(struct virtio_req *req, uint32_t len)
+{
+	int send_irq = 0;
+	struct _virtio_req *_req = container_of(req, struct _virtio_req, req);
+	struct virtio_queue *q = _req->q;
+	uint16_t avail_idx = _req->idx;
+	uint16_t used_idx = virtio_get_used_idx(_req->q);
+	int i;
+
+	/*
+	 * We've potentially used up multiple (non-chained) descriptors and have
+	 * to create one "used" entry for each descriptor we've consumed.
+	 */
+	for (i = 0; i < req->buf_count; i++) {
+		uint16_t used_len;
+
+		if (!q->max_merge_len)
+			used_len = len;
+		else
+			used_len = min_len(len,  req->buf[i].iov_len);
+
+		virtio_add_used(q, used_idx++, avail_idx++, used_len);
+
+		len -= used_len;
+		if (!len)
+			break;
+	}
+	virtio_sync_used_idx(q, used_idx);
+	q->last_avail_idx = avail_idx;
+
+	/*
+	 * Triggers the irq whenever there is no available buffer.
+	 */
+	if (q->last_avail_idx == le16toh(q->avail->idx))
+		send_irq = 1;
+
+	/*
+	 * There are two rings: q->avail and q->used for each of the rx and tx
+	 * queues that are used to pass buffers between kernel driver and the
+	 * virtio device implementation.
+	 *
+	 * Kernel maitains the first one and appends buffers to it. In rx queue,
+	 * it's empty buffers kernel offers to store received packets. In tx
+	 * queue, it's buffers containing packets to transmit. Kernel notifies
+	 * the device by mmio write (see VIRTIO_MMIO_QUEUE_NOTIFY below).
+	 *
+	 * The virtio device (here in this file) maintains the
+	 * q->used and appends buffer to it after consuming it from q->avail.
+	 *
+	 * The device needs to notify the driver by triggering irq here. The
+	 * LKL_VIRTIO_RING_F_EVENT_IDX is enabled in this implementation so
+	 * kernel can set virtio_get_used_event(q) to tell the device to "only
+	 * trigger the irq when this item in q->used ring is populated."
+	 *
+	 * Because driver and device are run in two different threads. When
+	 * driver sets virtio_get_used_event(q), q->used->idx may already be
+	 * increased to a larger one. So we need to trigger the irq when
+	 * virtio_get_used_event(q) < q->used->idx.
+	 *
+	 * To avoid unnessary irqs for each packet after
+	 * virtio_get_used_event(q) < q->used->idx, last_used_idx_signaled is
+	 * stored and irq is only triggered if
+	 * last_used_idx_signaled <= virtio_get_used_event(q) < q->used->idx
+	 *
+	 * This is what lkl_vring_need_event() checks and it evens covers the
+	 * case when those numbers wrap up.
+	 */
+	if (send_irq || lkl_vring_need_event(le16toh(virtio_get_used_event(q)),
+					     virtio_get_used_idx(q),
+					     q->last_used_idx_signaled)) {
+		q->last_used_idx_signaled = virtio_get_used_idx(q);
+		virtio_deliver_irq(_req->dev);
+	}
+}
+
+/*
+ * Grab the vring_desc from the queue at the appropriate index in the
+ * queue's circular buffer, converting from little-endian to
+ * the host's endianness.
+ */
+static inline
+struct lkl_vring_desc *vring_desc_at_le_idx(struct virtio_queue *q,
+					    __lkl__virtio16 le_idx)
+{
+	return &(q->desc[le16toh(le_idx) & (q->num - 1)]);
+}
+
+static inline
+struct lkl_vring_desc *vring_desc_at_avail_idx(struct virtio_queue *q,
+					       uint16_t idx)
+{
+	uint16_t desc_idx = q->avail->ring[idx & (q->num - 1)];
+
+	return vring_desc_at_le_idx(q, desc_idx);
+}
+
+/* Initialize buf to hold the same info as the vring_desc */
+static void add_dev_buf_from_vring_desc(struct virtio_req *req,
+					struct lkl_vring_desc *vring_desc)
+{
+	struct iovec *buf = &req->buf[req->buf_count++];
+
+	buf->iov_base = (void *)(uintptr_t)le64toh(vring_desc->addr);
+	buf->iov_len = le32toh(vring_desc->len);
+
+	if (!(buf->iov_base && buf->iov_len))
+		virtio_panic("bad vring_desc: %p %d\n",
+			     buf->iov_base, buf->iov_len);
+
+	req->total_len += buf->iov_len;
+}
+
+static struct lkl_vring_desc *get_next_desc(struct virtio_queue *q,
+					    struct lkl_vring_desc *desc,
+					    uint16_t *idx)
+{
+	uint16_t desc_idx;
+
+	if (q->max_merge_len) {
+		if (++(*idx) == le16toh(q->avail->idx))
+			return NULL;
+		desc_idx = q->avail->ring[*idx & (q->num - 1)];
+		return vring_desc_at_le_idx(q, desc_idx);
+	}
+
+	if (!(le16toh(desc->flags) & LKL_VRING_DESC_F_NEXT))
+		return NULL;
+	return vring_desc_at_le_idx(q, desc->next);
+}
+
+/*
+ * Below there are two distinctly different (per packet) buffer allocation
+ * schemes for us to deal with:
+ *
+ * 1. One or more descriptors chained through "next" as indicated by the
+ *    LKL_VRING_DESC_F_NEXT flag,
+ * 2. One or more descriptors from the ring sequentially, as many as are
+ *    available and needed. This is the RX only "mergeable_rx_bufs" mode.
+ *    The mode is entered when the VIRTIO_NET_F_MRG_RXBUF device feature
+ *    is enabled.
+ */
+static int virtio_process_one(struct virtio_dev *dev, int qidx)
+{
+	struct virtio_queue *q = &dev->queue[qidx];
+	uint16_t idx = q->last_avail_idx;
+	struct _virtio_req _req = {
+		.dev = dev,
+		.q = q,
+		.idx = idx,
+	};
+	struct virtio_req *req = &_req.req;
+	struct lkl_vring_desc *desc = vring_desc_at_avail_idx(q, _req.idx);
+
+	do {
+		add_dev_buf_from_vring_desc(req, desc);
+		if (q->max_merge_len && req->total_len > q->max_merge_len)
+			break;
+		desc = get_next_desc(q, desc, &idx);
+	} while (desc && req->buf_count < VIRTIO_REQ_MAX_BUFS);
+
+	if (desc && le16toh(desc->flags) & LKL_VRING_DESC_F_NEXT)
+		virtio_panic("too many chained bufs");
+
+	return dev->ops->enqueue(dev, qidx, req);
+}
+
+/* NB: we can enter this function two different ways in the case of
+ * netdevs --- either through a tx/rx thread poll (which the LKL
+ * scheduler knows nothing about) or through virtio_write called
+ * inside an interrupt handler, so to be safe, it's not enough to
+ * synchronize only the tx/rx polling threads.
+ *
+ * At the moment, it seems like only netdevs require the
+ * synchronization we do here (i.e. locking around operations on a
+ * particular virtqueue, with dev->ops->acquire_queue), since they
+ * have these two different entry points, one of which isn't managed
+ * by the LKL scheduler. So only devs corresponding to netdevs will
+ * have non-NULL acquire/release_queue.
+ *
+ * In the future, this may change. If you see errors thrown in virtio
+ * driver code by block/console devices, you should be suspicious of
+ * the synchronization going on here.
+ */
+void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx)
+{
+	struct virtio_queue *q = &dev->queue[qidx];
+
+	if (!q->ready)
+		return;
+
+	if (dev->ops->acquire_queue)
+		dev->ops->acquire_queue(dev, qidx);
+
+	while (q->last_avail_idx != le16toh(q->avail->idx)) {
+		/*
+		 * Make sure following loads happens after loading
+		 * q->avail->idx.
+		 */
+		__sync_synchronize();
+		if (virtio_process_one(dev, qidx) < 0)
+			break;
+		if (q->last_avail_idx == le16toh(q->avail->idx))
+			virtio_set_avail_event(q, q->avail->idx);
+	}
+
+	if (dev->ops->release_queue)
+		dev->ops->release_queue(dev, qidx);
+}
+
+static inline uint32_t virtio_read_device_features(struct virtio_dev *dev)
+{
+	if (dev->device_features_sel)
+		return (uint32_t)(dev->device_features >> 32);
+
+	return (uint32_t)dev->device_features;
+}
+
+static inline void virtio_write_driver_features(struct virtio_dev *dev,
+						uint32_t val)
+{
+	uint64_t tmp;
+
+	if (dev->driver_features_sel) {
+		tmp = dev->driver_features & 0xFFFFFFFF;
+		dev->driver_features = tmp | (uint64_t)val << 32;
+	} else {
+		tmp = dev->driver_features & 0xFFFFFFFF00000000;
+		dev->driver_features = tmp | val;
+	}
+}
+
+static int virtio_read(void *data, int offset, void *res, int size)
+{
+	uint32_t val;
+	struct virtio_dev *dev = (struct virtio_dev *)data;
+
+	if (offset >= VIRTIO_MMIO_CONFIG) {
+		offset -= VIRTIO_MMIO_CONFIG;
+		if (offset + size > dev->config_len)
+			return -LKL_EINVAL;
+		memcpy(res, dev->config_data + offset, size);
+		return 0;
+	}
+
+	if (size != sizeof(uint32_t))
+		return -LKL_EINVAL;
+
+	switch (offset) {
+	case VIRTIO_MMIO_MAGIC_VALUE:
+		val = VIRTIO_DEV_MAGIC;
+		break;
+	case VIRTIO_MMIO_VERSION:
+		val = VIRTIO_DEV_VERSION;
+		break;
+	case VIRTIO_MMIO_DEVICE_ID:
+		val = dev->device_id;
+		break;
+	case VIRTIO_MMIO_VENDOR_ID:
+		val = dev->vendor_id;
+		break;
+	case VIRTIO_MMIO_DEVICE_FEATURES:
+		val = virtio_read_device_features(dev);
+		break;
+	case VIRTIO_MMIO_QUEUE_NUM_MAX:
+		val = dev->queue[dev->queue_sel].num_max;
+		break;
+	case VIRTIO_MMIO_QUEUE_READY:
+		val = dev->queue[dev->queue_sel].ready;
+		break;
+	case VIRTIO_MMIO_INTERRUPT_STATUS:
+		val = dev->int_status;
+		break;
+	case VIRTIO_MMIO_STATUS:
+		val = dev->status;
+		break;
+	case VIRTIO_MMIO_CONFIG_GENERATION:
+		val = dev->config_gen;
+		break;
+	default:
+		return -1;
+	}
+
+	*(uint32_t *)res = htole32(val);
+
+	return 0;
+}
+
+static inline void set_ptr_low(void **ptr, uint32_t val)
+{
+	uint64_t tmp = (uintptr_t)*ptr;
+
+	tmp = (tmp & 0xFFFFFFFF00000000) | val;
+	*ptr = (void *)(long)tmp;
+}
+
+static inline void set_ptr_high(void **ptr, uint32_t val)
+{
+	uint64_t tmp = (uintptr_t)*ptr;
+
+	tmp = (tmp & 0x00000000FFFFFFFF) | ((uint64_t)val << 32);
+	*ptr = (void *)(long)tmp;
+}
+
+static inline void set_status(struct virtio_dev *dev, uint32_t val)
+{
+	if ((val & LKL_VIRTIO_CONFIG_S_FEATURES_OK) &&
+	    (!(dev->driver_features & BIT(LKL_VIRTIO_F_VERSION_1)) ||
+	     !(dev->driver_features & BIT(LKL_VIRTIO_RING_F_EVENT_IDX)) ||
+	     dev->ops->check_features(dev)))
+		val &= ~LKL_VIRTIO_CONFIG_S_FEATURES_OK;
+	dev->status = val;
+}
+
+static int virtio_write(void *data, int offset, void *res, int size)
+{
+	struct virtio_dev *dev = (struct virtio_dev *)data;
+	struct virtio_queue *q = &dev->queue[dev->queue_sel];
+	uint32_t val;
+	int ret = 0;
+
+	if (offset >= VIRTIO_MMIO_CONFIG) {
+		offset -= VIRTIO_MMIO_CONFIG;
+
+		if (offset + size >= dev->config_len)
+			return -LKL_EINVAL;
+		memcpy(dev->config_data + offset, res, size);
+		return 0;
+	}
+
+	if (size != sizeof(uint32_t))
+		return -LKL_EINVAL;
+
+	val = le32toh(*(uint32_t *)res);
+
+	switch (offset) {
+	case VIRTIO_MMIO_DEVICE_FEATURES_SEL:
+		if (val > 1)
+			return -LKL_EINVAL;
+		dev->device_features_sel = val;
+		break;
+	case VIRTIO_MMIO_DRIVER_FEATURES_SEL:
+		if (val > 1)
+			return -LKL_EINVAL;
+		dev->driver_features_sel = val;
+		break;
+	case VIRTIO_MMIO_DRIVER_FEATURES:
+		virtio_write_driver_features(dev, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_SEL:
+		dev->queue_sel = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_NUM:
+		dev->queue[dev->queue_sel].num = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_READY:
+		dev->queue[dev->queue_sel].ready = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_NOTIFY:
+		virtio_process_queue(dev, val);
+		break;
+	case VIRTIO_MMIO_INTERRUPT_ACK:
+		dev->int_status = 0;
+		break;
+	case VIRTIO_MMIO_STATUS:
+		set_status(dev, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_DESC_LOW:
+		set_ptr_low((void **)&q->desc, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_DESC_HIGH:
+		set_ptr_high((void **)&q->desc, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_AVAIL_LOW:
+		set_ptr_low((void **)&q->avail, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_AVAIL_HIGH:
+		set_ptr_high((void **)&q->avail, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_USED_LOW:
+		set_ptr_low((void **)&q->used, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_USED_HIGH:
+		set_ptr_high((void **)&q->used, val);
+		break;
+	default:
+		ret = -1;
+	}
+
+	return ret;
+}
+
+static const struct lkl_iomem_ops virtio_ops = {
+	.read = virtio_read,
+	.write = virtio_write,
+};
+
+char lkl_virtio_devs[4096];
+static char *devs = lkl_virtio_devs;
+static uint32_t lkl_num_virtio_boot_devs;
+
+void virtio_set_queue_max_merge_len(struct virtio_dev *dev, int q, int len)
+{
+	dev->queue[q].max_merge_len = len;
+}
+
+int virtio_dev_setup(struct virtio_dev *dev, int queues, int num_max)
+{
+	int qsize = queues * sizeof(*dev->queue);
+	int avail, mmio_size;
+	int i;
+	int num_bytes;
+	int ret;
+
+	dev->irq = lkl_get_free_irq("virtio");
+	if (dev->irq < 0)
+		return dev->irq;
+
+	dev->int_status = 0;
+	dev->device_features |= BIT(LKL_VIRTIO_F_VERSION_1) |
+		BIT(LKL_VIRTIO_RING_F_EVENT_IDX);
+	dev->queue = lkl_host_ops.mem_alloc(qsize);
+	if (!dev->queue)
+		return -LKL_ENOMEM;
+
+	memset(dev->queue, 0, qsize);
+	for (i = 0; i < queues; i++)
+		dev->queue[i].num_max = num_max;
+
+	mmio_size = VIRTIO_MMIO_CONFIG + dev->config_len;
+	dev->base = register_iomem(dev, mmio_size, &virtio_ops);
+	if (!dev->base) {
+		lkl_host_ops.mem_free(dev->queue);
+		return -LKL_ENOMEM;
+	}
+
+	if (!lkl_is_running()) {
+		avail = sizeof(lkl_virtio_devs) - (devs - lkl_virtio_devs);
+		num_bytes = snprintf(devs, avail,
+				     " virtio_mmio.device=%d@0x%"PRIxPTR":%d",
+				     mmio_size, (uintptr_t) dev->base,
+				     dev->irq);
+		if (num_bytes < 0 || num_bytes >= avail) {
+			lkl_put_irq(dev->irq, "virtio");
+			unregister_iomem(dev->base);
+			lkl_host_ops.mem_free(dev->queue);
+			return -LKL_ENOMEM;
+		}
+		devs += num_bytes;
+		dev->virtio_mmio_id = lkl_num_virtio_boot_devs++;
+	} else {
+		ret =
+		    lkl_sys_virtio_mmio_device_add((long)dev->base, mmio_size,
+						   dev->irq);
+		if (ret < 0) {
+			lkl_printf("can't register mmio device\n");
+			return -1;
+		}
+		dev->virtio_mmio_id = lkl_num_virtio_boot_devs + ret;
+	}
+
+	return 0;
+}
+
+int virtio_dev_cleanup(struct virtio_dev *dev)
+{
+	char devname[100];
+	long fd, ret;
+	long mount_ret;
+
+	if (!lkl_is_running())
+		goto skip_unbind;
+
+	mount_ret = lkl_mount_fs("sysfs");
+	if (mount_ret < 0)
+		return mount_ret;
+
+	if (dev->virtio_mmio_id >= virtio_get_num_bootdevs())
+		ret = snprintf(devname, sizeof(devname), "virtio-mmio.%d.auto",
+			       dev->virtio_mmio_id - virtio_get_num_bootdevs());
+	else
+		ret = snprintf(devname, sizeof(devname), "virtio-mmio.%d",
+			       dev->virtio_mmio_id);
+	if (ret < 0 || (size_t) ret >= sizeof(devname))
+		return -LKL_ENOMEM;
+
+	fd = lkl_sys_open("/sysfs/bus/platform/drivers/virtio-mmio/unbind",
+			  LKL_O_WRONLY, 0);
+	if (fd < 0)
+		return fd;
+
+	ret = lkl_sys_write(fd, devname, strlen(devname));
+	if (ret < 0)
+		return ret;
+
+	ret = lkl_sys_close(fd);
+	if (ret < 0)
+		return ret;
+
+	if (mount_ret == 0) {
+		ret = lkl_sys_umount("/sysfs", 0);
+		if (ret < 0)
+			return ret;
+	}
+
+skip_unbind:
+	lkl_put_irq(dev->irq, "virtio");
+	unregister_iomem(dev->base);
+	lkl_host_ops.mem_free(dev->queue);
+	return 0;
+}
+
+uint32_t virtio_get_num_bootdevs(void)
+{
+	return lkl_num_virtio_boot_devs;
+}
diff --git a/tools/lkl/lib/virtio.h b/tools/lkl/lib/virtio.h
new file mode 100644
index 000000000000..7427aa8fad79
--- /dev/null
+++ b/tools/lkl/lib/virtio.h
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_VIRTIO_H
+#define _LKL_LIB_VIRTIO_H
+
+#include <stdint.h>
+#include <lkl_host.h>
+
+#define PAGE_SIZE		4096
+
+/* The following are copied from skbuff.h */
+#if (65536/PAGE_SIZE + 1) < 16
+#define MAX_SKB_FRAGS 16UL
+#else
+#define MAX_SKB_FRAGS (65536/PAGE_SIZE + 1)
+#endif
+
+#define VIRTIO_REQ_MAX_BUFS	(MAX_SKB_FRAGS + 2)
+
+struct virtio_req {
+	uint16_t buf_count;
+	struct iovec buf[VIRTIO_REQ_MAX_BUFS];
+	uint32_t total_len;
+};
+
+struct virtio_dev;
+
+struct virtio_dev_ops {
+	int (*check_features)(struct virtio_dev *dev);
+	/**
+	 * enqueue - queues the request for processing
+	 *
+	 * Note that the curret implementation assumes that the requests are
+	 * processed synchronous and, as such, @virtio_req_complete must be
+	 * called by from this function.
+	 *
+	 * @dev - virtio device
+	 * @q	- queue index
+	 *
+	 * @returns a negative value if the request has not been queued for
+	 * processing in which case the virtio device is resposible for
+	 * restaring the queue processing by calling @virtio_process_queue at a
+	 * later time; 0 or a positive value means that the request has been
+	 * queued for processing
+	 */
+	int (*enqueue)(struct virtio_dev *dev, int q, struct virtio_req *req);
+	/*
+	 * Acquire/release a lock on the specified queue. Only implemented by
+	 * netdevs, all other devices have NULL acquire/release function
+	 * pointers.
+	 */
+	void (*acquire_queue)(struct virtio_dev *dev, int queue_idx);
+	void (*release_queue)(struct virtio_dev *dev, int queue_idx);
+};
+
+struct virtio_dev {
+	uint32_t device_id;
+	uint32_t vendor_id;
+	uint64_t device_features;
+	uint32_t device_features_sel;
+	uint64_t driver_features;
+	uint32_t driver_features_sel;
+	uint32_t queue_sel;
+	struct virtio_queue *queue;
+	uint32_t queue_notify;
+	uint32_t int_status;
+	uint32_t status;
+	uint32_t config_gen;
+
+	struct virtio_dev_ops *ops;
+	int irq;
+	void *config_data;
+	int config_len;
+	void *base;
+	uint32_t virtio_mmio_id;
+};
+
+int virtio_dev_setup(struct virtio_dev *dev, int queues, int num_max);
+int virtio_dev_cleanup(struct virtio_dev *dev);
+uint32_t virtio_get_num_bootdevs(void);
+/**
+ * virtio_req_complete - complete a virtio request
+ *
+ * @req - the request to be completed
+ * @len - the total size in bytes of the completed request
+ */
+void virtio_req_complete(struct virtio_req *req, uint32_t len);
+void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx);
+void virtio_set_queue_max_merge_len(struct virtio_dev *dev, int q, int len);
+
+#define container_of(ptr, type, member) \
+	(type *)((char *)(ptr) - __builtin_offsetof(type, member))
+
+#endif /* _LKL_LIB_VIRTIO_H */
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Conrad Meyer, Octavian Purdila, Akira Moroo,
	Yuan Liu, Patrick Collins, linux-kernel-library,
	Michael Zimmermann, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add helpers for implementing host virtio devices. It uses the memory
mapped I/O helpers to interact with the Linux MMIO virtio transport
driver and offers support to setup and add a new virtio device,
dispatch requests from the incoming queues as well as support for
completing requests.

All added virtio devices are stored in lkl_virtio_devs as strings, per
the Linux MMIO virtio transport driver command line specification.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl_host.h |   1 +
 tools/lkl/lib/Build          |   1 +
 tools/lkl/lib/virtio.c       | 631 +++++++++++++++++++++++++++++++++++
 tools/lkl/lib/virtio.h       |  93 ++++++
 4 files changed, 726 insertions(+)
 create mode 100644 tools/lkl/lib/virtio.c
 create mode 100644 tools/lkl/lib/virtio.h

diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
index 85e80eb4ad0d..81239e2b556f 100644
--- a/tools/lkl/include/lkl_host.h
+++ b/tools/lkl/include/lkl_host.h
@@ -18,6 +18,7 @@ extern struct lkl_host_operations lkl_host_ops;
  */
 int lkl_printf(const char *fmt, ...);
 
+extern char lkl_virtio_devs[4096];
 
 #ifdef __cplusplus
 }
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index 8d01982638f8..5fd1843b51d1 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -3,6 +3,7 @@ CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 liblkl-y += iomem.o
 liblkl-y += jmp_buf.o
 liblkl-y += utils.o
+liblkl-y += virtio.o
 liblkl-y += dbg.o
 liblkl-y += dbg_handler.o
 liblkl-y += ../../perf/pmu-events/jsmn.o
diff --git a/tools/lkl/lib/virtio.c b/tools/lkl/lib/virtio.c
new file mode 100644
index 000000000000..c5247665482d
--- /dev/null
+++ b/tools/lkl/lib/virtio.c
@@ -0,0 +1,631 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <stdio.h>
+#include <stdbool.h>
+#include <inttypes.h>
+#include <lkl_host.h>
+#include <lkl/linux/virtio_ring.h>
+#include "iomem.h"
+#include "virtio.h"
+#include "endian.h"
+
+#define VIRTIO_DEV_MAGIC		0x74726976
+#define VIRTIO_DEV_VERSION		2
+
+#define VIRTIO_MMIO_MAGIC_VALUE		0x000
+#define VIRTIO_MMIO_VERSION		0x004
+#define VIRTIO_MMIO_DEVICE_ID		0x008
+#define VIRTIO_MMIO_VENDOR_ID		0x00c
+#define VIRTIO_MMIO_DEVICE_FEATURES	0x010
+#define VIRTIO_MMIO_DEVICE_FEATURES_SEL 0x014
+#define VIRTIO_MMIO_DRIVER_FEATURES	0x020
+#define VIRTIO_MMIO_DRIVER_FEATURES_SEL 0x024
+#define VIRTIO_MMIO_QUEUE_SEL		0x030
+#define VIRTIO_MMIO_QUEUE_NUM_MAX	0x034
+#define VIRTIO_MMIO_QUEUE_NUM		0x038
+#define VIRTIO_MMIO_QUEUE_READY		0x044
+#define VIRTIO_MMIO_QUEUE_NOTIFY	0x050
+#define VIRTIO_MMIO_INTERRUPT_STATUS	0x060
+#define VIRTIO_MMIO_INTERRUPT_ACK	0x064
+#define VIRTIO_MMIO_STATUS		0x070
+#define VIRTIO_MMIO_QUEUE_DESC_LOW	0x080
+#define VIRTIO_MMIO_QUEUE_DESC_HIGH	0x084
+#define VIRTIO_MMIO_QUEUE_AVAIL_LOW	0x090
+#define VIRTIO_MMIO_QUEUE_AVAIL_HIGH	0x094
+#define VIRTIO_MMIO_QUEUE_USED_LOW	0x0a0
+#define VIRTIO_MMIO_QUEUE_USED_HIGH	0x0a4
+#define VIRTIO_MMIO_CONFIG_GENERATION	0x0fc
+#define VIRTIO_MMIO_CONFIG		0x100
+#define VIRTIO_MMIO_INT_VRING		0x01
+#define VIRTIO_MMIO_INT_CONFIG		0x02
+
+#define BIT(x) (1ULL << x)
+
+#define virtio_panic(msg, ...) do {					\
+		lkl_printf("LKL virtio error" msg, ##__VA_ARGS__);	\
+		lkl_host_ops.panic();					\
+	} while (0)
+
+struct virtio_queue {
+	uint32_t num_max;
+	uint32_t num;
+	uint32_t ready;
+	uint32_t max_merge_len;
+
+	struct lkl_vring_desc *desc;
+	struct lkl_vring_avail *avail;
+	struct lkl_vring_used *used;
+	uint16_t last_avail_idx;
+	uint16_t last_used_idx_signaled;
+};
+
+struct _virtio_req {
+	struct virtio_req req;
+	struct virtio_dev *dev;
+	struct virtio_queue *q;
+	uint16_t idx;
+};
+
+
+static inline uint16_t virtio_get_used_event(struct virtio_queue *q)
+{
+	return q->avail->ring[q->num];
+}
+
+static inline void virtio_set_avail_event(struct virtio_queue *q, uint16_t val)
+{
+	*((uint16_t *)&q->used->ring[q->num]) = val;
+}
+
+static inline void virtio_deliver_irq(struct virtio_dev *dev)
+{
+	dev->int_status |= VIRTIO_MMIO_INT_VRING;
+	/* Make sure all memory writes before are visible to the driver. */
+	__sync_synchronize();
+	lkl_trigger_irq(dev->irq);
+}
+
+static inline uint16_t virtio_get_used_idx(struct virtio_queue *q)
+{
+	return le16toh(q->used->idx);
+}
+
+static inline void virtio_add_used(struct virtio_queue *q, uint16_t used_idx,
+				   uint16_t avail_idx, uint16_t len)
+{
+	uint16_t desc_idx = q->avail->ring[avail_idx & (q->num - 1)];
+
+	used_idx = used_idx & (q->num - 1);
+	q->used->ring[used_idx].id = desc_idx;
+	q->used->ring[used_idx].len = htole16(len);
+}
+
+/*
+ * Make sure all memory writes before are visible to the driver before updating
+ * the idx.  We need it here even we already have one in virtio_deliver_irq()
+ * because there might already be an driver thread reading the idx and dequeuing
+ * used buffers.
+ */
+static inline void virtio_sync_used_idx(struct virtio_queue *q, uint16_t idx)
+{
+	__sync_synchronize();
+	q->used->idx = htole16(idx);
+}
+
+#define min_len(a, b) (a < b ? a : b)
+
+void virtio_req_complete(struct virtio_req *req, uint32_t len)
+{
+	int send_irq = 0;
+	struct _virtio_req *_req = container_of(req, struct _virtio_req, req);
+	struct virtio_queue *q = _req->q;
+	uint16_t avail_idx = _req->idx;
+	uint16_t used_idx = virtio_get_used_idx(_req->q);
+	int i;
+
+	/*
+	 * We've potentially used up multiple (non-chained) descriptors and have
+	 * to create one "used" entry for each descriptor we've consumed.
+	 */
+	for (i = 0; i < req->buf_count; i++) {
+		uint16_t used_len;
+
+		if (!q->max_merge_len)
+			used_len = len;
+		else
+			used_len = min_len(len,  req->buf[i].iov_len);
+
+		virtio_add_used(q, used_idx++, avail_idx++, used_len);
+
+		len -= used_len;
+		if (!len)
+			break;
+	}
+	virtio_sync_used_idx(q, used_idx);
+	q->last_avail_idx = avail_idx;
+
+	/*
+	 * Triggers the irq whenever there is no available buffer.
+	 */
+	if (q->last_avail_idx == le16toh(q->avail->idx))
+		send_irq = 1;
+
+	/*
+	 * There are two rings: q->avail and q->used for each of the rx and tx
+	 * queues that are used to pass buffers between kernel driver and the
+	 * virtio device implementation.
+	 *
+	 * Kernel maitains the first one and appends buffers to it. In rx queue,
+	 * it's empty buffers kernel offers to store received packets. In tx
+	 * queue, it's buffers containing packets to transmit. Kernel notifies
+	 * the device by mmio write (see VIRTIO_MMIO_QUEUE_NOTIFY below).
+	 *
+	 * The virtio device (here in this file) maintains the
+	 * q->used and appends buffer to it after consuming it from q->avail.
+	 *
+	 * The device needs to notify the driver by triggering irq here. The
+	 * LKL_VIRTIO_RING_F_EVENT_IDX is enabled in this implementation so
+	 * kernel can set virtio_get_used_event(q) to tell the device to "only
+	 * trigger the irq when this item in q->used ring is populated."
+	 *
+	 * Because driver and device are run in two different threads. When
+	 * driver sets virtio_get_used_event(q), q->used->idx may already be
+	 * increased to a larger one. So we need to trigger the irq when
+	 * virtio_get_used_event(q) < q->used->idx.
+	 *
+	 * To avoid unnessary irqs for each packet after
+	 * virtio_get_used_event(q) < q->used->idx, last_used_idx_signaled is
+	 * stored and irq is only triggered if
+	 * last_used_idx_signaled <= virtio_get_used_event(q) < q->used->idx
+	 *
+	 * This is what lkl_vring_need_event() checks and it evens covers the
+	 * case when those numbers wrap up.
+	 */
+	if (send_irq || lkl_vring_need_event(le16toh(virtio_get_used_event(q)),
+					     virtio_get_used_idx(q),
+					     q->last_used_idx_signaled)) {
+		q->last_used_idx_signaled = virtio_get_used_idx(q);
+		virtio_deliver_irq(_req->dev);
+	}
+}
+
+/*
+ * Grab the vring_desc from the queue at the appropriate index in the
+ * queue's circular buffer, converting from little-endian to
+ * the host's endianness.
+ */
+static inline
+struct lkl_vring_desc *vring_desc_at_le_idx(struct virtio_queue *q,
+					    __lkl__virtio16 le_idx)
+{
+	return &(q->desc[le16toh(le_idx) & (q->num - 1)]);
+}
+
+static inline
+struct lkl_vring_desc *vring_desc_at_avail_idx(struct virtio_queue *q,
+					       uint16_t idx)
+{
+	uint16_t desc_idx = q->avail->ring[idx & (q->num - 1)];
+
+	return vring_desc_at_le_idx(q, desc_idx);
+}
+
+/* Initialize buf to hold the same info as the vring_desc */
+static void add_dev_buf_from_vring_desc(struct virtio_req *req,
+					struct lkl_vring_desc *vring_desc)
+{
+	struct iovec *buf = &req->buf[req->buf_count++];
+
+	buf->iov_base = (void *)(uintptr_t)le64toh(vring_desc->addr);
+	buf->iov_len = le32toh(vring_desc->len);
+
+	if (!(buf->iov_base && buf->iov_len))
+		virtio_panic("bad vring_desc: %p %d\n",
+			     buf->iov_base, buf->iov_len);
+
+	req->total_len += buf->iov_len;
+}
+
+static struct lkl_vring_desc *get_next_desc(struct virtio_queue *q,
+					    struct lkl_vring_desc *desc,
+					    uint16_t *idx)
+{
+	uint16_t desc_idx;
+
+	if (q->max_merge_len) {
+		if (++(*idx) == le16toh(q->avail->idx))
+			return NULL;
+		desc_idx = q->avail->ring[*idx & (q->num - 1)];
+		return vring_desc_at_le_idx(q, desc_idx);
+	}
+
+	if (!(le16toh(desc->flags) & LKL_VRING_DESC_F_NEXT))
+		return NULL;
+	return vring_desc_at_le_idx(q, desc->next);
+}
+
+/*
+ * Below there are two distinctly different (per packet) buffer allocation
+ * schemes for us to deal with:
+ *
+ * 1. One or more descriptors chained through "next" as indicated by the
+ *    LKL_VRING_DESC_F_NEXT flag,
+ * 2. One or more descriptors from the ring sequentially, as many as are
+ *    available and needed. This is the RX only "mergeable_rx_bufs" mode.
+ *    The mode is entered when the VIRTIO_NET_F_MRG_RXBUF device feature
+ *    is enabled.
+ */
+static int virtio_process_one(struct virtio_dev *dev, int qidx)
+{
+	struct virtio_queue *q = &dev->queue[qidx];
+	uint16_t idx = q->last_avail_idx;
+	struct _virtio_req _req = {
+		.dev = dev,
+		.q = q,
+		.idx = idx,
+	};
+	struct virtio_req *req = &_req.req;
+	struct lkl_vring_desc *desc = vring_desc_at_avail_idx(q, _req.idx);
+
+	do {
+		add_dev_buf_from_vring_desc(req, desc);
+		if (q->max_merge_len && req->total_len > q->max_merge_len)
+			break;
+		desc = get_next_desc(q, desc, &idx);
+	} while (desc && req->buf_count < VIRTIO_REQ_MAX_BUFS);
+
+	if (desc && le16toh(desc->flags) & LKL_VRING_DESC_F_NEXT)
+		virtio_panic("too many chained bufs");
+
+	return dev->ops->enqueue(dev, qidx, req);
+}
+
+/* NB: we can enter this function two different ways in the case of
+ * netdevs --- either through a tx/rx thread poll (which the LKL
+ * scheduler knows nothing about) or through virtio_write called
+ * inside an interrupt handler, so to be safe, it's not enough to
+ * synchronize only the tx/rx polling threads.
+ *
+ * At the moment, it seems like only netdevs require the
+ * synchronization we do here (i.e. locking around operations on a
+ * particular virtqueue, with dev->ops->acquire_queue), since they
+ * have these two different entry points, one of which isn't managed
+ * by the LKL scheduler. So only devs corresponding to netdevs will
+ * have non-NULL acquire/release_queue.
+ *
+ * In the future, this may change. If you see errors thrown in virtio
+ * driver code by block/console devices, you should be suspicious of
+ * the synchronization going on here.
+ */
+void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx)
+{
+	struct virtio_queue *q = &dev->queue[qidx];
+
+	if (!q->ready)
+		return;
+
+	if (dev->ops->acquire_queue)
+		dev->ops->acquire_queue(dev, qidx);
+
+	while (q->last_avail_idx != le16toh(q->avail->idx)) {
+		/*
+		 * Make sure following loads happens after loading
+		 * q->avail->idx.
+		 */
+		__sync_synchronize();
+		if (virtio_process_one(dev, qidx) < 0)
+			break;
+		if (q->last_avail_idx == le16toh(q->avail->idx))
+			virtio_set_avail_event(q, q->avail->idx);
+	}
+
+	if (dev->ops->release_queue)
+		dev->ops->release_queue(dev, qidx);
+}
+
+static inline uint32_t virtio_read_device_features(struct virtio_dev *dev)
+{
+	if (dev->device_features_sel)
+		return (uint32_t)(dev->device_features >> 32);
+
+	return (uint32_t)dev->device_features;
+}
+
+static inline void virtio_write_driver_features(struct virtio_dev *dev,
+						uint32_t val)
+{
+	uint64_t tmp;
+
+	if (dev->driver_features_sel) {
+		tmp = dev->driver_features & 0xFFFFFFFF;
+		dev->driver_features = tmp | (uint64_t)val << 32;
+	} else {
+		tmp = dev->driver_features & 0xFFFFFFFF00000000;
+		dev->driver_features = tmp | val;
+	}
+}
+
+static int virtio_read(void *data, int offset, void *res, int size)
+{
+	uint32_t val;
+	struct virtio_dev *dev = (struct virtio_dev *)data;
+
+	if (offset >= VIRTIO_MMIO_CONFIG) {
+		offset -= VIRTIO_MMIO_CONFIG;
+		if (offset + size > dev->config_len)
+			return -LKL_EINVAL;
+		memcpy(res, dev->config_data + offset, size);
+		return 0;
+	}
+
+	if (size != sizeof(uint32_t))
+		return -LKL_EINVAL;
+
+	switch (offset) {
+	case VIRTIO_MMIO_MAGIC_VALUE:
+		val = VIRTIO_DEV_MAGIC;
+		break;
+	case VIRTIO_MMIO_VERSION:
+		val = VIRTIO_DEV_VERSION;
+		break;
+	case VIRTIO_MMIO_DEVICE_ID:
+		val = dev->device_id;
+		break;
+	case VIRTIO_MMIO_VENDOR_ID:
+		val = dev->vendor_id;
+		break;
+	case VIRTIO_MMIO_DEVICE_FEATURES:
+		val = virtio_read_device_features(dev);
+		break;
+	case VIRTIO_MMIO_QUEUE_NUM_MAX:
+		val = dev->queue[dev->queue_sel].num_max;
+		break;
+	case VIRTIO_MMIO_QUEUE_READY:
+		val = dev->queue[dev->queue_sel].ready;
+		break;
+	case VIRTIO_MMIO_INTERRUPT_STATUS:
+		val = dev->int_status;
+		break;
+	case VIRTIO_MMIO_STATUS:
+		val = dev->status;
+		break;
+	case VIRTIO_MMIO_CONFIG_GENERATION:
+		val = dev->config_gen;
+		break;
+	default:
+		return -1;
+	}
+
+	*(uint32_t *)res = htole32(val);
+
+	return 0;
+}
+
+static inline void set_ptr_low(void **ptr, uint32_t val)
+{
+	uint64_t tmp = (uintptr_t)*ptr;
+
+	tmp = (tmp & 0xFFFFFFFF00000000) | val;
+	*ptr = (void *)(long)tmp;
+}
+
+static inline void set_ptr_high(void **ptr, uint32_t val)
+{
+	uint64_t tmp = (uintptr_t)*ptr;
+
+	tmp = (tmp & 0x00000000FFFFFFFF) | ((uint64_t)val << 32);
+	*ptr = (void *)(long)tmp;
+}
+
+static inline void set_status(struct virtio_dev *dev, uint32_t val)
+{
+	if ((val & LKL_VIRTIO_CONFIG_S_FEATURES_OK) &&
+	    (!(dev->driver_features & BIT(LKL_VIRTIO_F_VERSION_1)) ||
+	     !(dev->driver_features & BIT(LKL_VIRTIO_RING_F_EVENT_IDX)) ||
+	     dev->ops->check_features(dev)))
+		val &= ~LKL_VIRTIO_CONFIG_S_FEATURES_OK;
+	dev->status = val;
+}
+
+static int virtio_write(void *data, int offset, void *res, int size)
+{
+	struct virtio_dev *dev = (struct virtio_dev *)data;
+	struct virtio_queue *q = &dev->queue[dev->queue_sel];
+	uint32_t val;
+	int ret = 0;
+
+	if (offset >= VIRTIO_MMIO_CONFIG) {
+		offset -= VIRTIO_MMIO_CONFIG;
+
+		if (offset + size >= dev->config_len)
+			return -LKL_EINVAL;
+		memcpy(dev->config_data + offset, res, size);
+		return 0;
+	}
+
+	if (size != sizeof(uint32_t))
+		return -LKL_EINVAL;
+
+	val = le32toh(*(uint32_t *)res);
+
+	switch (offset) {
+	case VIRTIO_MMIO_DEVICE_FEATURES_SEL:
+		if (val > 1)
+			return -LKL_EINVAL;
+		dev->device_features_sel = val;
+		break;
+	case VIRTIO_MMIO_DRIVER_FEATURES_SEL:
+		if (val > 1)
+			return -LKL_EINVAL;
+		dev->driver_features_sel = val;
+		break;
+	case VIRTIO_MMIO_DRIVER_FEATURES:
+		virtio_write_driver_features(dev, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_SEL:
+		dev->queue_sel = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_NUM:
+		dev->queue[dev->queue_sel].num = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_READY:
+		dev->queue[dev->queue_sel].ready = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_NOTIFY:
+		virtio_process_queue(dev, val);
+		break;
+	case VIRTIO_MMIO_INTERRUPT_ACK:
+		dev->int_status = 0;
+		break;
+	case VIRTIO_MMIO_STATUS:
+		set_status(dev, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_DESC_LOW:
+		set_ptr_low((void **)&q->desc, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_DESC_HIGH:
+		set_ptr_high((void **)&q->desc, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_AVAIL_LOW:
+		set_ptr_low((void **)&q->avail, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_AVAIL_HIGH:
+		set_ptr_high((void **)&q->avail, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_USED_LOW:
+		set_ptr_low((void **)&q->used, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_USED_HIGH:
+		set_ptr_high((void **)&q->used, val);
+		break;
+	default:
+		ret = -1;
+	}
+
+	return ret;
+}
+
+static const struct lkl_iomem_ops virtio_ops = {
+	.read = virtio_read,
+	.write = virtio_write,
+};
+
+char lkl_virtio_devs[4096];
+static char *devs = lkl_virtio_devs;
+static uint32_t lkl_num_virtio_boot_devs;
+
+void virtio_set_queue_max_merge_len(struct virtio_dev *dev, int q, int len)
+{
+	dev->queue[q].max_merge_len = len;
+}
+
+int virtio_dev_setup(struct virtio_dev *dev, int queues, int num_max)
+{
+	int qsize = queues * sizeof(*dev->queue);
+	int avail, mmio_size;
+	int i;
+	int num_bytes;
+	int ret;
+
+	dev->irq = lkl_get_free_irq("virtio");
+	if (dev->irq < 0)
+		return dev->irq;
+
+	dev->int_status = 0;
+	dev->device_features |= BIT(LKL_VIRTIO_F_VERSION_1) |
+		BIT(LKL_VIRTIO_RING_F_EVENT_IDX);
+	dev->queue = lkl_host_ops.mem_alloc(qsize);
+	if (!dev->queue)
+		return -LKL_ENOMEM;
+
+	memset(dev->queue, 0, qsize);
+	for (i = 0; i < queues; i++)
+		dev->queue[i].num_max = num_max;
+
+	mmio_size = VIRTIO_MMIO_CONFIG + dev->config_len;
+	dev->base = register_iomem(dev, mmio_size, &virtio_ops);
+	if (!dev->base) {
+		lkl_host_ops.mem_free(dev->queue);
+		return -LKL_ENOMEM;
+	}
+
+	if (!lkl_is_running()) {
+		avail = sizeof(lkl_virtio_devs) - (devs - lkl_virtio_devs);
+		num_bytes = snprintf(devs, avail,
+				     " virtio_mmio.device=%d@0x%"PRIxPTR":%d",
+				     mmio_size, (uintptr_t) dev->base,
+				     dev->irq);
+		if (num_bytes < 0 || num_bytes >= avail) {
+			lkl_put_irq(dev->irq, "virtio");
+			unregister_iomem(dev->base);
+			lkl_host_ops.mem_free(dev->queue);
+			return -LKL_ENOMEM;
+		}
+		devs += num_bytes;
+		dev->virtio_mmio_id = lkl_num_virtio_boot_devs++;
+	} else {
+		ret =
+		    lkl_sys_virtio_mmio_device_add((long)dev->base, mmio_size,
+						   dev->irq);
+		if (ret < 0) {
+			lkl_printf("can't register mmio device\n");
+			return -1;
+		}
+		dev->virtio_mmio_id = lkl_num_virtio_boot_devs + ret;
+	}
+
+	return 0;
+}
+
+int virtio_dev_cleanup(struct virtio_dev *dev)
+{
+	char devname[100];
+	long fd, ret;
+	long mount_ret;
+
+	if (!lkl_is_running())
+		goto skip_unbind;
+
+	mount_ret = lkl_mount_fs("sysfs");
+	if (mount_ret < 0)
+		return mount_ret;
+
+	if (dev->virtio_mmio_id >= virtio_get_num_bootdevs())
+		ret = snprintf(devname, sizeof(devname), "virtio-mmio.%d.auto",
+			       dev->virtio_mmio_id - virtio_get_num_bootdevs());
+	else
+		ret = snprintf(devname, sizeof(devname), "virtio-mmio.%d",
+			       dev->virtio_mmio_id);
+	if (ret < 0 || (size_t) ret >= sizeof(devname))
+		return -LKL_ENOMEM;
+
+	fd = lkl_sys_open("/sysfs/bus/platform/drivers/virtio-mmio/unbind",
+			  LKL_O_WRONLY, 0);
+	if (fd < 0)
+		return fd;
+
+	ret = lkl_sys_write(fd, devname, strlen(devname));
+	if (ret < 0)
+		return ret;
+
+	ret = lkl_sys_close(fd);
+	if (ret < 0)
+		return ret;
+
+	if (mount_ret == 0) {
+		ret = lkl_sys_umount("/sysfs", 0);
+		if (ret < 0)
+			return ret;
+	}
+
+skip_unbind:
+	lkl_put_irq(dev->irq, "virtio");
+	unregister_iomem(dev->base);
+	lkl_host_ops.mem_free(dev->queue);
+	return 0;
+}
+
+uint32_t virtio_get_num_bootdevs(void)
+{
+	return lkl_num_virtio_boot_devs;
+}
diff --git a/tools/lkl/lib/virtio.h b/tools/lkl/lib/virtio.h
new file mode 100644
index 000000000000..7427aa8fad79
--- /dev/null
+++ b/tools/lkl/lib/virtio.h
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_LIB_VIRTIO_H
+#define _LKL_LIB_VIRTIO_H
+
+#include <stdint.h>
+#include <lkl_host.h>
+
+#define PAGE_SIZE		4096
+
+/* The following are copied from skbuff.h */
+#if (65536/PAGE_SIZE + 1) < 16
+#define MAX_SKB_FRAGS 16UL
+#else
+#define MAX_SKB_FRAGS (65536/PAGE_SIZE + 1)
+#endif
+
+#define VIRTIO_REQ_MAX_BUFS	(MAX_SKB_FRAGS + 2)
+
+struct virtio_req {
+	uint16_t buf_count;
+	struct iovec buf[VIRTIO_REQ_MAX_BUFS];
+	uint32_t total_len;
+};
+
+struct virtio_dev;
+
+struct virtio_dev_ops {
+	int (*check_features)(struct virtio_dev *dev);
+	/**
+	 * enqueue - queues the request for processing
+	 *
+	 * Note that the curret implementation assumes that the requests are
+	 * processed synchronous and, as such, @virtio_req_complete must be
+	 * called by from this function.
+	 *
+	 * @dev - virtio device
+	 * @q	- queue index
+	 *
+	 * @returns a negative value if the request has not been queued for
+	 * processing in which case the virtio device is resposible for
+	 * restaring the queue processing by calling @virtio_process_queue at a
+	 * later time; 0 or a positive value means that the request has been
+	 * queued for processing
+	 */
+	int (*enqueue)(struct virtio_dev *dev, int q, struct virtio_req *req);
+	/*
+	 * Acquire/release a lock on the specified queue. Only implemented by
+	 * netdevs, all other devices have NULL acquire/release function
+	 * pointers.
+	 */
+	void (*acquire_queue)(struct virtio_dev *dev, int queue_idx);
+	void (*release_queue)(struct virtio_dev *dev, int queue_idx);
+};
+
+struct virtio_dev {
+	uint32_t device_id;
+	uint32_t vendor_id;
+	uint64_t device_features;
+	uint32_t device_features_sel;
+	uint64_t driver_features;
+	uint32_t driver_features_sel;
+	uint32_t queue_sel;
+	struct virtio_queue *queue;
+	uint32_t queue_notify;
+	uint32_t int_status;
+	uint32_t status;
+	uint32_t config_gen;
+
+	struct virtio_dev_ops *ops;
+	int irq;
+	void *config_data;
+	int config_len;
+	void *base;
+	uint32_t virtio_mmio_id;
+};
+
+int virtio_dev_setup(struct virtio_dev *dev, int queues, int num_max);
+int virtio_dev_cleanup(struct virtio_dev *dev);
+uint32_t virtio_get_num_bootdevs(void);
+/**
+ * virtio_req_complete - complete a virtio request
+ *
+ * @req - the request to be completed
+ * @len - the total size in bytes of the completed request
+ */
+void virtio_req_complete(struct virtio_req *req, uint32_t len);
+void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx);
+void virtio_set_queue_max_merge_len(struct virtio_dev *dev, int q, int len);
+
+#define container_of(ptr, type, member) \
+	(type *)((char *)(ptr) - __builtin_offsetof(type, member))
+
+#endif /* _LKL_LIB_VIRTIO_H */
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 18/37] lkl tools: host lib: virtio block device
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Michael Zimmermann, Petros Angelatos

From: Octavian Purdila <tavi.purdila@gmail.com>

Host independent implementation for virtio block devices. The host
dependent part of the host library must provide an implementation for
lkl_dev_block_ops.

Disks can be added to the LKL configuration via lkl_disk_add(), a new
LKL application API.

Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl.h      |  39 +++++++++++
 tools/lkl/include/lkl_host.h |  57 +++++++++++++++
 tools/lkl/lib/Build          |   1 +
 tools/lkl/lib/virtio_blk.c   | 132 +++++++++++++++++++++++++++++++++++
 4 files changed, 229 insertions(+)
 create mode 100644 tools/lkl/lib/virtio_blk.c

diff --git a/tools/lkl/include/lkl.h b/tools/lkl/include/lkl.h
index 76da534a85f1..967fbe4dbc26 100644
--- a/tools/lkl/include/lkl.h
+++ b/tools/lkl/include/lkl.h
@@ -350,6 +350,45 @@ const char *lkl_strerror(int err);
  */
 void lkl_perror(char *msg, int err);
 
+/**
+ * struct lkl_dev_blk_ops - block device host operations, defined in lkl_host.h.
+ */
+struct lkl_dev_blk_ops;
+
+/**
+ * lkl_disk - host disk handle
+ *
+ * @dev - a pointer to 'virtio_blk_dev' structure for this disk
+ * @fd - a POSIX file descriptor that can be used by preadv/pwritev
+ * @handle - an NT file handle that can be used by ReadFile/WriteFile
+ */
+struct lkl_disk {
+	void *dev;
+	union {
+		int fd;
+		void *handle;
+	};
+	struct lkl_dev_blk_ops *ops;
+};
+
+/**
+ * lkl_disk_add - add a new disk
+ *
+ * @disk - the host disk handle
+ * @returns a disk id (0 is valid) or a strictly negative value in case of error
+ */
+int lkl_disk_add(struct lkl_disk *disk);
+
+/**
+ * lkl_disk_remove - remove a disk
+ *
+ * This function makes a cleanup of the @disk's virtio_dev structure
+ * that was initialized by lkl_disk_add before.
+ *
+ * @disk - the host disk handle
+ */
+int lkl_disk_remove(struct lkl_disk disk);
+
 
 #ifdef __cplusplus
 }
diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
index 81239e2b556f..a630efc95f0f 100644
--- a/tools/lkl/include/lkl_host.h
+++ b/tools/lkl/include/lkl_host.h
@@ -20,6 +20,63 @@ int lkl_printf(const char *fmt, ...);
 
 extern char lkl_virtio_devs[4096];
 
+#ifdef LKL_HOST_CONFIG_POSIX
+#include <sys/uio.h>
+#else
+struct iovec {
+	void *iov_base;
+	size_t iov_len;
+};
+#endif
+
+extern struct lkl_dev_blk_ops lkl_dev_blk_ops;
+
+/**
+ * struct lkl_blk_req - block device request
+ *
+ * @type: type of request
+ * @prio: priority of request - currently unused
+ * @sector: offset in units 512 bytes for read / write requests
+ * @buf: an array of buffers to be used for read / write requests
+ * @count: the number of buffers
+ */
+struct lkl_blk_req {
+#define LKL_DEV_BLK_TYPE_READ		0
+#define LKL_DEV_BLK_TYPE_WRITE		1
+#define LKL_DEV_BLK_TYPE_FLUSH		4
+#define LKL_DEV_BLK_TYPE_FLUSH_OUT	5
+	unsigned int type;
+	unsigned int prio;
+	unsigned long long sector;
+	struct iovec *buf;
+	int count;
+};
+
+/**
+ * struct lkl_dev_blk_ops - block device host operations
+ */
+struct lkl_dev_blk_ops {
+	/**
+	 * @get_capacity: returns the disk capacity in bytes
+	 *
+	 * @disk - the disk for which the capacity is requested;
+	 * @res - pointer to receive the capacity, in bytes;
+	 * @returns - 0 in case of success, negative value in case of error
+	 */
+	int (*get_capacity)(struct lkl_disk disk, unsigned long long *res);
+#define LKL_DEV_BLK_STATUS_OK		0
+#define LKL_DEV_BLK_STATUS_IOERR	1
+#define LKL_DEV_BLK_STATUS_UNSUP	2
+	/**
+	 * @request: issue a block request
+	 *
+	 * @disk - the disk the request is issued to;
+	 * @req - a request described by &struct lkl_blk_req
+	 */
+	int (*request)(struct lkl_disk disk, struct lkl_blk_req *req);
+};
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index 5fd1843b51d1..d3154cfa4952 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -3,6 +3,7 @@ CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 liblkl-y += iomem.o
 liblkl-y += jmp_buf.o
 liblkl-y += utils.o
+liblkl-y += virtio_blk.o
 liblkl-y += virtio.o
 liblkl-y += dbg.o
 liblkl-y += dbg_handler.o
diff --git a/tools/lkl/lib/virtio_blk.c b/tools/lkl/lib/virtio_blk.c
new file mode 100644
index 000000000000..9e23316c5d99
--- /dev/null
+++ b/tools/lkl/lib/virtio_blk.c
@@ -0,0 +1,132 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <lkl_host.h>
+#include "virtio.h"
+#include "endian.h"
+
+struct virtio_blk_dev {
+	struct virtio_dev dev;
+	struct lkl_virtio_blk_config config;
+	struct lkl_dev_blk_ops *ops;
+	struct lkl_disk disk;
+};
+
+struct virtio_blk_req_trailer {
+	uint8_t status;
+};
+
+static int blk_check_features(struct virtio_dev *dev)
+{
+	if (dev->driver_features == dev->device_features)
+		return 0;
+
+	return -LKL_EINVAL;
+}
+
+static int blk_enqueue(struct virtio_dev *dev, int q, struct virtio_req *req)
+{
+	struct virtio_blk_dev *blk_dev;
+	struct lkl_virtio_blk_outhdr *h;
+	struct virtio_blk_req_trailer *t;
+	struct lkl_blk_req lkl_req;
+
+	if (req->buf_count < 3) {
+		lkl_printf("virtio_blk: no status buf\n");
+		goto out;
+	}
+
+	h = req->buf[0].iov_base;
+	t = req->buf[req->buf_count - 1].iov_base;
+	blk_dev = container_of(dev, struct virtio_blk_dev, dev);
+
+	t->status = LKL_DEV_BLK_STATUS_IOERR;
+
+	if (req->buf[0].iov_len != sizeof(*h)) {
+		lkl_printf("virtio_blk: bad header buf\n");
+		goto out;
+	}
+
+	if (req->buf[req->buf_count - 1].iov_len != sizeof(*t)) {
+		lkl_printf("virtio_blk: bad status buf\n");
+		goto out;
+	}
+
+	lkl_req.type = le32toh(h->type);
+	lkl_req.prio = le32toh(h->ioprio);
+	lkl_req.sector = le32toh(h->sector);
+	lkl_req.buf = &req->buf[1];
+	lkl_req.count = req->buf_count - 2;
+
+	t->status = blk_dev->ops->request(blk_dev->disk, &lkl_req);
+
+out:
+	virtio_req_complete(req, 0);
+	return 0;
+}
+
+static struct virtio_dev_ops blk_ops = {
+	.check_features = blk_check_features,
+	.enqueue = blk_enqueue,
+};
+
+
+int lkl_disk_add(struct lkl_disk *disk)
+{
+	struct virtio_blk_dev *dev;
+	unsigned long long capacity;
+	int ret;
+
+	dev = lkl_host_ops.mem_alloc(sizeof(*dev));
+	if (!dev)
+		return -LKL_ENOMEM;
+
+	disk->dev = dev;
+
+	dev->dev.device_id = LKL_VIRTIO_ID_BLOCK;
+	dev->dev.vendor_id = 0;
+	dev->dev.device_features = 0;
+	dev->dev.config_gen = 0;
+	dev->dev.config_data = &dev->config;
+	dev->dev.config_len = sizeof(dev->config);
+	dev->dev.ops = &blk_ops;
+	if (disk->ops)
+		dev->ops = disk->ops;
+	else
+		dev->ops = &lkl_dev_blk_ops;
+	dev->disk = *disk;
+
+	ret = dev->ops->get_capacity(*disk, &capacity);
+	if (ret) {
+		ret = -LKL_ENOMEM;
+		goto out_free;
+	}
+	dev->config.capacity = capacity / 512;
+
+	ret = virtio_dev_setup(&dev->dev, 1, 32);
+	if (ret)
+		goto out_free;
+
+	return dev->dev.virtio_mmio_id;
+
+out_free:
+	lkl_host_ops.mem_free(dev);
+
+	return ret;
+}
+
+int lkl_disk_remove(struct lkl_disk disk)
+{
+	struct virtio_blk_dev *dev;
+	int ret;
+
+	dev = (struct virtio_blk_dev *)disk.dev;
+	if (!dev)
+		return -LKL_EINVAL;
+
+	ret = virtio_dev_cleanup(&dev->dev);
+	if (ret < 0)
+		return ret;
+
+	lkl_host_ops.mem_free(dev);
+
+	return 0;
+}
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 18/37] lkl tools: host lib: virtio block device
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Octavian Purdila, Akira Moroo, Petros Angelatos,
	linux-kernel-library, Michael Zimmermann

From: Octavian Purdila <tavi.purdila@gmail.com>

Host independent implementation for virtio block devices. The host
dependent part of the host library must provide an implementation for
lkl_dev_block_ops.

Disks can be added to the LKL configuration via lkl_disk_add(), a new
LKL application API.

Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl.h      |  39 +++++++++++
 tools/lkl/include/lkl_host.h |  57 +++++++++++++++
 tools/lkl/lib/Build          |   1 +
 tools/lkl/lib/virtio_blk.c   | 132 +++++++++++++++++++++++++++++++++++
 4 files changed, 229 insertions(+)
 create mode 100644 tools/lkl/lib/virtio_blk.c

diff --git a/tools/lkl/include/lkl.h b/tools/lkl/include/lkl.h
index 76da534a85f1..967fbe4dbc26 100644
--- a/tools/lkl/include/lkl.h
+++ b/tools/lkl/include/lkl.h
@@ -350,6 +350,45 @@ const char *lkl_strerror(int err);
  */
 void lkl_perror(char *msg, int err);
 
+/**
+ * struct lkl_dev_blk_ops - block device host operations, defined in lkl_host.h.
+ */
+struct lkl_dev_blk_ops;
+
+/**
+ * lkl_disk - host disk handle
+ *
+ * @dev - a pointer to 'virtio_blk_dev' structure for this disk
+ * @fd - a POSIX file descriptor that can be used by preadv/pwritev
+ * @handle - an NT file handle that can be used by ReadFile/WriteFile
+ */
+struct lkl_disk {
+	void *dev;
+	union {
+		int fd;
+		void *handle;
+	};
+	struct lkl_dev_blk_ops *ops;
+};
+
+/**
+ * lkl_disk_add - add a new disk
+ *
+ * @disk - the host disk handle
+ * @returns a disk id (0 is valid) or a strictly negative value in case of error
+ */
+int lkl_disk_add(struct lkl_disk *disk);
+
+/**
+ * lkl_disk_remove - remove a disk
+ *
+ * This function makes a cleanup of the @disk's virtio_dev structure
+ * that was initialized by lkl_disk_add before.
+ *
+ * @disk - the host disk handle
+ */
+int lkl_disk_remove(struct lkl_disk disk);
+
 
 #ifdef __cplusplus
 }
diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
index 81239e2b556f..a630efc95f0f 100644
--- a/tools/lkl/include/lkl_host.h
+++ b/tools/lkl/include/lkl_host.h
@@ -20,6 +20,63 @@ int lkl_printf(const char *fmt, ...);
 
 extern char lkl_virtio_devs[4096];
 
+#ifdef LKL_HOST_CONFIG_POSIX
+#include <sys/uio.h>
+#else
+struct iovec {
+	void *iov_base;
+	size_t iov_len;
+};
+#endif
+
+extern struct lkl_dev_blk_ops lkl_dev_blk_ops;
+
+/**
+ * struct lkl_blk_req - block device request
+ *
+ * @type: type of request
+ * @prio: priority of request - currently unused
+ * @sector: offset in units 512 bytes for read / write requests
+ * @buf: an array of buffers to be used for read / write requests
+ * @count: the number of buffers
+ */
+struct lkl_blk_req {
+#define LKL_DEV_BLK_TYPE_READ		0
+#define LKL_DEV_BLK_TYPE_WRITE		1
+#define LKL_DEV_BLK_TYPE_FLUSH		4
+#define LKL_DEV_BLK_TYPE_FLUSH_OUT	5
+	unsigned int type;
+	unsigned int prio;
+	unsigned long long sector;
+	struct iovec *buf;
+	int count;
+};
+
+/**
+ * struct lkl_dev_blk_ops - block device host operations
+ */
+struct lkl_dev_blk_ops {
+	/**
+	 * @get_capacity: returns the disk capacity in bytes
+	 *
+	 * @disk - the disk for which the capacity is requested;
+	 * @res - pointer to receive the capacity, in bytes;
+	 * @returns - 0 in case of success, negative value in case of error
+	 */
+	int (*get_capacity)(struct lkl_disk disk, unsigned long long *res);
+#define LKL_DEV_BLK_STATUS_OK		0
+#define LKL_DEV_BLK_STATUS_IOERR	1
+#define LKL_DEV_BLK_STATUS_UNSUP	2
+	/**
+	 * @request: issue a block request
+	 *
+	 * @disk - the disk the request is issued to;
+	 * @req - a request described by &struct lkl_blk_req
+	 */
+	int (*request)(struct lkl_disk disk, struct lkl_blk_req *req);
+};
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index 5fd1843b51d1..d3154cfa4952 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -3,6 +3,7 @@ CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 liblkl-y += iomem.o
 liblkl-y += jmp_buf.o
 liblkl-y += utils.o
+liblkl-y += virtio_blk.o
 liblkl-y += virtio.o
 liblkl-y += dbg.o
 liblkl-y += dbg_handler.o
diff --git a/tools/lkl/lib/virtio_blk.c b/tools/lkl/lib/virtio_blk.c
new file mode 100644
index 000000000000..9e23316c5d99
--- /dev/null
+++ b/tools/lkl/lib/virtio_blk.c
@@ -0,0 +1,132 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <lkl_host.h>
+#include "virtio.h"
+#include "endian.h"
+
+struct virtio_blk_dev {
+	struct virtio_dev dev;
+	struct lkl_virtio_blk_config config;
+	struct lkl_dev_blk_ops *ops;
+	struct lkl_disk disk;
+};
+
+struct virtio_blk_req_trailer {
+	uint8_t status;
+};
+
+static int blk_check_features(struct virtio_dev *dev)
+{
+	if (dev->driver_features == dev->device_features)
+		return 0;
+
+	return -LKL_EINVAL;
+}
+
+static int blk_enqueue(struct virtio_dev *dev, int q, struct virtio_req *req)
+{
+	struct virtio_blk_dev *blk_dev;
+	struct lkl_virtio_blk_outhdr *h;
+	struct virtio_blk_req_trailer *t;
+	struct lkl_blk_req lkl_req;
+
+	if (req->buf_count < 3) {
+		lkl_printf("virtio_blk: no status buf\n");
+		goto out;
+	}
+
+	h = req->buf[0].iov_base;
+	t = req->buf[req->buf_count - 1].iov_base;
+	blk_dev = container_of(dev, struct virtio_blk_dev, dev);
+
+	t->status = LKL_DEV_BLK_STATUS_IOERR;
+
+	if (req->buf[0].iov_len != sizeof(*h)) {
+		lkl_printf("virtio_blk: bad header buf\n");
+		goto out;
+	}
+
+	if (req->buf[req->buf_count - 1].iov_len != sizeof(*t)) {
+		lkl_printf("virtio_blk: bad status buf\n");
+		goto out;
+	}
+
+	lkl_req.type = le32toh(h->type);
+	lkl_req.prio = le32toh(h->ioprio);
+	lkl_req.sector = le32toh(h->sector);
+	lkl_req.buf = &req->buf[1];
+	lkl_req.count = req->buf_count - 2;
+
+	t->status = blk_dev->ops->request(blk_dev->disk, &lkl_req);
+
+out:
+	virtio_req_complete(req, 0);
+	return 0;
+}
+
+static struct virtio_dev_ops blk_ops = {
+	.check_features = blk_check_features,
+	.enqueue = blk_enqueue,
+};
+
+
+int lkl_disk_add(struct lkl_disk *disk)
+{
+	struct virtio_blk_dev *dev;
+	unsigned long long capacity;
+	int ret;
+
+	dev = lkl_host_ops.mem_alloc(sizeof(*dev));
+	if (!dev)
+		return -LKL_ENOMEM;
+
+	disk->dev = dev;
+
+	dev->dev.device_id = LKL_VIRTIO_ID_BLOCK;
+	dev->dev.vendor_id = 0;
+	dev->dev.device_features = 0;
+	dev->dev.config_gen = 0;
+	dev->dev.config_data = &dev->config;
+	dev->dev.config_len = sizeof(dev->config);
+	dev->dev.ops = &blk_ops;
+	if (disk->ops)
+		dev->ops = disk->ops;
+	else
+		dev->ops = &lkl_dev_blk_ops;
+	dev->disk = *disk;
+
+	ret = dev->ops->get_capacity(*disk, &capacity);
+	if (ret) {
+		ret = -LKL_ENOMEM;
+		goto out_free;
+	}
+	dev->config.capacity = capacity / 512;
+
+	ret = virtio_dev_setup(&dev->dev, 1, 32);
+	if (ret)
+		goto out_free;
+
+	return dev->dev.virtio_mmio_id;
+
+out_free:
+	lkl_host_ops.mem_free(dev);
+
+	return ret;
+}
+
+int lkl_disk_remove(struct lkl_disk disk)
+{
+	struct virtio_blk_dev *dev;
+	int ret;
+
+	dev = (struct virtio_blk_dev *)disk.dev;
+	if (!dev)
+		return -LKL_EINVAL;
+
+	ret = virtio_dev_cleanup(&dev->dev);
+	if (ret < 0)
+		return ret;
+
+	lkl_host_ops.mem_free(dev);
+
+	return 0;
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 19/37] lkl tools: host lib: filesystem helpers
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Conrad Meyer, Hajime Tazaki, Michael Zimmermann, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Add LKL applications APIs to mount and unmount a filesystem from a
disk added via lkl_disk_add().

Also add open/close/read directory wrappers on top of
lkl_sys_getdents64.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl.h | 139 +++++++++++++
 tools/lkl/lib/Build     |   1 +
 tools/lkl/lib/fs.c      | 433 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 573 insertions(+)
 create mode 100644 tools/lkl/lib/fs.c

diff --git a/tools/lkl/include/lkl.h b/tools/lkl/include/lkl.h
index 967fbe4dbc26..8bda12d4c6de 100644
--- a/tools/lkl/include/lkl.h
+++ b/tools/lkl/include/lkl.h
@@ -389,6 +389,145 @@ int lkl_disk_add(struct lkl_disk *disk);
  */
 int lkl_disk_remove(struct lkl_disk disk);
 
+/**
+ * lkl_get_virtiolkl_encode_dev_from_sysfs_blkdev - extract device id from sysfs
+ *
+ * This function returns the device id for the given sysfs dev node.
+ * The content of the node has to be in the form 'MAJOR:MINOR'.
+ * Also, this function expects an absolute path which means that sysfs
+ * already has to be mounted at the given path
+ *
+ * @sysfs_path - absolute path to the sysfs dev node
+ * @pdevid - pointer to memory where dev id will be returned
+ * @returns - 0 on success, a negative value on error
+ */
+int lkl_encode_dev_from_sysfs(const char *sysfs_path, uint32_t *pdevid);
+
+/**
+ * lkl_get_virtio_blkdev - get device id of a disk (partition)
+ *
+ * This function returns the device id for the given disk.
+ *
+ * @disk_id - the disk id identifying the disk
+ * @part - disk partition or zero for full disk
+ * @pdevid - pointer to memory where dev id will be returned
+ * @returns - 0 on success, a negative value on error
+ */
+int lkl_get_virtio_blkdev(int disk_id, unsigned int part, uint32_t *pdevid);
+
+
+/**
+ * lkl_mount_dev - mount a disk
+ *
+ * This functions creates a device file for the given disk, creates a mount
+ * point and mounts the device over the mount point.
+ *
+ * @disk_id - the disk id identifying the disk to be mounted
+ * @part - disk partition or zero for full disk
+ * @fs_type - filesystem type
+ * @flags - mount flags
+ * @opts - additional filesystem specific mount options
+ * @mnt_str - a string that will be filled by this function with the path where
+ * the filesystem has been mounted
+ * @mnt_str_len - size of mnt_str
+ * @returns - 0 on success, a negative value on error
+ */
+long lkl_mount_dev(unsigned int disk_id, unsigned int part, const char *fs_type,
+		   int flags, const char *opts,
+		   char *mnt_str, unsigned int mnt_str_len);
+
+/**
+ * lkl_umount_dev - umount a disk
+ *
+ * This functions umounts the given disks and removes the device file and the
+ * mount point.
+ *
+ * @disk_id - the disk id identifying the disk to be mounted
+ * @part - disk partition or zero for full disk
+ * @flags - umount flags
+ * @timeout_ms - timeout to wait for the kernel to flush closed files so that
+ * umount can succeed
+ * @returns - 0 on success, a negative value on error
+ */
+long lkl_umount_dev(unsigned int disk_id, unsigned int part, int flags,
+		    long timeout_ms);
+
+/**
+ * lkl_umount_timeout - umount filesystem with timeout
+ *
+ * @path - the path to unmount
+ * @flags - umount flags
+ * @timeout_ms - timeout to wait for the kernel to flush closed files so that
+ * umount can succeed
+ * @returns - 0 on success, a negative value on error
+ */
+long lkl_umount_timeout(char *path, int flags, long timeout_ms);
+
+/**
+ * lkl_opendir - open a directory
+ *
+ * @path - directory path
+ * @err - pointer to store the error in case of failure
+ * @returns - a handle to be used when calling lkl_readdir
+ */
+struct lkl_dir *lkl_opendir(const char *path, int *err);
+
+/**
+ * lkl_fdopendir - open a directory
+ *
+ * @fd - file descriptor
+ * @err - pointer to store the error in case of failure
+ * @returns - a handle to be used when calling lkl_readdir
+ */
+struct lkl_dir *lkl_fdopendir(int fd, int *err);
+
+/**
+ * lkl_rewinddir - reset directory stream
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ */
+void lkl_rewinddir(struct lkl_dir *dir);
+
+/**
+ * lkl_closedir - close the directory
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ */
+int lkl_closedir(struct lkl_dir *dir);
+
+/**
+ * lkl_readdir - get the next available entry of the directory
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ * @returns - a lkl_dirent64 entry or NULL if the end of the directory stream is
+ * reached or if an error occurred; check lkl_errdir() to distinguish between
+ * errors or end of the directory stream
+ */
+struct lkl_linux_dirent64 *lkl_readdir(struct lkl_dir *dir);
+
+/**
+ * lkl_errdir - checks if an error occurred during the last lkl_readdir call
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ * @returns - 0 if no error occurred, or a negative value otherwise
+ */
+int lkl_errdir(struct lkl_dir *dir);
+
+/**
+ * lkl_dirfd - gets the file descriptor associated with the directory handle
+ *
+ * @dir - the directory handle as returned by lkl_opendir
+ * @returns - a positive value,which is the LKL file descriptor associated with
+ * the directory handle, or a negative value otherwise
+ */
+int lkl_dirfd(struct lkl_dir *dir);
+
+/**
+ * lkl_mount_fs - mount a file system type like proc, sys
+ * @fstype - file system type. e.g. proc, sys
+ * @returns - 0 on success. 1 if it's already mounted. negative on failure.
+ */
+int lkl_mount_fs(char *fstype);
 
 #ifdef __cplusplus
 }
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index d3154cfa4952..f2ee04366464 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1,5 +1,6 @@
 CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 
+liblkl-y += fs.o
 liblkl-y += iomem.o
 liblkl-y += jmp_buf.o
 liblkl-y += utils.o
diff --git a/tools/lkl/lib/fs.c b/tools/lkl/lib/fs.c
new file mode 100644
index 000000000000..c6f197aec3fb
--- /dev/null
+++ b/tools/lkl/lib/fs.c
@@ -0,0 +1,433 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <lkl_host.h>
+
+#include "virtio.h"
+
+#define MAX_FSTYPE_LEN 50
+int lkl_mount_fs(char *fstype)
+{
+	char dir[MAX_FSTYPE_LEN+2] = "/";
+	int flags = 0, ret = 0;
+
+	strncat(dir, fstype, MAX_FSTYPE_LEN);
+
+	/* Create with regular umask */
+	ret = lkl_sys_mkdir(dir, 0xff);
+	if (ret && ret != -LKL_EEXIST) {
+		lkl_perror("mount_fs mkdir", ret);
+		return ret;
+	}
+
+	/* We have no use for nonzero flags right now */
+	ret = lkl_sys_mount("none", dir, fstype, flags, NULL);
+	if (ret && ret != -LKL_EBUSY) {
+		lkl_sys_rmdir(dir);
+		return ret;
+	}
+
+	if (ret == -LKL_EBUSY)
+		return 1;
+	return 0;
+}
+
+static uint32_t new_encode_dev(unsigned int major, unsigned int minor)
+{
+	return (minor & 0xff) | (major << 8) | ((minor & ~0xff) << 12);
+}
+
+static int startswith(const char *str, const char *pre)
+{
+	return strncmp(pre, str, strlen(pre)) == 0;
+}
+
+static int get_node_with_prefix(const char *path, const char *prefix,
+				char *result, unsigned int result_len)
+{
+	struct lkl_dir *dir = NULL;
+	struct lkl_linux_dirent64 *dirent;
+	int ret;
+
+	dir = lkl_opendir(path, &ret);
+	if (!dir)
+		return ret;
+
+	ret = -LKL_ENOENT;
+
+	while ((dirent = lkl_readdir(dir))) {
+		if (startswith(dirent->d_name, prefix)) {
+			if (strlen(dirent->d_name) + 1 > result_len) {
+				ret = -LKL_ENOMEM;
+				break;
+			}
+			memcpy(result, dirent->d_name, strlen(dirent->d_name));
+			result[strlen(dirent->d_name)] = '\0';
+			ret = 0;
+			break;
+		}
+	}
+
+	lkl_closedir(dir);
+
+	return ret;
+}
+
+int lkl_encode_dev_from_sysfs(const char *sysfs_path, uint32_t *pdevid)
+{
+	int ret;
+	long fd;
+	int major, minor;
+	char buf[16] = { 0, };
+	char *bufptr;
+
+	fd = lkl_sys_open(sysfs_path, LKL_O_RDONLY, 0);
+	if (fd < 0)
+		return fd;
+
+	ret = lkl_sys_read(fd, buf, sizeof(buf));
+	if (ret < 0)
+		goto out_close;
+
+	if (ret == sizeof(buf)) {
+		ret = -LKL_ENOBUFS;
+		goto out_close;
+	}
+
+	bufptr = strchr(buf, ':');
+	if (bufptr == NULL) {
+		ret = -LKL_EINVAL;
+		goto out_close;
+	}
+	bufptr[0] = '\0';
+	bufptr++;
+
+	major = atoi(buf);
+	minor = atoi(bufptr);
+
+	*pdevid = new_encode_dev(major, minor);
+	ret = 0;
+
+out_close:
+	lkl_sys_close(fd);
+
+	return ret;
+}
+
+#define SYSFS_DEV_VIRTIO_PLATFORM_PATH \
+	"/sysfs/devices/platform/virtio-mmio.%d.auto"
+#define SYSFS_DEV_VIRTIO_CMDLINE_PATH \
+	"/sysfs/devices/virtio-mmio-cmdline/virtio-mmio.%d"
+
+struct abuf {
+	char *mem, *ptr;
+	unsigned int len;
+};
+
+static int snprintf_append(struct abuf *buf, const char *fmt, ...)
+{
+	int ret;
+	va_list args;
+
+	if (!buf->ptr)
+		buf->ptr = buf->mem;
+
+	va_start(args, fmt);
+	ret = vsnprintf(buf->ptr, buf->len - (buf->ptr - buf->mem), fmt, args);
+	va_end(args);
+
+	if (ret < 0 || (ret >= (buf->len - (buf->ptr - buf->mem))))
+		return -LKL_ENOMEM;
+
+	buf->ptr += ret;
+
+	return 0;
+}
+
+int lkl_get_virtio_blkdev(int disk_id, unsigned int part, uint32_t *pdevid)
+{
+	char sysfs_path[LKL_PATH_MAX];
+	char virtio_name[LKL_PATH_MAX];
+	char disk_name[LKL_PATH_MAX];
+	struct abuf sysfs_path_buf = {
+		.mem = sysfs_path,
+		.len = sizeof(sysfs_path),
+	};
+	char *fmt;
+	int ret;
+
+	if (disk_id < 0)
+		return -LKL_EINVAL;
+
+	ret = lkl_mount_fs("sysfs");
+	if (ret < 0)
+		return ret;
+
+	if ((uint32_t) disk_id >= virtio_get_num_bootdevs()) {
+		fmt = SYSFS_DEV_VIRTIO_PLATFORM_PATH;
+		disk_id -= virtio_get_num_bootdevs();
+	} else {
+		fmt = SYSFS_DEV_VIRTIO_CMDLINE_PATH;
+	}
+
+	ret = snprintf_append(&sysfs_path_buf, fmt, disk_id);
+	if (ret)
+		return ret;
+
+	ret = get_node_with_prefix(sysfs_path, "virtio", virtio_name,
+				   sizeof(virtio_name));
+	if (ret)
+		return ret;
+
+	ret = snprintf_append(&sysfs_path_buf, "/%s/block", virtio_name);
+	if (ret)
+		return ret;
+
+	ret = get_node_with_prefix(sysfs_path, "vd", disk_name,
+				   sizeof(disk_name));
+	if (ret)
+		return ret;
+
+	if (!part)
+		ret = snprintf_append(&sysfs_path_buf, "/%s/dev", disk_name);
+	else
+		ret = snprintf_append(&sysfs_path_buf, "/%s/%s%d/dev",
+				      disk_name, disk_name, part);
+	if (ret)
+		return ret;
+
+	return lkl_encode_dev_from_sysfs(sysfs_path, pdevid);
+}
+
+long lkl_mount_dev(unsigned int disk_id, unsigned int part,
+		   const char *fs_type, int flags,
+		   const char *data, char *mnt_str, unsigned int mnt_str_len)
+{
+	char dev_str[] = { "/dev/xxxxxxxx" };
+	unsigned int dev;
+	int err;
+	char _data[4096]; /* FIXME: PAGE_SIZE is not exported by LKL */
+
+	if (mnt_str_len < sizeof(dev_str))
+		return -LKL_ENOMEM;
+
+	err = lkl_get_virtio_blkdev(disk_id, part, &dev);
+	if (err < 0)
+		return err;
+
+	snprintf(dev_str, sizeof(dev_str), "/dev/%08x", dev);
+	snprintf(mnt_str, mnt_str_len, "/mnt/%08x", dev);
+
+	err = lkl_sys_access("/dev", LKL_S_IRWXO);
+	if (err < 0) {
+		if (err == -LKL_ENOENT)
+			err = lkl_sys_mkdir("/dev", 0700);
+		if (err < 0)
+			return err;
+	}
+
+	err = lkl_sys_mknod(dev_str, LKL_S_IFBLK | 0600, dev);
+	if (err < 0)
+		return err;
+
+	err = lkl_sys_access("/mnt", LKL_S_IRWXO);
+	if (err < 0) {
+		if (err == -LKL_ENOENT)
+			err = lkl_sys_mkdir("/mnt", 0700);
+		if (err < 0)
+			return err;
+	}
+
+	err = lkl_sys_mkdir(mnt_str, 0700);
+	if (err < 0) {
+		lkl_sys_unlink(dev_str);
+		return err;
+	}
+
+	/* kernel always copies a full page */
+	if (data) {
+		strncpy(_data, data, sizeof(_data));
+		_data[sizeof(_data) - 1] = 0;
+	} else {
+		_data[0] = 0;
+	}
+
+	err = lkl_sys_mount(dev_str, mnt_str, (char *)fs_type, flags, _data);
+	if (err < 0) {
+		lkl_sys_unlink(dev_str);
+		lkl_sys_rmdir(mnt_str);
+		return err;
+	}
+
+	return 0;
+}
+
+long lkl_umount_timeout(char *path, int flags, long timeout_ms)
+{
+	long incr = 10000000; /* 10 ms */
+	struct lkl_timespec ts = {
+		.tv_sec = 0,
+		.tv_nsec = incr,
+	};
+	long err;
+
+	do {
+		err = lkl_sys_umount(path, flags);
+		if (err == -LKL_EBUSY) {
+			lkl_sys_nanosleep((struct __lkl__kernel_timespec *)&ts,
+					  NULL);
+			timeout_ms -= incr / 1000000;
+		}
+	} while (err == -LKL_EBUSY && timeout_ms > 0);
+
+	return err;
+}
+
+long lkl_umount_dev(unsigned int disk_id, unsigned int part, int flags,
+		    long timeout_ms)
+{
+	char dev_str[] = { "/dev/xxxxxxxx" };
+	char mnt_str[] = { "/mnt/xxxxxxxx" };
+	unsigned int dev;
+	int err;
+
+	err = lkl_get_virtio_blkdev(disk_id, part, &dev);
+	if (err < 0)
+		return err;
+
+	snprintf(dev_str, sizeof(dev_str), "/dev/%08x", dev);
+	snprintf(mnt_str, sizeof(mnt_str), "/mnt/%08x", dev);
+
+	err = lkl_umount_timeout(mnt_str, flags, timeout_ms);
+	if (err)
+		return err;
+
+	err = lkl_sys_unlink(dev_str);
+	if (err)
+		return err;
+
+	return lkl_sys_rmdir(mnt_str);
+}
+
+struct lkl_dir {
+	int fd;
+	char buf[1024];
+	char *pos;
+	int len;
+};
+
+static struct lkl_dir *lkl_dir_alloc(int *err)
+{
+	struct lkl_dir *dir = lkl_host_ops.mem_alloc(sizeof(struct lkl_dir));
+
+	if (!dir) {
+		*err = -LKL_ENOMEM;
+		return NULL;
+	}
+
+	dir->len = 0;
+	dir->pos = NULL;
+
+	return dir;
+}
+
+struct lkl_dir *lkl_opendir(const char *path, int *err)
+{
+	struct lkl_dir *dir = lkl_dir_alloc(err);
+
+	if (!dir) {
+		*err = -LKL_ENOMEM;
+		return NULL;
+	}
+
+	dir->fd = lkl_sys_open(path, LKL_O_RDONLY | LKL_O_DIRECTORY, 0);
+	if (dir->fd < 0) {
+		*err = dir->fd;
+		lkl_host_ops.mem_free(dir);
+		return NULL;
+	}
+
+	*err = 0;
+
+	return dir;
+}
+
+struct lkl_dir *lkl_fdopendir(int fd, int *err)
+{
+	struct lkl_dir *dir = lkl_dir_alloc(err);
+
+	if (!dir)
+		return NULL;
+
+	dir->fd = fd;
+
+	return dir;
+}
+
+void lkl_rewinddir(struct lkl_dir *dir)
+{
+	lkl_sys_lseek(dir->fd, 0, LKL_SEEK_SET);
+	dir->len = 0;
+	dir->pos = NULL;
+}
+
+int lkl_closedir(struct lkl_dir *dir)
+{
+	int ret;
+
+	ret = lkl_sys_close(dir->fd);
+	lkl_host_ops.mem_free(dir);
+
+	return ret;
+}
+
+struct lkl_linux_dirent64 *lkl_readdir(struct lkl_dir *dir)
+{
+	struct lkl_linux_dirent64 *de;
+
+	if (dir->len < 0)
+		return NULL;
+
+	if (!dir->pos || dir->pos - dir->buf >= dir->len)
+		goto read_buf;
+
+return_de:
+	de = (struct lkl_linux_dirent64 *)dir->pos;
+	dir->pos += de->d_reclen;
+
+	return de;
+
+read_buf:
+	dir->pos = NULL;
+	de = (struct lkl_linux_dirent64 *)dir->buf;
+	dir->len = lkl_sys_getdents64(dir->fd, de, sizeof(dir->buf));
+	if (dir->len <= 0)
+		return NULL;
+
+	dir->pos = dir->buf;
+	goto return_de;
+}
+
+int lkl_errdir(struct lkl_dir *dir)
+{
+	if (dir->len >= 0)
+		return 0;
+
+	return dir->len;
+}
+
+int lkl_dirfd(struct lkl_dir *dir)
+{
+	return dir->fd;
+}
+
+int lkl_set_fd_limit(unsigned int fd_limit)
+{
+	struct lkl_rlimit rlim = {
+		.rlim_cur = fd_limit,
+		.rlim_max = fd_limit,
+	};
+	return lkl_sys_setrlimit(LKL_RLIMIT_NOFILE, &rlim);
+}
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 19/37] lkl tools: host lib: filesystem helpers
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Conrad Meyer, Octavian Purdila, Akira Moroo,
	Yuan Liu, linux-kernel-library, Michael Zimmermann,
	Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add LKL applications APIs to mount and unmount a filesystem from a
disk added via lkl_disk_add().

Also add open/close/read directory wrappers on top of
lkl_sys_getdents64.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl.h | 139 +++++++++++++
 tools/lkl/lib/Build     |   1 +
 tools/lkl/lib/fs.c      | 433 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 573 insertions(+)
 create mode 100644 tools/lkl/lib/fs.c

diff --git a/tools/lkl/include/lkl.h b/tools/lkl/include/lkl.h
index 967fbe4dbc26..8bda12d4c6de 100644
--- a/tools/lkl/include/lkl.h
+++ b/tools/lkl/include/lkl.h
@@ -389,6 +389,145 @@ int lkl_disk_add(struct lkl_disk *disk);
  */
 int lkl_disk_remove(struct lkl_disk disk);
 
+/**
+ * lkl_get_virtiolkl_encode_dev_from_sysfs_blkdev - extract device id from sysfs
+ *
+ * This function returns the device id for the given sysfs dev node.
+ * The content of the node has to be in the form 'MAJOR:MINOR'.
+ * Also, this function expects an absolute path which means that sysfs
+ * already has to be mounted at the given path
+ *
+ * @sysfs_path - absolute path to the sysfs dev node
+ * @pdevid - pointer to memory where dev id will be returned
+ * @returns - 0 on success, a negative value on error
+ */
+int lkl_encode_dev_from_sysfs(const char *sysfs_path, uint32_t *pdevid);
+
+/**
+ * lkl_get_virtio_blkdev - get device id of a disk (partition)
+ *
+ * This function returns the device id for the given disk.
+ *
+ * @disk_id - the disk id identifying the disk
+ * @part - disk partition or zero for full disk
+ * @pdevid - pointer to memory where dev id will be returned
+ * @returns - 0 on success, a negative value on error
+ */
+int lkl_get_virtio_blkdev(int disk_id, unsigned int part, uint32_t *pdevid);
+
+
+/**
+ * lkl_mount_dev - mount a disk
+ *
+ * This functions creates a device file for the given disk, creates a mount
+ * point and mounts the device over the mount point.
+ *
+ * @disk_id - the disk id identifying the disk to be mounted
+ * @part - disk partition or zero for full disk
+ * @fs_type - filesystem type
+ * @flags - mount flags
+ * @opts - additional filesystem specific mount options
+ * @mnt_str - a string that will be filled by this function with the path where
+ * the filesystem has been mounted
+ * @mnt_str_len - size of mnt_str
+ * @returns - 0 on success, a negative value on error
+ */
+long lkl_mount_dev(unsigned int disk_id, unsigned int part, const char *fs_type,
+		   int flags, const char *opts,
+		   char *mnt_str, unsigned int mnt_str_len);
+
+/**
+ * lkl_umount_dev - umount a disk
+ *
+ * This functions umounts the given disks and removes the device file and the
+ * mount point.
+ *
+ * @disk_id - the disk id identifying the disk to be mounted
+ * @part - disk partition or zero for full disk
+ * @flags - umount flags
+ * @timeout_ms - timeout to wait for the kernel to flush closed files so that
+ * umount can succeed
+ * @returns - 0 on success, a negative value on error
+ */
+long lkl_umount_dev(unsigned int disk_id, unsigned int part, int flags,
+		    long timeout_ms);
+
+/**
+ * lkl_umount_timeout - umount filesystem with timeout
+ *
+ * @path - the path to unmount
+ * @flags - umount flags
+ * @timeout_ms - timeout to wait for the kernel to flush closed files so that
+ * umount can succeed
+ * @returns - 0 on success, a negative value on error
+ */
+long lkl_umount_timeout(char *path, int flags, long timeout_ms);
+
+/**
+ * lkl_opendir - open a directory
+ *
+ * @path - directory path
+ * @err - pointer to store the error in case of failure
+ * @returns - a handle to be used when calling lkl_readdir
+ */
+struct lkl_dir *lkl_opendir(const char *path, int *err);
+
+/**
+ * lkl_fdopendir - open a directory
+ *
+ * @fd - file descriptor
+ * @err - pointer to store the error in case of failure
+ * @returns - a handle to be used when calling lkl_readdir
+ */
+struct lkl_dir *lkl_fdopendir(int fd, int *err);
+
+/**
+ * lkl_rewinddir - reset directory stream
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ */
+void lkl_rewinddir(struct lkl_dir *dir);
+
+/**
+ * lkl_closedir - close the directory
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ */
+int lkl_closedir(struct lkl_dir *dir);
+
+/**
+ * lkl_readdir - get the next available entry of the directory
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ * @returns - a lkl_dirent64 entry or NULL if the end of the directory stream is
+ * reached or if an error occurred; check lkl_errdir() to distinguish between
+ * errors or end of the directory stream
+ */
+struct lkl_linux_dirent64 *lkl_readdir(struct lkl_dir *dir);
+
+/**
+ * lkl_errdir - checks if an error occurred during the last lkl_readdir call
+ *
+ * @dir - the directory handler as returned by lkl_opendir
+ * @returns - 0 if no error occurred, or a negative value otherwise
+ */
+int lkl_errdir(struct lkl_dir *dir);
+
+/**
+ * lkl_dirfd - gets the file descriptor associated with the directory handle
+ *
+ * @dir - the directory handle as returned by lkl_opendir
+ * @returns - a positive value,which is the LKL file descriptor associated with
+ * the directory handle, or a negative value otherwise
+ */
+int lkl_dirfd(struct lkl_dir *dir);
+
+/**
+ * lkl_mount_fs - mount a file system type like proc, sys
+ * @fstype - file system type. e.g. proc, sys
+ * @returns - 0 on success. 1 if it's already mounted. negative on failure.
+ */
+int lkl_mount_fs(char *fstype);
 
 #ifdef __cplusplus
 }
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index d3154cfa4952..f2ee04366464 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1,5 +1,6 @@
 CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 
+liblkl-y += fs.o
 liblkl-y += iomem.o
 liblkl-y += jmp_buf.o
 liblkl-y += utils.o
diff --git a/tools/lkl/lib/fs.c b/tools/lkl/lib/fs.c
new file mode 100644
index 000000000000..c6f197aec3fb
--- /dev/null
+++ b/tools/lkl/lib/fs.c
@@ -0,0 +1,433 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <lkl_host.h>
+
+#include "virtio.h"
+
+#define MAX_FSTYPE_LEN 50
+int lkl_mount_fs(char *fstype)
+{
+	char dir[MAX_FSTYPE_LEN+2] = "/";
+	int flags = 0, ret = 0;
+
+	strncat(dir, fstype, MAX_FSTYPE_LEN);
+
+	/* Create with regular umask */
+	ret = lkl_sys_mkdir(dir, 0xff);
+	if (ret && ret != -LKL_EEXIST) {
+		lkl_perror("mount_fs mkdir", ret);
+		return ret;
+	}
+
+	/* We have no use for nonzero flags right now */
+	ret = lkl_sys_mount("none", dir, fstype, flags, NULL);
+	if (ret && ret != -LKL_EBUSY) {
+		lkl_sys_rmdir(dir);
+		return ret;
+	}
+
+	if (ret == -LKL_EBUSY)
+		return 1;
+	return 0;
+}
+
+static uint32_t new_encode_dev(unsigned int major, unsigned int minor)
+{
+	return (minor & 0xff) | (major << 8) | ((minor & ~0xff) << 12);
+}
+
+static int startswith(const char *str, const char *pre)
+{
+	return strncmp(pre, str, strlen(pre)) == 0;
+}
+
+static int get_node_with_prefix(const char *path, const char *prefix,
+				char *result, unsigned int result_len)
+{
+	struct lkl_dir *dir = NULL;
+	struct lkl_linux_dirent64 *dirent;
+	int ret;
+
+	dir = lkl_opendir(path, &ret);
+	if (!dir)
+		return ret;
+
+	ret = -LKL_ENOENT;
+
+	while ((dirent = lkl_readdir(dir))) {
+		if (startswith(dirent->d_name, prefix)) {
+			if (strlen(dirent->d_name) + 1 > result_len) {
+				ret = -LKL_ENOMEM;
+				break;
+			}
+			memcpy(result, dirent->d_name, strlen(dirent->d_name));
+			result[strlen(dirent->d_name)] = '\0';
+			ret = 0;
+			break;
+		}
+	}
+
+	lkl_closedir(dir);
+
+	return ret;
+}
+
+int lkl_encode_dev_from_sysfs(const char *sysfs_path, uint32_t *pdevid)
+{
+	int ret;
+	long fd;
+	int major, minor;
+	char buf[16] = { 0, };
+	char *bufptr;
+
+	fd = lkl_sys_open(sysfs_path, LKL_O_RDONLY, 0);
+	if (fd < 0)
+		return fd;
+
+	ret = lkl_sys_read(fd, buf, sizeof(buf));
+	if (ret < 0)
+		goto out_close;
+
+	if (ret == sizeof(buf)) {
+		ret = -LKL_ENOBUFS;
+		goto out_close;
+	}
+
+	bufptr = strchr(buf, ':');
+	if (bufptr == NULL) {
+		ret = -LKL_EINVAL;
+		goto out_close;
+	}
+	bufptr[0] = '\0';
+	bufptr++;
+
+	major = atoi(buf);
+	minor = atoi(bufptr);
+
+	*pdevid = new_encode_dev(major, minor);
+	ret = 0;
+
+out_close:
+	lkl_sys_close(fd);
+
+	return ret;
+}
+
+#define SYSFS_DEV_VIRTIO_PLATFORM_PATH \
+	"/sysfs/devices/platform/virtio-mmio.%d.auto"
+#define SYSFS_DEV_VIRTIO_CMDLINE_PATH \
+	"/sysfs/devices/virtio-mmio-cmdline/virtio-mmio.%d"
+
+struct abuf {
+	char *mem, *ptr;
+	unsigned int len;
+};
+
+static int snprintf_append(struct abuf *buf, const char *fmt, ...)
+{
+	int ret;
+	va_list args;
+
+	if (!buf->ptr)
+		buf->ptr = buf->mem;
+
+	va_start(args, fmt);
+	ret = vsnprintf(buf->ptr, buf->len - (buf->ptr - buf->mem), fmt, args);
+	va_end(args);
+
+	if (ret < 0 || (ret >= (buf->len - (buf->ptr - buf->mem))))
+		return -LKL_ENOMEM;
+
+	buf->ptr += ret;
+
+	return 0;
+}
+
+int lkl_get_virtio_blkdev(int disk_id, unsigned int part, uint32_t *pdevid)
+{
+	char sysfs_path[LKL_PATH_MAX];
+	char virtio_name[LKL_PATH_MAX];
+	char disk_name[LKL_PATH_MAX];
+	struct abuf sysfs_path_buf = {
+		.mem = sysfs_path,
+		.len = sizeof(sysfs_path),
+	};
+	char *fmt;
+	int ret;
+
+	if (disk_id < 0)
+		return -LKL_EINVAL;
+
+	ret = lkl_mount_fs("sysfs");
+	if (ret < 0)
+		return ret;
+
+	if ((uint32_t) disk_id >= virtio_get_num_bootdevs()) {
+		fmt = SYSFS_DEV_VIRTIO_PLATFORM_PATH;
+		disk_id -= virtio_get_num_bootdevs();
+	} else {
+		fmt = SYSFS_DEV_VIRTIO_CMDLINE_PATH;
+	}
+
+	ret = snprintf_append(&sysfs_path_buf, fmt, disk_id);
+	if (ret)
+		return ret;
+
+	ret = get_node_with_prefix(sysfs_path, "virtio", virtio_name,
+				   sizeof(virtio_name));
+	if (ret)
+		return ret;
+
+	ret = snprintf_append(&sysfs_path_buf, "/%s/block", virtio_name);
+	if (ret)
+		return ret;
+
+	ret = get_node_with_prefix(sysfs_path, "vd", disk_name,
+				   sizeof(disk_name));
+	if (ret)
+		return ret;
+
+	if (!part)
+		ret = snprintf_append(&sysfs_path_buf, "/%s/dev", disk_name);
+	else
+		ret = snprintf_append(&sysfs_path_buf, "/%s/%s%d/dev",
+				      disk_name, disk_name, part);
+	if (ret)
+		return ret;
+
+	return lkl_encode_dev_from_sysfs(sysfs_path, pdevid);
+}
+
+long lkl_mount_dev(unsigned int disk_id, unsigned int part,
+		   const char *fs_type, int flags,
+		   const char *data, char *mnt_str, unsigned int mnt_str_len)
+{
+	char dev_str[] = { "/dev/xxxxxxxx" };
+	unsigned int dev;
+	int err;
+	char _data[4096]; /* FIXME: PAGE_SIZE is not exported by LKL */
+
+	if (mnt_str_len < sizeof(dev_str))
+		return -LKL_ENOMEM;
+
+	err = lkl_get_virtio_blkdev(disk_id, part, &dev);
+	if (err < 0)
+		return err;
+
+	snprintf(dev_str, sizeof(dev_str), "/dev/%08x", dev);
+	snprintf(mnt_str, mnt_str_len, "/mnt/%08x", dev);
+
+	err = lkl_sys_access("/dev", LKL_S_IRWXO);
+	if (err < 0) {
+		if (err == -LKL_ENOENT)
+			err = lkl_sys_mkdir("/dev", 0700);
+		if (err < 0)
+			return err;
+	}
+
+	err = lkl_sys_mknod(dev_str, LKL_S_IFBLK | 0600, dev);
+	if (err < 0)
+		return err;
+
+	err = lkl_sys_access("/mnt", LKL_S_IRWXO);
+	if (err < 0) {
+		if (err == -LKL_ENOENT)
+			err = lkl_sys_mkdir("/mnt", 0700);
+		if (err < 0)
+			return err;
+	}
+
+	err = lkl_sys_mkdir(mnt_str, 0700);
+	if (err < 0) {
+		lkl_sys_unlink(dev_str);
+		return err;
+	}
+
+	/* kernel always copies a full page */
+	if (data) {
+		strncpy(_data, data, sizeof(_data));
+		_data[sizeof(_data) - 1] = 0;
+	} else {
+		_data[0] = 0;
+	}
+
+	err = lkl_sys_mount(dev_str, mnt_str, (char *)fs_type, flags, _data);
+	if (err < 0) {
+		lkl_sys_unlink(dev_str);
+		lkl_sys_rmdir(mnt_str);
+		return err;
+	}
+
+	return 0;
+}
+
+long lkl_umount_timeout(char *path, int flags, long timeout_ms)
+{
+	long incr = 10000000; /* 10 ms */
+	struct lkl_timespec ts = {
+		.tv_sec = 0,
+		.tv_nsec = incr,
+	};
+	long err;
+
+	do {
+		err = lkl_sys_umount(path, flags);
+		if (err == -LKL_EBUSY) {
+			lkl_sys_nanosleep((struct __lkl__kernel_timespec *)&ts,
+					  NULL);
+			timeout_ms -= incr / 1000000;
+		}
+	} while (err == -LKL_EBUSY && timeout_ms > 0);
+
+	return err;
+}
+
+long lkl_umount_dev(unsigned int disk_id, unsigned int part, int flags,
+		    long timeout_ms)
+{
+	char dev_str[] = { "/dev/xxxxxxxx" };
+	char mnt_str[] = { "/mnt/xxxxxxxx" };
+	unsigned int dev;
+	int err;
+
+	err = lkl_get_virtio_blkdev(disk_id, part, &dev);
+	if (err < 0)
+		return err;
+
+	snprintf(dev_str, sizeof(dev_str), "/dev/%08x", dev);
+	snprintf(mnt_str, sizeof(mnt_str), "/mnt/%08x", dev);
+
+	err = lkl_umount_timeout(mnt_str, flags, timeout_ms);
+	if (err)
+		return err;
+
+	err = lkl_sys_unlink(dev_str);
+	if (err)
+		return err;
+
+	return lkl_sys_rmdir(mnt_str);
+}
+
+struct lkl_dir {
+	int fd;
+	char buf[1024];
+	char *pos;
+	int len;
+};
+
+static struct lkl_dir *lkl_dir_alloc(int *err)
+{
+	struct lkl_dir *dir = lkl_host_ops.mem_alloc(sizeof(struct lkl_dir));
+
+	if (!dir) {
+		*err = -LKL_ENOMEM;
+		return NULL;
+	}
+
+	dir->len = 0;
+	dir->pos = NULL;
+
+	return dir;
+}
+
+struct lkl_dir *lkl_opendir(const char *path, int *err)
+{
+	struct lkl_dir *dir = lkl_dir_alloc(err);
+
+	if (!dir) {
+		*err = -LKL_ENOMEM;
+		return NULL;
+	}
+
+	dir->fd = lkl_sys_open(path, LKL_O_RDONLY | LKL_O_DIRECTORY, 0);
+	if (dir->fd < 0) {
+		*err = dir->fd;
+		lkl_host_ops.mem_free(dir);
+		return NULL;
+	}
+
+	*err = 0;
+
+	return dir;
+}
+
+struct lkl_dir *lkl_fdopendir(int fd, int *err)
+{
+	struct lkl_dir *dir = lkl_dir_alloc(err);
+
+	if (!dir)
+		return NULL;
+
+	dir->fd = fd;
+
+	return dir;
+}
+
+void lkl_rewinddir(struct lkl_dir *dir)
+{
+	lkl_sys_lseek(dir->fd, 0, LKL_SEEK_SET);
+	dir->len = 0;
+	dir->pos = NULL;
+}
+
+int lkl_closedir(struct lkl_dir *dir)
+{
+	int ret;
+
+	ret = lkl_sys_close(dir->fd);
+	lkl_host_ops.mem_free(dir);
+
+	return ret;
+}
+
+struct lkl_linux_dirent64 *lkl_readdir(struct lkl_dir *dir)
+{
+	struct lkl_linux_dirent64 *de;
+
+	if (dir->len < 0)
+		return NULL;
+
+	if (!dir->pos || dir->pos - dir->buf >= dir->len)
+		goto read_buf;
+
+return_de:
+	de = (struct lkl_linux_dirent64 *)dir->pos;
+	dir->pos += de->d_reclen;
+
+	return de;
+
+read_buf:
+	dir->pos = NULL;
+	de = (struct lkl_linux_dirent64 *)dir->buf;
+	dir->len = lkl_sys_getdents64(dir->fd, de, sizeof(dir->buf));
+	if (dir->len <= 0)
+		return NULL;
+
+	dir->pos = dir->buf;
+	goto return_de;
+}
+
+int lkl_errdir(struct lkl_dir *dir)
+{
+	if (dir->len >= 0)
+		return 0;
+
+	return dir->len;
+}
+
+int lkl_dirfd(struct lkl_dir *dir)
+{
+	return dir->fd;
+}
+
+int lkl_set_fd_limit(unsigned int fd_limit)
+{
+	struct lkl_rlimit rlim = {
+		.rlim_cur = fd_limit,
+		.rlim_max = fd_limit,
+	};
+	return lkl_sys_setrlimit(LKL_RLIMIT_NOFILE, &rlim);
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 20/37] lkl tools: host lib: posix host operations
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Conrad Meyer, Hajime Tazaki, Mark Stillwell, Patrick Collins,
	Pierre-Hugues Husson, Thomas Liebetraut, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Implement LKL host operations for POSIX hosts.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 Makefile                   |   2 +
 tools/lkl/lib/Build        |   2 +
 tools/lkl/lib/posix-host.c | 435 +++++++++++++++++++++++++++++++++++++
 3 files changed, 439 insertions(+)
 create mode 100644 tools/lkl/lib/posix-host.c

diff --git a/Makefile b/Makefile
index 0cbe8717bdb3..874c0aec0f9c 100644
--- a/Makefile
+++ b/Makefile
@@ -1123,7 +1123,9 @@ archprepare: archheaders archscripts scripts prepare3 outputmakefile \
 	asm-generic $(version_h) $(autoksyms_h) include/generated/utsrelease.h
 
 prepare0: archprepare
+ifeq ($(findstring elf,$(if $(CONFIG_OUTPUT_FORMAT),$(CONFIG_OUTPUT_FORMAT),elf)),elf)
 	$(Q)$(MAKE) $(build)=scripts/mod
+endif
 	$(Q)$(MAKE) $(build)=.
 
 # All the preparing..
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index f2ee04366464..a7a3bff27bb1 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1,8 +1,10 @@
 CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
+CFLAGS_posix-host.o += -D_FILE_OFFSET_BITS=64
 
 liblkl-y += fs.o
 liblkl-y += iomem.o
 liblkl-y += jmp_buf.o
+liblkl-$(LKL_HOST_CONFIG_POSIX) += posix-host.o
 liblkl-y += utils.o
 liblkl-y += virtio_blk.o
 liblkl-y += virtio.o
diff --git a/tools/lkl/lib/posix-host.c b/tools/lkl/lib/posix-host.c
new file mode 100644
index 000000000000..c2b579433b12
--- /dev/null
+++ b/tools/lkl/lib/posix-host.c
@@ -0,0 +1,435 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <pthread.h>
+#include <stdlib.h>
+#include <sys/time.h>
+#include <time.h>
+#include <signal.h>
+#include <assert.h>
+#include <unistd.h>
+#include <errno.h>
+#include <string.h>
+#include <time.h>
+#include <stdint.h>
+#include <sys/uio.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/syscall.h>
+#include <poll.h>
+#include <lkl_host.h>
+#include "iomem.h"
+#include "jmp_buf.h"
+
+/* Let's see if the host has semaphore.h */
+#include <unistd.h>
+
+#ifdef _POSIX_SEMAPHORES
+#include <semaphore.h>
+/* TODO(pscollins): We don't support fork() for now, but maybe one day
+ * we will?
+ */
+#define SHARE_SEM 0
+#endif /* _POSIX_SEMAPHORES */
+
+static void print(const char *str, int len)
+{
+	int ret __attribute__((unused));
+
+	ret = write(STDOUT_FILENO, str, len);
+}
+
+struct lkl_mutex {
+	pthread_mutex_t mutex;
+};
+
+struct lkl_sem {
+#ifdef _POSIX_SEMAPHORES
+	sem_t sem;
+#else
+	pthread_mutex_t lock;
+	int count;
+	pthread_cond_t cond;
+#endif /* _POSIX_SEMAPHORES */
+};
+
+struct lkl_tls_key {
+	pthread_key_t key;
+};
+
+#define WARN_UNLESS(exp) do {						\
+		if (exp < 0)						\
+			lkl_printf("%s: %s\n", #exp, strerror(errno));	\
+	} while (0)
+
+static int _warn_pthread(int ret, char *str_exp)
+{
+	if (ret > 0)
+		lkl_printf("%s: %s\n", str_exp, strerror(ret));
+
+	return ret;
+}
+
+
+/* pthread_* functions use the reverse convention */
+#define WARN_PTHREAD(exp) _warn_pthread(exp, #exp)
+
+static struct lkl_sem *sem_alloc(int count)
+{
+	struct lkl_sem *sem;
+
+	sem = malloc(sizeof(*sem));
+	if (!sem)
+		return NULL;
+
+#ifdef _POSIX_SEMAPHORES
+	if (sem_init(&sem->sem, SHARE_SEM, count) < 0) {
+		lkl_printf("sem_init: %s\n", strerror(errno));
+		free(sem);
+		return NULL;
+	}
+#else
+	pthread_mutex_init(&sem->lock, NULL);
+	sem->count = count;
+	WARN_PTHREAD(pthread_cond_init(&sem->cond, NULL));
+#endif /* _POSIX_SEMAPHORES */
+
+	return sem;
+}
+
+static void sem_free(struct lkl_sem *sem)
+{
+#ifdef _POSIX_SEMAPHORES
+	WARN_UNLESS(sem_destroy(&sem->sem));
+#else
+	WARN_PTHREAD(pthread_cond_destroy(&sem->cond));
+	WARN_PTHREAD(pthread_mutex_destroy(&sem->lock));
+#endif /* _POSIX_SEMAPHORES */
+	free(sem);
+}
+
+static void sem_up(struct lkl_sem *sem)
+{
+#ifdef _POSIX_SEMAPHORES
+	WARN_UNLESS(sem_post(&sem->sem));
+#else
+	WARN_PTHREAD(pthread_mutex_lock(&sem->lock));
+	sem->count++;
+	if (sem->count > 0)
+		WARN_PTHREAD(pthread_cond_signal(&sem->cond));
+	WARN_PTHREAD(pthread_mutex_unlock(&sem->lock));
+#endif /* _POSIX_SEMAPHORES */
+
+}
+
+static void sem_down(struct lkl_sem *sem)
+{
+#ifdef _POSIX_SEMAPHORES
+	int err;
+
+	do {
+		err = sem_wait(&sem->sem);
+	} while (err < 0 && errno == EINTR);
+	if (err < 0 && errno != EINTR)
+		lkl_printf("sem_wait: %s\n", strerror(errno));
+#else
+	WARN_PTHREAD(pthread_mutex_lock(&sem->lock));
+	while (sem->count <= 0)
+		WARN_PTHREAD(pthread_cond_wait(&sem->cond, &sem->lock));
+	sem->count--;
+	WARN_PTHREAD(pthread_mutex_unlock(&sem->lock));
+#endif /* _POSIX_SEMAPHORES */
+}
+
+static struct lkl_mutex *mutex_alloc(int recursive)
+{
+	struct lkl_mutex *_mutex = malloc(sizeof(struct lkl_mutex));
+	pthread_mutex_t *mutex = NULL;
+	pthread_mutexattr_t attr;
+
+	if (!_mutex)
+		return NULL;
+
+	mutex = &_mutex->mutex;
+	WARN_PTHREAD(pthread_mutexattr_init(&attr));
+
+	/* PTHREAD_MUTEX_ERRORCHECK is *very* useful for debugging,
+	 * but has some overhead, so we provide an option to turn it
+	 * off.
+	 */
+#ifdef DEBUG
+	if (!recursive)
+		WARN_PTHREAD(pthread_mutexattr_settype(
+				     &attr, PTHREAD_MUTEX_ERRORCHECK));
+#endif /* DEBUG */
+
+	if (recursive)
+		WARN_PTHREAD(pthread_mutexattr_settype(
+				     &attr, PTHREAD_MUTEX_RECURSIVE));
+
+	WARN_PTHREAD(pthread_mutex_init(mutex, &attr));
+
+	return _mutex;
+}
+
+static void mutex_lock(struct lkl_mutex *mutex)
+{
+	WARN_PTHREAD(pthread_mutex_lock(&mutex->mutex));
+}
+
+static void mutex_unlock(struct lkl_mutex *_mutex)
+{
+	pthread_mutex_t *mutex = &_mutex->mutex;
+
+	WARN_PTHREAD(pthread_mutex_unlock(mutex));
+}
+
+static void mutex_free(struct lkl_mutex *_mutex)
+{
+	pthread_mutex_t *mutex = &_mutex->mutex;
+
+	WARN_PTHREAD(pthread_mutex_destroy(mutex));
+	free(_mutex);
+}
+
+static lkl_thread_t thread_create(void (*fn)(void *), void *arg)
+{
+	pthread_t thread;
+
+	if (WARN_PTHREAD(pthread_create(&thread, NULL, (void* (*)(void *))fn,
+					arg)))
+		return 0;
+	else
+		return (lkl_thread_t) thread;
+}
+
+static void thread_detach(void)
+{
+	WARN_PTHREAD(pthread_detach(pthread_self()));
+}
+
+static void thread_exit(void)
+{
+	pthread_exit(NULL);
+}
+
+static int thread_join(lkl_thread_t tid)
+{
+	if (WARN_PTHREAD(pthread_join((pthread_t)tid, NULL)))
+		return (-1);
+	else
+		return 0;
+}
+
+static lkl_thread_t thread_self(void)
+{
+	return (lkl_thread_t)pthread_self();
+}
+
+static int thread_equal(lkl_thread_t a, lkl_thread_t b)
+{
+	return pthread_equal((pthread_t)a, (pthread_t)b);
+}
+
+static struct lkl_tls_key *tls_alloc(void (*destructor)(void *))
+{
+	struct lkl_tls_key *ret = malloc(sizeof(struct lkl_tls_key));
+
+	if (WARN_PTHREAD(pthread_key_create(&ret->key, destructor))) {
+		free(ret);
+		return NULL;
+	}
+	return ret;
+}
+
+static void tls_free(struct lkl_tls_key *key)
+{
+	WARN_PTHREAD(pthread_key_delete(key->key));
+	free(key);
+}
+
+static int tls_set(struct lkl_tls_key *key, void *data)
+{
+	if (WARN_PTHREAD(pthread_setspecific(key->key, data)))
+		return (-1);
+	return 0;
+}
+
+static void *tls_get(struct lkl_tls_key *key)
+{
+	return pthread_getspecific(key->key);
+}
+
+static unsigned long long time_ns(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_MONOTONIC, &ts);
+
+	return 1e9*ts.tv_sec + ts.tv_nsec;
+}
+
+static void *timer_alloc(void (*fn)(void *), void *arg)
+{
+	int err;
+	timer_t timer;
+	struct sigevent se =  {
+		.sigev_notify = SIGEV_THREAD,
+		.sigev_value = {
+			.sival_ptr = arg,
+		},
+		.sigev_notify_function = (void (*)(union sigval))fn,
+	};
+
+	err = timer_create(CLOCK_REALTIME, &se, &timer);
+	if (err)
+		return NULL;
+
+	return (void *)(long)timer;
+}
+
+static int timer_set_oneshot(void *_timer, unsigned long ns)
+{
+	timer_t timer = (timer_t)(long)_timer;
+	struct itimerspec ts = {
+		.it_value = {
+			.tv_sec = ns / 1000000000,
+			.tv_nsec = ns % 1000000000,
+		},
+	};
+
+	return timer_settime(timer, 0, &ts, NULL);
+}
+
+static void timer_free(void *_timer)
+{
+	timer_t timer = (timer_t)(long)_timer;
+
+	timer_delete(timer);
+}
+
+static void panic(void)
+{
+	assert(0);
+}
+
+static long _gettid(void)
+{
+#ifdef	__FreeBSD__
+	return (long)pthread_self();
+#else
+	return syscall(SYS_gettid);
+#endif
+}
+
+struct lkl_host_operations lkl_host_ops = {
+	.panic = panic,
+	.thread_create = thread_create,
+	.thread_detach = thread_detach,
+	.thread_exit = thread_exit,
+	.thread_join = thread_join,
+	.thread_self = thread_self,
+	.thread_equal = thread_equal,
+	.sem_alloc = sem_alloc,
+	.sem_free = sem_free,
+	.sem_up = sem_up,
+	.sem_down = sem_down,
+	.mutex_alloc = mutex_alloc,
+	.mutex_free = mutex_free,
+	.mutex_lock = mutex_lock,
+	.mutex_unlock = mutex_unlock,
+	.tls_alloc = tls_alloc,
+	.tls_free = tls_free,
+	.tls_set = tls_set,
+	.tls_get = tls_get,
+	.time = time_ns,
+	.timer_alloc = timer_alloc,
+	.timer_set_oneshot = timer_set_oneshot,
+	.timer_free = timer_free,
+	.print = print,
+	.mem_alloc = malloc,
+	.mem_free = free,
+	.ioremap = lkl_ioremap,
+	.iomem_access = lkl_iomem_access,
+	.virtio_devices = lkl_virtio_devs,
+	.gettid = _gettid,
+	.jmp_buf_set = jmp_buf_set,
+	.jmp_buf_longjmp = jmp_buf_longjmp,
+};
+
+static int fd_get_capacity(struct lkl_disk disk, unsigned long long *res)
+{
+	off_t off;
+
+	off = lseek(disk.fd, 0, SEEK_END);
+	if (off < 0)
+		return (-1);
+
+	*res = off;
+	return 0;
+}
+
+static int do_rw(ssize_t (*fn)(), struct lkl_disk disk, struct lkl_blk_req *req)
+{
+	off_t off = req->sector * 512;
+	void *addr;
+	int len;
+	int i;
+	int ret = 0;
+
+	for (i = 0; i < req->count; i++) {
+
+		addr = req->buf[i].iov_base;
+		len = req->buf[i].iov_len;
+
+		do {
+			ret = fn(disk.fd, addr, len, off);
+
+			if (ret <= 0) {
+				ret = -1;
+				goto out;
+			}
+
+			addr += ret;
+			len -= ret;
+			off += ret;
+
+		} while (len);
+	}
+
+out:
+	return ret;
+}
+
+static int blk_request(struct lkl_disk disk, struct lkl_blk_req *req)
+{
+	int err = 0;
+
+	switch (req->type) {
+	case LKL_DEV_BLK_TYPE_READ:
+		err = do_rw(pread, disk, req);
+		break;
+	case LKL_DEV_BLK_TYPE_WRITE:
+		err = do_rw(pwrite, disk, req);
+		break;
+	case LKL_DEV_BLK_TYPE_FLUSH:
+	case LKL_DEV_BLK_TYPE_FLUSH_OUT:
+#ifdef __linux__
+		err = fdatasync(disk.fd);
+#else
+		err = fsync(disk.fd);
+#endif
+		break;
+	default:
+		return LKL_DEV_BLK_STATUS_UNSUP;
+	}
+
+	if (err < 0)
+		return LKL_DEV_BLK_STATUS_IOERR;
+
+	return LKL_DEV_BLK_STATUS_OK;
+}
+
+struct lkl_dev_blk_ops lkl_dev_blk_ops = {
+	.get_capacity = fd_get_capacity,
+	.request = blk_request,
+};
+
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 20/37] lkl tools: host lib: posix host operations
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Conrad Meyer, Octavian Purdila, Akira Moroo,
	Yuan Liu, Thomas Liebetraut, Mark Stillwell, Patrick Collins,
	linux-kernel-library, Pierre-Hugues Husson, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Implement LKL host operations for POSIX hosts.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 Makefile                   |   2 +
 tools/lkl/lib/Build        |   2 +
 tools/lkl/lib/posix-host.c | 435 +++++++++++++++++++++++++++++++++++++
 3 files changed, 439 insertions(+)
 create mode 100644 tools/lkl/lib/posix-host.c

diff --git a/Makefile b/Makefile
index 0cbe8717bdb3..874c0aec0f9c 100644
--- a/Makefile
+++ b/Makefile
@@ -1123,7 +1123,9 @@ archprepare: archheaders archscripts scripts prepare3 outputmakefile \
 	asm-generic $(version_h) $(autoksyms_h) include/generated/utsrelease.h
 
 prepare0: archprepare
+ifeq ($(findstring elf,$(if $(CONFIG_OUTPUT_FORMAT),$(CONFIG_OUTPUT_FORMAT),elf)),elf)
 	$(Q)$(MAKE) $(build)=scripts/mod
+endif
 	$(Q)$(MAKE) $(build)=.
 
 # All the preparing..
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index f2ee04366464..a7a3bff27bb1 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1,8 +1,10 @@
 CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
+CFLAGS_posix-host.o += -D_FILE_OFFSET_BITS=64
 
 liblkl-y += fs.o
 liblkl-y += iomem.o
 liblkl-y += jmp_buf.o
+liblkl-$(LKL_HOST_CONFIG_POSIX) += posix-host.o
 liblkl-y += utils.o
 liblkl-y += virtio_blk.o
 liblkl-y += virtio.o
diff --git a/tools/lkl/lib/posix-host.c b/tools/lkl/lib/posix-host.c
new file mode 100644
index 000000000000..c2b579433b12
--- /dev/null
+++ b/tools/lkl/lib/posix-host.c
@@ -0,0 +1,435 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <pthread.h>
+#include <stdlib.h>
+#include <sys/time.h>
+#include <time.h>
+#include <signal.h>
+#include <assert.h>
+#include <unistd.h>
+#include <errno.h>
+#include <string.h>
+#include <time.h>
+#include <stdint.h>
+#include <sys/uio.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/syscall.h>
+#include <poll.h>
+#include <lkl_host.h>
+#include "iomem.h"
+#include "jmp_buf.h"
+
+/* Let's see if the host has semaphore.h */
+#include <unistd.h>
+
+#ifdef _POSIX_SEMAPHORES
+#include <semaphore.h>
+/* TODO(pscollins): We don't support fork() for now, but maybe one day
+ * we will?
+ */
+#define SHARE_SEM 0
+#endif /* _POSIX_SEMAPHORES */
+
+static void print(const char *str, int len)
+{
+	int ret __attribute__((unused));
+
+	ret = write(STDOUT_FILENO, str, len);
+}
+
+struct lkl_mutex {
+	pthread_mutex_t mutex;
+};
+
+struct lkl_sem {
+#ifdef _POSIX_SEMAPHORES
+	sem_t sem;
+#else
+	pthread_mutex_t lock;
+	int count;
+	pthread_cond_t cond;
+#endif /* _POSIX_SEMAPHORES */
+};
+
+struct lkl_tls_key {
+	pthread_key_t key;
+};
+
+#define WARN_UNLESS(exp) do {						\
+		if (exp < 0)						\
+			lkl_printf("%s: %s\n", #exp, strerror(errno));	\
+	} while (0)
+
+static int _warn_pthread(int ret, char *str_exp)
+{
+	if (ret > 0)
+		lkl_printf("%s: %s\n", str_exp, strerror(ret));
+
+	return ret;
+}
+
+
+/* pthread_* functions use the reverse convention */
+#define WARN_PTHREAD(exp) _warn_pthread(exp, #exp)
+
+static struct lkl_sem *sem_alloc(int count)
+{
+	struct lkl_sem *sem;
+
+	sem = malloc(sizeof(*sem));
+	if (!sem)
+		return NULL;
+
+#ifdef _POSIX_SEMAPHORES
+	if (sem_init(&sem->sem, SHARE_SEM, count) < 0) {
+		lkl_printf("sem_init: %s\n", strerror(errno));
+		free(sem);
+		return NULL;
+	}
+#else
+	pthread_mutex_init(&sem->lock, NULL);
+	sem->count = count;
+	WARN_PTHREAD(pthread_cond_init(&sem->cond, NULL));
+#endif /* _POSIX_SEMAPHORES */
+
+	return sem;
+}
+
+static void sem_free(struct lkl_sem *sem)
+{
+#ifdef _POSIX_SEMAPHORES
+	WARN_UNLESS(sem_destroy(&sem->sem));
+#else
+	WARN_PTHREAD(pthread_cond_destroy(&sem->cond));
+	WARN_PTHREAD(pthread_mutex_destroy(&sem->lock));
+#endif /* _POSIX_SEMAPHORES */
+	free(sem);
+}
+
+static void sem_up(struct lkl_sem *sem)
+{
+#ifdef _POSIX_SEMAPHORES
+	WARN_UNLESS(sem_post(&sem->sem));
+#else
+	WARN_PTHREAD(pthread_mutex_lock(&sem->lock));
+	sem->count++;
+	if (sem->count > 0)
+		WARN_PTHREAD(pthread_cond_signal(&sem->cond));
+	WARN_PTHREAD(pthread_mutex_unlock(&sem->lock));
+#endif /* _POSIX_SEMAPHORES */
+
+}
+
+static void sem_down(struct lkl_sem *sem)
+{
+#ifdef _POSIX_SEMAPHORES
+	int err;
+
+	do {
+		err = sem_wait(&sem->sem);
+	} while (err < 0 && errno == EINTR);
+	if (err < 0 && errno != EINTR)
+		lkl_printf("sem_wait: %s\n", strerror(errno));
+#else
+	WARN_PTHREAD(pthread_mutex_lock(&sem->lock));
+	while (sem->count <= 0)
+		WARN_PTHREAD(pthread_cond_wait(&sem->cond, &sem->lock));
+	sem->count--;
+	WARN_PTHREAD(pthread_mutex_unlock(&sem->lock));
+#endif /* _POSIX_SEMAPHORES */
+}
+
+static struct lkl_mutex *mutex_alloc(int recursive)
+{
+	struct lkl_mutex *_mutex = malloc(sizeof(struct lkl_mutex));
+	pthread_mutex_t *mutex = NULL;
+	pthread_mutexattr_t attr;
+
+	if (!_mutex)
+		return NULL;
+
+	mutex = &_mutex->mutex;
+	WARN_PTHREAD(pthread_mutexattr_init(&attr));
+
+	/* PTHREAD_MUTEX_ERRORCHECK is *very* useful for debugging,
+	 * but has some overhead, so we provide an option to turn it
+	 * off.
+	 */
+#ifdef DEBUG
+	if (!recursive)
+		WARN_PTHREAD(pthread_mutexattr_settype(
+				     &attr, PTHREAD_MUTEX_ERRORCHECK));
+#endif /* DEBUG */
+
+	if (recursive)
+		WARN_PTHREAD(pthread_mutexattr_settype(
+				     &attr, PTHREAD_MUTEX_RECURSIVE));
+
+	WARN_PTHREAD(pthread_mutex_init(mutex, &attr));
+
+	return _mutex;
+}
+
+static void mutex_lock(struct lkl_mutex *mutex)
+{
+	WARN_PTHREAD(pthread_mutex_lock(&mutex->mutex));
+}
+
+static void mutex_unlock(struct lkl_mutex *_mutex)
+{
+	pthread_mutex_t *mutex = &_mutex->mutex;
+
+	WARN_PTHREAD(pthread_mutex_unlock(mutex));
+}
+
+static void mutex_free(struct lkl_mutex *_mutex)
+{
+	pthread_mutex_t *mutex = &_mutex->mutex;
+
+	WARN_PTHREAD(pthread_mutex_destroy(mutex));
+	free(_mutex);
+}
+
+static lkl_thread_t thread_create(void (*fn)(void *), void *arg)
+{
+	pthread_t thread;
+
+	if (WARN_PTHREAD(pthread_create(&thread, NULL, (void* (*)(void *))fn,
+					arg)))
+		return 0;
+	else
+		return (lkl_thread_t) thread;
+}
+
+static void thread_detach(void)
+{
+	WARN_PTHREAD(pthread_detach(pthread_self()));
+}
+
+static void thread_exit(void)
+{
+	pthread_exit(NULL);
+}
+
+static int thread_join(lkl_thread_t tid)
+{
+	if (WARN_PTHREAD(pthread_join((pthread_t)tid, NULL)))
+		return (-1);
+	else
+		return 0;
+}
+
+static lkl_thread_t thread_self(void)
+{
+	return (lkl_thread_t)pthread_self();
+}
+
+static int thread_equal(lkl_thread_t a, lkl_thread_t b)
+{
+	return pthread_equal((pthread_t)a, (pthread_t)b);
+}
+
+static struct lkl_tls_key *tls_alloc(void (*destructor)(void *))
+{
+	struct lkl_tls_key *ret = malloc(sizeof(struct lkl_tls_key));
+
+	if (WARN_PTHREAD(pthread_key_create(&ret->key, destructor))) {
+		free(ret);
+		return NULL;
+	}
+	return ret;
+}
+
+static void tls_free(struct lkl_tls_key *key)
+{
+	WARN_PTHREAD(pthread_key_delete(key->key));
+	free(key);
+}
+
+static int tls_set(struct lkl_tls_key *key, void *data)
+{
+	if (WARN_PTHREAD(pthread_setspecific(key->key, data)))
+		return (-1);
+	return 0;
+}
+
+static void *tls_get(struct lkl_tls_key *key)
+{
+	return pthread_getspecific(key->key);
+}
+
+static unsigned long long time_ns(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_MONOTONIC, &ts);
+
+	return 1e9*ts.tv_sec + ts.tv_nsec;
+}
+
+static void *timer_alloc(void (*fn)(void *), void *arg)
+{
+	int err;
+	timer_t timer;
+	struct sigevent se =  {
+		.sigev_notify = SIGEV_THREAD,
+		.sigev_value = {
+			.sival_ptr = arg,
+		},
+		.sigev_notify_function = (void (*)(union sigval))fn,
+	};
+
+	err = timer_create(CLOCK_REALTIME, &se, &timer);
+	if (err)
+		return NULL;
+
+	return (void *)(long)timer;
+}
+
+static int timer_set_oneshot(void *_timer, unsigned long ns)
+{
+	timer_t timer = (timer_t)(long)_timer;
+	struct itimerspec ts = {
+		.it_value = {
+			.tv_sec = ns / 1000000000,
+			.tv_nsec = ns % 1000000000,
+		},
+	};
+
+	return timer_settime(timer, 0, &ts, NULL);
+}
+
+static void timer_free(void *_timer)
+{
+	timer_t timer = (timer_t)(long)_timer;
+
+	timer_delete(timer);
+}
+
+static void panic(void)
+{
+	assert(0);
+}
+
+static long _gettid(void)
+{
+#ifdef	__FreeBSD__
+	return (long)pthread_self();
+#else
+	return syscall(SYS_gettid);
+#endif
+}
+
+struct lkl_host_operations lkl_host_ops = {
+	.panic = panic,
+	.thread_create = thread_create,
+	.thread_detach = thread_detach,
+	.thread_exit = thread_exit,
+	.thread_join = thread_join,
+	.thread_self = thread_self,
+	.thread_equal = thread_equal,
+	.sem_alloc = sem_alloc,
+	.sem_free = sem_free,
+	.sem_up = sem_up,
+	.sem_down = sem_down,
+	.mutex_alloc = mutex_alloc,
+	.mutex_free = mutex_free,
+	.mutex_lock = mutex_lock,
+	.mutex_unlock = mutex_unlock,
+	.tls_alloc = tls_alloc,
+	.tls_free = tls_free,
+	.tls_set = tls_set,
+	.tls_get = tls_get,
+	.time = time_ns,
+	.timer_alloc = timer_alloc,
+	.timer_set_oneshot = timer_set_oneshot,
+	.timer_free = timer_free,
+	.print = print,
+	.mem_alloc = malloc,
+	.mem_free = free,
+	.ioremap = lkl_ioremap,
+	.iomem_access = lkl_iomem_access,
+	.virtio_devices = lkl_virtio_devs,
+	.gettid = _gettid,
+	.jmp_buf_set = jmp_buf_set,
+	.jmp_buf_longjmp = jmp_buf_longjmp,
+};
+
+static int fd_get_capacity(struct lkl_disk disk, unsigned long long *res)
+{
+	off_t off;
+
+	off = lseek(disk.fd, 0, SEEK_END);
+	if (off < 0)
+		return (-1);
+
+	*res = off;
+	return 0;
+}
+
+static int do_rw(ssize_t (*fn)(), struct lkl_disk disk, struct lkl_blk_req *req)
+{
+	off_t off = req->sector * 512;
+	void *addr;
+	int len;
+	int i;
+	int ret = 0;
+
+	for (i = 0; i < req->count; i++) {
+
+		addr = req->buf[i].iov_base;
+		len = req->buf[i].iov_len;
+
+		do {
+			ret = fn(disk.fd, addr, len, off);
+
+			if (ret <= 0) {
+				ret = -1;
+				goto out;
+			}
+
+			addr += ret;
+			len -= ret;
+			off += ret;
+
+		} while (len);
+	}
+
+out:
+	return ret;
+}
+
+static int blk_request(struct lkl_disk disk, struct lkl_blk_req *req)
+{
+	int err = 0;
+
+	switch (req->type) {
+	case LKL_DEV_BLK_TYPE_READ:
+		err = do_rw(pread, disk, req);
+		break;
+	case LKL_DEV_BLK_TYPE_WRITE:
+		err = do_rw(pwrite, disk, req);
+		break;
+	case LKL_DEV_BLK_TYPE_FLUSH:
+	case LKL_DEV_BLK_TYPE_FLUSH_OUT:
+#ifdef __linux__
+		err = fdatasync(disk.fd);
+#else
+		err = fsync(disk.fd);
+#endif
+		break;
+	default:
+		return LKL_DEV_BLK_STATUS_UNSUP;
+	}
+
+	if (err < 0)
+		return LKL_DEV_BLK_STATUS_IOERR;
+
+	return LKL_DEV_BLK_STATUS_OK;
+}
+
+struct lkl_dev_blk_ops lkl_dev_blk_ops = {
+	.get_capacity = fd_get_capacity,
+	.request = blk_request,
+};
+
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 21/37] lkl tools: "boot" test
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Conrad Meyer, David Disseldorp, Hajime Tazaki, Lai Jiangshan,
	Luca Dariz, Mark Stillwell, Motomu Utsumi, Patrick Collins,
	Petros Angelatos, Thomas Liebetraut, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Add a simple LKL test applications that starts the kernel and performs
simple tests that minimally exercise the LKL API.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore              |   5 +
 tools/lkl/Makefile                |   5 +-
 tools/lkl/Targets                 |   3 +
 tools/lkl/tests/Build             |   3 +
 tools/lkl/tests/boot.c            | 562 ++++++++++++++++++++++++++++++
 tools/lkl/tests/boot.sh           |   9 +
 tools/lkl/tests/cla.c             | 159 +++++++++
 tools/lkl/tests/cla.h             |  33 ++
 tools/lkl/tests/disk.c            | 189 ++++++++++
 tools/lkl/tests/disk.sh           |  61 ++++
 tools/lkl/tests/run.py            | 182 ++++++++++
 tools/lkl/tests/tap13.py          | 209 +++++++++++
 tools/lkl/tests/test.c            | 126 +++++++
 tools/lkl/tests/test.h            |  72 ++++
 tools/lkl/tests/test.sh           | 179 ++++++++++
 tools/lkl/tests/valgrind.supp     |  85 +++++
 tools/lkl/tests/valgrind2xunit.py |  69 ++++
 17 files changed, 1950 insertions(+), 1 deletion(-)
 create mode 100644 tools/lkl/tests/Build
 create mode 100644 tools/lkl/tests/boot.c
 create mode 100755 tools/lkl/tests/boot.sh
 create mode 100644 tools/lkl/tests/cla.c
 create mode 100644 tools/lkl/tests/cla.h
 create mode 100644 tools/lkl/tests/disk.c
 create mode 100755 tools/lkl/tests/disk.sh
 create mode 100755 tools/lkl/tests/run.py
 create mode 100644 tools/lkl/tests/tap13.py
 create mode 100644 tools/lkl/tests/test.c
 create mode 100644 tools/lkl/tests/test.h
 create mode 100644 tools/lkl/tests/test.sh
 create mode 100644 tools/lkl/tests/valgrind.supp
 create mode 100755 tools/lkl/tests/valgrind2xunit.py

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
index 1aed58bfe171..4e08254dbd46 100644
--- a/tools/lkl/.gitignore
+++ b/tools/lkl/.gitignore
@@ -2,3 +2,8 @@ Makefile.conf
 include/lkl_autoconf.h
 tests/autoconf.sh
 bin/stat
+tests/net-test
+tests/disk
+tests/boot
+tests/valgrind*.xml
+*.pyc
diff --git a/tools/lkl/Makefile b/tools/lkl/Makefile
index 6d6d2cead03f..9a55df5064e4 100644
--- a/tools/lkl/Makefile
+++ b/tools/lkl/Makefile
@@ -116,8 +116,11 @@ programs_install: $(progs-y:%=$(OUTPUT)%$(EXESUF))
 install: headers_install libraries_install programs_install
 
 
+run-tests:
+	./tests/run.py $(tests)
+
 FORCE: ;
-.PHONY: all clean FORCE
+.PHONY: all clean FORCE run-tests
 .PHONY: headers_install libraries_install programs_install install
 .NOTPARALLEL : lib/lkl.o
 .SECONDARY:
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
index 24c985e64638..a9f74c3cc8fb 100644
--- a/tools/lkl/Targets
+++ b/tools/lkl/Targets
@@ -1,3 +1,6 @@
 libs-y += lib/liblkl
 
+progs-y += tests/boot
+progs-y += tests/disk
+progs-y += tests/net-test
 
diff --git a/tools/lkl/tests/Build b/tools/lkl/tests/Build
new file mode 100644
index 000000000000..ace86a3d3438
--- /dev/null
+++ b/tools/lkl/tests/Build
@@ -0,0 +1,3 @@
+boot-y += boot.o test.o
+disk-y += disk.o cla.o test.o
+net-test-y += net-test.o cla.o test.o
diff --git a/tools/lkl/tests/boot.c b/tools/lkl/tests/boot.c
new file mode 100644
index 000000000000..b021e9540147
--- /dev/null
+++ b/tools/lkl/tests/boot.c
@@ -0,0 +1,562 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <time.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include <sys/stat.h>
+#include <fcntl.h>
+#if defined(__FreeBSD__)
+#include <net/if.h>
+#include <sys/ioctl.h>
+#elif __linux
+#include <sys/epoll.h>
+#include <sys/ioctl.h>
+#elif __MINGW32__
+#include <windows.h>
+#endif
+
+#include "test.h"
+
+#ifndef __MINGW32__
+#define sleep_ns 87654321
+int lkl_test_nanosleep(void)
+{
+	struct lkl_timespec ts = {
+		.tv_sec = 0,
+		.tv_nsec = sleep_ns,
+	};
+	struct timespec start, stop;
+	long delta;
+	long ret;
+
+	clock_gettime(CLOCK_MONOTONIC, &start);
+	ret = lkl_sys_nanosleep((struct __lkl__kernel_timespec *)&ts, NULL);
+	clock_gettime(CLOCK_MONOTONIC, &stop);
+
+	delta = 1e9*(stop.tv_sec - start.tv_sec) +
+		(stop.tv_nsec - start.tv_nsec);
+
+	lkl_test_logf("sleep %ld, expected sleep %d\n", delta, sleep_ns);
+
+	if (ret == 0 && delta > sleep_ns * 0.9)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+#endif
+
+LKL_TEST_CALL(getpid, lkl_sys_getpid, 1)
+
+void check_latency(long (*f)(void), long *min, long *max, long *avg)
+{
+	int i;
+	unsigned long long start, stop, sum = 0;
+	static const int count = 1000;
+	long delta;
+
+	*min = 1000000000;
+	*max = -1;
+
+	for (i = 0; i < count; i++) {
+		start = lkl_host_ops.time();
+		f();
+		stop = lkl_host_ops.time();
+		delta = stop - start;
+		if (*min > delta)
+			*min = delta;
+		if (*max < delta)
+			*max = delta;
+		sum += delta;
+	}
+	*avg = sum / count;
+}
+
+static long native_getpid(void)
+{
+#ifdef __MINGW32__
+	GetCurrentProcessId();
+#else
+	getpid();
+#endif
+	return 0;
+}
+
+int lkl_test_syscall_latency(void)
+{
+	long min, max, avg;
+
+	lkl_test_logf("avg/min/max: ");
+
+	check_latency(lkl_sys_getpid, &min, &max, &avg);
+
+	lkl_test_logf("lkl:%ld/%ld/%ld ", avg, min, max);
+
+	check_latency(native_getpid, &min, &max, &avg);
+
+	lkl_test_logf("native:%ld/%ld/%ld\n", avg, min, max);
+
+	return TEST_SUCCESS;
+}
+
+#define access_rights 0721
+
+LKL_TEST_CALL(creat, lkl_sys_creat, 0, "/file", access_rights)
+LKL_TEST_CALL(close, lkl_sys_close, 0, 0);
+LKL_TEST_CALL(failopen, lkl_sys_open, -LKL_ENOENT, "/file2", 0, 0);
+LKL_TEST_CALL(umask, lkl_sys_umask, 022,  0777);
+LKL_TEST_CALL(umask2, lkl_sys_umask, 0777, 0);
+LKL_TEST_CALL(open, lkl_sys_open, 0, "/file", LKL_O_RDWR, 0);
+static const char wrbuf[] = "test";
+LKL_TEST_CALL(write, lkl_sys_write, sizeof(wrbuf), 0, wrbuf, sizeof(wrbuf));
+LKL_TEST_CALL(lseek_cur, lkl_sys_lseek, sizeof(wrbuf), 0, 0, LKL_SEEK_CUR);
+LKL_TEST_CALL(lseek_end, lkl_sys_lseek, sizeof(wrbuf), 0, 0, LKL_SEEK_END);
+LKL_TEST_CALL(lseek_set, lkl_sys_lseek, 0, 0, 0, LKL_SEEK_SET);
+
+int lkl_test_read(void)
+{
+	char buf[10] = { 0, };
+	long ret;
+
+	ret = lkl_sys_read(0, buf, sizeof(buf));
+
+	lkl_test_logf("lkl_sys_read=%ld buf=%s\n", ret, buf);
+
+	if (ret == sizeof(wrbuf) && !strcmp(wrbuf, buf))
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+int lkl_test_fstat(void)
+{
+	struct lkl_stat stat;
+	long ret;
+
+	ret = lkl_sys_fstat(0, &stat);
+
+	lkl_test_logf("lkl_sys_fstat=%ld mode=%o size=%zd\n", ret, stat.st_mode,
+		      stat.st_size);
+
+	if (ret == 0 && stat.st_size == sizeof(wrbuf) &&
+	    stat.st_mode == (access_rights | LKL_S_IFREG))
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+LKL_TEST_CALL(mkdir, lkl_sys_mkdir, 0, "/mnt", access_rights)
+
+int lkl_test_stat(void)
+{
+	struct lkl_stat stat;
+	long ret;
+
+	ret = lkl_sys_stat("/mnt", &stat);
+
+	lkl_test_logf("lkl_sys_stat(\"/mnt\")=%ld mode=%o\n", ret,
+		      stat.st_mode);
+
+	if (ret == 0 && stat.st_mode == (access_rights | LKL_S_IFDIR))
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+static int lkl_test_pipe2(void)
+{
+	int pipe_fds[2];
+	int READ_IDX = 0, WRITE_IDX = 1;
+	static const char msg[] = "Hello world!";
+	char str[20];
+	int msg_len_bytes = strlen(msg) + 1;
+	int cmp_res;
+	long ret;
+
+	ret = lkl_sys_pipe2(pipe_fds, LKL_O_NONBLOCK);
+	if (ret) {
+		lkl_test_logf("pipe2: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_write(pipe_fds[WRITE_IDX], msg, msg_len_bytes);
+	if (ret != msg_len_bytes) {
+		if (ret < 0)
+			lkl_test_logf("write error: %s\n", lkl_strerror(ret));
+		else
+			lkl_test_logf("short write: %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_read(pipe_fds[READ_IDX], str, msg_len_bytes);
+	if (ret != msg_len_bytes) {
+		if (ret < 0)
+			lkl_test_logf("read error: %s\n", lkl_strerror(ret));
+		else
+			lkl_test_logf("short read: %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	cmp_res = memcmp(msg, str, msg_len_bytes);
+	if (cmp_res) {
+		lkl_test_logf("memcmp failed: %d\n", cmp_res);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_close(pipe_fds[0]);
+	if (ret) {
+		lkl_test_logf("close error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_close(pipe_fds[1]);
+	if (ret) {
+		lkl_test_logf("close error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_epoll(void)
+{
+	int epoll_fd, pipe_fds[2];
+	int READ_IDX = 0, WRITE_IDX = 1;
+	struct lkl_epoll_event wait_on, read_result;
+	static const char msg[] = "Hello world!";
+	long ret;
+
+	memset(&wait_on, 0, sizeof(wait_on));
+	memset(&read_result, 0, sizeof(read_result));
+
+	ret = lkl_sys_pipe2(pipe_fds, LKL_O_NONBLOCK);
+	if (ret) {
+		lkl_test_logf("pipe2 error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	epoll_fd = lkl_sys_epoll_create(1);
+	if (epoll_fd < 0) {
+		lkl_test_logf("epoll_create error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	wait_on.events = LKL_POLLIN | LKL_POLLOUT;
+	wait_on.data = pipe_fds[READ_IDX];
+
+	ret = lkl_sys_epoll_ctl(epoll_fd, LKL_EPOLL_CTL_ADD, pipe_fds[READ_IDX],
+				&wait_on);
+	if (ret < 0) {
+		lkl_test_logf("epoll_ctl error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	/* Shouldn't be ready before we have written something */
+	ret = lkl_sys_epoll_wait(epoll_fd, &read_result, 1, 0);
+	if (ret != 0) {
+		if (ret < 0)
+			lkl_test_logf("epoll_wait error: %s\n",
+				      lkl_strerror(ret));
+		else
+			lkl_test_logf("epoll_wait: bad event: 0x%lx\n", ret);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_write(pipe_fds[WRITE_IDX], msg, strlen(msg) + 1);
+	if (ret < 0) {
+		lkl_test_logf("write error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	/* We expect exactly 1 fd to be ready immediately */
+	ret = lkl_sys_epoll_wait(epoll_fd, &read_result, 1, 0);
+	if (ret != 1) {
+		if (ret < 0)
+			lkl_test_logf("epoll_wait error: %s\n",
+				      lkl_strerror(ret));
+		else
+			lkl_test_logf("epoll_wait: bad ev no %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	/* Already tested reading from pipe2 so no need to do it
+	 * here
+	 */
+
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(chdir_proc, lkl_sys_chdir, 0, "proc");
+
+static int dir_fd;
+
+static int lkl_test_open_cwd(void)
+{
+	dir_fd = lkl_sys_open(".", LKL_O_RDONLY | LKL_O_DIRECTORY, 0);
+	if (dir_fd < 0) {
+		lkl_test_logf("failed to open current directory: %s\n",
+			      lkl_strerror(dir_fd));
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+/* column where to insert a line break for the list file tests below. */
+#define COL_LINE_BREAK 70
+
+static int lkl_test_getdents64(void)
+{
+	long ret;
+	char buf[1024], *pos;
+	struct lkl_linux_dirent64 *de;
+	int wr;
+
+	de = (struct lkl_linux_dirent64 *)buf;
+	ret = lkl_sys_getdents64(dir_fd, de, sizeof(buf));
+
+	wr = lkl_test_logf("%d ", dir_fd);
+
+	if (ret < 0)
+		return TEST_FAILURE;
+
+	for (pos = buf; pos - buf < ret; pos += de->d_reclen) {
+		de = (struct lkl_linux_dirent64 *)pos;
+
+		wr += lkl_test_logf("%s ", de->d_name);
+		if (wr >= COL_LINE_BREAK) {
+			lkl_test_logf("\n");
+			wr = 0;
+		}
+	}
+
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(close_dir_fd, lkl_sys_close, 0, dir_fd);
+LKL_TEST_CALL(chdir_root, lkl_sys_chdir, 0, "/");
+LKL_TEST_CALL(mount_fs_proc, lkl_mount_fs, 0, "proc");
+LKL_TEST_CALL(umount_fs_proc, lkl_umount_timeout, 0, "proc", 0, 1000);
+LKL_TEST_CALL(lo_ifup, lkl_if_up, 0, 1);
+
+static int lkl_test_mutex(void)
+{
+	long ret = TEST_SUCCESS;
+	/*
+	 * Can't do much to verify that this works, so we'll just let Valgrind
+	 * warn us on CI if we've made bad memory accesses.
+	 */
+
+	struct lkl_mutex *mutex;
+
+	mutex = lkl_host_ops.mutex_alloc(0);
+	lkl_host_ops.mutex_lock(mutex);
+	lkl_host_ops.mutex_unlock(mutex);
+	lkl_host_ops.mutex_free(mutex);
+
+	mutex = lkl_host_ops.mutex_alloc(1);
+	lkl_host_ops.mutex_lock(mutex);
+	lkl_host_ops.mutex_lock(mutex);
+	lkl_host_ops.mutex_unlock(mutex);
+	lkl_host_ops.mutex_unlock(mutex);
+	lkl_host_ops.mutex_free(mutex);
+
+	return ret;
+}
+
+static int lkl_test_semaphore(void)
+{
+	long ret = TEST_SUCCESS;
+	/*
+	 * Can't do much to verify that this works, so we'll just let Valgrind
+	 * warn us on CI if we've made bad memory accesses.
+	 */
+
+	struct lkl_sem *sem = lkl_host_ops.sem_alloc(1);
+
+	lkl_host_ops.sem_down(sem);
+	lkl_host_ops.sem_up(sem);
+	lkl_host_ops.sem_free(sem);
+
+	return ret;
+}
+
+static int lkl_test_gettid(void)
+{
+	long tid = lkl_host_ops.gettid();
+
+	lkl_test_logf("%ld", tid);
+
+	/* As far as I know, thread IDs are non-zero on all reasonable
+	 * systems.
+	 */
+	if (tid)
+		return TEST_SUCCESS;
+	else
+		return TEST_FAILURE;
+}
+
+static void test_thread(void *data)
+{
+	int *pipe_fds = (int *) data;
+	char tmp[LKL_PIPE_BUF+1];
+	int ret;
+
+	ret = lkl_sys_read(pipe_fds[0], tmp, sizeof(tmp));
+	if (ret < 0)
+		lkl_test_logf("%s: %s\n", __func__, lkl_strerror(ret));
+}
+
+static int lkl_test_syscall_thread(void)
+{
+	int pipe_fds[2];
+	char tmp[LKL_PIPE_BUF+1];
+	long ret;
+	lkl_thread_t tid;
+
+	ret = lkl_sys_pipe2(pipe_fds, 0);
+	if (ret) {
+		lkl_test_logf("pipe2: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_fcntl(pipe_fds[0], LKL_F_SETPIPE_SZ, 1);
+	if (ret < 0) {
+		lkl_test_logf("fcntl setpipe_sz: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	tid = lkl_host_ops.thread_create(test_thread, pipe_fds);
+	if (!tid) {
+		lkl_test_logf("failed to create thread\n");
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_write(pipe_fds[1], tmp, sizeof(tmp));
+	if (ret != sizeof(tmp)) {
+		if (ret < 0)
+			lkl_test_logf("write error: %s\n", lkl_strerror(ret));
+		else
+			lkl_test_logf("short write: %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_host_ops.thread_join(tid);
+	if (ret) {
+		lkl_test_logf("failed to join thread\n");
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+#ifndef __MINGW32__
+static void thread_get_pid(void *unused)
+{
+	lkl_sys_getpid();
+}
+
+static int lkl_test_many_syscall_threads(void)
+{
+	lkl_thread_t tid;
+	int count = 65, ret;
+
+	while (--count > 0) {
+		tid = lkl_host_ops.thread_create(thread_get_pid, NULL);
+		if (!tid) {
+			lkl_test_logf("failed to create thread\n");
+			return TEST_FAILURE;
+		}
+
+		ret = lkl_host_ops.thread_join(tid);
+		if (ret) {
+			lkl_test_logf("failed to join thread\n");
+			return TEST_FAILURE;
+		}
+	}
+
+	return TEST_SUCCESS;
+}
+#endif
+
+static void thread_quit_immediately(void *unused)
+{
+}
+
+static int lkl_test_join(void)
+{
+	lkl_thread_t tid = lkl_host_ops.thread_create(thread_quit_immediately,
+						      NULL);
+	int ret = lkl_host_ops.thread_join(tid);
+
+	if (ret == 0) {
+		lkl_test_logf("joined %ld\n", tid);
+		return TEST_SUCCESS;
+	}
+
+	lkl_test_logf("failed joining %ld\n", tid);
+	return TEST_FAILURE;
+}
+
+LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
+	     "mem=16M loglevel=8");
+LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
+
+struct lkl_test tests[] = {
+	LKL_TEST(mutex),
+	LKL_TEST(semaphore),
+	LKL_TEST(join),
+	LKL_TEST(start_kernel),
+	LKL_TEST(getpid),
+	LKL_TEST(syscall_latency),
+	LKL_TEST(umask),
+	LKL_TEST(umask2),
+	LKL_TEST(creat),
+	LKL_TEST(close),
+	LKL_TEST(failopen),
+	LKL_TEST(open),
+	LKL_TEST(write),
+	LKL_TEST(lseek_cur),
+	LKL_TEST(lseek_end),
+	LKL_TEST(lseek_set),
+	LKL_TEST(read),
+	LKL_TEST(fstat),
+	LKL_TEST(mkdir),
+	LKL_TEST(stat),
+#ifndef __MINGW32__
+	LKL_TEST(nanosleep),
+#endif
+	LKL_TEST(pipe2),
+	LKL_TEST(epoll),
+	LKL_TEST(mount_fs_proc),
+	LKL_TEST(chdir_proc),
+	LKL_TEST(open_cwd),
+	LKL_TEST(getdents64),
+	LKL_TEST(close_dir_fd),
+	LKL_TEST(chdir_root),
+	LKL_TEST(umount_fs_proc),
+	LKL_TEST(lo_ifup),
+	LKL_TEST(gettid),
+	LKL_TEST(syscall_thread),
+	/*
+	 * Wine has an issue where the FlsCallback is not called when
+	 * the thread terminates which makes testing the automatic
+	 * syscall threads cleanup impossible under wine.
+	 */
+#ifndef __MINGW32__
+	LKL_TEST(many_syscall_threads),
+#endif
+	LKL_TEST(stop_kernel),
+};
+
+int main(int argc, const char **argv)
+{
+	lkl_host_ops.print = lkl_test_log;
+
+	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
+			    "boot");
+}
diff --git a/tools/lkl/tests/boot.sh b/tools/lkl/tests/boot.sh
new file mode 100755
index 000000000000..d985c04b0ac1
--- /dev/null
+++ b/tools/lkl/tests/boot.sh
@@ -0,0 +1,9 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+source $script_dir/test.sh
+
+lkl_test_plan 1 "boot"
+lkl_test_run 1
+lkl_test_exec $script_dir/boot
diff --git a/tools/lkl/tests/cla.c b/tools/lkl/tests/cla.c
new file mode 100644
index 000000000000..a34badeb5f06
--- /dev/null
+++ b/tools/lkl/tests/cla.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdlib.h>
+#ifdef __MINGW32__
+#include <winsock2.h>
+#else
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#endif
+
+#include "cla.h"
+
+static int cl_arg_parse_bool(struct cl_arg *arg, const char *value)
+{
+	*((int *)arg->store) = 1;
+	return 0;
+}
+
+static int cl_arg_parse_str(struct cl_arg *arg, const char *value)
+{
+	*((const char **)arg->store) = value;
+	return 0;
+}
+
+static int cl_arg_parse_int(struct cl_arg *arg, const char *value)
+{
+	errno = 0;
+	*((int *)arg->store) = strtol(value, NULL, 0);
+	return errno == 0;
+}
+
+static int cl_arg_parse_str_set(struct cl_arg *arg, const char *value)
+{
+	const char **set = arg->set;
+	int i;
+
+	for (i = 0; set[i] != NULL; i++) {
+		if (strcmp(set[i], value) == 0) {
+			*((int *)arg->store) = i;
+			return 0;
+		}
+	}
+
+	return -1;
+}
+
+static int cl_arg_parse_ipv4(struct cl_arg *arg, const char *value)
+{
+	unsigned int addr;
+
+	if (!value)
+		return -1;
+
+	addr = inet_addr(value);
+	if (addr == INADDR_NONE)
+		return -1;
+	*((unsigned int *)arg->store) = addr;
+	return 0;
+}
+
+static cl_arg_parser_t parsers[] = {
+	[CL_ARG_BOOL] = cl_arg_parse_bool,
+	[CL_ARG_INT] = cl_arg_parse_int,
+	[CL_ARG_STR] = cl_arg_parse_str,
+	[CL_ARG_STR_SET] = cl_arg_parse_str_set,
+	[CL_ARG_IPV4] = cl_arg_parse_ipv4,
+};
+
+static struct cl_arg *find_short_arg(char name, struct cl_arg *args)
+{
+	struct cl_arg *arg;
+
+	for (arg = args; arg->short_name != 0; arg++) {
+		if (arg->short_name == name)
+			return arg;
+	}
+
+	return NULL;
+}
+
+static struct cl_arg *find_long_arg(const char *name, struct cl_arg *args)
+{
+	struct cl_arg *arg;
+
+	for (arg = args; arg->long_name; arg++) {
+		if (strcmp(arg->long_name, name) == 0)
+			return arg;
+	}
+
+	return NULL;
+}
+
+static void print_help(struct cl_arg *args)
+{
+	struct cl_arg *arg;
+
+	fprintf(stderr, "usage:\n");
+	for (arg = args; arg->long_name; arg++) {
+		fprintf(stderr, "-%c, --%-20s %s", arg->short_name,
+			arg->long_name, arg->help);
+		if (arg->type == CL_ARG_STR_SET) {
+			const char **set = arg->set;
+
+			fprintf(stderr, " [ ");
+			while (*set != NULL)
+				fprintf(stderr, "%s ", *(set++));
+			fprintf(stderr, "]");
+		}
+		fprintf(stderr, "\n");
+	}
+}
+
+int parse_args(int argc, const char **argv, struct cl_arg *args)
+{
+	int i;
+
+	for (i = 1; i < argc; i++) {
+		struct cl_arg *arg = NULL;
+		cl_arg_parser_t parser;
+
+		if (argv[i][0] == '-') {
+			if (argv[i][1] != '-')
+				arg = find_short_arg(argv[i][1], args);
+			else
+				arg = find_long_arg(&argv[i][2], args);
+		}
+
+		if (!arg) {
+			fprintf(stderr, "unknown option '%s'\n", argv[i]);
+			print_help(args);
+			return -1;
+		}
+
+		if (arg->type == CL_ARG_USER || arg->type >= CL_ARG_END)
+			parser = arg->parser;
+		else
+			parser = parsers[arg->type];
+
+		if (!parser) {
+			fprintf(stderr, "can't parse --'%s'/-'%c'\n",
+				arg->long_name, args->short_name);
+			return -1;
+		}
+
+		if (parser(arg, argv[i + 1]) < 0) {
+			fprintf(stderr, "can't parse '%s'\n", argv[i]);
+			print_help(args);
+			return -1;
+		}
+
+		if (arg->has_arg)
+			i++;
+	}
+
+	return 0;
+}
diff --git a/tools/lkl/tests/cla.h b/tools/lkl/tests/cla.h
new file mode 100644
index 000000000000..f8369be02e5a
--- /dev/null
+++ b/tools/lkl/tests/cla.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_TEST_CLA_H
+#define _LKL_TEST_CLA_H
+
+enum cl_arg_type {
+	CL_ARG_USER = 0,
+	CL_ARG_BOOL,
+	CL_ARG_INT,
+	CL_ARG_STR,
+	CL_ARG_STR_SET,
+	CL_ARG_IPV4,
+	CL_ARG_END,
+};
+
+struct cl_arg;
+
+typedef int (*cl_arg_parser_t)(struct cl_arg *arg, const char *value);
+
+struct cl_arg {
+	const char *long_name;
+	char short_name;
+	const char *help;
+	int has_arg;
+	enum cl_arg_type type;
+	void *store;
+	void *set;
+	cl_arg_parser_t parser;
+};
+
+int parse_args(int argc, const char **argv, struct cl_arg *args);
+
+
+#endif /* _LKL_TEST_CLA_H */
diff --git a/tools/lkl/tests/disk.c b/tools/lkl/tests/disk.c
new file mode 100644
index 000000000000..0aa039876b54
--- /dev/null
+++ b/tools/lkl/tests/disk.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <time.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <lkl.h>
+#include <lkl_host.h>
+#ifndef __MINGW32__
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#else
+#include <windows.h>
+#endif
+
+#include "test.h"
+#include "cla.h"
+
+static struct {
+	int printk;
+	const char *disk;
+	const char *fstype;
+	int partition;
+} cla;
+
+struct cl_arg args[] = {
+	{"disk", 'd', "disk file to use", 1, CL_ARG_STR, &cla.disk},
+	{"partition", 'P', "partition to mount", 1, CL_ARG_INT, &cla.partition},
+	{"type", 't', "filesystem type", 1, CL_ARG_STR, &cla.fstype},
+	{0},
+};
+
+
+static struct lkl_disk disk;
+static int disk_id = -1;
+
+int lkl_test_disk_add(void)
+{
+#ifdef __MINGW32__
+	disk.handle = CreateFile(cla.disk, GENERIC_READ | GENERIC_WRITE,
+			       0, NULL, OPEN_EXISTING, 0, NULL);
+	if (!disk.handle)
+#else
+	disk.fd = open(cla.disk, O_RDWR);
+	if (disk.fd < 0)
+#endif
+		goto out_unlink;
+
+	disk.ops = NULL;
+
+	disk_id = lkl_disk_add(&disk);
+	if (disk_id < 0)
+		goto out_close;
+
+	goto out;
+
+out_close:
+#ifdef __MINGW32__
+	CloseHandle(disk.handle);
+#else
+	close(disk.fd);
+#endif
+
+out_unlink:
+#ifdef __MINGW32__
+	DeleteFile(cla.disk);
+#else
+	unlink(cla.disk);
+#endif
+
+out:
+	lkl_test_logf("disk fd/handle %x disk_id %d", disk.fd, disk_id);
+
+	if (disk_id >= 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+int lkl_test_disk_remove(void)
+{
+	int ret;
+
+	ret = lkl_disk_remove(disk);
+
+#ifdef __MINGW32__
+	CloseHandle(disk.handle);
+#else
+	close(disk.fd);
+#endif
+
+	if (ret == 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+
+static char mnt_point[32];
+
+LKL_TEST_CALL(mount_dev, lkl_mount_dev, 0, disk_id, cla.partition, cla.fstype,
+	      0, NULL, mnt_point, sizeof(mnt_point))
+
+static int lkl_test_umount_dev(void)
+{
+	long ret, ret2;
+
+	ret = lkl_sys_chdir("/");
+
+	ret2 = lkl_umount_dev(disk_id, cla.partition, 0, 1000);
+
+	lkl_test_logf("%ld %ld", ret, ret2);
+
+	if (!ret && !ret2)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+struct lkl_dir *dir;
+
+static int lkl_test_opendir(void)
+{
+	int err;
+
+	dir = lkl_opendir(mnt_point, &err);
+
+	lkl_test_logf("lkl_opedir(%s) = %d %s\n", mnt_point, err,
+		      lkl_strerror(err));
+
+	if (err == 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+static int lkl_test_readdir(void)
+{
+	struct lkl_linux_dirent64 *de = lkl_readdir(dir);
+	int wr = 0;
+
+	while (de) {
+		wr += lkl_test_logf("%s ", de->d_name);
+		if (wr >= 70) {
+			lkl_test_logf("\n");
+			wr = 0;
+			break;
+		}
+		de = lkl_readdir(dir);
+	}
+
+	if (lkl_errdir(dir) == 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+LKL_TEST_CALL(closedir, lkl_closedir, 0, dir);
+LKL_TEST_CALL(chdir_mnt_point, lkl_sys_chdir, 0, mnt_point);
+LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
+	     "mem=16M loglevel=8");
+LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
+
+struct lkl_test tests[] = {
+	LKL_TEST(disk_add),
+	LKL_TEST(start_kernel),
+	LKL_TEST(mount_dev),
+	LKL_TEST(chdir_mnt_point),
+	LKL_TEST(opendir),
+	LKL_TEST(readdir),
+	LKL_TEST(closedir),
+	LKL_TEST(umount_dev),
+	LKL_TEST(stop_kernel),
+	LKL_TEST(disk_remove),
+
+};
+
+int main(int argc, const char **argv)
+{
+	if (parse_args(argc, argv, args) < 0)
+		return -1;
+
+	lkl_host_ops.print = lkl_test_log;
+
+	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
+			    "disk %s", cla.fstype);
+}
diff --git a/tools/lkl/tests/disk.sh b/tools/lkl/tests/disk.sh
new file mode 100755
index 000000000000..9bdcb16f2d5c
--- /dev/null
+++ b/tools/lkl/tests/disk.sh
@@ -0,0 +1,61 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+source $script_dir/test.sh
+
+function prepfs()
+{
+    set -e
+
+    file=`mktemp`
+
+    dd if=/dev/zero of=$file bs=1024 count=204800
+
+    yes | mkfs.$1 $file
+
+    if ! [ -z $BSD_WDIR ]; then
+        $MYSSH mkdir -p $BSD_WDIR
+        ssh_copy $file $BSD_WDIR
+        rm $file
+        file=$BSD_WDIR/$(basename $file)
+    fi
+
+    export_vars file
+}
+
+function cleanfs()
+{
+    set -e
+
+    if ! [ -z $BSD_WDIR ]; then
+        $MYSSH rm $1
+        $MYSSH rm $BSD_WDIR/disk
+    else
+        rm $1
+    fi
+}
+
+if [ "$1" = "-t" ]; then
+    shift
+    fstype=$1
+    shift
+fi
+
+if [ -z "$fstype" ]; then
+    fstype="ext4"
+fi
+
+if [ -z $(which mkfs.$fstype) ]; then
+    lkl_test_plan 0 "disk $fstype"
+    echo "no mkfs.$fstype command"
+    exit 0
+fi
+
+lkl_test_plan 1 "disk $fstype"
+lkl_test_run 1 prepfs $fstype
+lkl_test_exec $script_dir/disk -d $file -t $fstype $@
+lkl_test_plan 1 "disk $fstype"
+lkl_test_run 1 cleanfs $file
+
diff --git a/tools/lkl/tests/run.py b/tools/lkl/tests/run.py
new file mode 100755
index 000000000000..8fea72686a7a
--- /dev/null
+++ b/tools/lkl/tests/run.py
@@ -0,0 +1,182 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License
+#
+# Author: Octavian Purdila <tavi@cs.pub.ro>
+#
+
+from __future__ import print_function
+
+import argparse
+import os
+import subprocess
+import sys
+import tap13
+import xml.etree.ElementTree as ET
+
+from junit_xml import TestSuite, TestCase
+
+
+class Reporter(tap13.Reporter):
+    def start(self, obj):
+        if type(obj) is tap13.Test:
+            if obj.result == "*":
+                end='\r'
+            else:
+                end='\n'
+            print("  TEST       %-8s %.50s" %
+                  (obj.result, obj.description + " " + obj.comment), end=end)
+
+        elif type(obj) is tap13.Suite:
+            if obj.tests_planned == 0:
+                status = "skip"
+            else:
+                status = ""
+            print("  SUITE      %-8s %s" % (status, obj.name))
+
+    def end(self, obj):
+        if type(obj) is tap13.Test:
+            if obj.result != "ok":
+                try:
+                    print(obj.yaml["log"], end='')
+                except:
+                    None
+
+
+mydir=os.path.dirname(os.path.realpath(__file__))
+
+tests = [
+    'boot.sh',
+    'disk.sh -t ext4',
+    'disk.sh -t vfat',
+    'net.sh -b loopback',
+    'net.sh -b tap',
+    'net.sh -b pipe',
+    'net.sh -b raw',
+    'net.sh -b macvtap',
+    'lklfuse.sh -t ext4',
+    'lklfuse.sh -t vfat',
+    'hijack-test.sh'
+]
+
+parser = argparse.ArgumentParser(description='LKL test runner')
+parser.add_argument('tests', nargs='?', action='append',
+                    help='tests to run %s' % tests)
+parser.add_argument('--junit-dir',
+                    help='directory where to store the juni suites')
+parser.add_argument('--gdb', action='store_true', default=False,
+                    help='run simple tests under gdb; implies --pass-through')
+parser.add_argument('--pass-through', action='store_true',  default=False,
+                    help='run the test without interpeting the test output')
+parser.add_argument('--valgrind', action='store_true', default=False,
+                    help='run simple tests under valgrind')
+
+args = parser.parse_args()
+if args.tests == [None]:
+    args.tests = tests
+
+if args.gdb:
+    args.pass_through=True
+    os.environ['GDB']="yes"
+
+if args.valgrind:
+    os.environ['VALGRIND']="yes"
+
+tap = tap13.Parser(Reporter())
+
+os.environ['PATH'] += ":" + mydir
+
+exit_code = 0
+
+for t in args.tests:
+    if not t:
+        continue
+    if args.pass_through:
+        print(t)
+        if subprocess.call(t, shell=True) != 0:
+            exit_code = 1
+    else:
+        p = subprocess.Popen(t, shell=True, stdout=subprocess.PIPE)
+        tap.parse(p.stdout)
+
+if args.pass_through:
+    sys.exit(exit_code)
+
+suites_count = 0
+tests_total = 0
+tests_not_ok = 0
+tests_ok = 0
+tests_skip = 0
+val_errs = 0
+val_fails = 0
+val_skips = 0
+
+for s in tap.run.suites:
+
+    junit_tests = []
+    suites_count += 1
+
+    for t in s.tests:
+        try:
+            secs = t.yaml["time_us"] / 1000000.0
+        except:
+            secs = 0
+        try:
+            log = t.yaml['log']
+        except:
+            log = ""
+
+        jt = TestCase(t.description, elapsed_sec=secs, stdout=log)
+        if t.result == 'skip':
+            jt.add_skipped_info(output=log)
+        elif t.result == 'not ok':
+            jt.add_error_info(output=log)
+
+        junit_tests.append(jt)
+
+        tests_total += 1
+        if t.result == "ok":
+            tests_ok += 1
+        elif t.result == "not ok":
+            tests_not_ok += 1
+            exit_code = 1
+        elif t.result == "skip":
+            tests_skip += 1
+
+    if args.junit_dir:
+        js = TestSuite(s.name, junit_tests)
+        with open(os.path.join(args.junit_dir, os.path.basename(s.name) + '.xml'), 'w') as f:
+            js.to_file(f, [js])
+
+        if os.getenv('VALGRIND') is not None:
+            val_xml = 'valgrind-%s.xml' % os.path.basename(s.name).replace(' ','-')
+            # skipped tests don't generate xml file
+            if os.path.exists(val_xml) is False:
+                continue
+
+            cmd = 'mv %s %s' % (val_xml, args.junit_dir)
+            subprocess.call(cmd, shell=True, )
+
+            cmd = mydir + '/valgrind2xunit.py ' + val_xml
+            subprocess.call(cmd, shell=True, cwd=args.junit_dir)
+
+            # count valgrind results
+            doc = ET.parse(os.path.join(args.junit_dir, 'valgrind-%s_xunit.xml' \
+                                        % (os.path.basename(s.name).replace(' ','-'))))
+            ts = doc.getroot()
+            val_errs += int(ts.get('errors'))
+            val_fails += int(ts.get('failures'))
+            val_skips += int(ts.get('skip'))
+
+print("Summary: %d suites run, %d tests, %d ok, %d not ok, %d skipped" %
+      (suites_count, tests_total, tests_ok, tests_not_ok, tests_skip))
+
+if os.getenv('VALGRIND') is not None:
+    print(" valgrind (memcheck): %d failures, %d skipped" % (val_fails, val_skips))
+    if val_errs or val_fails:
+        exit_code = 1
+
+sys.exit(exit_code)
diff --git a/tools/lkl/tests/tap13.py b/tools/lkl/tests/tap13.py
new file mode 100644
index 000000000000..65c73cda7ca1
--- /dev/null
+++ b/tools/lkl/tests/tap13.py
@@ -0,0 +1,209 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License
+#
+# Author: Octavian Purdila <tavi@cs.pub.ro>
+#
+# Based on TAP13:
+#
+# Copyright 2013, Red Hat, Inc.
+# Author: Josef Skladanka <jskladan@redhat.com>
+#
+from __future__ import print_function
+
+import re
+import sys
+import yamlish
+
+
+class Reporter(object):
+
+    def start(self, obj):
+        None
+
+    def end(self, obj):
+        None
+
+
+class Test(object):
+    def __init__(self, reporter, result, id, description=None, directive=None,
+                 comment=None):
+        self.reporter = reporter
+        self.result = result
+        if directive:
+            self.result = directive.lower()
+        if id:
+            self.id = int(id)
+        else:
+            self.id = None
+        if description:
+            self.description = description
+        else:
+            self.description = ""
+        if comment:
+            self.comment = "# " + comment
+        else:
+            self.comment = ""
+        self.yaml = None
+        self._yaml_buffer = None
+        self.diagnostics = []
+
+        self.reporter.start(self)
+
+    def end(self):
+        if not self.yaml:
+            self.yaml = yamlish.load(self._yaml_buffer)
+            self.reporter.end(self)
+
+
+class Suite(object):
+    def __init__(self, reporter, start, end, explanation):
+        self.reporter = reporter
+        self.tests = []
+        self.name = explanation
+        self.tests_planned = int(end)
+
+        self.__tests_counter = 0
+        self.__tests_base = 0
+
+        self.reporter.start(self)
+
+    def newTest(self, args):
+        try:
+            self.tests[-1].end()
+        except IndexError:
+            None
+
+        if 'id' not in args or not args['id']:
+            args['id'] = self.__tests_counter
+        else:
+            args['id'] = int(args['id']) + self.__tests_base
+
+        if args['id'] < self.__tests_counter:
+            print("error: bad test id %d, fixing it" % (args['id']))
+            args['id'] = self.__tests_counter
+        # according to TAP13 specs, missing tests must be handled as 'not ok'
+        # here we add the missing tests in sequence
+        while args['id'] > (self.__tests_counter + 1):
+            comment = 'test %d not present' % self.__tests_counter
+            self.tests.append(Test(self.reporter, 'not ok',
+                                   self.__tests_counter, comment=comment))
+            self.__tests_counter += 1
+
+        if args['id'] == self.__tests_counter:
+            if args['directive']:
+                self.test().result = args['directive'].lower()
+            else:
+                self.test().result = args['result']
+            self.reporter.start(self.test())
+        else:
+            self.tests.append(Test(self.reporter, **args))
+            self.__tests_counter += 1
+
+    def test(self):
+        return self.tests[-1]
+
+    def end(self, name, planned):
+        if name == self.name:
+            self.tests_planned += int(planned)
+            self.__tests_base = self.__tests_counter
+            return False
+        try:
+            self.test().end()
+        except IndexError:
+            None
+        if len(self.tests) != self.tests_planned:
+            for i in range(len(self.tests), self.tests_planned):
+                self.tests.append(Test(self.reporter, 'not ok', i+1,
+                                       comment='test not present'))
+        return True
+
+
+class Run(object):
+
+    def __init__(self, reporter):
+        self.reporter = reporter
+        self.suites = []
+
+    def suite(self):
+        return self.suites[-1]
+
+    def test(self):
+        return self.suites[-1].tests[-1]
+
+    def newSuite(self, args):
+        new = False
+        try:
+            if self.suite().end(args['explanation'], args['end']):
+                new = True
+        except IndexError:
+            new = True
+        if new:
+            self.suites.append(Suite(self.reporter, **args))
+
+    def newTest(self, args):
+        self.suite().newTest(args)
+
+
+class Parser(object):
+    RE_PLAN = re.compile(r"^\s*(?P<start>\d+)\.\.(?P<end>\d+)\s*(#\s*(?P<explanation>.*))?\s*$")
+    RE_TEST_LINE = re.compile(r"^\s*(?P<result>(not\s+)?ok|[*]+)\s*(?P<id>\d+)?\s*(?P<description>[^#]+)?\s*(#\s*(?P<directive>TODO|SKIP)?\s*(?P<comment>.+)?)?\s*$",  re.IGNORECASE)
+    RE_EXPLANATION = re.compile(r"^\s*#\s*(?P<explanation>.+)?\s*$")
+    RE_YAMLISH_START = re.compile(r"^\s*---.*$")
+    RE_YAMLISH_END = re.compile(r"^\s*\.\.\.\s*$")
+
+    def __init__(self, reporter):
+        self.seek_test = False
+        self.in_test = False
+        self.in_yaml = False
+        self.run = Run(reporter)
+
+    def parse(self, source):
+        # to avoid input buffering
+        while True:
+            line = source.readline()
+            if not line:
+                break
+
+            if self.in_yaml:
+                if Parser.RE_YAMLISH_END.match(line):
+                    self.run.test()._yaml_buffer.append(line.strip())
+                    self.in_yaml = False
+                else:
+                    self.run.test()._yaml_buffer.append(line.rstrip())
+                continue
+
+            line = line.strip()
+
+            if self.in_test:
+                if Parser.RE_EXPLANATION.match(line):
+                    self.run.test().diagnostics.append(line)
+                    continue
+                if Parser.RE_YAMLISH_START.match(line):
+                    self.run.test()._yaml_buffer = [line.strip()]
+                    self.in_yaml = True
+                    continue
+
+            m = Parser.RE_PLAN.match(line)
+            if m:
+                self.seek_test = True
+                args = m.groupdict()
+                self.run.newSuite(args)
+                continue
+
+            if self.seek_test:
+                m = Parser.RE_TEST_LINE.match(line)
+                if m:
+                    args = m.groupdict()
+                    self.run.newTest(args)
+                    self.in_test = True
+                    continue
+
+            print(line)
+        try:
+            self.run.suite().end(None, 0)
+        except IndexError:
+            None
diff --git a/tools/lkl/tests/test.c b/tools/lkl/tests/test.c
new file mode 100644
index 000000000000..3e334d106c48
--- /dev/null
+++ b/tools/lkl/tests/test.c
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <stdarg.h>
+#include <time.h>
+
+#include "test.h"
+
+/* circular log buffer */
+
+static char log_buf[0x10000];
+static char *head = log_buf, *tail = log_buf;
+
+static inline void advance(char **ptr)
+{
+	if ((unsigned int)(*ptr - log_buf) >= sizeof(log_buf))
+		*ptr = log_buf;
+	else
+		*ptr = *ptr + 1;
+}
+
+static void log_char(char c)
+{
+	*tail = c;
+	advance(&tail);
+	if (tail == head)
+		advance(&head);
+}
+
+static void print_log(void)
+{
+	char last;
+
+	printf(" log: |\n");
+	last = '\n';
+	while (head != tail) {
+		if (last == '\n')
+			printf("  ");
+		last = *head;
+		putchar(last);
+		advance(&head);
+	}
+	if (last != '\n')
+		putchar('\n');
+}
+
+int lkl_test_run(const struct lkl_test *tests, int nr, const char *fmt, ...)
+{
+	int i, ret, status = TEST_SUCCESS;
+	clock_t start, stop;
+	char name[1024];
+	va_list args;
+
+	va_start(args, fmt);
+	vsnprintf(name, sizeof(name), fmt, args);
+	va_end(args);
+
+	printf("1..%d # %s\n", nr, name);
+	for (i = 1; i <= nr; i++) {
+		const struct lkl_test *t = &tests[i-1];
+		unsigned long delta_us;
+
+		printf("* %d %s\n", i, t->name);
+		fflush(stdout);
+
+		start = clock();
+
+		ret = t->fn(t->arg1, t->arg2, t->arg3);
+
+		stop = clock();
+
+		switch (ret) {
+		case TEST_SUCCESS:
+			printf("ok %d %s\n", i, t->name);
+			break;
+		case TEST_SKIP:
+			printf("ok %d %s # SKIP\n", i, t->name);
+			break;
+		case TEST_BAILOUT:
+			status = TEST_BAILOUT;
+			/* fall through */
+		case TEST_FAILURE:
+		default:
+			if (status != TEST_BAILOUT)
+				status = TEST_FAILURE;
+			printf("not ok %d %s\n", i, t->name);
+		}
+
+		printf(" ---\n");
+		delta_us = (stop - start) * 1000000 / CLOCKS_PER_SEC;
+		printf(" time_us: %ld\n", delta_us);
+		print_log();
+		printf(" ...\n");
+
+		if (status == TEST_BAILOUT) {
+			printf("Bail out!\n");
+			return TEST_FAILURE;
+		}
+
+		fflush(stdout);
+	}
+
+	return status;
+}
+
+
+void lkl_test_log(const char *str, int len)
+{
+	while (len--)
+		log_char(*(str++));
+}
+
+int lkl_test_logf(const char *fmt, ...)
+{
+	char tmp[1024], *c;
+	va_list args;
+	unsigned int n;
+
+	va_start(args, fmt);
+	n = vsnprintf(tmp, sizeof(tmp), fmt, args);
+	va_end(args);
+
+	for (c = tmp; *c != 0; c++)
+		log_char(*c);
+
+	return n > sizeof(tmp) ? sizeof(tmp) : n;
+}
diff --git a/tools/lkl/tests/test.h b/tools/lkl/tests/test.h
new file mode 100644
index 000000000000..f63ad6d419cb
--- /dev/null
+++ b/tools/lkl/tests/test.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_TEST_H
+#define _LKL_TEST_H
+
+#define TEST_SUCCESS	0
+#define TEST_FAILURE	1
+#define TEST_SKIP	2
+#define TEST_TODO	3
+#define TEST_BAILOUT	4
+
+struct lkl_test {
+	const char *name;
+	int (*fn)();
+	void *arg1, *arg2, *arg3;
+};
+
+/**
+ * Simple wrapper to initialize a test entry.
+ * @name - test name, it assume test function is named test_@name
+ * @vargs - arguments to be passed to the function
+ */
+#define LKL_TEST(name, ...) { #name, lkl_test_##name, __VA_ARGS__ }
+
+/**
+ * lkl_test_run - run a test suite
+ *
+ * @tests - the list of tests to run
+ * @nr - number of tests
+ * @fmt - format string to be used for suite name
+ */
+int lkl_test_run(const struct lkl_test *tests, int nr, const char *fmt, ...);
+
+/**
+ * lkl_test_log - store a string in the test log buffer
+ * @str - the string to log (can be non-NULL terminated)
+ * @len - the string length
+ */
+void lkl_test_log(const char *str, int len);
+
+/**
+ * lkl_test_logf - printf like function to store into the test log buffer
+ * @fmt - printf format string
+ * @vargs - arguments to the format string
+ */
+int lkl_test_logf(const char *fmt, ...) __attribute__((format(printf, 1, 2)));
+
+/**
+ * LKL_TEST_CALL - create a test function as for a LKL call
+ *
+ * The test function will be named lkl_test_@name and will return
+ * TEST_SUCCESS if the called functions returns @expect. Otherwise
+ * will return TEST_FAILUIRE.
+ *
+ * @name - test name; must be unique because it is part of the the
+ * test function; the test function will be named
+ * @call - function to call
+ * @expect - expected return value for success
+ * @args - arguments to pass to the LKL call
+ */
+#define LKL_TEST_CALL(name, call, expect, ...)				\
+	static int lkl_test_##name(void)				\
+	{								\
+		long ret;						\
+									\
+		ret = call(__VA_ARGS__);				\
+		lkl_test_logf("%s(%s) = %ld %s\n", #call, #__VA_ARGS__, \
+			ret, ret < 0 ? lkl_strerror(ret) : "");		\
+		return (ret == expect) ? TEST_SUCCESS : TEST_FAILURE;	\
+	}
+
+
+#endif /* _LKL_TEST_H */
diff --git a/tools/lkl/tests/test.sh b/tools/lkl/tests/test.sh
new file mode 100644
index 000000000000..1a5619aed735
--- /dev/null
+++ b/tools/lkl/tests/test.sh
@@ -0,0 +1,179 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+basedir=$(cd $script_dir/..; pwd)
+source ${script_dir}/autoconf.sh
+
+TEST_SUCCESS=0
+TEST_FAILURE=1
+TEST_SKIP=113
+TEST_TODO=114
+TEST_BAILOUT=115
+
+print_log()
+{
+    echo " log: |"
+    while read line; do
+        echo "  $line"
+    done < $1
+}
+
+export_vars()
+{
+    if [ -z "$var_file" ]; then
+        return
+    fi
+
+    for i in $@; do
+        echo "$i=${!i}" >> $var_file
+    done
+}
+
+lkl_test_run()
+{
+    log_file=$(mktemp)
+    export var_file=$(mktemp)
+
+    tid=$1 && shift && tname=$@
+
+    echo "* $tid $tname"
+
+    start=$(date '+%s%9N')
+    # run in a separate shell to avoid -e terminating us
+    $@ 2>&1 | strings >$log_file
+    exit=${PIPESTATUS[0]}
+    stop=$(date '+%s%9N')
+
+    case $exit in
+    $TEST_SUCCESS)
+        echo "ok $tid $tname"
+        ;;
+    $TEST_SKIP)
+        echo "ok $tid $tname # SKIP"
+        ;;
+    $TEST_BAILOUT)
+        echo "not ok $tid $tname"
+        echo "Bail out!"
+        ;;
+    $TEST_FAILURE|*)
+        echo "not ok $tid $tname"
+        ;;
+    esac
+
+    delta=$(((stop-start)/1000))
+
+    echo " ---"
+    echo " time_us: $delta"
+    print_log $log_file
+    echo -e " ..."
+
+    rm $log_file
+    . $var_file
+    rm $var_file
+
+    return $exit
+}
+
+lkl_test_plan()
+{
+    echo "1..$1 # $2"
+    export suite_name="${2// /\-}"
+}
+
+lkl_test_exec()
+{
+    local SUDO=""
+    local WRAPPER=""
+
+    if [ "$1" = "sudo" ]; then
+        SUDO=sudo
+        shift
+    fi
+
+    local file=$1
+    shift
+
+    if [ -n "$LKL_HOST_CONFIG_NT" ]; then
+        file=$file.exe
+    fi
+
+    if file $file | grep ARM; then
+        WRAPPER="qemu-arm-static"
+    elif file $file | grep "FreeBSD" ; then
+        ssh_copy "$file" $BSD_WDIR
+        if [ -n "$SUDO" ]; then
+            SUDO=""
+        fi
+        WRAPPER="$MYSSH $SU"
+        # ssh will mess up with pipes ('|') so, escape the pipe char.
+        args="${@//\|/\\\|}"
+        set - $BSD_WDIR/$(basename $file) $args
+        file=""
+    elif [ -n "$GDB" ]; then
+        WRAPPER="gdb"
+        args="$@"
+        set - -ex "run $args" -ex quit $file
+        file=""
+    elif [ -n "$VALGRIND" ]; then
+        WRAPPER="valgrind --suppressions=$script_dir/valgrind.supp \
+                  --leak-check=full --show-leak-kinds=all --xml=yes \
+                  --xml-file=valgrind-$suite_name.xml"
+    fi
+
+    $SUDO $WRAPPER $file "$@"
+}
+
+lkl_test_cmd()
+{
+    local WRAPPER=""
+
+    if [ -z "$QUIET" ]; then
+        SHOPTS="-x"
+    fi
+
+    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+        WRAPPER="$MYSSH $SU"
+    fi
+
+    echo "$@" | $WRAPPER sh $SHOPTS
+}
+
+# XXX: $MYSSH and $MYSCP are defined in a circleci docker image.
+# see the definitions in lkl/lkl-docker:circleci/freebsd11/Dockerfile
+ssh_push()
+{
+    while [ -n "$1" ]; do
+        if [[ "$1" = *.sh ]]; then
+            type="script"
+        else
+            type="file"
+        fi
+
+        dir=$(dirname $1)
+        $MYSSH mkdir -p $BSD_WDIR/$dir
+
+        $MYSCP -P 7722 -r $basedir/$1 root@localhost:$BSD_WDIR/$dir
+        if [ "$type" = "script" ]; then
+            $MYSSH chmod a+x $BSD_WDIR/$1
+        fi
+
+        shift
+    done
+}
+
+ssh_copy()
+{
+    $MYSCP -P 7722 -r $1 root@localhost:$2
+}
+
+lkl_test_bsd_cleanup()
+{
+    $MYSSH rm -rf $BSD_WDIR
+}
+
+if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+    trap lkl_test_bsd_cleanup EXIT
+    export BSD_WDIR=/root/lkl
+    $MYSSH mkdir -p $BSD_WDIR
+fi
diff --git a/tools/lkl/tests/valgrind.supp b/tools/lkl/tests/valgrind.supp
new file mode 100644
index 000000000000..5ce717d759fc
--- /dev/null
+++ b/tools/lkl/tests/valgrind.supp
@@ -0,0 +1,85 @@
+{
+   <unfinished timer 1>
+   Memcheck:Leak
+   match-leak-kinds: possible
+   ...
+   fun:pthread_create@@GLIBC_2.2.5
+   fun:__start_helper_thread
+   fun:__pthread_once_slow
+   fun:timer_create@@GLIBC_2.3.3
+   fun:timer_alloc
+   fun:clockevent_set_state_oneshot
+   ...
+   fun:__clockevents_switch_state
+   fun:clockevents_switch_state
+   fun:tick_setup_periodic
+   ...
+}
+
+{
+   <pid1 kernel thread>
+   Memcheck:Leak
+   match-leak-kinds: possible
+   ...
+   fun:thread_create
+   fun:copy_thread
+   fun:copy_thread_tls
+   ...
+   fun:rest_init
+   fun:start_kernel
+   fun:lkl_run_kernel
+}
+
+{
+   <xfs uninitialized buf error: delete this once upstream is fixed>
+   Memcheck:Value8
+   fun:crc32_body
+   fun:crc32_le_generic
+   fun:__crc32c_le
+   fun:chksum_update
+   fun:crypto_shash_update
+   fun:crc32c
+   fun:xlog_cksum
+}
+
+{
+   <xfs pwrite64 issue: delete this once upstream is fixed>
+   Memcheck:Param
+   pwrite64(buf)
+   ...
+   fun:blk_request
+   fun:blk_enqueue
+   fun:virtio_process_one
+   fun:virtio_process_queue
+   fun:virtio_write
+   fun:__raw_writel
+   fun:writel
+   fun:vm_notify
+   fun:virtqueue_notify
+   fun:virtio_queue_rq
+   fun:blk_mq_dispatch_rq_list
+   fun:blk_mq_sched_dispatch_requests
+}
+
+{
+   <virtio_net_pipe xmits>
+   Memcheck:Param
+   writev(vector[...])
+   ...
+   fun:fd_net_tx
+   fun:net_enqueue
+   fun:virtio_process_one
+   fun:virtio_process_queue
+   fun:virtio_write
+   fun:__raw_writel
+   fun:writel
+   fun:vm_notify
+   fun:virtqueue_notify
+   fun:virtqueue_kick
+   fun:start_xmit
+   fun:__netdev_start_xmit
+   fun:netdev_start_xmit
+   fun:xmit_one
+   fun:dev_hard_start_xmit
+   fun:sch_direct_xmit
+}
\ No newline at end of file
diff --git a/tools/lkl/tests/valgrind2xunit.py b/tools/lkl/tests/valgrind2xunit.py
new file mode 100755
index 000000000000..ab7c12b83377
--- /dev/null
+++ b/tools/lkl/tests/valgrind2xunit.py
@@ -0,0 +1,69 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+
+##
+## Downloader from
+## http://humdi.net/wiki/tips/valgrind-to-xunit-xml-converter
+##
+
+import xml.etree.ElementTree as ET
+import sys
+import os
+
+fname = sys.argv[1]
+if fname is None:
+    fname = 'valgrind.xml'
+
+doc = ET.parse(fname)
+errors = doc.findall('.//error')
+
+out = open(os.path.splitext(os.path.basename(fname))[0]+'_xunit.xml',"w")
+out.write('<?xml version="1.0" encoding="UTF-8"?>\n')
+out.write('<testsuite name="valgrind" tests="'+str(len(errors))+'" errors="0" failures="'+str(len(errors))+'" skip="0">\n')
+errorcount=0
+for error in errors:
+    errorcount=errorcount+1
+
+    kind = error.find('kind')
+    what = error.find('what')
+    if  what == None:
+        what = error.find('xwhat/text')
+
+    stack = error.find('stack')
+    frames = stack.findall('frame')
+
+    for frame in frames:
+        fi = frame.find('file')
+        li = frame.find('line')
+        if fi != None and li != None:
+            break
+
+    if fi != None and li != None:
+        out.write('    <testcase classname="ValgrindMemoryCheck" name="Memory check '+str(errorcount)+' ('+kind.text+', '+fi.text+':'+li.text+')" time="0">\n')
+    else:
+        out.write('    <testcase classname="ValgrindMemoryCheck" name="Memory check '+str(errorcount)+' ('+kind.text+')" time="0">\n')
+    out.write('        <error type="'+kind.text+'">\n')
+    out.write('  '+what.text+'\n\n')
+
+    for frame in frames:
+        ip = frame.find('ip')
+        fn = frame.find('fn')
+        fi = frame.find('file')
+        li = frame.find('line')
+
+        if fn is None:
+            bodytext = '(unresolved symbol)'
+        else:
+            bodytext = fn.text
+        bodytext = bodytext.replace("&","&amp;")
+        bodytext = bodytext.replace("<","&lt;")
+        bodytext = bodytext.replace(">","&gt;")
+        if fi != None and li != None:
+            out.write('  '+ip.text+': '+bodytext+' ('+fi.text+':'+li.text+')\n')
+        else:
+            out.write('  '+ip.text+': '+bodytext+'\n')
+
+    out.write('        </error>\n')
+    out.write('    </testcase>\n')
+out.write('</testsuite>\n')
+out.close()
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 21/37] lkl tools: "boot" test
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Patrick Collins, Conrad Meyer, Octavian Purdila,
	Motomu Utsumi, Lai Jiangshan, Akira Moroo, Petros Angelatos,
	Yuan Liu, Thomas Liebetraut, Mark Stillwell, David Disseldorp,
	linux-kernel-library, Luca Dariz, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add a simple LKL test applications that starts the kernel and performs
simple tests that minimally exercise the LKL API.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
Signed-off-by: Mark Stillwell <mark@stillwell.me>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore              |   5 +
 tools/lkl/Makefile                |   5 +-
 tools/lkl/Targets                 |   3 +
 tools/lkl/tests/Build             |   3 +
 tools/lkl/tests/boot.c            | 562 ++++++++++++++++++++++++++++++
 tools/lkl/tests/boot.sh           |   9 +
 tools/lkl/tests/cla.c             | 159 +++++++++
 tools/lkl/tests/cla.h             |  33 ++
 tools/lkl/tests/disk.c            | 189 ++++++++++
 tools/lkl/tests/disk.sh           |  61 ++++
 tools/lkl/tests/run.py            | 182 ++++++++++
 tools/lkl/tests/tap13.py          | 209 +++++++++++
 tools/lkl/tests/test.c            | 126 +++++++
 tools/lkl/tests/test.h            |  72 ++++
 tools/lkl/tests/test.sh           | 179 ++++++++++
 tools/lkl/tests/valgrind.supp     |  85 +++++
 tools/lkl/tests/valgrind2xunit.py |  69 ++++
 17 files changed, 1950 insertions(+), 1 deletion(-)
 create mode 100644 tools/lkl/tests/Build
 create mode 100644 tools/lkl/tests/boot.c
 create mode 100755 tools/lkl/tests/boot.sh
 create mode 100644 tools/lkl/tests/cla.c
 create mode 100644 tools/lkl/tests/cla.h
 create mode 100644 tools/lkl/tests/disk.c
 create mode 100755 tools/lkl/tests/disk.sh
 create mode 100755 tools/lkl/tests/run.py
 create mode 100644 tools/lkl/tests/tap13.py
 create mode 100644 tools/lkl/tests/test.c
 create mode 100644 tools/lkl/tests/test.h
 create mode 100644 tools/lkl/tests/test.sh
 create mode 100644 tools/lkl/tests/valgrind.supp
 create mode 100755 tools/lkl/tests/valgrind2xunit.py

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
index 1aed58bfe171..4e08254dbd46 100644
--- a/tools/lkl/.gitignore
+++ b/tools/lkl/.gitignore
@@ -2,3 +2,8 @@ Makefile.conf
 include/lkl_autoconf.h
 tests/autoconf.sh
 bin/stat
+tests/net-test
+tests/disk
+tests/boot
+tests/valgrind*.xml
+*.pyc
diff --git a/tools/lkl/Makefile b/tools/lkl/Makefile
index 6d6d2cead03f..9a55df5064e4 100644
--- a/tools/lkl/Makefile
+++ b/tools/lkl/Makefile
@@ -116,8 +116,11 @@ programs_install: $(progs-y:%=$(OUTPUT)%$(EXESUF))
 install: headers_install libraries_install programs_install
 
 
+run-tests:
+	./tests/run.py $(tests)
+
 FORCE: ;
-.PHONY: all clean FORCE
+.PHONY: all clean FORCE run-tests
 .PHONY: headers_install libraries_install programs_install install
 .NOTPARALLEL : lib/lkl.o
 .SECONDARY:
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
index 24c985e64638..a9f74c3cc8fb 100644
--- a/tools/lkl/Targets
+++ b/tools/lkl/Targets
@@ -1,3 +1,6 @@
 libs-y += lib/liblkl
 
+progs-y += tests/boot
+progs-y += tests/disk
+progs-y += tests/net-test
 
diff --git a/tools/lkl/tests/Build b/tools/lkl/tests/Build
new file mode 100644
index 000000000000..ace86a3d3438
--- /dev/null
+++ b/tools/lkl/tests/Build
@@ -0,0 +1,3 @@
+boot-y += boot.o test.o
+disk-y += disk.o cla.o test.o
+net-test-y += net-test.o cla.o test.o
diff --git a/tools/lkl/tests/boot.c b/tools/lkl/tests/boot.c
new file mode 100644
index 000000000000..b021e9540147
--- /dev/null
+++ b/tools/lkl/tests/boot.c
@@ -0,0 +1,562 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <time.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include <sys/stat.h>
+#include <fcntl.h>
+#if defined(__FreeBSD__)
+#include <net/if.h>
+#include <sys/ioctl.h>
+#elif __linux
+#include <sys/epoll.h>
+#include <sys/ioctl.h>
+#elif __MINGW32__
+#include <windows.h>
+#endif
+
+#include "test.h"
+
+#ifndef __MINGW32__
+#define sleep_ns 87654321
+int lkl_test_nanosleep(void)
+{
+	struct lkl_timespec ts = {
+		.tv_sec = 0,
+		.tv_nsec = sleep_ns,
+	};
+	struct timespec start, stop;
+	long delta;
+	long ret;
+
+	clock_gettime(CLOCK_MONOTONIC, &start);
+	ret = lkl_sys_nanosleep((struct __lkl__kernel_timespec *)&ts, NULL);
+	clock_gettime(CLOCK_MONOTONIC, &stop);
+
+	delta = 1e9*(stop.tv_sec - start.tv_sec) +
+		(stop.tv_nsec - start.tv_nsec);
+
+	lkl_test_logf("sleep %ld, expected sleep %d\n", delta, sleep_ns);
+
+	if (ret == 0 && delta > sleep_ns * 0.9)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+#endif
+
+LKL_TEST_CALL(getpid, lkl_sys_getpid, 1)
+
+void check_latency(long (*f)(void), long *min, long *max, long *avg)
+{
+	int i;
+	unsigned long long start, stop, sum = 0;
+	static const int count = 1000;
+	long delta;
+
+	*min = 1000000000;
+	*max = -1;
+
+	for (i = 0; i < count; i++) {
+		start = lkl_host_ops.time();
+		f();
+		stop = lkl_host_ops.time();
+		delta = stop - start;
+		if (*min > delta)
+			*min = delta;
+		if (*max < delta)
+			*max = delta;
+		sum += delta;
+	}
+	*avg = sum / count;
+}
+
+static long native_getpid(void)
+{
+#ifdef __MINGW32__
+	GetCurrentProcessId();
+#else
+	getpid();
+#endif
+	return 0;
+}
+
+int lkl_test_syscall_latency(void)
+{
+	long min, max, avg;
+
+	lkl_test_logf("avg/min/max: ");
+
+	check_latency(lkl_sys_getpid, &min, &max, &avg);
+
+	lkl_test_logf("lkl:%ld/%ld/%ld ", avg, min, max);
+
+	check_latency(native_getpid, &min, &max, &avg);
+
+	lkl_test_logf("native:%ld/%ld/%ld\n", avg, min, max);
+
+	return TEST_SUCCESS;
+}
+
+#define access_rights 0721
+
+LKL_TEST_CALL(creat, lkl_sys_creat, 0, "/file", access_rights)
+LKL_TEST_CALL(close, lkl_sys_close, 0, 0);
+LKL_TEST_CALL(failopen, lkl_sys_open, -LKL_ENOENT, "/file2", 0, 0);
+LKL_TEST_CALL(umask, lkl_sys_umask, 022,  0777);
+LKL_TEST_CALL(umask2, lkl_sys_umask, 0777, 0);
+LKL_TEST_CALL(open, lkl_sys_open, 0, "/file", LKL_O_RDWR, 0);
+static const char wrbuf[] = "test";
+LKL_TEST_CALL(write, lkl_sys_write, sizeof(wrbuf), 0, wrbuf, sizeof(wrbuf));
+LKL_TEST_CALL(lseek_cur, lkl_sys_lseek, sizeof(wrbuf), 0, 0, LKL_SEEK_CUR);
+LKL_TEST_CALL(lseek_end, lkl_sys_lseek, sizeof(wrbuf), 0, 0, LKL_SEEK_END);
+LKL_TEST_CALL(lseek_set, lkl_sys_lseek, 0, 0, 0, LKL_SEEK_SET);
+
+int lkl_test_read(void)
+{
+	char buf[10] = { 0, };
+	long ret;
+
+	ret = lkl_sys_read(0, buf, sizeof(buf));
+
+	lkl_test_logf("lkl_sys_read=%ld buf=%s\n", ret, buf);
+
+	if (ret == sizeof(wrbuf) && !strcmp(wrbuf, buf))
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+int lkl_test_fstat(void)
+{
+	struct lkl_stat stat;
+	long ret;
+
+	ret = lkl_sys_fstat(0, &stat);
+
+	lkl_test_logf("lkl_sys_fstat=%ld mode=%o size=%zd\n", ret, stat.st_mode,
+		      stat.st_size);
+
+	if (ret == 0 && stat.st_size == sizeof(wrbuf) &&
+	    stat.st_mode == (access_rights | LKL_S_IFREG))
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+LKL_TEST_CALL(mkdir, lkl_sys_mkdir, 0, "/mnt", access_rights)
+
+int lkl_test_stat(void)
+{
+	struct lkl_stat stat;
+	long ret;
+
+	ret = lkl_sys_stat("/mnt", &stat);
+
+	lkl_test_logf("lkl_sys_stat(\"/mnt\")=%ld mode=%o\n", ret,
+		      stat.st_mode);
+
+	if (ret == 0 && stat.st_mode == (access_rights | LKL_S_IFDIR))
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+static int lkl_test_pipe2(void)
+{
+	int pipe_fds[2];
+	int READ_IDX = 0, WRITE_IDX = 1;
+	static const char msg[] = "Hello world!";
+	char str[20];
+	int msg_len_bytes = strlen(msg) + 1;
+	int cmp_res;
+	long ret;
+
+	ret = lkl_sys_pipe2(pipe_fds, LKL_O_NONBLOCK);
+	if (ret) {
+		lkl_test_logf("pipe2: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_write(pipe_fds[WRITE_IDX], msg, msg_len_bytes);
+	if (ret != msg_len_bytes) {
+		if (ret < 0)
+			lkl_test_logf("write error: %s\n", lkl_strerror(ret));
+		else
+			lkl_test_logf("short write: %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_read(pipe_fds[READ_IDX], str, msg_len_bytes);
+	if (ret != msg_len_bytes) {
+		if (ret < 0)
+			lkl_test_logf("read error: %s\n", lkl_strerror(ret));
+		else
+			lkl_test_logf("short read: %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	cmp_res = memcmp(msg, str, msg_len_bytes);
+	if (cmp_res) {
+		lkl_test_logf("memcmp failed: %d\n", cmp_res);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_close(pipe_fds[0]);
+	if (ret) {
+		lkl_test_logf("close error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_close(pipe_fds[1]);
+	if (ret) {
+		lkl_test_logf("close error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_epoll(void)
+{
+	int epoll_fd, pipe_fds[2];
+	int READ_IDX = 0, WRITE_IDX = 1;
+	struct lkl_epoll_event wait_on, read_result;
+	static const char msg[] = "Hello world!";
+	long ret;
+
+	memset(&wait_on, 0, sizeof(wait_on));
+	memset(&read_result, 0, sizeof(read_result));
+
+	ret = lkl_sys_pipe2(pipe_fds, LKL_O_NONBLOCK);
+	if (ret) {
+		lkl_test_logf("pipe2 error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	epoll_fd = lkl_sys_epoll_create(1);
+	if (epoll_fd < 0) {
+		lkl_test_logf("epoll_create error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	wait_on.events = LKL_POLLIN | LKL_POLLOUT;
+	wait_on.data = pipe_fds[READ_IDX];
+
+	ret = lkl_sys_epoll_ctl(epoll_fd, LKL_EPOLL_CTL_ADD, pipe_fds[READ_IDX],
+				&wait_on);
+	if (ret < 0) {
+		lkl_test_logf("epoll_ctl error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	/* Shouldn't be ready before we have written something */
+	ret = lkl_sys_epoll_wait(epoll_fd, &read_result, 1, 0);
+	if (ret != 0) {
+		if (ret < 0)
+			lkl_test_logf("epoll_wait error: %s\n",
+				      lkl_strerror(ret));
+		else
+			lkl_test_logf("epoll_wait: bad event: 0x%lx\n", ret);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_write(pipe_fds[WRITE_IDX], msg, strlen(msg) + 1);
+	if (ret < 0) {
+		lkl_test_logf("write error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	/* We expect exactly 1 fd to be ready immediately */
+	ret = lkl_sys_epoll_wait(epoll_fd, &read_result, 1, 0);
+	if (ret != 1) {
+		if (ret < 0)
+			lkl_test_logf("epoll_wait error: %s\n",
+				      lkl_strerror(ret));
+		else
+			lkl_test_logf("epoll_wait: bad ev no %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	/* Already tested reading from pipe2 so no need to do it
+	 * here
+	 */
+
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(chdir_proc, lkl_sys_chdir, 0, "proc");
+
+static int dir_fd;
+
+static int lkl_test_open_cwd(void)
+{
+	dir_fd = lkl_sys_open(".", LKL_O_RDONLY | LKL_O_DIRECTORY, 0);
+	if (dir_fd < 0) {
+		lkl_test_logf("failed to open current directory: %s\n",
+			      lkl_strerror(dir_fd));
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+/* column where to insert a line break for the list file tests below. */
+#define COL_LINE_BREAK 70
+
+static int lkl_test_getdents64(void)
+{
+	long ret;
+	char buf[1024], *pos;
+	struct lkl_linux_dirent64 *de;
+	int wr;
+
+	de = (struct lkl_linux_dirent64 *)buf;
+	ret = lkl_sys_getdents64(dir_fd, de, sizeof(buf));
+
+	wr = lkl_test_logf("%d ", dir_fd);
+
+	if (ret < 0)
+		return TEST_FAILURE;
+
+	for (pos = buf; pos - buf < ret; pos += de->d_reclen) {
+		de = (struct lkl_linux_dirent64 *)pos;
+
+		wr += lkl_test_logf("%s ", de->d_name);
+		if (wr >= COL_LINE_BREAK) {
+			lkl_test_logf("\n");
+			wr = 0;
+		}
+	}
+
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(close_dir_fd, lkl_sys_close, 0, dir_fd);
+LKL_TEST_CALL(chdir_root, lkl_sys_chdir, 0, "/");
+LKL_TEST_CALL(mount_fs_proc, lkl_mount_fs, 0, "proc");
+LKL_TEST_CALL(umount_fs_proc, lkl_umount_timeout, 0, "proc", 0, 1000);
+LKL_TEST_CALL(lo_ifup, lkl_if_up, 0, 1);
+
+static int lkl_test_mutex(void)
+{
+	long ret = TEST_SUCCESS;
+	/*
+	 * Can't do much to verify that this works, so we'll just let Valgrind
+	 * warn us on CI if we've made bad memory accesses.
+	 */
+
+	struct lkl_mutex *mutex;
+
+	mutex = lkl_host_ops.mutex_alloc(0);
+	lkl_host_ops.mutex_lock(mutex);
+	lkl_host_ops.mutex_unlock(mutex);
+	lkl_host_ops.mutex_free(mutex);
+
+	mutex = lkl_host_ops.mutex_alloc(1);
+	lkl_host_ops.mutex_lock(mutex);
+	lkl_host_ops.mutex_lock(mutex);
+	lkl_host_ops.mutex_unlock(mutex);
+	lkl_host_ops.mutex_unlock(mutex);
+	lkl_host_ops.mutex_free(mutex);
+
+	return ret;
+}
+
+static int lkl_test_semaphore(void)
+{
+	long ret = TEST_SUCCESS;
+	/*
+	 * Can't do much to verify that this works, so we'll just let Valgrind
+	 * warn us on CI if we've made bad memory accesses.
+	 */
+
+	struct lkl_sem *sem = lkl_host_ops.sem_alloc(1);
+
+	lkl_host_ops.sem_down(sem);
+	lkl_host_ops.sem_up(sem);
+	lkl_host_ops.sem_free(sem);
+
+	return ret;
+}
+
+static int lkl_test_gettid(void)
+{
+	long tid = lkl_host_ops.gettid();
+
+	lkl_test_logf("%ld", tid);
+
+	/* As far as I know, thread IDs are non-zero on all reasonable
+	 * systems.
+	 */
+	if (tid)
+		return TEST_SUCCESS;
+	else
+		return TEST_FAILURE;
+}
+
+static void test_thread(void *data)
+{
+	int *pipe_fds = (int *) data;
+	char tmp[LKL_PIPE_BUF+1];
+	int ret;
+
+	ret = lkl_sys_read(pipe_fds[0], tmp, sizeof(tmp));
+	if (ret < 0)
+		lkl_test_logf("%s: %s\n", __func__, lkl_strerror(ret));
+}
+
+static int lkl_test_syscall_thread(void)
+{
+	int pipe_fds[2];
+	char tmp[LKL_PIPE_BUF+1];
+	long ret;
+	lkl_thread_t tid;
+
+	ret = lkl_sys_pipe2(pipe_fds, 0);
+	if (ret) {
+		lkl_test_logf("pipe2: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_fcntl(pipe_fds[0], LKL_F_SETPIPE_SZ, 1);
+	if (ret < 0) {
+		lkl_test_logf("fcntl setpipe_sz: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	tid = lkl_host_ops.thread_create(test_thread, pipe_fds);
+	if (!tid) {
+		lkl_test_logf("failed to create thread\n");
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_write(pipe_fds[1], tmp, sizeof(tmp));
+	if (ret != sizeof(tmp)) {
+		if (ret < 0)
+			lkl_test_logf("write error: %s\n", lkl_strerror(ret));
+		else
+			lkl_test_logf("short write: %ld\n", ret);
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_host_ops.thread_join(tid);
+	if (ret) {
+		lkl_test_logf("failed to join thread\n");
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+#ifndef __MINGW32__
+static void thread_get_pid(void *unused)
+{
+	lkl_sys_getpid();
+}
+
+static int lkl_test_many_syscall_threads(void)
+{
+	lkl_thread_t tid;
+	int count = 65, ret;
+
+	while (--count > 0) {
+		tid = lkl_host_ops.thread_create(thread_get_pid, NULL);
+		if (!tid) {
+			lkl_test_logf("failed to create thread\n");
+			return TEST_FAILURE;
+		}
+
+		ret = lkl_host_ops.thread_join(tid);
+		if (ret) {
+			lkl_test_logf("failed to join thread\n");
+			return TEST_FAILURE;
+		}
+	}
+
+	return TEST_SUCCESS;
+}
+#endif
+
+static void thread_quit_immediately(void *unused)
+{
+}
+
+static int lkl_test_join(void)
+{
+	lkl_thread_t tid = lkl_host_ops.thread_create(thread_quit_immediately,
+						      NULL);
+	int ret = lkl_host_ops.thread_join(tid);
+
+	if (ret == 0) {
+		lkl_test_logf("joined %ld\n", tid);
+		return TEST_SUCCESS;
+	}
+
+	lkl_test_logf("failed joining %ld\n", tid);
+	return TEST_FAILURE;
+}
+
+LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
+	     "mem=16M loglevel=8");
+LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
+
+struct lkl_test tests[] = {
+	LKL_TEST(mutex),
+	LKL_TEST(semaphore),
+	LKL_TEST(join),
+	LKL_TEST(start_kernel),
+	LKL_TEST(getpid),
+	LKL_TEST(syscall_latency),
+	LKL_TEST(umask),
+	LKL_TEST(umask2),
+	LKL_TEST(creat),
+	LKL_TEST(close),
+	LKL_TEST(failopen),
+	LKL_TEST(open),
+	LKL_TEST(write),
+	LKL_TEST(lseek_cur),
+	LKL_TEST(lseek_end),
+	LKL_TEST(lseek_set),
+	LKL_TEST(read),
+	LKL_TEST(fstat),
+	LKL_TEST(mkdir),
+	LKL_TEST(stat),
+#ifndef __MINGW32__
+	LKL_TEST(nanosleep),
+#endif
+	LKL_TEST(pipe2),
+	LKL_TEST(epoll),
+	LKL_TEST(mount_fs_proc),
+	LKL_TEST(chdir_proc),
+	LKL_TEST(open_cwd),
+	LKL_TEST(getdents64),
+	LKL_TEST(close_dir_fd),
+	LKL_TEST(chdir_root),
+	LKL_TEST(umount_fs_proc),
+	LKL_TEST(lo_ifup),
+	LKL_TEST(gettid),
+	LKL_TEST(syscall_thread),
+	/*
+	 * Wine has an issue where the FlsCallback is not called when
+	 * the thread terminates which makes testing the automatic
+	 * syscall threads cleanup impossible under wine.
+	 */
+#ifndef __MINGW32__
+	LKL_TEST(many_syscall_threads),
+#endif
+	LKL_TEST(stop_kernel),
+};
+
+int main(int argc, const char **argv)
+{
+	lkl_host_ops.print = lkl_test_log;
+
+	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
+			    "boot");
+}
diff --git a/tools/lkl/tests/boot.sh b/tools/lkl/tests/boot.sh
new file mode 100755
index 000000000000..d985c04b0ac1
--- /dev/null
+++ b/tools/lkl/tests/boot.sh
@@ -0,0 +1,9 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+source $script_dir/test.sh
+
+lkl_test_plan 1 "boot"
+lkl_test_run 1
+lkl_test_exec $script_dir/boot
diff --git a/tools/lkl/tests/cla.c b/tools/lkl/tests/cla.c
new file mode 100644
index 000000000000..a34badeb5f06
--- /dev/null
+++ b/tools/lkl/tests/cla.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdlib.h>
+#ifdef __MINGW32__
+#include <winsock2.h>
+#else
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#endif
+
+#include "cla.h"
+
+static int cl_arg_parse_bool(struct cl_arg *arg, const char *value)
+{
+	*((int *)arg->store) = 1;
+	return 0;
+}
+
+static int cl_arg_parse_str(struct cl_arg *arg, const char *value)
+{
+	*((const char **)arg->store) = value;
+	return 0;
+}
+
+static int cl_arg_parse_int(struct cl_arg *arg, const char *value)
+{
+	errno = 0;
+	*((int *)arg->store) = strtol(value, NULL, 0);
+	return errno == 0;
+}
+
+static int cl_arg_parse_str_set(struct cl_arg *arg, const char *value)
+{
+	const char **set = arg->set;
+	int i;
+
+	for (i = 0; set[i] != NULL; i++) {
+		if (strcmp(set[i], value) == 0) {
+			*((int *)arg->store) = i;
+			return 0;
+		}
+	}
+
+	return -1;
+}
+
+static int cl_arg_parse_ipv4(struct cl_arg *arg, const char *value)
+{
+	unsigned int addr;
+
+	if (!value)
+		return -1;
+
+	addr = inet_addr(value);
+	if (addr == INADDR_NONE)
+		return -1;
+	*((unsigned int *)arg->store) = addr;
+	return 0;
+}
+
+static cl_arg_parser_t parsers[] = {
+	[CL_ARG_BOOL] = cl_arg_parse_bool,
+	[CL_ARG_INT] = cl_arg_parse_int,
+	[CL_ARG_STR] = cl_arg_parse_str,
+	[CL_ARG_STR_SET] = cl_arg_parse_str_set,
+	[CL_ARG_IPV4] = cl_arg_parse_ipv4,
+};
+
+static struct cl_arg *find_short_arg(char name, struct cl_arg *args)
+{
+	struct cl_arg *arg;
+
+	for (arg = args; arg->short_name != 0; arg++) {
+		if (arg->short_name == name)
+			return arg;
+	}
+
+	return NULL;
+}
+
+static struct cl_arg *find_long_arg(const char *name, struct cl_arg *args)
+{
+	struct cl_arg *arg;
+
+	for (arg = args; arg->long_name; arg++) {
+		if (strcmp(arg->long_name, name) == 0)
+			return arg;
+	}
+
+	return NULL;
+}
+
+static void print_help(struct cl_arg *args)
+{
+	struct cl_arg *arg;
+
+	fprintf(stderr, "usage:\n");
+	for (arg = args; arg->long_name; arg++) {
+		fprintf(stderr, "-%c, --%-20s %s", arg->short_name,
+			arg->long_name, arg->help);
+		if (arg->type == CL_ARG_STR_SET) {
+			const char **set = arg->set;
+
+			fprintf(stderr, " [ ");
+			while (*set != NULL)
+				fprintf(stderr, "%s ", *(set++));
+			fprintf(stderr, "]");
+		}
+		fprintf(stderr, "\n");
+	}
+}
+
+int parse_args(int argc, const char **argv, struct cl_arg *args)
+{
+	int i;
+
+	for (i = 1; i < argc; i++) {
+		struct cl_arg *arg = NULL;
+		cl_arg_parser_t parser;
+
+		if (argv[i][0] == '-') {
+			if (argv[i][1] != '-')
+				arg = find_short_arg(argv[i][1], args);
+			else
+				arg = find_long_arg(&argv[i][2], args);
+		}
+
+		if (!arg) {
+			fprintf(stderr, "unknown option '%s'\n", argv[i]);
+			print_help(args);
+			return -1;
+		}
+
+		if (arg->type == CL_ARG_USER || arg->type >= CL_ARG_END)
+			parser = arg->parser;
+		else
+			parser = parsers[arg->type];
+
+		if (!parser) {
+			fprintf(stderr, "can't parse --'%s'/-'%c'\n",
+				arg->long_name, args->short_name);
+			return -1;
+		}
+
+		if (parser(arg, argv[i + 1]) < 0) {
+			fprintf(stderr, "can't parse '%s'\n", argv[i]);
+			print_help(args);
+			return -1;
+		}
+
+		if (arg->has_arg)
+			i++;
+	}
+
+	return 0;
+}
diff --git a/tools/lkl/tests/cla.h b/tools/lkl/tests/cla.h
new file mode 100644
index 000000000000..f8369be02e5a
--- /dev/null
+++ b/tools/lkl/tests/cla.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_TEST_CLA_H
+#define _LKL_TEST_CLA_H
+
+enum cl_arg_type {
+	CL_ARG_USER = 0,
+	CL_ARG_BOOL,
+	CL_ARG_INT,
+	CL_ARG_STR,
+	CL_ARG_STR_SET,
+	CL_ARG_IPV4,
+	CL_ARG_END,
+};
+
+struct cl_arg;
+
+typedef int (*cl_arg_parser_t)(struct cl_arg *arg, const char *value);
+
+struct cl_arg {
+	const char *long_name;
+	char short_name;
+	const char *help;
+	int has_arg;
+	enum cl_arg_type type;
+	void *store;
+	void *set;
+	cl_arg_parser_t parser;
+};
+
+int parse_args(int argc, const char **argv, struct cl_arg *args);
+
+
+#endif /* _LKL_TEST_CLA_H */
diff --git a/tools/lkl/tests/disk.c b/tools/lkl/tests/disk.c
new file mode 100644
index 000000000000..0aa039876b54
--- /dev/null
+++ b/tools/lkl/tests/disk.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <unistd.h>
+#include <string.h>
+#include <time.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <lkl.h>
+#include <lkl_host.h>
+#ifndef __MINGW32__
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#else
+#include <windows.h>
+#endif
+
+#include "test.h"
+#include "cla.h"
+
+static struct {
+	int printk;
+	const char *disk;
+	const char *fstype;
+	int partition;
+} cla;
+
+struct cl_arg args[] = {
+	{"disk", 'd', "disk file to use", 1, CL_ARG_STR, &cla.disk},
+	{"partition", 'P', "partition to mount", 1, CL_ARG_INT, &cla.partition},
+	{"type", 't', "filesystem type", 1, CL_ARG_STR, &cla.fstype},
+	{0},
+};
+
+
+static struct lkl_disk disk;
+static int disk_id = -1;
+
+int lkl_test_disk_add(void)
+{
+#ifdef __MINGW32__
+	disk.handle = CreateFile(cla.disk, GENERIC_READ | GENERIC_WRITE,
+			       0, NULL, OPEN_EXISTING, 0, NULL);
+	if (!disk.handle)
+#else
+	disk.fd = open(cla.disk, O_RDWR);
+	if (disk.fd < 0)
+#endif
+		goto out_unlink;
+
+	disk.ops = NULL;
+
+	disk_id = lkl_disk_add(&disk);
+	if (disk_id < 0)
+		goto out_close;
+
+	goto out;
+
+out_close:
+#ifdef __MINGW32__
+	CloseHandle(disk.handle);
+#else
+	close(disk.fd);
+#endif
+
+out_unlink:
+#ifdef __MINGW32__
+	DeleteFile(cla.disk);
+#else
+	unlink(cla.disk);
+#endif
+
+out:
+	lkl_test_logf("disk fd/handle %x disk_id %d", disk.fd, disk_id);
+
+	if (disk_id >= 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+int lkl_test_disk_remove(void)
+{
+	int ret;
+
+	ret = lkl_disk_remove(disk);
+
+#ifdef __MINGW32__
+	CloseHandle(disk.handle);
+#else
+	close(disk.fd);
+#endif
+
+	if (ret == 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+
+static char mnt_point[32];
+
+LKL_TEST_CALL(mount_dev, lkl_mount_dev, 0, disk_id, cla.partition, cla.fstype,
+	      0, NULL, mnt_point, sizeof(mnt_point))
+
+static int lkl_test_umount_dev(void)
+{
+	long ret, ret2;
+
+	ret = lkl_sys_chdir("/");
+
+	ret2 = lkl_umount_dev(disk_id, cla.partition, 0, 1000);
+
+	lkl_test_logf("%ld %ld", ret, ret2);
+
+	if (!ret && !ret2)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+struct lkl_dir *dir;
+
+static int lkl_test_opendir(void)
+{
+	int err;
+
+	dir = lkl_opendir(mnt_point, &err);
+
+	lkl_test_logf("lkl_opedir(%s) = %d %s\n", mnt_point, err,
+		      lkl_strerror(err));
+
+	if (err == 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+static int lkl_test_readdir(void)
+{
+	struct lkl_linux_dirent64 *de = lkl_readdir(dir);
+	int wr = 0;
+
+	while (de) {
+		wr += lkl_test_logf("%s ", de->d_name);
+		if (wr >= 70) {
+			lkl_test_logf("\n");
+			wr = 0;
+			break;
+		}
+		de = lkl_readdir(dir);
+	}
+
+	if (lkl_errdir(dir) == 0)
+		return TEST_SUCCESS;
+
+	return TEST_FAILURE;
+}
+
+LKL_TEST_CALL(closedir, lkl_closedir, 0, dir);
+LKL_TEST_CALL(chdir_mnt_point, lkl_sys_chdir, 0, mnt_point);
+LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
+	     "mem=16M loglevel=8");
+LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
+
+struct lkl_test tests[] = {
+	LKL_TEST(disk_add),
+	LKL_TEST(start_kernel),
+	LKL_TEST(mount_dev),
+	LKL_TEST(chdir_mnt_point),
+	LKL_TEST(opendir),
+	LKL_TEST(readdir),
+	LKL_TEST(closedir),
+	LKL_TEST(umount_dev),
+	LKL_TEST(stop_kernel),
+	LKL_TEST(disk_remove),
+
+};
+
+int main(int argc, const char **argv)
+{
+	if (parse_args(argc, argv, args) < 0)
+		return -1;
+
+	lkl_host_ops.print = lkl_test_log;
+
+	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
+			    "disk %s", cla.fstype);
+}
diff --git a/tools/lkl/tests/disk.sh b/tools/lkl/tests/disk.sh
new file mode 100755
index 000000000000..9bdcb16f2d5c
--- /dev/null
+++ b/tools/lkl/tests/disk.sh
@@ -0,0 +1,61 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+source $script_dir/test.sh
+
+function prepfs()
+{
+    set -e
+
+    file=`mktemp`
+
+    dd if=/dev/zero of=$file bs=1024 count=204800
+
+    yes | mkfs.$1 $file
+
+    if ! [ -z $BSD_WDIR ]; then
+        $MYSSH mkdir -p $BSD_WDIR
+        ssh_copy $file $BSD_WDIR
+        rm $file
+        file=$BSD_WDIR/$(basename $file)
+    fi
+
+    export_vars file
+}
+
+function cleanfs()
+{
+    set -e
+
+    if ! [ -z $BSD_WDIR ]; then
+        $MYSSH rm $1
+        $MYSSH rm $BSD_WDIR/disk
+    else
+        rm $1
+    fi
+}
+
+if [ "$1" = "-t" ]; then
+    shift
+    fstype=$1
+    shift
+fi
+
+if [ -z "$fstype" ]; then
+    fstype="ext4"
+fi
+
+if [ -z $(which mkfs.$fstype) ]; then
+    lkl_test_plan 0 "disk $fstype"
+    echo "no mkfs.$fstype command"
+    exit 0
+fi
+
+lkl_test_plan 1 "disk $fstype"
+lkl_test_run 1 prepfs $fstype
+lkl_test_exec $script_dir/disk -d $file -t $fstype $@
+lkl_test_plan 1 "disk $fstype"
+lkl_test_run 1 cleanfs $file
+
diff --git a/tools/lkl/tests/run.py b/tools/lkl/tests/run.py
new file mode 100755
index 000000000000..8fea72686a7a
--- /dev/null
+++ b/tools/lkl/tests/run.py
@@ -0,0 +1,182 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License
+#
+# Author: Octavian Purdila <tavi@cs.pub.ro>
+#
+
+from __future__ import print_function
+
+import argparse
+import os
+import subprocess
+import sys
+import tap13
+import xml.etree.ElementTree as ET
+
+from junit_xml import TestSuite, TestCase
+
+
+class Reporter(tap13.Reporter):
+    def start(self, obj):
+        if type(obj) is tap13.Test:
+            if obj.result == "*":
+                end='\r'
+            else:
+                end='\n'
+            print("  TEST       %-8s %.50s" %
+                  (obj.result, obj.description + " " + obj.comment), end=end)
+
+        elif type(obj) is tap13.Suite:
+            if obj.tests_planned == 0:
+                status = "skip"
+            else:
+                status = ""
+            print("  SUITE      %-8s %s" % (status, obj.name))
+
+    def end(self, obj):
+        if type(obj) is tap13.Test:
+            if obj.result != "ok":
+                try:
+                    print(obj.yaml["log"], end='')
+                except:
+                    None
+
+
+mydir=os.path.dirname(os.path.realpath(__file__))
+
+tests = [
+    'boot.sh',
+    'disk.sh -t ext4',
+    'disk.sh -t vfat',
+    'net.sh -b loopback',
+    'net.sh -b tap',
+    'net.sh -b pipe',
+    'net.sh -b raw',
+    'net.sh -b macvtap',
+    'lklfuse.sh -t ext4',
+    'lklfuse.sh -t vfat',
+    'hijack-test.sh'
+]
+
+parser = argparse.ArgumentParser(description='LKL test runner')
+parser.add_argument('tests', nargs='?', action='append',
+                    help='tests to run %s' % tests)
+parser.add_argument('--junit-dir',
+                    help='directory where to store the juni suites')
+parser.add_argument('--gdb', action='store_true', default=False,
+                    help='run simple tests under gdb; implies --pass-through')
+parser.add_argument('--pass-through', action='store_true',  default=False,
+                    help='run the test without interpeting the test output')
+parser.add_argument('--valgrind', action='store_true', default=False,
+                    help='run simple tests under valgrind')
+
+args = parser.parse_args()
+if args.tests == [None]:
+    args.tests = tests
+
+if args.gdb:
+    args.pass_through=True
+    os.environ['GDB']="yes"
+
+if args.valgrind:
+    os.environ['VALGRIND']="yes"
+
+tap = tap13.Parser(Reporter())
+
+os.environ['PATH'] += ":" + mydir
+
+exit_code = 0
+
+for t in args.tests:
+    if not t:
+        continue
+    if args.pass_through:
+        print(t)
+        if subprocess.call(t, shell=True) != 0:
+            exit_code = 1
+    else:
+        p = subprocess.Popen(t, shell=True, stdout=subprocess.PIPE)
+        tap.parse(p.stdout)
+
+if args.pass_through:
+    sys.exit(exit_code)
+
+suites_count = 0
+tests_total = 0
+tests_not_ok = 0
+tests_ok = 0
+tests_skip = 0
+val_errs = 0
+val_fails = 0
+val_skips = 0
+
+for s in tap.run.suites:
+
+    junit_tests = []
+    suites_count += 1
+
+    for t in s.tests:
+        try:
+            secs = t.yaml["time_us"] / 1000000.0
+        except:
+            secs = 0
+        try:
+            log = t.yaml['log']
+        except:
+            log = ""
+
+        jt = TestCase(t.description, elapsed_sec=secs, stdout=log)
+        if t.result == 'skip':
+            jt.add_skipped_info(output=log)
+        elif t.result == 'not ok':
+            jt.add_error_info(output=log)
+
+        junit_tests.append(jt)
+
+        tests_total += 1
+        if t.result == "ok":
+            tests_ok += 1
+        elif t.result == "not ok":
+            tests_not_ok += 1
+            exit_code = 1
+        elif t.result == "skip":
+            tests_skip += 1
+
+    if args.junit_dir:
+        js = TestSuite(s.name, junit_tests)
+        with open(os.path.join(args.junit_dir, os.path.basename(s.name) + '.xml'), 'w') as f:
+            js.to_file(f, [js])
+
+        if os.getenv('VALGRIND') is not None:
+            val_xml = 'valgrind-%s.xml' % os.path.basename(s.name).replace(' ','-')
+            # skipped tests don't generate xml file
+            if os.path.exists(val_xml) is False:
+                continue
+
+            cmd = 'mv %s %s' % (val_xml, args.junit_dir)
+            subprocess.call(cmd, shell=True, )
+
+            cmd = mydir + '/valgrind2xunit.py ' + val_xml
+            subprocess.call(cmd, shell=True, cwd=args.junit_dir)
+
+            # count valgrind results
+            doc = ET.parse(os.path.join(args.junit_dir, 'valgrind-%s_xunit.xml' \
+                                        % (os.path.basename(s.name).replace(' ','-'))))
+            ts = doc.getroot()
+            val_errs += int(ts.get('errors'))
+            val_fails += int(ts.get('failures'))
+            val_skips += int(ts.get('skip'))
+
+print("Summary: %d suites run, %d tests, %d ok, %d not ok, %d skipped" %
+      (suites_count, tests_total, tests_ok, tests_not_ok, tests_skip))
+
+if os.getenv('VALGRIND') is not None:
+    print(" valgrind (memcheck): %d failures, %d skipped" % (val_fails, val_skips))
+    if val_errs or val_fails:
+        exit_code = 1
+
+sys.exit(exit_code)
diff --git a/tools/lkl/tests/tap13.py b/tools/lkl/tests/tap13.py
new file mode 100644
index 000000000000..65c73cda7ca1
--- /dev/null
+++ b/tools/lkl/tests/tap13.py
@@ -0,0 +1,209 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License
+#
+# Author: Octavian Purdila <tavi@cs.pub.ro>
+#
+# Based on TAP13:
+#
+# Copyright 2013, Red Hat, Inc.
+# Author: Josef Skladanka <jskladan@redhat.com>
+#
+from __future__ import print_function
+
+import re
+import sys
+import yamlish
+
+
+class Reporter(object):
+
+    def start(self, obj):
+        None
+
+    def end(self, obj):
+        None
+
+
+class Test(object):
+    def __init__(self, reporter, result, id, description=None, directive=None,
+                 comment=None):
+        self.reporter = reporter
+        self.result = result
+        if directive:
+            self.result = directive.lower()
+        if id:
+            self.id = int(id)
+        else:
+            self.id = None
+        if description:
+            self.description = description
+        else:
+            self.description = ""
+        if comment:
+            self.comment = "# " + comment
+        else:
+            self.comment = ""
+        self.yaml = None
+        self._yaml_buffer = None
+        self.diagnostics = []
+
+        self.reporter.start(self)
+
+    def end(self):
+        if not self.yaml:
+            self.yaml = yamlish.load(self._yaml_buffer)
+            self.reporter.end(self)
+
+
+class Suite(object):
+    def __init__(self, reporter, start, end, explanation):
+        self.reporter = reporter
+        self.tests = []
+        self.name = explanation
+        self.tests_planned = int(end)
+
+        self.__tests_counter = 0
+        self.__tests_base = 0
+
+        self.reporter.start(self)
+
+    def newTest(self, args):
+        try:
+            self.tests[-1].end()
+        except IndexError:
+            None
+
+        if 'id' not in args or not args['id']:
+            args['id'] = self.__tests_counter
+        else:
+            args['id'] = int(args['id']) + self.__tests_base
+
+        if args['id'] < self.__tests_counter:
+            print("error: bad test id %d, fixing it" % (args['id']))
+            args['id'] = self.__tests_counter
+        # according to TAP13 specs, missing tests must be handled as 'not ok'
+        # here we add the missing tests in sequence
+        while args['id'] > (self.__tests_counter + 1):
+            comment = 'test %d not present' % self.__tests_counter
+            self.tests.append(Test(self.reporter, 'not ok',
+                                   self.__tests_counter, comment=comment))
+            self.__tests_counter += 1
+
+        if args['id'] == self.__tests_counter:
+            if args['directive']:
+                self.test().result = args['directive'].lower()
+            else:
+                self.test().result = args['result']
+            self.reporter.start(self.test())
+        else:
+            self.tests.append(Test(self.reporter, **args))
+            self.__tests_counter += 1
+
+    def test(self):
+        return self.tests[-1]
+
+    def end(self, name, planned):
+        if name == self.name:
+            self.tests_planned += int(planned)
+            self.__tests_base = self.__tests_counter
+            return False
+        try:
+            self.test().end()
+        except IndexError:
+            None
+        if len(self.tests) != self.tests_planned:
+            for i in range(len(self.tests), self.tests_planned):
+                self.tests.append(Test(self.reporter, 'not ok', i+1,
+                                       comment='test not present'))
+        return True
+
+
+class Run(object):
+
+    def __init__(self, reporter):
+        self.reporter = reporter
+        self.suites = []
+
+    def suite(self):
+        return self.suites[-1]
+
+    def test(self):
+        return self.suites[-1].tests[-1]
+
+    def newSuite(self, args):
+        new = False
+        try:
+            if self.suite().end(args['explanation'], args['end']):
+                new = True
+        except IndexError:
+            new = True
+        if new:
+            self.suites.append(Suite(self.reporter, **args))
+
+    def newTest(self, args):
+        self.suite().newTest(args)
+
+
+class Parser(object):
+    RE_PLAN = re.compile(r"^\s*(?P<start>\d+)\.\.(?P<end>\d+)\s*(#\s*(?P<explanation>.*))?\s*$")
+    RE_TEST_LINE = re.compile(r"^\s*(?P<result>(not\s+)?ok|[*]+)\s*(?P<id>\d+)?\s*(?P<description>[^#]+)?\s*(#\s*(?P<directive>TODO|SKIP)?\s*(?P<comment>.+)?)?\s*$",  re.IGNORECASE)
+    RE_EXPLANATION = re.compile(r"^\s*#\s*(?P<explanation>.+)?\s*$")
+    RE_YAMLISH_START = re.compile(r"^\s*---.*$")
+    RE_YAMLISH_END = re.compile(r"^\s*\.\.\.\s*$")
+
+    def __init__(self, reporter):
+        self.seek_test = False
+        self.in_test = False
+        self.in_yaml = False
+        self.run = Run(reporter)
+
+    def parse(self, source):
+        # to avoid input buffering
+        while True:
+            line = source.readline()
+            if not line:
+                break
+
+            if self.in_yaml:
+                if Parser.RE_YAMLISH_END.match(line):
+                    self.run.test()._yaml_buffer.append(line.strip())
+                    self.in_yaml = False
+                else:
+                    self.run.test()._yaml_buffer.append(line.rstrip())
+                continue
+
+            line = line.strip()
+
+            if self.in_test:
+                if Parser.RE_EXPLANATION.match(line):
+                    self.run.test().diagnostics.append(line)
+                    continue
+                if Parser.RE_YAMLISH_START.match(line):
+                    self.run.test()._yaml_buffer = [line.strip()]
+                    self.in_yaml = True
+                    continue
+
+            m = Parser.RE_PLAN.match(line)
+            if m:
+                self.seek_test = True
+                args = m.groupdict()
+                self.run.newSuite(args)
+                continue
+
+            if self.seek_test:
+                m = Parser.RE_TEST_LINE.match(line)
+                if m:
+                    args = m.groupdict()
+                    self.run.newTest(args)
+                    self.in_test = True
+                    continue
+
+            print(line)
+        try:
+            self.run.suite().end(None, 0)
+        except IndexError:
+            None
diff --git a/tools/lkl/tests/test.c b/tools/lkl/tests/test.c
new file mode 100644
index 000000000000..3e334d106c48
--- /dev/null
+++ b/tools/lkl/tests/test.c
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <stdarg.h>
+#include <time.h>
+
+#include "test.h"
+
+/* circular log buffer */
+
+static char log_buf[0x10000];
+static char *head = log_buf, *tail = log_buf;
+
+static inline void advance(char **ptr)
+{
+	if ((unsigned int)(*ptr - log_buf) >= sizeof(log_buf))
+		*ptr = log_buf;
+	else
+		*ptr = *ptr + 1;
+}
+
+static void log_char(char c)
+{
+	*tail = c;
+	advance(&tail);
+	if (tail == head)
+		advance(&head);
+}
+
+static void print_log(void)
+{
+	char last;
+
+	printf(" log: |\n");
+	last = '\n';
+	while (head != tail) {
+		if (last == '\n')
+			printf("  ");
+		last = *head;
+		putchar(last);
+		advance(&head);
+	}
+	if (last != '\n')
+		putchar('\n');
+}
+
+int lkl_test_run(const struct lkl_test *tests, int nr, const char *fmt, ...)
+{
+	int i, ret, status = TEST_SUCCESS;
+	clock_t start, stop;
+	char name[1024];
+	va_list args;
+
+	va_start(args, fmt);
+	vsnprintf(name, sizeof(name), fmt, args);
+	va_end(args);
+
+	printf("1..%d # %s\n", nr, name);
+	for (i = 1; i <= nr; i++) {
+		const struct lkl_test *t = &tests[i-1];
+		unsigned long delta_us;
+
+		printf("* %d %s\n", i, t->name);
+		fflush(stdout);
+
+		start = clock();
+
+		ret = t->fn(t->arg1, t->arg2, t->arg3);
+
+		stop = clock();
+
+		switch (ret) {
+		case TEST_SUCCESS:
+			printf("ok %d %s\n", i, t->name);
+			break;
+		case TEST_SKIP:
+			printf("ok %d %s # SKIP\n", i, t->name);
+			break;
+		case TEST_BAILOUT:
+			status = TEST_BAILOUT;
+			/* fall through */
+		case TEST_FAILURE:
+		default:
+			if (status != TEST_BAILOUT)
+				status = TEST_FAILURE;
+			printf("not ok %d %s\n", i, t->name);
+		}
+
+		printf(" ---\n");
+		delta_us = (stop - start) * 1000000 / CLOCKS_PER_SEC;
+		printf(" time_us: %ld\n", delta_us);
+		print_log();
+		printf(" ...\n");
+
+		if (status == TEST_BAILOUT) {
+			printf("Bail out!\n");
+			return TEST_FAILURE;
+		}
+
+		fflush(stdout);
+	}
+
+	return status;
+}
+
+
+void lkl_test_log(const char *str, int len)
+{
+	while (len--)
+		log_char(*(str++));
+}
+
+int lkl_test_logf(const char *fmt, ...)
+{
+	char tmp[1024], *c;
+	va_list args;
+	unsigned int n;
+
+	va_start(args, fmt);
+	n = vsnprintf(tmp, sizeof(tmp), fmt, args);
+	va_end(args);
+
+	for (c = tmp; *c != 0; c++)
+		log_char(*c);
+
+	return n > sizeof(tmp) ? sizeof(tmp) : n;
+}
diff --git a/tools/lkl/tests/test.h b/tools/lkl/tests/test.h
new file mode 100644
index 000000000000..f63ad6d419cb
--- /dev/null
+++ b/tools/lkl/tests/test.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_TEST_H
+#define _LKL_TEST_H
+
+#define TEST_SUCCESS	0
+#define TEST_FAILURE	1
+#define TEST_SKIP	2
+#define TEST_TODO	3
+#define TEST_BAILOUT	4
+
+struct lkl_test {
+	const char *name;
+	int (*fn)();
+	void *arg1, *arg2, *arg3;
+};
+
+/**
+ * Simple wrapper to initialize a test entry.
+ * @name - test name, it assume test function is named test_@name
+ * @vargs - arguments to be passed to the function
+ */
+#define LKL_TEST(name, ...) { #name, lkl_test_##name, __VA_ARGS__ }
+
+/**
+ * lkl_test_run - run a test suite
+ *
+ * @tests - the list of tests to run
+ * @nr - number of tests
+ * @fmt - format string to be used for suite name
+ */
+int lkl_test_run(const struct lkl_test *tests, int nr, const char *fmt, ...);
+
+/**
+ * lkl_test_log - store a string in the test log buffer
+ * @str - the string to log (can be non-NULL terminated)
+ * @len - the string length
+ */
+void lkl_test_log(const char *str, int len);
+
+/**
+ * lkl_test_logf - printf like function to store into the test log buffer
+ * @fmt - printf format string
+ * @vargs - arguments to the format string
+ */
+int lkl_test_logf(const char *fmt, ...) __attribute__((format(printf, 1, 2)));
+
+/**
+ * LKL_TEST_CALL - create a test function as for a LKL call
+ *
+ * The test function will be named lkl_test_@name and will return
+ * TEST_SUCCESS if the called functions returns @expect. Otherwise
+ * will return TEST_FAILUIRE.
+ *
+ * @name - test name; must be unique because it is part of the the
+ * test function; the test function will be named
+ * @call - function to call
+ * @expect - expected return value for success
+ * @args - arguments to pass to the LKL call
+ */
+#define LKL_TEST_CALL(name, call, expect, ...)				\
+	static int lkl_test_##name(void)				\
+	{								\
+		long ret;						\
+									\
+		ret = call(__VA_ARGS__);				\
+		lkl_test_logf("%s(%s) = %ld %s\n", #call, #__VA_ARGS__, \
+			ret, ret < 0 ? lkl_strerror(ret) : "");		\
+		return (ret == expect) ? TEST_SUCCESS : TEST_FAILURE;	\
+	}
+
+
+#endif /* _LKL_TEST_H */
diff --git a/tools/lkl/tests/test.sh b/tools/lkl/tests/test.sh
new file mode 100644
index 000000000000..1a5619aed735
--- /dev/null
+++ b/tools/lkl/tests/test.sh
@@ -0,0 +1,179 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+basedir=$(cd $script_dir/..; pwd)
+source ${script_dir}/autoconf.sh
+
+TEST_SUCCESS=0
+TEST_FAILURE=1
+TEST_SKIP=113
+TEST_TODO=114
+TEST_BAILOUT=115
+
+print_log()
+{
+    echo " log: |"
+    while read line; do
+        echo "  $line"
+    done < $1
+}
+
+export_vars()
+{
+    if [ -z "$var_file" ]; then
+        return
+    fi
+
+    for i in $@; do
+        echo "$i=${!i}" >> $var_file
+    done
+}
+
+lkl_test_run()
+{
+    log_file=$(mktemp)
+    export var_file=$(mktemp)
+
+    tid=$1 && shift && tname=$@
+
+    echo "* $tid $tname"
+
+    start=$(date '+%s%9N')
+    # run in a separate shell to avoid -e terminating us
+    $@ 2>&1 | strings >$log_file
+    exit=${PIPESTATUS[0]}
+    stop=$(date '+%s%9N')
+
+    case $exit in
+    $TEST_SUCCESS)
+        echo "ok $tid $tname"
+        ;;
+    $TEST_SKIP)
+        echo "ok $tid $tname # SKIP"
+        ;;
+    $TEST_BAILOUT)
+        echo "not ok $tid $tname"
+        echo "Bail out!"
+        ;;
+    $TEST_FAILURE|*)
+        echo "not ok $tid $tname"
+        ;;
+    esac
+
+    delta=$(((stop-start)/1000))
+
+    echo " ---"
+    echo " time_us: $delta"
+    print_log $log_file
+    echo -e " ..."
+
+    rm $log_file
+    . $var_file
+    rm $var_file
+
+    return $exit
+}
+
+lkl_test_plan()
+{
+    echo "1..$1 # $2"
+    export suite_name="${2// /\-}"
+}
+
+lkl_test_exec()
+{
+    local SUDO=""
+    local WRAPPER=""
+
+    if [ "$1" = "sudo" ]; then
+        SUDO=sudo
+        shift
+    fi
+
+    local file=$1
+    shift
+
+    if [ -n "$LKL_HOST_CONFIG_NT" ]; then
+        file=$file.exe
+    fi
+
+    if file $file | grep ARM; then
+        WRAPPER="qemu-arm-static"
+    elif file $file | grep "FreeBSD" ; then
+        ssh_copy "$file" $BSD_WDIR
+        if [ -n "$SUDO" ]; then
+            SUDO=""
+        fi
+        WRAPPER="$MYSSH $SU"
+        # ssh will mess up with pipes ('|') so, escape the pipe char.
+        args="${@//\|/\\\|}"
+        set - $BSD_WDIR/$(basename $file) $args
+        file=""
+    elif [ -n "$GDB" ]; then
+        WRAPPER="gdb"
+        args="$@"
+        set - -ex "run $args" -ex quit $file
+        file=""
+    elif [ -n "$VALGRIND" ]; then
+        WRAPPER="valgrind --suppressions=$script_dir/valgrind.supp \
+                  --leak-check=full --show-leak-kinds=all --xml=yes \
+                  --xml-file=valgrind-$suite_name.xml"
+    fi
+
+    $SUDO $WRAPPER $file "$@"
+}
+
+lkl_test_cmd()
+{
+    local WRAPPER=""
+
+    if [ -z "$QUIET" ]; then
+        SHOPTS="-x"
+    fi
+
+    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+        WRAPPER="$MYSSH $SU"
+    fi
+
+    echo "$@" | $WRAPPER sh $SHOPTS
+}
+
+# XXX: $MYSSH and $MYSCP are defined in a circleci docker image.
+# see the definitions in lkl/lkl-docker:circleci/freebsd11/Dockerfile
+ssh_push()
+{
+    while [ -n "$1" ]; do
+        if [[ "$1" = *.sh ]]; then
+            type="script"
+        else
+            type="file"
+        fi
+
+        dir=$(dirname $1)
+        $MYSSH mkdir -p $BSD_WDIR/$dir
+
+        $MYSCP -P 7722 -r $basedir/$1 root@localhost:$BSD_WDIR/$dir
+        if [ "$type" = "script" ]; then
+            $MYSSH chmod a+x $BSD_WDIR/$1
+        fi
+
+        shift
+    done
+}
+
+ssh_copy()
+{
+    $MYSCP -P 7722 -r $1 root@localhost:$2
+}
+
+lkl_test_bsd_cleanup()
+{
+    $MYSSH rm -rf $BSD_WDIR
+}
+
+if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+    trap lkl_test_bsd_cleanup EXIT
+    export BSD_WDIR=/root/lkl
+    $MYSSH mkdir -p $BSD_WDIR
+fi
diff --git a/tools/lkl/tests/valgrind.supp b/tools/lkl/tests/valgrind.supp
new file mode 100644
index 000000000000..5ce717d759fc
--- /dev/null
+++ b/tools/lkl/tests/valgrind.supp
@@ -0,0 +1,85 @@
+{
+   <unfinished timer 1>
+   Memcheck:Leak
+   match-leak-kinds: possible
+   ...
+   fun:pthread_create@@GLIBC_2.2.5
+   fun:__start_helper_thread
+   fun:__pthread_once_slow
+   fun:timer_create@@GLIBC_2.3.3
+   fun:timer_alloc
+   fun:clockevent_set_state_oneshot
+   ...
+   fun:__clockevents_switch_state
+   fun:clockevents_switch_state
+   fun:tick_setup_periodic
+   ...
+}
+
+{
+   <pid1 kernel thread>
+   Memcheck:Leak
+   match-leak-kinds: possible
+   ...
+   fun:thread_create
+   fun:copy_thread
+   fun:copy_thread_tls
+   ...
+   fun:rest_init
+   fun:start_kernel
+   fun:lkl_run_kernel
+}
+
+{
+   <xfs uninitialized buf error: delete this once upstream is fixed>
+   Memcheck:Value8
+   fun:crc32_body
+   fun:crc32_le_generic
+   fun:__crc32c_le
+   fun:chksum_update
+   fun:crypto_shash_update
+   fun:crc32c
+   fun:xlog_cksum
+}
+
+{
+   <xfs pwrite64 issue: delete this once upstream is fixed>
+   Memcheck:Param
+   pwrite64(buf)
+   ...
+   fun:blk_request
+   fun:blk_enqueue
+   fun:virtio_process_one
+   fun:virtio_process_queue
+   fun:virtio_write
+   fun:__raw_writel
+   fun:writel
+   fun:vm_notify
+   fun:virtqueue_notify
+   fun:virtio_queue_rq
+   fun:blk_mq_dispatch_rq_list
+   fun:blk_mq_sched_dispatch_requests
+}
+
+{
+   <virtio_net_pipe xmits>
+   Memcheck:Param
+   writev(vector[...])
+   ...
+   fun:fd_net_tx
+   fun:net_enqueue
+   fun:virtio_process_one
+   fun:virtio_process_queue
+   fun:virtio_write
+   fun:__raw_writel
+   fun:writel
+   fun:vm_notify
+   fun:virtqueue_notify
+   fun:virtqueue_kick
+   fun:start_xmit
+   fun:__netdev_start_xmit
+   fun:netdev_start_xmit
+   fun:xmit_one
+   fun:dev_hard_start_xmit
+   fun:sch_direct_xmit
+}
\ No newline at end of file
diff --git a/tools/lkl/tests/valgrind2xunit.py b/tools/lkl/tests/valgrind2xunit.py
new file mode 100755
index 000000000000..ab7c12b83377
--- /dev/null
+++ b/tools/lkl/tests/valgrind2xunit.py
@@ -0,0 +1,69 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+
+##
+## Downloader from
+## http://humdi.net/wiki/tips/valgrind-to-xunit-xml-converter
+##
+
+import xml.etree.ElementTree as ET
+import sys
+import os
+
+fname = sys.argv[1]
+if fname is None:
+    fname = 'valgrind.xml'
+
+doc = ET.parse(fname)
+errors = doc.findall('.//error')
+
+out = open(os.path.splitext(os.path.basename(fname))[0]+'_xunit.xml',"w")
+out.write('<?xml version="1.0" encoding="UTF-8"?>\n')
+out.write('<testsuite name="valgrind" tests="'+str(len(errors))+'" errors="0" failures="'+str(len(errors))+'" skip="0">\n')
+errorcount=0
+for error in errors:
+    errorcount=errorcount+1
+
+    kind = error.find('kind')
+    what = error.find('what')
+    if  what == None:
+        what = error.find('xwhat/text')
+
+    stack = error.find('stack')
+    frames = stack.findall('frame')
+
+    for frame in frames:
+        fi = frame.find('file')
+        li = frame.find('line')
+        if fi != None and li != None:
+            break
+
+    if fi != None and li != None:
+        out.write('    <testcase classname="ValgrindMemoryCheck" name="Memory check '+str(errorcount)+' ('+kind.text+', '+fi.text+':'+li.text+')" time="0">\n')
+    else:
+        out.write('    <testcase classname="ValgrindMemoryCheck" name="Memory check '+str(errorcount)+' ('+kind.text+')" time="0">\n')
+    out.write('        <error type="'+kind.text+'">\n')
+    out.write('  '+what.text+'\n\n')
+
+    for frame in frames:
+        ip = frame.find('ip')
+        fn = frame.find('fn')
+        fi = frame.find('file')
+        li = frame.find('line')
+
+        if fn is None:
+            bodytext = '(unresolved symbol)'
+        else:
+            bodytext = fn.text
+        bodytext = bodytext.replace("&","&amp;")
+        bodytext = bodytext.replace("<","&lt;")
+        bodytext = bodytext.replace(">","&gt;")
+        if fi != None and li != None:
+            out.write('  '+ip.text+': '+bodytext+' ('+fi.text+':'+li.text+')\n')
+        else:
+            out.write('  '+ip.text+': '+bodytext+'\n')
+
+    out.write('        </error>\n')
+    out.write('    </testcase>\n')
+out.write('</testsuite>\n')
+out.close()
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 22/37] lkl tools: tool that reads/writes to/from a filesystem image
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Conrad Meyer, Dan Peebles, Hajime Tazaki, Petros Angelatos,
	Tuomas Tynkkynen, Yuriy Taraday

From: Octavian Purdila <tavi.purdila@gmail.com>

A tool to read/write to/from a filesystem image without mounting the
file to host filesystem.  Thus there is no root priviledge to modify the
contents.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Dan Peebles <pumpkin@me.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Signed-off-by: Yuriy Taraday <yorik.sar@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore |   2 +
 tools/lkl/Build      |   2 +
 tools/lkl/Makefile   |   3 +
 tools/lkl/Targets    |   3 +
 tools/lkl/cptofs.c   | 635 +++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 645 insertions(+)
 create mode 100644 tools/lkl/cptofs.c

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
index 4e08254dbd46..1a8210f4d9c4 100644
--- a/tools/lkl/.gitignore
+++ b/tools/lkl/.gitignore
@@ -7,3 +7,5 @@ tests/disk
 tests/boot
 tests/valgrind*.xml
 *.pyc
+cptofs
+cpfromfs
diff --git a/tools/lkl/Build b/tools/lkl/Build
index e69de29bb2d1..a9d12c5ca260 100644
--- a/tools/lkl/Build
+++ b/tools/lkl/Build
@@ -0,0 +1,2 @@
+cptofs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs.o
+
diff --git a/tools/lkl/Makefile b/tools/lkl/Makefile
index 9a55df5064e4..94f3ba09123d 100644
--- a/tools/lkl/Makefile
+++ b/tools/lkl/Makefile
@@ -85,6 +85,9 @@ $(OUTPUT)%$(EXESUF): $(OUTPUT)%-in.o $(OUTPUT)liblkl.a
 $(OUTPUT)%-in.o: $(OUTPUT)lib/lkl.o FORCE
 	$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(patsubst %/,%,$(dir $*)) obj=$(notdir $*)
 
+$(OUTPUT)cpfromfs$(EXESUF): cptofs$(EXESUF)
+	$(Q)if ! [ -e $@ ]; then ln -s $< $@; fi
+
 
 clean:
 	$(call QUIET_CLEAN, objects)find $(OUTPUT) -name '*.o' -delete -o -name '\.*.cmd'\
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
index a9f74c3cc8fb..e629b330e5aa 100644
--- a/tools/lkl/Targets
+++ b/tools/lkl/Targets
@@ -4,3 +4,6 @@ progs-y += tests/boot
 progs-y += tests/disk
 progs-y += tests/net-test
 
+progs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs
+LDLIBS_cptofs-y := -larchive
+LDLIBS_cptofs-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
diff --git a/tools/lkl/cptofs.c b/tools/lkl/cptofs.c
new file mode 100644
index 000000000000..dd490435d5b7
--- /dev/null
+++ b/tools/lkl/cptofs.c
@@ -0,0 +1,635 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifdef __FreeBSD__
+#include <sys/param.h>
+#endif
+
+#include <stdio.h>
+#include <time.h>
+#include <argp.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include <libgen.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <fnmatch.h>
+#include <dirent.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+static const char doc_cptofs[] = "Copy files to a filesystem image";
+static const char doc_cpfromfs[] = "Copy files from a filesystem image";
+static const char args_doc_cptofs[] = "-t fstype -i fsimage path... fs_path";
+static const char args_doc_cpfromfs[] = "-t fstype -i fsimage fs_path... path";
+
+static struct argp_option options[] = {
+	{"enable-printk", 'p', 0, 0, "show Linux printks"},
+	{"partition", 'P', "int", 0, "partition number"},
+	{"filesystem-type", 't', "string", 0,
+	 "select filesystem type - mandatory"},
+	{"filesystem-image", 'i', "string", 0,
+	 "path to the filesystem image - mandatory"},
+	{"selinux", 's', "string", 0, "selinux attributes for destination"},
+	{0},
+};
+
+static struct cl_args {
+	int printk;
+	int part;
+	const char *fsimg_type;
+	const char *fsimg_path;
+	int npaths;
+	char **paths;
+	const char *selinux;
+} cla;
+
+static int cptofs;
+
+static error_t parse_opt(int key, char *arg, struct argp_state *state)
+{
+	struct cl_args *cla = state->input;
+
+	switch (key) {
+	case 'p':
+		cla->printk = 1;
+		break;
+	case 'P':
+		cla->part = atoi(arg);
+		break;
+	case 't':
+		cla->fsimg_type = arg;
+		break;
+	case 'i':
+		cla->fsimg_path = arg;
+		break;
+	case 's':
+		cla->selinux = arg;
+		break;
+	case ARGP_KEY_ARG:
+		// Capture all remaining arguments in our paths array and stop
+		// parsing here. We treat the last one as the destination and
+		// everything before it as sources, just like cp does.
+		cla->paths = &state->argv[state->next - 1];
+		cla->npaths = state->argc - state->next + 1;
+		state->next = state->argc;
+		break;
+	default:
+		return ARGP_ERR_UNKNOWN;
+	}
+
+	return 0;
+}
+
+static struct argp argp_cptofs = {
+	.options = options,
+	.parser = parse_opt,
+	.args_doc = args_doc_cptofs,
+	.doc = doc_cptofs,
+};
+
+static struct argp argp_cpfromfs = {
+	.options = options,
+	.parser = parse_opt,
+	.args_doc = args_doc_cpfromfs,
+	.doc = doc_cpfromfs,
+};
+
+static int searchdir(const char *fs_path, const char *path, const char *match);
+
+static int open_src(const char *path)
+{
+	int fd;
+
+	if (cptofs)
+		fd = open(path, O_RDONLY, 0);
+	else
+		fd = lkl_sys_open(path, LKL_O_RDONLY, 0);
+
+	if (fd < 0)
+		fprintf(stderr, "unable to open file %s for reading: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(fd));
+
+	return fd;
+}
+
+static int open_dst(const char *path, int mode)
+{
+	int fd;
+
+	if (cptofs)
+		fd = lkl_sys_open(path, LKL_O_RDWR | LKL_O_TRUNC | LKL_O_CREAT,
+				  mode);
+	else
+		fd = open(path, O_RDWR | O_TRUNC | O_CREAT, mode);
+
+	if (fd < 0)
+		fprintf(stderr, "unable to open file %s for writing: %s\n",
+			path, cptofs ? lkl_strerror(fd) : strerror(errno));
+
+	if (cla.selinux && cptofs) {
+		int ret = lkl_sys_fsetxattr(fd, "security.selinux", cla.selinux,
+					    strlen(cla.selinux), 0);
+		if (ret)
+			fprintf(stderr,
+				"unable to set selinux attribute on %s: %s\n",
+				path, lkl_strerror(ret));
+	}
+
+	return fd;
+}
+
+static int read_src(int fd, char *buf, int len)
+{
+	int ret;
+
+	if (cptofs)
+		ret = read(fd, buf, len);
+	else
+		ret = lkl_sys_read(fd, buf, len);
+
+	if (ret < 0)
+		fprintf(stderr, "error reading file: %s\n",
+			cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int write_dst(int fd, char *buf, int len)
+{
+	int ret;
+
+	if (cptofs)
+		ret = lkl_sys_write(fd, buf, len);
+	else
+		ret = write(fd, buf, len);
+
+	if (ret < 0)
+		fprintf(stderr, "error writing file: %s\n",
+			cptofs ? lkl_strerror(ret) : strerror(errno));
+
+	return ret;
+}
+
+static void close_src(int fd)
+{
+	if (cptofs)
+		close(fd);
+	else
+		lkl_sys_close(fd);
+}
+
+static void close_dst(int fd)
+{
+	if (cptofs)
+		lkl_sys_close(fd);
+	else
+		close(fd);
+}
+
+static int copy_file(const char *src, const char *dst, int mode)
+{
+	long len, to_write, wrote;
+	char buf[4096], *ptr;
+	int ret = 0;
+	int fd_src, fd_dst;
+
+	fd_src = open_src(src);
+	if (fd_src < 0)
+		return fd_src;
+
+	fd_dst = open_dst(dst, mode);
+	if (fd_dst < 0)
+		return fd_dst;
+
+	do {
+		len = read_src(fd_src, buf, sizeof(buf));
+
+		if (len > 0) {
+			ptr = buf;
+			to_write = len;
+			do {
+				wrote = write_dst(fd_dst, ptr, to_write);
+
+				if (wrote < 0) {
+					ret = wrote;
+					goto out;
+				}
+
+				to_write -= wrote;
+				ptr += len;
+
+			} while (to_write > 0);
+		}
+
+		if (len < 0)
+			ret = len;
+
+	} while (len > 0);
+
+out:
+	close_src(fd_src);
+	close_dst(fd_dst);
+
+	return ret;
+}
+
+static int stat_src(const char *path, unsigned int *type, unsigned int *mode,
+		    long long *size, struct lkl_timespec *mtime,
+		    struct lkl_timespec *atime)
+{
+	struct stat stat;
+	struct lkl_stat lkl_stat;
+	int ret;
+
+	if (cptofs) {
+		ret = lstat(path, &stat);
+		if (type)
+			*type = stat.st_mode & S_IFMT;
+		if (mode)
+			*mode = stat.st_mode & ~S_IFMT;
+		if (size)
+			*size = stat.st_size;
+		if (mtime) {
+			mtime->tv_sec = stat.st_mtim.tv_sec;
+			mtime->tv_nsec = stat.st_mtim.tv_nsec;
+		}
+		if (atime) {
+			atime->tv_sec = stat.st_atim.tv_sec;
+			atime->tv_nsec = stat.st_atim.tv_nsec;
+		}
+	} else {
+		ret = lkl_sys_lstat(path, &lkl_stat);
+		if (type)
+			*type = lkl_stat.st_mode & S_IFMT;
+		if (mode)
+			*mode = lkl_stat.st_mode & ~S_IFMT;
+		if (size)
+			*size = lkl_stat.st_size;
+		if (mtime) {
+			mtime->tv_sec = lkl_stat.lkl_st_mtime;
+			mtime->tv_nsec = lkl_stat.st_mtime_nsec;
+		}
+		if (atime) {
+			atime->tv_sec = lkl_stat.lkl_st_atime;
+			atime->tv_nsec = lkl_stat.st_atime_nsec;
+		}
+	}
+
+	if (ret)
+		fprintf(stderr, "fsimg lstat(%s) error: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int mkdir_dst(const char *path, unsigned int mode)
+{
+	int ret;
+
+	if (cptofs) {
+		ret = lkl_sys_mkdir(path, mode);
+		if (ret == -LKL_EEXIST)
+			ret = 0;
+	} else {
+		ret = mkdir(path, mode);
+		if (ret < 0 && errno == EEXIST)
+			ret = 0;
+	}
+
+	if (ret)
+		fprintf(stderr, "unable to create directory %s: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int readlink_src(const char *src, char *out, int outsize)
+{
+	int ret;
+
+	if (cptofs)
+		ret = readlink(src, out, outsize);
+	else
+		ret = lkl_sys_readlink(src, out, outsize);
+
+	if (ret < 0)
+		fprintf(stderr, "unable to readlink '%s': %s\n", src,
+			cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int symlink_dst(const char *path, const char *target)
+{
+	int ret;
+
+	if (cptofs)
+		ret = lkl_sys_symlink(target, path);
+	else
+		ret = symlink(target, path);
+
+	if (ret)
+		fprintf(stderr, "unable to symlink '%s' with target '%s': %s\n",
+			path, target, cptofs ? lkl_strerror(ret) :
+			strerror(errno));
+
+	return ret;
+}
+
+static int copy_symlink(const char *src, const char *dst)
+{
+	int ret;
+	long long size, actual_size;
+	char *target = NULL;
+
+	ret = stat_src(src, NULL, NULL, &size, NULL, NULL);
+	if (ret) {
+		ret = -1;
+		goto out;
+	}
+
+	target = malloc(size + 1);
+	if (!target) {
+		fprintf(stderr, "Unable to allocate memory (%lld bytes)\n",
+			size + 1);
+		ret = -1;
+		goto out;
+	}
+
+	actual_size = readlink_src(src, target, size);
+	if (actual_size != size) {
+		fprintf(stderr,
+			"readlink(%s) bad size: got %lld, expected %lld\n",
+			src, actual_size, size);
+		ret = -1;
+		goto out;
+	}
+	target[size] = 0; // readlink doesn't append the trailing null byte
+
+	ret = symlink_dst(dst, target);
+	if (ret)
+		ret = -1;
+
+out:
+	if (target)
+		free(target);
+
+	return ret;
+}
+
+static int do_entry(const char *_src, const char *_dst, const char *name)
+{
+	char src[PATH_MAX], dst[PATH_MAX];
+	struct lkl_timespec mtime, atime;
+	unsigned int type, mode;
+	int ret;
+
+	snprintf(src, sizeof(src), "%s/%s", _src, name);
+	snprintf(dst, sizeof(dst), "%s/%s", _dst, name);
+
+	ret = stat_src(src, &type, &mode, NULL, &mtime, &atime);
+
+	switch (type) {
+	case S_IFREG:
+	{
+		ret = copy_file(src, dst, mode);
+		break;
+	}
+	case S_IFDIR:
+		ret = mkdir_dst(dst, mode);
+		if (ret)
+			break;
+		ret = searchdir(src, dst, NULL);
+		break;
+	case S_IFLNK:
+		ret = copy_symlink(src, dst);
+		break;
+	case S_IFSOCK:
+	case S_IFBLK:
+	case S_IFCHR:
+	case S_IFIFO:
+	default:
+		printf("skipping %s: unsupported entry type %d\n", src, type);
+	}
+
+	if (!ret) {
+		if (cptofs) {
+			struct lkl_timespec lkl_ts[] = { atime, mtime };
+
+			ret = lkl_sys_utimensat(-1, dst,
+						(struct __lkl__kernel_timespec
+						 *)lkl_ts,
+						LKL_AT_SYMLINK_NOFOLLOW);
+		} else {
+			struct timespec ts[] = {
+				{ .tv_sec = atime.tv_sec,
+				  .tv_nsec = atime.tv_nsec, },
+				{ .tv_sec = mtime.tv_sec,
+				  .tv_nsec = mtime.tv_nsec, },
+			};
+
+			ret = utimensat(-1, dst, ts, AT_SYMLINK_NOFOLLOW);
+		}
+	}
+
+	if (ret)
+		printf("error processing entry %s, aborting\n", src);
+
+	return ret;
+}
+
+static DIR *open_dir(const char *path)
+{
+	DIR *dir;
+	int err;
+
+	if (cptofs)
+		dir = opendir(path);
+	else
+		dir = (DIR *)lkl_opendir(path, &err);
+
+	if (!dir)
+		fprintf(stderr, "unable to open directory %s: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(err));
+	return dir;
+}
+
+static const char *read_dir(DIR *dir, const char *path)
+{
+	struct lkl_dir *lkl_dir = (struct lkl_dir *)dir;
+	const char *name = NULL;
+	const char *err = NULL;
+
+	if (cptofs) {
+		struct dirent *de = readdir(dir);
+
+		if (de)
+			name = de->d_name;
+	} else {
+		struct lkl_linux_dirent64 *de = lkl_readdir(lkl_dir);
+
+		if (de)
+			name = de->d_name;
+	}
+
+	if (!name) {
+		if (cptofs) {
+			if (errno)
+				err = strerror(errno);
+		} else {
+			if (lkl_errdir(lkl_dir))
+				err = lkl_strerror(lkl_errdir(lkl_dir));
+		}
+	}
+
+	if (err)
+		fprintf(stderr, "error while reading directory %s: %s\n",
+			path, err);
+	return name;
+}
+
+static void close_dir(DIR *dir)
+{
+	if (cptofs)
+		closedir(dir);
+	else
+		lkl_closedir((struct lkl_dir *)dir);
+}
+
+static int searchdir(const char *src, const char *dst, const char *match)
+{
+	DIR *dir;
+	const char *name;
+	int ret = 0;
+
+	dir = open_dir(src);
+	if (!dir)
+		return -1;
+
+	while ((name = read_dir(dir, src))) {
+		if (!strcmp(name, ".") || !strcmp(name, "..") ||
+		    (match && fnmatch(match, name, 0) != 0))
+			continue;
+
+		ret = do_entry(src, dst, name);
+		if (ret)
+			goto out;
+	}
+
+out:
+	close_dir(dir);
+
+	return ret;
+}
+
+static int match_root(const char *src)
+{
+	const char *c = src;
+
+	while (*c) {
+		switch (*c) {
+		case '.':
+			if (c > src && c[-1] == '.')
+				return 0;
+			break;
+		case '/':
+			break;
+		default:
+			return 0;
+		}
+		c++;
+	}
+
+	return 1;
+}
+
+int copy_one(const char *src, const char *mpoint, const char *dst)
+{
+	char *src_path_dir, *src_path_base;
+	char src_path[PATH_MAX], dst_path[PATH_MAX];
+
+	if (cptofs) {
+		snprintf(src_path, sizeof(src_path),  "%s", src);
+		snprintf(dst_path, sizeof(dst_path),  "%s/%s", mpoint, dst);
+	} else {
+		snprintf(src_path, sizeof(src_path),  "%s/%s", mpoint, src);
+		snprintf(dst_path, sizeof(dst_path),  "%s", dst);
+	}
+
+	if (match_root(src))
+		return searchdir(src_path, dst, NULL);
+
+	src_path_dir = dirname(strdup(src_path));
+	src_path_base = basename(strdup(src_path));
+
+	return searchdir(src_path_dir, dst_path, src_path_base);
+}
+
+int main(int argc, char **argv)
+{
+	struct lkl_disk disk;
+	long ret, umount_ret;
+	int i;
+	char mpoint[32];
+	unsigned int disk_id;
+
+	if (strstr(argv[0], "cptofs")) {
+		cptofs = 1;
+		ret = argp_parse(&argp_cptofs, argc, argv, 0, 0, &cla);
+	} else {
+		ret = argp_parse(&argp_cpfromfs, argc, argv, 0, 0, &cla);
+	}
+
+	if (ret < 0)
+		return -1;
+
+	if (!cla.printk)
+		lkl_host_ops.print = NULL;
+
+	disk.fd = open(cla.fsimg_path, cptofs ? O_RDWR : O_RDONLY);
+	if (disk.fd < 0) {
+		fprintf(stderr, "can't open fsimg %s: %s\n", cla.fsimg_path,
+			strerror(errno));
+		ret = 1;
+		goto out;
+	}
+
+	disk.ops = NULL;
+
+	ret = lkl_disk_add(&disk);
+	if (ret < 0) {
+		fprintf(stderr, "can't add disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+	disk_id = ret;
+
+	lkl_start_kernel(&lkl_host_ops, "mem=100M");
+
+	ret = lkl_mount_dev(disk_id, cla.part, cla.fsimg_type,
+			    cptofs ? 0 : LKL_MS_RDONLY,
+			    NULL, mpoint, sizeof(mpoint));
+	if (ret) {
+		fprintf(stderr, "can't mount disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+
+	lkl_sys_umask(0);
+
+	for (i = 0; i < cla.npaths - 1; i++) {
+		ret = copy_one(cla.paths[i], mpoint, cla.paths[cla.npaths - 1]);
+		if (ret)
+			break;
+	}
+
+	umount_ret = lkl_umount_dev(disk_id, cla.part, 0, 1000);
+	if (ret == 0)
+		ret = umount_ret;
+
+out_close:
+	close(disk.fd);
+
+out:
+	lkl_sys_halt();
+
+	return ret;
+}
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 22/37] lkl tools: tool that reads/writes to/from a filesystem image
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Conrad Meyer, Octavian Purdila, Akira Moroo,
	Petros Angelatos, Dan Peebles, linux-kernel-library,
	Yuriy Taraday, Tuomas Tynkkynen, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

A tool to read/write to/from a filesystem image without mounting the
file to host filesystem.  Thus there is no root priviledge to modify the
contents.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Dan Peebles <pumpkin@me.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Signed-off-by: Yuriy Taraday <yorik.sar@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore |   2 +
 tools/lkl/Build      |   2 +
 tools/lkl/Makefile   |   3 +
 tools/lkl/Targets    |   3 +
 tools/lkl/cptofs.c   | 635 +++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 645 insertions(+)
 create mode 100644 tools/lkl/cptofs.c

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
index 4e08254dbd46..1a8210f4d9c4 100644
--- a/tools/lkl/.gitignore
+++ b/tools/lkl/.gitignore
@@ -7,3 +7,5 @@ tests/disk
 tests/boot
 tests/valgrind*.xml
 *.pyc
+cptofs
+cpfromfs
diff --git a/tools/lkl/Build b/tools/lkl/Build
index e69de29bb2d1..a9d12c5ca260 100644
--- a/tools/lkl/Build
+++ b/tools/lkl/Build
@@ -0,0 +1,2 @@
+cptofs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs.o
+
diff --git a/tools/lkl/Makefile b/tools/lkl/Makefile
index 9a55df5064e4..94f3ba09123d 100644
--- a/tools/lkl/Makefile
+++ b/tools/lkl/Makefile
@@ -85,6 +85,9 @@ $(OUTPUT)%$(EXESUF): $(OUTPUT)%-in.o $(OUTPUT)liblkl.a
 $(OUTPUT)%-in.o: $(OUTPUT)lib/lkl.o FORCE
 	$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(patsubst %/,%,$(dir $*)) obj=$(notdir $*)
 
+$(OUTPUT)cpfromfs$(EXESUF): cptofs$(EXESUF)
+	$(Q)if ! [ -e $@ ]; then ln -s $< $@; fi
+
 
 clean:
 	$(call QUIET_CLEAN, objects)find $(OUTPUT) -name '*.o' -delete -o -name '\.*.cmd'\
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
index a9f74c3cc8fb..e629b330e5aa 100644
--- a/tools/lkl/Targets
+++ b/tools/lkl/Targets
@@ -4,3 +4,6 @@ progs-y += tests/boot
 progs-y += tests/disk
 progs-y += tests/net-test
 
+progs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs
+LDLIBS_cptofs-y := -larchive
+LDLIBS_cptofs-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
diff --git a/tools/lkl/cptofs.c b/tools/lkl/cptofs.c
new file mode 100644
index 000000000000..dd490435d5b7
--- /dev/null
+++ b/tools/lkl/cptofs.c
@@ -0,0 +1,635 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifdef __FreeBSD__
+#include <sys/param.h>
+#endif
+
+#include <stdio.h>
+#include <time.h>
+#include <argp.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include <libgen.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <fnmatch.h>
+#include <dirent.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+static const char doc_cptofs[] = "Copy files to a filesystem image";
+static const char doc_cpfromfs[] = "Copy files from a filesystem image";
+static const char args_doc_cptofs[] = "-t fstype -i fsimage path... fs_path";
+static const char args_doc_cpfromfs[] = "-t fstype -i fsimage fs_path... path";
+
+static struct argp_option options[] = {
+	{"enable-printk", 'p', 0, 0, "show Linux printks"},
+	{"partition", 'P', "int", 0, "partition number"},
+	{"filesystem-type", 't', "string", 0,
+	 "select filesystem type - mandatory"},
+	{"filesystem-image", 'i', "string", 0,
+	 "path to the filesystem image - mandatory"},
+	{"selinux", 's', "string", 0, "selinux attributes for destination"},
+	{0},
+};
+
+static struct cl_args {
+	int printk;
+	int part;
+	const char *fsimg_type;
+	const char *fsimg_path;
+	int npaths;
+	char **paths;
+	const char *selinux;
+} cla;
+
+static int cptofs;
+
+static error_t parse_opt(int key, char *arg, struct argp_state *state)
+{
+	struct cl_args *cla = state->input;
+
+	switch (key) {
+	case 'p':
+		cla->printk = 1;
+		break;
+	case 'P':
+		cla->part = atoi(arg);
+		break;
+	case 't':
+		cla->fsimg_type = arg;
+		break;
+	case 'i':
+		cla->fsimg_path = arg;
+		break;
+	case 's':
+		cla->selinux = arg;
+		break;
+	case ARGP_KEY_ARG:
+		// Capture all remaining arguments in our paths array and stop
+		// parsing here. We treat the last one as the destination and
+		// everything before it as sources, just like cp does.
+		cla->paths = &state->argv[state->next - 1];
+		cla->npaths = state->argc - state->next + 1;
+		state->next = state->argc;
+		break;
+	default:
+		return ARGP_ERR_UNKNOWN;
+	}
+
+	return 0;
+}
+
+static struct argp argp_cptofs = {
+	.options = options,
+	.parser = parse_opt,
+	.args_doc = args_doc_cptofs,
+	.doc = doc_cptofs,
+};
+
+static struct argp argp_cpfromfs = {
+	.options = options,
+	.parser = parse_opt,
+	.args_doc = args_doc_cpfromfs,
+	.doc = doc_cpfromfs,
+};
+
+static int searchdir(const char *fs_path, const char *path, const char *match);
+
+static int open_src(const char *path)
+{
+	int fd;
+
+	if (cptofs)
+		fd = open(path, O_RDONLY, 0);
+	else
+		fd = lkl_sys_open(path, LKL_O_RDONLY, 0);
+
+	if (fd < 0)
+		fprintf(stderr, "unable to open file %s for reading: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(fd));
+
+	return fd;
+}
+
+static int open_dst(const char *path, int mode)
+{
+	int fd;
+
+	if (cptofs)
+		fd = lkl_sys_open(path, LKL_O_RDWR | LKL_O_TRUNC | LKL_O_CREAT,
+				  mode);
+	else
+		fd = open(path, O_RDWR | O_TRUNC | O_CREAT, mode);
+
+	if (fd < 0)
+		fprintf(stderr, "unable to open file %s for writing: %s\n",
+			path, cptofs ? lkl_strerror(fd) : strerror(errno));
+
+	if (cla.selinux && cptofs) {
+		int ret = lkl_sys_fsetxattr(fd, "security.selinux", cla.selinux,
+					    strlen(cla.selinux), 0);
+		if (ret)
+			fprintf(stderr,
+				"unable to set selinux attribute on %s: %s\n",
+				path, lkl_strerror(ret));
+	}
+
+	return fd;
+}
+
+static int read_src(int fd, char *buf, int len)
+{
+	int ret;
+
+	if (cptofs)
+		ret = read(fd, buf, len);
+	else
+		ret = lkl_sys_read(fd, buf, len);
+
+	if (ret < 0)
+		fprintf(stderr, "error reading file: %s\n",
+			cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int write_dst(int fd, char *buf, int len)
+{
+	int ret;
+
+	if (cptofs)
+		ret = lkl_sys_write(fd, buf, len);
+	else
+		ret = write(fd, buf, len);
+
+	if (ret < 0)
+		fprintf(stderr, "error writing file: %s\n",
+			cptofs ? lkl_strerror(ret) : strerror(errno));
+
+	return ret;
+}
+
+static void close_src(int fd)
+{
+	if (cptofs)
+		close(fd);
+	else
+		lkl_sys_close(fd);
+}
+
+static void close_dst(int fd)
+{
+	if (cptofs)
+		lkl_sys_close(fd);
+	else
+		close(fd);
+}
+
+static int copy_file(const char *src, const char *dst, int mode)
+{
+	long len, to_write, wrote;
+	char buf[4096], *ptr;
+	int ret = 0;
+	int fd_src, fd_dst;
+
+	fd_src = open_src(src);
+	if (fd_src < 0)
+		return fd_src;
+
+	fd_dst = open_dst(dst, mode);
+	if (fd_dst < 0)
+		return fd_dst;
+
+	do {
+		len = read_src(fd_src, buf, sizeof(buf));
+
+		if (len > 0) {
+			ptr = buf;
+			to_write = len;
+			do {
+				wrote = write_dst(fd_dst, ptr, to_write);
+
+				if (wrote < 0) {
+					ret = wrote;
+					goto out;
+				}
+
+				to_write -= wrote;
+				ptr += len;
+
+			} while (to_write > 0);
+		}
+
+		if (len < 0)
+			ret = len;
+
+	} while (len > 0);
+
+out:
+	close_src(fd_src);
+	close_dst(fd_dst);
+
+	return ret;
+}
+
+static int stat_src(const char *path, unsigned int *type, unsigned int *mode,
+		    long long *size, struct lkl_timespec *mtime,
+		    struct lkl_timespec *atime)
+{
+	struct stat stat;
+	struct lkl_stat lkl_stat;
+	int ret;
+
+	if (cptofs) {
+		ret = lstat(path, &stat);
+		if (type)
+			*type = stat.st_mode & S_IFMT;
+		if (mode)
+			*mode = stat.st_mode & ~S_IFMT;
+		if (size)
+			*size = stat.st_size;
+		if (mtime) {
+			mtime->tv_sec = stat.st_mtim.tv_sec;
+			mtime->tv_nsec = stat.st_mtim.tv_nsec;
+		}
+		if (atime) {
+			atime->tv_sec = stat.st_atim.tv_sec;
+			atime->tv_nsec = stat.st_atim.tv_nsec;
+		}
+	} else {
+		ret = lkl_sys_lstat(path, &lkl_stat);
+		if (type)
+			*type = lkl_stat.st_mode & S_IFMT;
+		if (mode)
+			*mode = lkl_stat.st_mode & ~S_IFMT;
+		if (size)
+			*size = lkl_stat.st_size;
+		if (mtime) {
+			mtime->tv_sec = lkl_stat.lkl_st_mtime;
+			mtime->tv_nsec = lkl_stat.st_mtime_nsec;
+		}
+		if (atime) {
+			atime->tv_sec = lkl_stat.lkl_st_atime;
+			atime->tv_nsec = lkl_stat.st_atime_nsec;
+		}
+	}
+
+	if (ret)
+		fprintf(stderr, "fsimg lstat(%s) error: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int mkdir_dst(const char *path, unsigned int mode)
+{
+	int ret;
+
+	if (cptofs) {
+		ret = lkl_sys_mkdir(path, mode);
+		if (ret == -LKL_EEXIST)
+			ret = 0;
+	} else {
+		ret = mkdir(path, mode);
+		if (ret < 0 && errno == EEXIST)
+			ret = 0;
+	}
+
+	if (ret)
+		fprintf(stderr, "unable to create directory %s: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int readlink_src(const char *src, char *out, int outsize)
+{
+	int ret;
+
+	if (cptofs)
+		ret = readlink(src, out, outsize);
+	else
+		ret = lkl_sys_readlink(src, out, outsize);
+
+	if (ret < 0)
+		fprintf(stderr, "unable to readlink '%s': %s\n", src,
+			cptofs ? strerror(errno) : lkl_strerror(ret));
+
+	return ret;
+}
+
+static int symlink_dst(const char *path, const char *target)
+{
+	int ret;
+
+	if (cptofs)
+		ret = lkl_sys_symlink(target, path);
+	else
+		ret = symlink(target, path);
+
+	if (ret)
+		fprintf(stderr, "unable to symlink '%s' with target '%s': %s\n",
+			path, target, cptofs ? lkl_strerror(ret) :
+			strerror(errno));
+
+	return ret;
+}
+
+static int copy_symlink(const char *src, const char *dst)
+{
+	int ret;
+	long long size, actual_size;
+	char *target = NULL;
+
+	ret = stat_src(src, NULL, NULL, &size, NULL, NULL);
+	if (ret) {
+		ret = -1;
+		goto out;
+	}
+
+	target = malloc(size + 1);
+	if (!target) {
+		fprintf(stderr, "Unable to allocate memory (%lld bytes)\n",
+			size + 1);
+		ret = -1;
+		goto out;
+	}
+
+	actual_size = readlink_src(src, target, size);
+	if (actual_size != size) {
+		fprintf(stderr,
+			"readlink(%s) bad size: got %lld, expected %lld\n",
+			src, actual_size, size);
+		ret = -1;
+		goto out;
+	}
+	target[size] = 0; // readlink doesn't append the trailing null byte
+
+	ret = symlink_dst(dst, target);
+	if (ret)
+		ret = -1;
+
+out:
+	if (target)
+		free(target);
+
+	return ret;
+}
+
+static int do_entry(const char *_src, const char *_dst, const char *name)
+{
+	char src[PATH_MAX], dst[PATH_MAX];
+	struct lkl_timespec mtime, atime;
+	unsigned int type, mode;
+	int ret;
+
+	snprintf(src, sizeof(src), "%s/%s", _src, name);
+	snprintf(dst, sizeof(dst), "%s/%s", _dst, name);
+
+	ret = stat_src(src, &type, &mode, NULL, &mtime, &atime);
+
+	switch (type) {
+	case S_IFREG:
+	{
+		ret = copy_file(src, dst, mode);
+		break;
+	}
+	case S_IFDIR:
+		ret = mkdir_dst(dst, mode);
+		if (ret)
+			break;
+		ret = searchdir(src, dst, NULL);
+		break;
+	case S_IFLNK:
+		ret = copy_symlink(src, dst);
+		break;
+	case S_IFSOCK:
+	case S_IFBLK:
+	case S_IFCHR:
+	case S_IFIFO:
+	default:
+		printf("skipping %s: unsupported entry type %d\n", src, type);
+	}
+
+	if (!ret) {
+		if (cptofs) {
+			struct lkl_timespec lkl_ts[] = { atime, mtime };
+
+			ret = lkl_sys_utimensat(-1, dst,
+						(struct __lkl__kernel_timespec
+						 *)lkl_ts,
+						LKL_AT_SYMLINK_NOFOLLOW);
+		} else {
+			struct timespec ts[] = {
+				{ .tv_sec = atime.tv_sec,
+				  .tv_nsec = atime.tv_nsec, },
+				{ .tv_sec = mtime.tv_sec,
+				  .tv_nsec = mtime.tv_nsec, },
+			};
+
+			ret = utimensat(-1, dst, ts, AT_SYMLINK_NOFOLLOW);
+		}
+	}
+
+	if (ret)
+		printf("error processing entry %s, aborting\n", src);
+
+	return ret;
+}
+
+static DIR *open_dir(const char *path)
+{
+	DIR *dir;
+	int err;
+
+	if (cptofs)
+		dir = opendir(path);
+	else
+		dir = (DIR *)lkl_opendir(path, &err);
+
+	if (!dir)
+		fprintf(stderr, "unable to open directory %s: %s\n",
+			path, cptofs ? strerror(errno) : lkl_strerror(err));
+	return dir;
+}
+
+static const char *read_dir(DIR *dir, const char *path)
+{
+	struct lkl_dir *lkl_dir = (struct lkl_dir *)dir;
+	const char *name = NULL;
+	const char *err = NULL;
+
+	if (cptofs) {
+		struct dirent *de = readdir(dir);
+
+		if (de)
+			name = de->d_name;
+	} else {
+		struct lkl_linux_dirent64 *de = lkl_readdir(lkl_dir);
+
+		if (de)
+			name = de->d_name;
+	}
+
+	if (!name) {
+		if (cptofs) {
+			if (errno)
+				err = strerror(errno);
+		} else {
+			if (lkl_errdir(lkl_dir))
+				err = lkl_strerror(lkl_errdir(lkl_dir));
+		}
+	}
+
+	if (err)
+		fprintf(stderr, "error while reading directory %s: %s\n",
+			path, err);
+	return name;
+}
+
+static void close_dir(DIR *dir)
+{
+	if (cptofs)
+		closedir(dir);
+	else
+		lkl_closedir((struct lkl_dir *)dir);
+}
+
+static int searchdir(const char *src, const char *dst, const char *match)
+{
+	DIR *dir;
+	const char *name;
+	int ret = 0;
+
+	dir = open_dir(src);
+	if (!dir)
+		return -1;
+
+	while ((name = read_dir(dir, src))) {
+		if (!strcmp(name, ".") || !strcmp(name, "..") ||
+		    (match && fnmatch(match, name, 0) != 0))
+			continue;
+
+		ret = do_entry(src, dst, name);
+		if (ret)
+			goto out;
+	}
+
+out:
+	close_dir(dir);
+
+	return ret;
+}
+
+static int match_root(const char *src)
+{
+	const char *c = src;
+
+	while (*c) {
+		switch (*c) {
+		case '.':
+			if (c > src && c[-1] == '.')
+				return 0;
+			break;
+		case '/':
+			break;
+		default:
+			return 0;
+		}
+		c++;
+	}
+
+	return 1;
+}
+
+int copy_one(const char *src, const char *mpoint, const char *dst)
+{
+	char *src_path_dir, *src_path_base;
+	char src_path[PATH_MAX], dst_path[PATH_MAX];
+
+	if (cptofs) {
+		snprintf(src_path, sizeof(src_path),  "%s", src);
+		snprintf(dst_path, sizeof(dst_path),  "%s/%s", mpoint, dst);
+	} else {
+		snprintf(src_path, sizeof(src_path),  "%s/%s", mpoint, src);
+		snprintf(dst_path, sizeof(dst_path),  "%s", dst);
+	}
+
+	if (match_root(src))
+		return searchdir(src_path, dst, NULL);
+
+	src_path_dir = dirname(strdup(src_path));
+	src_path_base = basename(strdup(src_path));
+
+	return searchdir(src_path_dir, dst_path, src_path_base);
+}
+
+int main(int argc, char **argv)
+{
+	struct lkl_disk disk;
+	long ret, umount_ret;
+	int i;
+	char mpoint[32];
+	unsigned int disk_id;
+
+	if (strstr(argv[0], "cptofs")) {
+		cptofs = 1;
+		ret = argp_parse(&argp_cptofs, argc, argv, 0, 0, &cla);
+	} else {
+		ret = argp_parse(&argp_cpfromfs, argc, argv, 0, 0, &cla);
+	}
+
+	if (ret < 0)
+		return -1;
+
+	if (!cla.printk)
+		lkl_host_ops.print = NULL;
+
+	disk.fd = open(cla.fsimg_path, cptofs ? O_RDWR : O_RDONLY);
+	if (disk.fd < 0) {
+		fprintf(stderr, "can't open fsimg %s: %s\n", cla.fsimg_path,
+			strerror(errno));
+		ret = 1;
+		goto out;
+	}
+
+	disk.ops = NULL;
+
+	ret = lkl_disk_add(&disk);
+	if (ret < 0) {
+		fprintf(stderr, "can't add disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+	disk_id = ret;
+
+	lkl_start_kernel(&lkl_host_ops, "mem=100M");
+
+	ret = lkl_mount_dev(disk_id, cla.part, cla.fsimg_type,
+			    cptofs ? 0 : LKL_MS_RDONLY,
+			    NULL, mpoint, sizeof(mpoint));
+	if (ret) {
+		fprintf(stderr, "can't mount disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+
+	lkl_sys_umask(0);
+
+	for (i = 0; i < cla.npaths - 1; i++) {
+		ret = copy_one(cla.paths[i], mpoint, cla.paths[cla.npaths - 1]);
+		if (ret)
+			break;
+	}
+
+	umount_ret = lkl_umount_dev(disk_id, cla.part, 0, 1000);
+	if (ret == 0)
+		ret = umount_ret;
+
+out_close:
+	close(disk.fd);
+
+out:
+	lkl_sys_halt();
+
+	return ret;
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 23/37] lkl tools: tool that converts a filesystem image to tar
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Conrad Meyer, Hajime Tazaki, Petros Angelatos

From: Octavian Purdila <tavi.purdila@gmail.com>

Simple utility that converts a filesystem image to a tar file,
preserving file rights and extended attributes.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore |   1 +
 tools/lkl/Build      |   1 +
 tools/lkl/Targets    |   4 +
 tools/lkl/fs2tar.c   | 410 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 416 insertions(+)
 create mode 100644 tools/lkl/fs2tar.c

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
index 1a8210f4d9c4..138e65efcad2 100644
--- a/tools/lkl/.gitignore
+++ b/tools/lkl/.gitignore
@@ -9,3 +9,4 @@ tests/valgrind*.xml
 *.pyc
 cptofs
 cpfromfs
+fs2tar
diff --git a/tools/lkl/Build b/tools/lkl/Build
index a9d12c5ca260..73b37363a6de 100644
--- a/tools/lkl/Build
+++ b/tools/lkl/Build
@@ -1,2 +1,3 @@
 cptofs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs.o
+fs2tar-$(LKL_HOST_CONFIG_ARCHIVE) += fs2tar.o
 
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
index e629b330e5aa..05f5bd1dddcc 100644
--- a/tools/lkl/Targets
+++ b/tools/lkl/Targets
@@ -7,3 +7,7 @@ progs-y += tests/net-test
 progs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs
 LDLIBS_cptofs-y := -larchive
 LDLIBS_cptofs-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
+
+progs-$(LKL_HOST_CONFIG_ARCHIVE) += fs2tar
+LDLIBS_fs2tar-y := -larchive
+LDLIBS_fs2tar-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
diff --git a/tools/lkl/fs2tar.c b/tools/lkl/fs2tar.c
new file mode 100644
index 000000000000..d2834afcce93
--- /dev/null
+++ b/tools/lkl/fs2tar.c
@@ -0,0 +1,410 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifdef __FreeBSD__
+#include <sys/param.h>
+#endif
+
+#include <stdio.h>
+#include <time.h>
+#include <argp.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include <libgen.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <archive.h>
+#include <archive_entry.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+char doc[] = "";
+char args_doc[] = "-t fstype fsimage_path tar_path";
+static struct argp_option options[] = {
+	{"enable-printk", 'p', 0, 0, "show Linux printks"},
+	{"partition", 'P', "int", 0, "partition number"},
+	{"filesystem-type", 't', "string", 0,
+	 "select filesystem type - mandatory"},
+	{"selinux-contexts", 's', "file", 0,
+	 "export selinux contexts to file"},
+	{0},
+};
+
+static struct cl_args {
+	int printk;
+	int part;
+	const char *fsimg_type;
+	const char *fsimg_path;
+	const char *tar_path;
+	FILE *selinux;
+} cla;
+
+static error_t parse_opt(int key, char *arg, struct argp_state *state)
+{
+	struct cl_args *cla = state->input;
+
+	switch (key) {
+	case 'p':
+		cla->printk = 1;
+		break;
+	case 'P':
+		cla->part = atoi(arg);
+		break;
+	case 't':
+		cla->fsimg_type = arg;
+		break;
+	case 's':
+		cla->selinux = fopen(arg, "w");
+		if (!cla->selinux) {
+			fprintf(stderr,
+				"failed to open selinux contexts file: %s\n",
+				strerror(errno));
+			return -1;
+		}
+		break;
+	case ARGP_KEY_ARG:
+		if (!cla->fsimg_path)
+			cla->fsimg_path = arg;
+		else if (!cla->tar_path)
+			cla->tar_path = arg;
+		else
+			return -1;
+		break;
+	case ARGP_KEY_END:
+		if (state->arg_num < 2 || !cla->fsimg_type)
+			argp_usage(state);
+	default:
+		return ARGP_ERR_UNKNOWN;
+	}
+
+	return 0;
+}
+
+static struct argp argp = { options, parse_opt, args_doc, doc };
+
+static struct archive *tar;
+
+static int searchdir(const char *fsimg_path, const char *path);
+
+static int copy_file(const char *fsimg_path, const char *path)
+{
+	long fsimg_fd;
+	char buff[4096];
+	long len, wrote;
+	int ret = 0;
+
+	fsimg_fd = lkl_sys_open(fsimg_path, LKL_O_RDONLY, 0);
+	if (fsimg_fd < 0) {
+		fprintf(stderr, "fsimg error opening %s: %s\n", fsimg_path,
+			lkl_strerror(fsimg_fd));
+		return fsimg_fd;
+	}
+
+	do {
+		len = lkl_sys_read(fsimg_fd, buff, sizeof(buff));
+		if (len > 0) {
+			wrote = archive_write_data(tar, buff, len);
+			if (wrote != len) {
+				fprintf(stderr,
+					"error writing file %s to archive: %s [%d %ld]\n",
+					path, archive_error_string(tar), ret,
+					len);
+				ret = -archive_errno(tar);
+				break;
+			}
+		}
+
+		if (len < 0) {
+			fprintf(stderr, "error reading fsimg file %s: %s\n",
+				fsimg_path, lkl_strerror(len));
+			ret = len;
+		}
+
+	} while (len > 0);
+
+	lkl_sys_close(fsimg_fd);
+
+	return ret;
+}
+
+static int add_link(const char *fsimg_path, const char *path,
+		    struct archive_entry *entry)
+{
+	char buf[4096] = { 0, };
+	long len;
+
+	len = lkl_sys_readlink(fsimg_path, buf, sizeof(buf));
+	if (len < 0) {
+		fprintf(stderr, "fsimg readlink error %s: %s\n",
+			fsimg_path, lkl_strerror(len));
+		return len;
+	}
+
+	archive_entry_set_symlink(entry, buf);
+
+	return 0;
+}
+
+static inline void fsimg_copy_stat(struct stat *st, struct lkl_stat *fst)
+{
+	st->st_dev = fst->st_dev;
+	st->st_ino = fst->st_ino;
+	st->st_mode = fst->st_mode;
+	st->st_nlink = fst->st_nlink;
+	st->st_uid = fst->st_uid;
+	st->st_gid = fst->st_gid;
+	st->st_rdev = fst->st_rdev;
+	st->st_size = fst->st_size;
+	st->st_blksize = fst->st_blksize;
+	st->st_blocks = fst->st_blocks;
+	st->st_atim.tv_sec = fst->lkl_st_atime;
+	st->st_atim.tv_nsec = fst->st_atime_nsec;
+	st->st_mtim.tv_sec = fst->lkl_st_mtime;
+	st->st_mtim.tv_nsec = fst->st_mtime_nsec;
+	st->st_ctim.tv_sec = fst->lkl_st_ctime;
+	st->st_ctim.tv_nsec = fst->st_ctime_nsec;
+}
+
+static int copy_xattr(const char *fsimg_path, const char *path,
+		      struct archive_entry *entry)
+{
+	long ret;
+	char *xattr_list, *i;
+	long xattr_list_size;
+
+	ret = lkl_sys_llistxattr(fsimg_path, NULL, 0);
+	if (ret < 0) {
+		fprintf(stderr, "fsimg llistxattr(%s) error: %s\n",
+			path, lkl_strerror(ret));
+		return ret;
+	}
+
+	if (!ret)
+		return 0;
+
+	xattr_list = malloc(ret);
+
+	ret = lkl_sys_llistxattr(fsimg_path, xattr_list, ret);
+	if (ret < 0) {
+		fprintf(stderr, "fsimg llistxattr(%s) error: %s\n", path,
+			lkl_strerror(ret));
+		free(xattr_list);
+		return ret;
+	}
+
+	xattr_list_size = ret;
+
+	for (i = xattr_list; i - xattr_list < xattr_list_size;
+	     i += strlen(i) + 1) {
+		void *xattr_buf;
+
+		ret = lkl_sys_lgetxattr(fsimg_path, i, NULL, 0);
+		if (ret < 0) {
+			fprintf(stderr, "fsimg lgetxattr(%s) error: %s\n", path,
+				lkl_strerror(ret));
+			free(xattr_list);
+			return ret;
+		}
+
+		xattr_buf = malloc(ret);
+
+		ret = lkl_sys_lgetxattr(fsimg_path, i, xattr_buf, ret);
+		if (ret < 0) {
+			fprintf(stderr, "fsimg lgetxattr2(%s) error: %s\n",
+				path, lkl_strerror(ret));
+			free(xattr_list);
+			free(xattr_buf);
+			return ret;
+		}
+
+		if (cla.selinux && strcmp(i, "security.selinux") == 0)
+			fprintf(cla.selinux, "%s %s\n", path,
+				(char *)xattr_buf);
+
+		archive_entry_xattr_clear(entry);
+		archive_entry_xattr_add_entry(entry, i, xattr_buf, ret);
+
+		free(xattr_buf);
+	}
+
+	free(xattr_list);
+
+	return 0;
+}
+
+static int do_entry(const char *fsimg_path, const char *path,
+		    const struct lkl_linux_dirent64 *de)
+{
+	char fsimg_new_path[PATH_MAX], new_path[PATH_MAX];
+	struct lkl_stat fsimg_stat;
+	struct stat stat;
+	struct archive_entry *entry;
+	int ftype;
+	long ret;
+
+	snprintf(new_path, sizeof(new_path), "%s/%s", path, de->d_name);
+	snprintf(fsimg_new_path, sizeof(fsimg_new_path), "%s/%s", fsimg_path,
+		 de->d_name);
+
+	ret = lkl_sys_lstat(fsimg_new_path, &fsimg_stat);
+	if (ret) {
+		fprintf(stderr, "fsimg lstat(%s) error: %s\n",
+			path, lkl_strerror(ret));
+		return ret;
+	}
+
+	entry = archive_entry_new();
+
+	archive_entry_set_pathname(entry, new_path);
+	fsimg_copy_stat(&stat, &fsimg_stat);
+	archive_entry_copy_stat(entry, &stat);
+	ret = copy_xattr(fsimg_new_path, new_path, entry);
+	if (ret)
+		return ret;
+	/* TODO: ACLs */
+
+	ftype = stat.st_mode & S_IFMT;
+
+	switch (ftype) {
+	case S_IFREG:
+		archive_write_header(tar, entry);
+		ret = copy_file(fsimg_new_path, new_path);
+		break;
+	case S_IFDIR:
+		archive_write_header(tar, entry);
+		ret = searchdir(fsimg_new_path, new_path);
+		break;
+	case S_IFLNK:
+		ret = add_link(fsimg_new_path, new_path, entry);
+		/* fall through */
+	case S_IFSOCK:
+	case S_IFBLK:
+	case S_IFCHR:
+	case S_IFIFO:
+		if (ret)
+			break;
+		archive_write_header(tar, entry);
+		break;
+	default:
+		printf("skipping %s: unsupported entry type %d\n", new_path,
+		       ftype);
+	}
+
+	archive_entry_free(entry);
+
+	if (ret)
+		printf("error processing entry %s, aborting\n", new_path);
+
+	return ret;
+}
+
+static int searchdir(const char *fsimg_path, const char *path)
+{
+	long ret, fd;
+	char buf[1024], *pos;
+	long buf_len;
+
+	fd = lkl_sys_open(fsimg_path, LKL_O_RDONLY | LKL_O_DIRECTORY, 0);
+	if (fd < 0) {
+		fprintf(stderr, "failed to open dir %s: %s", fsimg_path,
+			lkl_strerror(fd));
+		return fd;
+	}
+
+	do {
+		struct lkl_linux_dirent64 *de;
+
+		de = (struct lkl_linux_dirent64 *) buf;
+		buf_len = lkl_sys_getdents64(fd, de, sizeof(buf));
+		if (buf_len < 0) {
+			fprintf(stderr, "gentdents64 error: %s\n",
+				lkl_strerror(buf_len));
+			break;
+		}
+
+		for (pos = buf; pos - buf < buf_len; pos += de->d_reclen) {
+			de = (struct lkl_linux_dirent64 *)pos;
+
+			if (!strcmp(de->d_name, ".") ||
+			    !strcmp(de->d_name, ".."))
+				continue;
+
+			ret = do_entry(fsimg_path, path, de);
+			if (ret)
+				goto out;
+		}
+
+	} while (buf_len > 0);
+
+out:
+	lkl_sys_close(fd);
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	struct lkl_disk disk;
+	long ret;
+	char mpoint[32];
+	unsigned int disk_id;
+
+	if (argp_parse(&argp, argc, argv, 0, 0, &cla) < 0)
+		return -1;
+
+	if (!cla.printk)
+		lkl_host_ops.print = NULL;
+
+	disk.fd = open(cla.fsimg_path, O_RDONLY);
+	if (disk.fd < 0) {
+		fprintf(stderr, "can't open fsimg %s: %s\n", cla.fsimg_path,
+			strerror(errno));
+		ret = 1;
+		goto out;
+	}
+
+	disk.ops = NULL;
+
+	ret = lkl_disk_add(&disk);
+	if (ret < 0) {
+		fprintf(stderr, "can't add disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+	disk_id = ret;
+
+	lkl_start_kernel(&lkl_host_ops, "mem=10M");
+
+	ret = lkl_mount_dev(disk_id, cla.part, cla.fsimg_type, LKL_MS_RDONLY,
+			    NULL, mpoint, sizeof(mpoint));
+	if (ret) {
+		fprintf(stderr, "can't mount disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+
+	ret = lkl_sys_chdir(mpoint);
+	if (ret) {
+		fprintf(stderr, "can't chdir to %s: %s\n", mpoint,
+			lkl_strerror(ret));
+		goto out_umount;
+	}
+
+	tar = archive_write_new();
+	archive_write_set_format_pax_restricted(tar);
+	archive_write_open_filename(tar, cla.tar_path);
+
+	ret = searchdir(mpoint, "");
+
+	archive_write_free(tar);
+
+	if (cla.selinux)
+		fclose(cla.selinux);
+
+out_umount:
+	lkl_umount_dev(disk_id, cla.part, 0, 1000);
+
+out_close:
+	close(disk.fd);
+
+out:
+	lkl_sys_halt();
+
+	return ret;
+}
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 23/37] lkl tools: tool that converts a filesystem image to tar
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Conrad Meyer, Octavian Purdila, Akira Moroo,
	Petros Angelatos, linux-kernel-library, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Simple utility that converts a filesystem image to a tar file,
preserving file rights and extended attributes.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore |   1 +
 tools/lkl/Build      |   1 +
 tools/lkl/Targets    |   4 +
 tools/lkl/fs2tar.c   | 410 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 416 insertions(+)
 create mode 100644 tools/lkl/fs2tar.c

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
index 1a8210f4d9c4..138e65efcad2 100644
--- a/tools/lkl/.gitignore
+++ b/tools/lkl/.gitignore
@@ -9,3 +9,4 @@ tests/valgrind*.xml
 *.pyc
 cptofs
 cpfromfs
+fs2tar
diff --git a/tools/lkl/Build b/tools/lkl/Build
index a9d12c5ca260..73b37363a6de 100644
--- a/tools/lkl/Build
+++ b/tools/lkl/Build
@@ -1,2 +1,3 @@
 cptofs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs.o
+fs2tar-$(LKL_HOST_CONFIG_ARCHIVE) += fs2tar.o
 
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
index e629b330e5aa..05f5bd1dddcc 100644
--- a/tools/lkl/Targets
+++ b/tools/lkl/Targets
@@ -7,3 +7,7 @@ progs-y += tests/net-test
 progs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs
 LDLIBS_cptofs-y := -larchive
 LDLIBS_cptofs-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
+
+progs-$(LKL_HOST_CONFIG_ARCHIVE) += fs2tar
+LDLIBS_fs2tar-y := -larchive
+LDLIBS_fs2tar-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
diff --git a/tools/lkl/fs2tar.c b/tools/lkl/fs2tar.c
new file mode 100644
index 000000000000..d2834afcce93
--- /dev/null
+++ b/tools/lkl/fs2tar.c
@@ -0,0 +1,410 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifdef __FreeBSD__
+#include <sys/param.h>
+#endif
+
+#include <stdio.h>
+#include <time.h>
+#include <argp.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdlib.h>
+#include <libgen.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <archive.h>
+#include <archive_entry.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+char doc[] = "";
+char args_doc[] = "-t fstype fsimage_path tar_path";
+static struct argp_option options[] = {
+	{"enable-printk", 'p', 0, 0, "show Linux printks"},
+	{"partition", 'P', "int", 0, "partition number"},
+	{"filesystem-type", 't', "string", 0,
+	 "select filesystem type - mandatory"},
+	{"selinux-contexts", 's', "file", 0,
+	 "export selinux contexts to file"},
+	{0},
+};
+
+static struct cl_args {
+	int printk;
+	int part;
+	const char *fsimg_type;
+	const char *fsimg_path;
+	const char *tar_path;
+	FILE *selinux;
+} cla;
+
+static error_t parse_opt(int key, char *arg, struct argp_state *state)
+{
+	struct cl_args *cla = state->input;
+
+	switch (key) {
+	case 'p':
+		cla->printk = 1;
+		break;
+	case 'P':
+		cla->part = atoi(arg);
+		break;
+	case 't':
+		cla->fsimg_type = arg;
+		break;
+	case 's':
+		cla->selinux = fopen(arg, "w");
+		if (!cla->selinux) {
+			fprintf(stderr,
+				"failed to open selinux contexts file: %s\n",
+				strerror(errno));
+			return -1;
+		}
+		break;
+	case ARGP_KEY_ARG:
+		if (!cla->fsimg_path)
+			cla->fsimg_path = arg;
+		else if (!cla->tar_path)
+			cla->tar_path = arg;
+		else
+			return -1;
+		break;
+	case ARGP_KEY_END:
+		if (state->arg_num < 2 || !cla->fsimg_type)
+			argp_usage(state);
+	default:
+		return ARGP_ERR_UNKNOWN;
+	}
+
+	return 0;
+}
+
+static struct argp argp = { options, parse_opt, args_doc, doc };
+
+static struct archive *tar;
+
+static int searchdir(const char *fsimg_path, const char *path);
+
+static int copy_file(const char *fsimg_path, const char *path)
+{
+	long fsimg_fd;
+	char buff[4096];
+	long len, wrote;
+	int ret = 0;
+
+	fsimg_fd = lkl_sys_open(fsimg_path, LKL_O_RDONLY, 0);
+	if (fsimg_fd < 0) {
+		fprintf(stderr, "fsimg error opening %s: %s\n", fsimg_path,
+			lkl_strerror(fsimg_fd));
+		return fsimg_fd;
+	}
+
+	do {
+		len = lkl_sys_read(fsimg_fd, buff, sizeof(buff));
+		if (len > 0) {
+			wrote = archive_write_data(tar, buff, len);
+			if (wrote != len) {
+				fprintf(stderr,
+					"error writing file %s to archive: %s [%d %ld]\n",
+					path, archive_error_string(tar), ret,
+					len);
+				ret = -archive_errno(tar);
+				break;
+			}
+		}
+
+		if (len < 0) {
+			fprintf(stderr, "error reading fsimg file %s: %s\n",
+				fsimg_path, lkl_strerror(len));
+			ret = len;
+		}
+
+	} while (len > 0);
+
+	lkl_sys_close(fsimg_fd);
+
+	return ret;
+}
+
+static int add_link(const char *fsimg_path, const char *path,
+		    struct archive_entry *entry)
+{
+	char buf[4096] = { 0, };
+	long len;
+
+	len = lkl_sys_readlink(fsimg_path, buf, sizeof(buf));
+	if (len < 0) {
+		fprintf(stderr, "fsimg readlink error %s: %s\n",
+			fsimg_path, lkl_strerror(len));
+		return len;
+	}
+
+	archive_entry_set_symlink(entry, buf);
+
+	return 0;
+}
+
+static inline void fsimg_copy_stat(struct stat *st, struct lkl_stat *fst)
+{
+	st->st_dev = fst->st_dev;
+	st->st_ino = fst->st_ino;
+	st->st_mode = fst->st_mode;
+	st->st_nlink = fst->st_nlink;
+	st->st_uid = fst->st_uid;
+	st->st_gid = fst->st_gid;
+	st->st_rdev = fst->st_rdev;
+	st->st_size = fst->st_size;
+	st->st_blksize = fst->st_blksize;
+	st->st_blocks = fst->st_blocks;
+	st->st_atim.tv_sec = fst->lkl_st_atime;
+	st->st_atim.tv_nsec = fst->st_atime_nsec;
+	st->st_mtim.tv_sec = fst->lkl_st_mtime;
+	st->st_mtim.tv_nsec = fst->st_mtime_nsec;
+	st->st_ctim.tv_sec = fst->lkl_st_ctime;
+	st->st_ctim.tv_nsec = fst->st_ctime_nsec;
+}
+
+static int copy_xattr(const char *fsimg_path, const char *path,
+		      struct archive_entry *entry)
+{
+	long ret;
+	char *xattr_list, *i;
+	long xattr_list_size;
+
+	ret = lkl_sys_llistxattr(fsimg_path, NULL, 0);
+	if (ret < 0) {
+		fprintf(stderr, "fsimg llistxattr(%s) error: %s\n",
+			path, lkl_strerror(ret));
+		return ret;
+	}
+
+	if (!ret)
+		return 0;
+
+	xattr_list = malloc(ret);
+
+	ret = lkl_sys_llistxattr(fsimg_path, xattr_list, ret);
+	if (ret < 0) {
+		fprintf(stderr, "fsimg llistxattr(%s) error: %s\n", path,
+			lkl_strerror(ret));
+		free(xattr_list);
+		return ret;
+	}
+
+	xattr_list_size = ret;
+
+	for (i = xattr_list; i - xattr_list < xattr_list_size;
+	     i += strlen(i) + 1) {
+		void *xattr_buf;
+
+		ret = lkl_sys_lgetxattr(fsimg_path, i, NULL, 0);
+		if (ret < 0) {
+			fprintf(stderr, "fsimg lgetxattr(%s) error: %s\n", path,
+				lkl_strerror(ret));
+			free(xattr_list);
+			return ret;
+		}
+
+		xattr_buf = malloc(ret);
+
+		ret = lkl_sys_lgetxattr(fsimg_path, i, xattr_buf, ret);
+		if (ret < 0) {
+			fprintf(stderr, "fsimg lgetxattr2(%s) error: %s\n",
+				path, lkl_strerror(ret));
+			free(xattr_list);
+			free(xattr_buf);
+			return ret;
+		}
+
+		if (cla.selinux && strcmp(i, "security.selinux") == 0)
+			fprintf(cla.selinux, "%s %s\n", path,
+				(char *)xattr_buf);
+
+		archive_entry_xattr_clear(entry);
+		archive_entry_xattr_add_entry(entry, i, xattr_buf, ret);
+
+		free(xattr_buf);
+	}
+
+	free(xattr_list);
+
+	return 0;
+}
+
+static int do_entry(const char *fsimg_path, const char *path,
+		    const struct lkl_linux_dirent64 *de)
+{
+	char fsimg_new_path[PATH_MAX], new_path[PATH_MAX];
+	struct lkl_stat fsimg_stat;
+	struct stat stat;
+	struct archive_entry *entry;
+	int ftype;
+	long ret;
+
+	snprintf(new_path, sizeof(new_path), "%s/%s", path, de->d_name);
+	snprintf(fsimg_new_path, sizeof(fsimg_new_path), "%s/%s", fsimg_path,
+		 de->d_name);
+
+	ret = lkl_sys_lstat(fsimg_new_path, &fsimg_stat);
+	if (ret) {
+		fprintf(stderr, "fsimg lstat(%s) error: %s\n",
+			path, lkl_strerror(ret));
+		return ret;
+	}
+
+	entry = archive_entry_new();
+
+	archive_entry_set_pathname(entry, new_path);
+	fsimg_copy_stat(&stat, &fsimg_stat);
+	archive_entry_copy_stat(entry, &stat);
+	ret = copy_xattr(fsimg_new_path, new_path, entry);
+	if (ret)
+		return ret;
+	/* TODO: ACLs */
+
+	ftype = stat.st_mode & S_IFMT;
+
+	switch (ftype) {
+	case S_IFREG:
+		archive_write_header(tar, entry);
+		ret = copy_file(fsimg_new_path, new_path);
+		break;
+	case S_IFDIR:
+		archive_write_header(tar, entry);
+		ret = searchdir(fsimg_new_path, new_path);
+		break;
+	case S_IFLNK:
+		ret = add_link(fsimg_new_path, new_path, entry);
+		/* fall through */
+	case S_IFSOCK:
+	case S_IFBLK:
+	case S_IFCHR:
+	case S_IFIFO:
+		if (ret)
+			break;
+		archive_write_header(tar, entry);
+		break;
+	default:
+		printf("skipping %s: unsupported entry type %d\n", new_path,
+		       ftype);
+	}
+
+	archive_entry_free(entry);
+
+	if (ret)
+		printf("error processing entry %s, aborting\n", new_path);
+
+	return ret;
+}
+
+static int searchdir(const char *fsimg_path, const char *path)
+{
+	long ret, fd;
+	char buf[1024], *pos;
+	long buf_len;
+
+	fd = lkl_sys_open(fsimg_path, LKL_O_RDONLY | LKL_O_DIRECTORY, 0);
+	if (fd < 0) {
+		fprintf(stderr, "failed to open dir %s: %s", fsimg_path,
+			lkl_strerror(fd));
+		return fd;
+	}
+
+	do {
+		struct lkl_linux_dirent64 *de;
+
+		de = (struct lkl_linux_dirent64 *) buf;
+		buf_len = lkl_sys_getdents64(fd, de, sizeof(buf));
+		if (buf_len < 0) {
+			fprintf(stderr, "gentdents64 error: %s\n",
+				lkl_strerror(buf_len));
+			break;
+		}
+
+		for (pos = buf; pos - buf < buf_len; pos += de->d_reclen) {
+			de = (struct lkl_linux_dirent64 *)pos;
+
+			if (!strcmp(de->d_name, ".") ||
+			    !strcmp(de->d_name, ".."))
+				continue;
+
+			ret = do_entry(fsimg_path, path, de);
+			if (ret)
+				goto out;
+		}
+
+	} while (buf_len > 0);
+
+out:
+	lkl_sys_close(fd);
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	struct lkl_disk disk;
+	long ret;
+	char mpoint[32];
+	unsigned int disk_id;
+
+	if (argp_parse(&argp, argc, argv, 0, 0, &cla) < 0)
+		return -1;
+
+	if (!cla.printk)
+		lkl_host_ops.print = NULL;
+
+	disk.fd = open(cla.fsimg_path, O_RDONLY);
+	if (disk.fd < 0) {
+		fprintf(stderr, "can't open fsimg %s: %s\n", cla.fsimg_path,
+			strerror(errno));
+		ret = 1;
+		goto out;
+	}
+
+	disk.ops = NULL;
+
+	ret = lkl_disk_add(&disk);
+	if (ret < 0) {
+		fprintf(stderr, "can't add disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+	disk_id = ret;
+
+	lkl_start_kernel(&lkl_host_ops, "mem=10M");
+
+	ret = lkl_mount_dev(disk_id, cla.part, cla.fsimg_type, LKL_MS_RDONLY,
+			    NULL, mpoint, sizeof(mpoint));
+	if (ret) {
+		fprintf(stderr, "can't mount disk: %s\n", lkl_strerror(ret));
+		goto out_close;
+	}
+
+	ret = lkl_sys_chdir(mpoint);
+	if (ret) {
+		fprintf(stderr, "can't chdir to %s: %s\n", mpoint,
+			lkl_strerror(ret));
+		goto out_umount;
+	}
+
+	tar = archive_write_new();
+	archive_write_set_format_pax_restricted(tar);
+	archive_write_open_filename(tar, cla.tar_path);
+
+	ret = searchdir(mpoint, "");
+
+	archive_write_free(tar);
+
+	if (cla.selinux)
+		fclose(cla.selinux);
+
+out_umount:
+	lkl_umount_dev(disk_id, cla.part, 0, 1000);
+
+out_close:
+	close(disk.fd);
+
+out:
+	lkl_sys_halt();
+
+	return ret;
+}
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 24/37] lkl tools: virtio: add network device support
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	David Disseldorp, Hajime Tazaki, Motomu Utsumi, Patrick Collins,
	Thomas Liebetraut, Xiao Jia, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

This commit adds basic virtio_net device implementation support to be
utilized by virtio-net driver over LKL. It also adds various virtio_net
backend to be used as network devices.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl.h            | 392 ++++++++++++++
 tools/lkl/include/lkl_host.h       |  76 +++
 tools/lkl/lib/Build                |  10 +
 tools/lkl/lib/net.c                | 818 +++++++++++++++++++++++++++++
 tools/lkl/lib/virtio_net.c         | 322 ++++++++++++
 tools/lkl/lib/virtio_net_dpdk.c    | 480 +++++++++++++++++
 tools/lkl/lib/virtio_net_fd.c      | 217 ++++++++
 tools/lkl/lib/virtio_net_fd.h      |  28 +
 tools/lkl/lib/virtio_net_macvtap.c |  32 ++
 tools/lkl/lib/virtio_net_pipe.c    |  76 +++
 tools/lkl/lib/virtio_net_raw.c     |  94 ++++
 tools/lkl/lib/virtio_net_tap.c     | 111 ++++
 tools/lkl/lib/virtio_net_vde.c     | 168 ++++++
 tools/lkl/tests/net-setup.sh       | 134 +++++
 tools/lkl/tests/net-test.c         | 317 +++++++++++
 tools/lkl/tests/net.sh             | 186 +++++++
 16 files changed, 3461 insertions(+)
 create mode 100644 tools/lkl/lib/net.c
 create mode 100644 tools/lkl/lib/virtio_net.c
 create mode 100644 tools/lkl/lib/virtio_net_dpdk.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.h
 create mode 100644 tools/lkl/lib/virtio_net_macvtap.c
 create mode 100644 tools/lkl/lib/virtio_net_pipe.c
 create mode 100644 tools/lkl/lib/virtio_net_raw.c
 create mode 100644 tools/lkl/lib/virtio_net_tap.c
 create mode 100644 tools/lkl/lib/virtio_net_vde.c
 create mode 100644 tools/lkl/tests/net-setup.sh
 create mode 100644 tools/lkl/tests/net-test.c
 create mode 100755 tools/lkl/tests/net.sh

diff --git a/tools/lkl/include/lkl.h b/tools/lkl/include/lkl.h
index 8bda12d4c6de..710fa38af905 100644
--- a/tools/lkl/include/lkl.h
+++ b/tools/lkl/include/lkl.h
@@ -529,6 +529,398 @@ int lkl_dirfd(struct lkl_dir *dir);
  */
 int lkl_mount_fs(char *fstype);
 
+/**
+ * lkl_if_up - activate network interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_up(int ifindex);
+
+/**
+ * lkl_if_down - deactivate network interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_down(int ifindex);
+
+/**
+ * lkl_if_set_mtu - set MTU on interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @mtu - the requested MTU size
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_mtu(int ifindex, int mtu);
+
+/**
+ * lkl_if_set_ipv4 - set IPv4 address on interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @addr - 4-byte IP address (i.e., struct in_addr)
+ * @netmask_len - prefix length of the @addr
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv4(int ifindex, unsigned int addr, unsigned int netmask_len);
+
+/**
+ * lkl_set_ipv4_gateway - add an IPv4 default route
+ *
+ * @addr - 4-byte IP address of the gateway (i.e., struct in_addr)
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_set_ipv4_gateway(unsigned int addr);
+
+/**
+ * lkl_if_set_ipv4_gateway - add an IPv4 default route in rule table
+ *
+ * @ifindex - the ifindex of the interface, used for tableid calculation
+ * @addr - 4-byte IP address of the interface
+ * @netmask_len - prefix length of the @addr
+ * @gw_addr - 4-byte IP address of the gateway
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv4_gateway(int ifindex, unsigned int addr,
+		unsigned int netmask_len, unsigned int gw_addr);
+
+/**
+ * lkl_if_set_ipv6 - set IPv6 address on interface
+ * must be called after interface is up.
+ *
+ * @ifindex - the ifindex of the interface
+ * @addr - 16-byte IPv6 address (i.e., struct in6_addr)
+ * @netprefix_len - prefix length of the @addr
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv6(int ifindex, void *addr, unsigned int netprefix_len);
+
+/**
+ * lkl_set_ipv6_gateway - add an IPv6 default route
+ *
+ * @addr - 16-byte IPv6 address of the gateway (i.e., struct in6_addr)
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_set_ipv6_gateway(void *addr);
+
+/**
+ * lkl_if_set_ipv6_gateway - add an IPv6 default route in rule table
+ *
+ * @ifindex - the ifindex of the interface, used for tableid calculation
+ * @addr - 16-byte IP address of the interface
+ * @netmask_len - prefix length of the @addr
+ * @gw_addr - 16-byte IP address of the gateway (i.e., struct in_addr)
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv6_gateway(int ifindex, void *addr,
+		unsigned int netmask_len, void *gw_addr);
+
+/**
+ * lkl_ifname_to_ifindex - obtain ifindex of an interface by name
+ *
+ * @name - string of an interface
+ * @returns - return an integer of ifindex if no error
+ */
+int lkl_ifname_to_ifindex(const char *name);
+
+/**
+ * lkl_netdev - host network device handle, defined in lkl_host.h.
+ */
+struct lkl_netdev;
+
+/**
+ * lkl_netdev_args - arguments to lkl_netdev_add
+ * @mac - optional MAC address for the device
+ * @offload - offload bits for the device
+ */
+struct lkl_netdev_args {
+	void *mac;
+	unsigned int offload;
+};
+
+/**
+ * lkl_netdev_add - add a new network device
+ *
+ * Must be called before calling lkl_start_kernel.
+ *
+ * @nd - the network device host handle
+ * @args - arguments that configs the netdev. Can be NULL
+ * @returns a network device id (0 is valid) or a strictly negative value in
+ * case of error
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args);
+#else
+static inline int lkl_netdev_add(struct lkl_netdev *nd,
+				 struct lkl_netdev_args *args)
+{
+	return -LKL_ENOSYS;
+}
+#endif
+
+/**
+ * lkl_netdev_remove - remove a previously added network device
+ *
+ * Attempts to release all resources held by a network device created
+ * via lkl_netdev_add.
+ *
+ * @id - the network device id, as return by @lkl_netdev_add
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+void lkl_netdev_remove(int id);
+#else
+static inline void lkl_netdev_remove(int id)
+{
+}
+#endif
+
+/**
+ * lkl_netdev_free - frees a network device
+ *
+ * @nd - the network device to free
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+void lkl_netdev_free(struct lkl_netdev *nd);
+#else
+static inline void lkl_netdev_free(struct lkl_netdev *nd)
+{
+}
+#endif
+
+/**
+ * lkl_netdev_get_ifindex - retrieve the interface index for a given network
+ * device id
+ *
+ * @id - the network device id
+ * @returns the interface index or a stricly negative value in case of error
+ */
+int lkl_netdev_get_ifindex(int id);
+
+/**
+ * lkl_netdev_tap_create - create TAP net_device for the virtio net backend
+ *
+ * @ifname - interface name for the TAP device. need to be configured
+ * on host in advance
+ * @offload - offload bits for the device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+struct lkl_netdev *lkl_netdev_tap_create(const char *ifname, int offload);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_tap_create(const char *ifname, int offload)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_dpdk_create - create DPDK net_device for the virtio net backend
+ *
+ * @ifname - interface name for the DPDK device. The name for DPDK device is
+ * only used for an internal use.
+ * @offload - offload bits for the device
+ * @mac - mac address pointer of dpdk-ed device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET_DPDK
+struct lkl_netdev *lkl_netdev_dpdk_create(const char *ifname, int offload,
+					unsigned char *mac);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_dpdk_create(const char *ifname, int offload, unsigned char *mac)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_vde_create - create VDE net_device for the virtio net backend
+ *
+ * @switch_path - path to the VDE switch directory. Needs to be started on host
+ * in advance.
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET_VDE
+struct lkl_netdev *lkl_netdev_vde_create(const char *switch_path);
+#else
+static inline struct lkl_netdev *lkl_netdev_vde_create(const char *switch_path)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_raw_create - create raw socket net_device for the virtio net
+ *                         backend
+ *
+ * @ifname - interface name for the snoop device.
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+struct lkl_netdev *lkl_netdev_raw_create(const char *ifname);
+#else
+static inline struct lkl_netdev *lkl_netdev_raw_create(const char *ifname)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_macvtap_create - create macvtap net_device for the virtio
+ * net backend
+ *
+ * @path - a file name for the macvtap device. need to be configured
+ * on host in advance
+ * @offload - offload bits for the device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET_MACVTAP
+struct lkl_netdev *lkl_netdev_macvtap_create(const char *path, int offload);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_macvtap_create(const char *path, int offload)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_pipe_create - create pipe net_device for the virtio
+ * net backend
+ *
+ * @ifname - a file name for the rx and tx pipe device. need to be configured
+ * on host in advance. delimiter is "|". e.g. "rx_name|tx_name".
+ * @offload - offload bits for the device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+struct lkl_netdev *lkl_netdev_pipe_create(const char *ifname, int offload);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_pipe_create(const char *ifname, int offload)
+{
+	return NULL;
+}
+#endif
+
+/*
+ * lkl_register_dbg_handler- register a signal handler that loads a debug lib.
+ *
+ * The signal handler is triggered by Ctrl-Z. It creates a new pthread which
+ * call dbg_entrance().
+ *
+ * If you run the program from shell script, make sure you ignore SIGTSTP by
+ * "trap '' TSTP" in the shell script.
+ */
+void lkl_register_dbg_handler(void);
+
+/**
+ * lkl_add_neighbor - add a permanent arp entry
+ * @ifindex - the ifindex of the interface
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @ip - ip address of the entry in network byte order
+ * @mac - mac address of the entry
+ */
+int lkl_add_neighbor(int ifindex, int af, void *addr, void *mac);
+
+/**
+ * lkl_if_add_ip - add an ip address
+ * @ifindex - the ifindex of the interface
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @addr - ip address of the entry in network byte order
+ * @netprefix_len - prefix length of the @addr
+ */
+int lkl_if_add_ip(int ifindex, int af, void *addr, unsigned int netprefix_len);
+
+/**
+ * lkl_if_del_ip - add an ip address
+ * @ifindex - the ifindex of the interface
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @addr - ip address of the entry in network byte order
+ * @netprefix_len - prefix length of the @addr
+ */
+int lkl_if_del_ip(int ifindex, int af, void *addr, unsigned int netprefix_len);
+
+/**
+ * lkl_add_gateway - add a gateway
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @gwaddr - 4-byte IP address of the gateway (i.e., struct in_addr)
+ */
+int lkl_add_gateway(int af, void *gwaddr);
+
+/**
+ * XXX Should I use OIF selector?
+ * temporary table idx = ifindex * 2 + 0 <- ipv4
+ * temporary table idx = ifindex * 2 + 1 <- ipv6
+ */
+/**
+ * lkl_if_add_rule_from_addr - create an ip rule table with "from" selector
+ * @ifindex - the ifindex of the interface, used for table id calculation
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @saddr - network byte order ip address, "from" selector address of this rule
+ */
+int lkl_if_add_rule_from_saddr(int ifindex, int af, void *saddr);
+
+/**
+ * lkl_if_add_gateway - add gateway to rule table
+ * @ifindex - the ifindex of the interface, used for table id calculation
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @gwaddr - 4-byte IP address of the gateway (i.e., struct in_addr)
+ */
+int lkl_if_add_gateway(int ifindex, int af, void *gwaddr);
+
+/**
+ * lkl_if_add_linklocal - add linklocal route to rule table
+ * @ifindex - the ifindex of the interface, used for table id calculation
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @addr - ip address of the entry in network byte order
+ * @netprefix_len - prefix length of the @addr
+ */
+int lkl_if_add_linklocal(int ifindex, int af,  void *addr, int netprefix_len);
+
+/**
+ * lkl_if_wait_ipv6_dad - wait for DAD to be done for a ipv6 address
+ * must be called after interface is up
+ *
+ * @ifindex - the ifindex of the interface
+ * @addr - ip address of the entry in network byte order
+ */
+int lkl_if_wait_ipv6_dad(int ifindex, void *addr);
+
+/**
+ * lkl_set_fd_limit - set the maximum number of file descriptors allowed
+ * @fd_limit - fd max limit
+ */
+int lkl_set_fd_limit(unsigned int fd_limit);
+
+/**
+ * lkl_qdisc_add - set qdisc rule onto an interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @root - the name of root class (e.g., "root");
+ * @type - the type of qdisc (e.g., "fq")
+ */
+int lkl_qdisc_add(int ifindex, const char *root, const char *type);
+
+/**
+ * lkl_qdisc_parse_add - Add a qdisc entry for an interface with strings
+ *
+ * @ifindex - the ifindex of the interface
+ * @entries - strings of qdisc configurations in the form of
+ *            "root|type;root|type;..."
+ */
+void lkl_qdisc_parse_add(int ifindex, const char *entries);
+
+/**
+ * lkl_sysctl - write a sysctl value
+ *
+ * @path - the path to an sysctl entry (e.g., "net.ipv4.tcp_wmem");
+ * @value - the value of the sysctl (e.g., "4096 87380 2147483647")
+ */
+int lkl_sysctl(const char *path, const char *value);
+
+/**
+ * lkl_sysctl_parse_write - Configure sysctl parameters with strings
+ *
+ * @sysctls - Configure sysctl parameters as the form of "key=value;..."
+ */
+void lkl_sysctl_parse_write(const char *sysctls);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
index a630efc95f0f..ab9c3f2a69fb 100644
--- a/tools/lkl/include/lkl_host.h
+++ b/tools/lkl/include/lkl_host.h
@@ -76,6 +76,82 @@ struct lkl_dev_blk_ops {
 	int (*request)(struct lkl_disk disk, struct lkl_blk_req *req);
 };
 
+struct lkl_netdev {
+	struct lkl_dev_net_ops *ops;
+	int id;
+	uint8_t has_vnet_hdr: 1;
+};
+
+/**
+ * struct lkl_dev_net_ops - network device host operations
+ */
+struct lkl_dev_net_ops {
+	/**
+	 * @tx: writes a L2 packet into the net device
+	 *
+	 * The data buffer can only hold 0 or 1 complete packets.
+	 *
+	 * @nd - pointer to the network device;
+	 * @iov - pointer to the buffer vector;
+	 * @cnt - # of vectors in iov.
+	 *
+	 * @returns number of bytes transmitted
+	 */
+	int (*tx)(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+
+	/**
+	 * @rx: reads a packet from the net device.
+	 *
+	 * It must only read one complete packet if present.
+	 *
+	 * If the buffer is too small for the packet, the implementation may
+	 * decide to drop it or trim it.
+	 *
+	 * @nd - pointer to the network device
+	 * @iov - pointer to the buffer vector to store the packet
+	 * @cnt - # of vectors in iov.
+	 *
+	 * @returns number of bytes read for success or < 0 if error
+	 */
+	int (*rx)(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+
+#define LKL_DEV_NET_POLL_RX		1
+#define LKL_DEV_NET_POLL_TX		2
+#define LKL_DEV_NET_POLL_HUP		4
+
+	/**
+	 * @poll: polls a net device
+	 *
+	 * Supports the following events: LKL_DEV_NET_POLL_RX
+	 * (readable), LKL_DEV_NET_POLL_TX (writable) or
+	 * LKL_DEV_NET_POLL_HUP (the close operations has been issued
+	 * and we need to clean up). Blocks until one event is
+	 * available.
+	 *
+	 * @nd - pointer to the network device
+	 *
+	 * @returns - LKL_DEV_NET_POLL_RX, LKL_DEV_NET_POLL_TX,
+	 * LKL_DEV_NET_POLL_HUP or a negative value for errors
+	 */
+	int (*poll)(struct lkl_netdev *nd);
+
+	/**
+	 * @poll_hup: make poll wakeup and return LKL_DEV_NET_POLL_HUP
+	 *
+	 * @nd - pointer to the network device
+	 */
+	void (*poll_hup)(struct lkl_netdev *nd);
+
+	/**
+	 * @free: frees a network device
+	 *
+	 * Implementation must release its resources and free the network device
+	 * structure.
+	 *
+	 * @nd - pointer to the network device
+	 */
+	void (*free)(struct lkl_netdev *nd);
+};
 
 #ifdef __cplusplus
 }
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index a7a3bff27bb1..1f1d55f259a3 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1,8 +1,10 @@
 CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 CFLAGS_posix-host.o += -D_FILE_OFFSET_BITS=64
+CFLAGS_virtio_net_vde.o += $(pkg-config --cflags vdeplug 2>/dev/null)
 
 liblkl-y += fs.o
 liblkl-y += iomem.o
+liblkl-y += net.o
 liblkl-y += jmp_buf.o
 liblkl-$(LKL_HOST_CONFIG_POSIX) += posix-host.o
 liblkl-y += utils.o
@@ -10,5 +12,13 @@ liblkl-y += virtio_blk.o
 liblkl-y += virtio.o
 liblkl-y += dbg.o
 liblkl-y += dbg_handler.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_fd.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_tap.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_raw.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET_MACVTAP) += virtio_net_macvtap.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET_DPDK) += virtio_net_dpdk.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET_VDE) += virtio_net_vde.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_pipe.o
 liblkl-y += ../../perf/pmu-events/jsmn.o
 liblkl-y += config.o
diff --git a/tools/lkl/lib/net.c b/tools/lkl/lib/net.c
new file mode 100644
index 000000000000..316965ffd21e
--- /dev/null
+++ b/tools/lkl/lib/net.c
@@ -0,0 +1,818 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <stdio.h>
+#include "endian.h"
+#include <lkl_host.h>
+
+#ifdef __MINGW32__
+#include <ws2tcpip.h>
+
+int lkl_inet_pton(int af, const char *src, void *dst)
+{
+	struct addrinfo hint, *res = NULL;
+	int err;
+
+	memset(&hint, 0, sizeof(struct addrinfo));
+
+	hint.ai_family = af;
+	hint.ai_flags = AI_NUMERICHOST;
+
+	err = getaddrinfo(src, NULL, &hint, &res);
+	if (err)
+		return 0;
+
+	switch (af) {
+	case AF_INET:
+		*(struct in_addr *)dst =
+			((struct sockaddr_in *)&res->ai_addr)->sin_addr;
+		break;
+	case AF_INET6:
+		*(struct in6_addr *)dst =
+			((struct sockaddr_in6 *)&res->ai_addr)->sin6_addr;
+		break;
+	default:
+		freeaddrinfo(res);
+		return 0;
+	}
+
+	freeaddrinfo(res);
+	return 1;
+}
+#endif
+
+static inline void set_sockaddr(struct lkl_sockaddr_in *sin, unsigned int addr,
+				unsigned short port)
+{
+	sin->sin_family = LKL_AF_INET;
+	sin->sin_addr.lkl_s_addr = addr;
+	sin->sin_port = port;
+}
+
+static inline int ifindex_to_name(int sock, struct lkl_ifreq *ifr, int ifindex)
+{
+	ifr->lkl_ifr_ifindex = ifindex;
+	return lkl_sys_ioctl(sock, LKL_SIOCGIFNAME, (long)ifr);
+}
+
+int lkl_ifname_to_ifindex(const char *name)
+{
+	struct lkl_ifreq ifr;
+	int fd, ret;
+
+	fd = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (fd < 0)
+		return fd;
+
+	strcpy(ifr.lkl_ifr_name, name);
+
+	ret = lkl_sys_ioctl(fd, LKL_SIOCGIFINDEX, (long)&ifr);
+	if (ret < 0)
+		return ret;
+
+	return ifr.lkl_ifr_ifindex;
+}
+
+int lkl_if_up(int ifindex)
+{
+	struct lkl_ifreq ifr;
+	int err, sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+
+	if (sock < 0)
+		return sock;
+	err = ifindex_to_name(sock, &ifr, ifindex);
+	if (err < 0)
+		return err;
+
+	err = lkl_sys_ioctl(sock, LKL_SIOCGIFFLAGS, (long)&ifr);
+	if (!err) {
+		ifr.lkl_ifr_flags |= LKL_IFF_UP;
+		err = lkl_sys_ioctl(sock, LKL_SIOCSIFFLAGS, (long)&ifr);
+	}
+
+	lkl_sys_close(sock);
+
+	return err;
+}
+
+int lkl_if_down(int ifindex)
+{
+	struct lkl_ifreq ifr;
+	int err, sock;
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (sock < 0)
+		return sock;
+
+	err = ifindex_to_name(sock, &ifr, ifindex);
+	if (err < 0)
+		return err;
+
+	err = lkl_sys_ioctl(sock, LKL_SIOCGIFFLAGS, (long)&ifr);
+	if (!err) {
+		ifr.lkl_ifr_flags &= ~LKL_IFF_UP;
+		err = lkl_sys_ioctl(sock, LKL_SIOCSIFFLAGS, (long)&ifr);
+	}
+
+	lkl_sys_close(sock);
+
+	return err;
+}
+
+int lkl_if_set_mtu(int ifindex, int mtu)
+{
+	struct lkl_ifreq ifr;
+	int err, sock;
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (sock < 0)
+		return sock;
+
+	err = ifindex_to_name(sock, &ifr, ifindex);
+	if (err < 0)
+		return err;
+
+	ifr.lkl_ifr_mtu = mtu;
+
+	err = lkl_sys_ioctl(sock, LKL_SIOCSIFMTU, (long)&ifr);
+
+	lkl_sys_close(sock);
+
+	return err;
+}
+
+int lkl_if_set_ipv4(int ifindex, unsigned int addr, unsigned int netmask_len)
+{
+	return lkl_if_add_ip(ifindex, LKL_AF_INET, &addr, netmask_len);
+}
+
+int lkl_if_set_ipv4_gateway(int ifindex, unsigned int src_addr,
+		unsigned int src_masklen, unsigned int via_addr)
+{
+	int err;
+
+	err = lkl_if_add_rule_from_saddr(ifindex, LKL_AF_INET, &src_addr);
+	if (err)
+		return err;
+	err = lkl_if_add_linklocal(ifindex, LKL_AF_INET,
+					&src_addr, src_masklen);
+	if (err)
+		return err;
+	return lkl_if_add_gateway(ifindex, LKL_AF_INET, &via_addr);
+}
+
+int lkl_set_ipv4_gateway(unsigned int addr)
+{
+	return lkl_add_gateway(LKL_AF_INET, &addr);
+}
+
+int lkl_netdev_get_ifindex(int id)
+{
+	struct lkl_ifreq ifr;
+	int sock, ret;
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (sock < 0)
+		return sock;
+
+	snprintf(ifr.lkl_ifr_name, sizeof(ifr.lkl_ifr_name), "eth%d", id);
+	ret = lkl_sys_ioctl(sock, LKL_SIOCGIFINDEX, (long)&ifr);
+	lkl_sys_close(sock);
+
+	return ret < 0 ? ret : ifr.lkl_ifr_ifindex;
+}
+
+static int netlink_sock(unsigned int groups)
+{
+	struct lkl_sockaddr_nl la;
+	int fd, err;
+
+	fd = lkl_sys_socket(LKL_AF_NETLINK, LKL_SOCK_DGRAM, LKL_NETLINK_ROUTE);
+	if (fd < 0)
+		return fd;
+
+	memset(&la, 0, sizeof(la));
+	la.nl_family = LKL_AF_NETLINK;
+	la.nl_groups = groups;
+	err = lkl_sys_bind(fd, (struct lkl_sockaddr *)&la, sizeof(la));
+	if (err < 0)
+		return err;
+
+	return fd;
+}
+
+static int parse_rtattr(struct lkl_rtattr *tb[], int max,
+			struct lkl_rtattr *rta, int len)
+{
+	unsigned short type;
+
+	memset(tb, 0, sizeof(struct lkl_rtattr *) * (max + 1));
+	while (LKL_RTA_OK(rta, len)) {
+		type = rta->rta_type;
+		if ((type <= max) && (!tb[type]))
+			tb[type] = rta;
+		rta = LKL_RTA_NEXT(rta, len);
+	}
+	if (len)
+		lkl_printf("!!!Deficit %d, rta_len=%d\n", len,
+			rta->rta_len);
+	return 0;
+}
+
+struct addr_filter {
+	unsigned int ifindex;
+	void *addr;
+};
+
+static unsigned int get_ifa_flags(struct lkl_ifaddrmsg *ifa,
+				  struct lkl_rtattr *ifa_flags_attr)
+{
+	return ifa_flags_attr ? *(unsigned int *)LKL_RTA_DATA(ifa_flags_attr) :
+				ifa->ifa_flags;
+}
+
+/* returns:
+ * 0 - dad succeed.
+ * -1 - dad failed or other error.
+ * 1 - should wait for new msg.
+ */
+static int check_ipv6_dad(struct lkl_sockaddr_nl *nladdr,
+			  struct lkl_nlmsghdr *n, void *arg)
+{
+	struct addr_filter *filter = arg;
+	struct lkl_ifaddrmsg *ifa = LKL_NLMSG_DATA(n);
+	struct lkl_rtattr *rta_tb[LKL_IFA_MAX+1];
+	unsigned int ifa_flags;
+	int len = n->nlmsg_len;
+
+	if (n->nlmsg_type != LKL_RTM_NEWADDR)
+		return 1;
+
+	len -= LKL_NLMSG_LENGTH(sizeof(*ifa));
+	if (len < 0) {
+		lkl_printf("BUG: wrong nlmsg len %d\n", len);
+		return -1;
+	}
+
+	parse_rtattr(rta_tb, LKL_IFA_MAX, LKL_IFA_RTA(ifa),
+		     n->nlmsg_len - LKL_NLMSG_LENGTH(sizeof(*ifa)));
+
+	ifa_flags = get_ifa_flags(ifa, rta_tb[LKL_IFA_FLAGS]);
+
+	if (ifa->ifa_index != filter->ifindex)
+		return 1;
+	if (ifa->ifa_family != LKL_AF_INET6)
+		return 1;
+
+	if (!rta_tb[LKL_IFA_LOCAL])
+		rta_tb[LKL_IFA_LOCAL] = rta_tb[LKL_IFA_ADDRESS];
+
+	if (!rta_tb[LKL_IFA_LOCAL] ||
+	    (filter->addr && memcmp(LKL_RTA_DATA(rta_tb[LKL_IFA_LOCAL]),
+				    filter->addr, 16))) {
+		return 1;
+	}
+	if (ifa_flags & LKL_IFA_F_DADFAILED) {
+		lkl_printf("IPV6 DAD failed.\n");
+		return -1;
+	}
+	if (!(ifa_flags & LKL_IFA_F_TENTATIVE))
+		return 0;
+	return 1;
+}
+
+/* Copied from iproute2/lib/ */
+static int rtnl_listen(int fd, int (*handler)(struct lkl_sockaddr_nl *nladdr,
+					      struct lkl_nlmsghdr *, void *),
+		       void *arg)
+{
+	int status;
+	struct lkl_nlmsghdr *h;
+	struct lkl_sockaddr_nl nladdr = { .nl_family = LKL_AF_NETLINK };
+	struct lkl_iovec iov;
+	struct lkl_user_msghdr msg = {
+		.msg_name = &nladdr,
+		.msg_namelen = sizeof(nladdr),
+		.msg_iov = &iov,
+		.msg_iovlen = 1,
+	};
+	char   buf[16384];
+
+	iov.iov_base = buf;
+	while (1) {
+		iov.iov_len = sizeof(buf);
+		status = lkl_sys_recvmsg(fd, &msg, 0);
+
+		if (status < 0) {
+			if (status == -LKL_EINTR || status == -LKL_EAGAIN)
+				continue;
+			lkl_printf("netlink receive error %s (%d)\n",
+				lkl_strerror(status), status);
+			if (status == -LKL_ENOBUFS)
+				continue;
+			return status;
+		}
+		if (status == 0) {
+			lkl_printf("EOF on netlink\n");
+			return -1;
+		}
+		if (msg.msg_namelen != sizeof(nladdr)) {
+			lkl_printf("Sender address length == %d\n",
+				msg.msg_namelen);
+			return -1;
+		}
+
+		for (h = (struct lkl_nlmsghdr *)buf;
+		     (unsigned int)status >= sizeof(*h);) {
+			int err;
+			int len = h->nlmsg_len;
+			int l = len - sizeof(*h);
+
+			if (l < 0 || len > status) {
+				if (msg.msg_flags & LKL_MSG_TRUNC) {
+					lkl_printf("Truncated message\n");
+					return -1;
+				}
+				lkl_printf("!!!malformed message: len=%d\n",
+					len);
+				return -1;
+			}
+
+			err = handler(&nladdr, h, arg);
+			if (err <= 0)
+				return err;
+
+			status -= LKL_NLMSG_ALIGN(len);
+			h = (struct lkl_nlmsghdr *)((char *)h +
+						    LKL_NLMSG_ALIGN(len));
+		}
+		if (msg.msg_flags & LKL_MSG_TRUNC) {
+			lkl_printf("Message truncated\n");
+			continue;
+		}
+		if (status) {
+			lkl_printf("!!!Remnant of size %d\n", status);
+			return -1;
+		}
+	}
+}
+
+int lkl_if_wait_ipv6_dad(int ifindex, void *addr)
+{
+	struct addr_filter filter = {.ifindex = ifindex, .addr = addr};
+	int fd, ret;
+	struct {
+		struct lkl_nlmsghdr		nlmsg_info;
+		struct lkl_ifaddrmsg	ifaddrmsg_info;
+	} req;
+
+	fd = netlink_sock(1 << (LKL_RTNLGRP_IPV6_IFADDR - 1));
+	if (fd < 0)
+		return fd;
+
+	memset(&req, 0, sizeof(req));
+	req.nlmsg_info.nlmsg_len =
+			LKL_NLMSG_LENGTH(sizeof(struct lkl_ifaddrmsg));
+	req.nlmsg_info.nlmsg_flags = LKL_NLM_F_REQUEST | LKL_NLM_F_DUMP;
+	req.nlmsg_info.nlmsg_type = LKL_RTM_GETADDR;
+	req.ifaddrmsg_info.ifa_family = LKL_AF_INET6;
+	req.ifaddrmsg_info.ifa_index = ifindex;
+	ret = lkl_sys_send(fd, &req, req.nlmsg_info.nlmsg_len, 0);
+	if (ret < 0) {
+		lkl_perror("lkl_sys_send", ret);
+		return ret;
+	}
+	ret = rtnl_listen(fd, check_ipv6_dad, (void *)&filter);
+	lkl_sys_close(fd);
+	return ret;
+}
+
+int lkl_if_set_ipv6(int ifindex, void *addr, unsigned int netprefix_len)
+{
+	int err = lkl_if_add_ip(ifindex, LKL_AF_INET6, addr, netprefix_len);
+
+	if (err)
+		return err;
+	return lkl_if_wait_ipv6_dad(ifindex, addr);
+}
+
+int lkl_if_set_ipv6_gateway(int ifindex, void *src_addr,
+		unsigned int src_masklen, void *via_addr)
+{
+	int err;
+
+	err = lkl_if_add_rule_from_saddr(ifindex, LKL_AF_INET6, src_addr);
+	if (err)
+		return err;
+	err = lkl_if_add_linklocal(ifindex, LKL_AF_INET6,
+					src_addr, src_masklen);
+	if (err)
+		return err;
+	return lkl_if_add_gateway(ifindex, LKL_AF_INET6, via_addr);
+}
+
+int lkl_set_ipv6_gateway(void *addr)
+{
+	return lkl_add_gateway(LKL_AF_INET6, addr);
+}
+
+/* returns:
+ * 0 - succeed.
+ * < 0 - error number.
+ * 1 - should wait for new msg.
+ */
+static int check_error(struct lkl_sockaddr_nl *nladdr, struct lkl_nlmsghdr *n,
+		       void *arg)
+{
+	unsigned int s = *(unsigned int *)arg;
+
+	if (nladdr->nl_pid != 0 || n->nlmsg_seq != s) {
+		/* Don't forget to skip that message. */
+		return 1;
+	}
+
+	if (n->nlmsg_type == LKL_NLMSG_ERROR) {
+		struct lkl_nlmsgerr *err =
+			(struct lkl_nlmsgerr *)LKL_NLMSG_DATA(n);
+		int l = n->nlmsg_len - sizeof(*n);
+
+		if (l < (int)sizeof(struct lkl_nlmsgerr))
+			lkl_printf("ERROR truncated\n");
+		else if (!err->error)
+			return 0;
+
+		lkl_printf("RTNETLINK answers: %s\n",
+			lkl_strerror(-err->error));
+		return err->error;
+	}
+	lkl_printf("Unexpected reply!!!\n");
+	return -1;
+}
+
+static unsigned int seq;
+static int rtnl_talk(int fd, struct lkl_nlmsghdr *n)
+{
+	int status;
+	struct lkl_sockaddr_nl nladdr = {.nl_family = LKL_AF_NETLINK};
+	struct lkl_iovec iov = {.iov_base = (void *)n, .iov_len = n->nlmsg_len};
+	struct lkl_user_msghdr msg = {
+			.msg_name = &nladdr,
+			.msg_namelen = sizeof(nladdr),
+			.msg_iov = &iov,
+			.msg_iovlen = 1,
+	};
+
+	n->nlmsg_seq = seq;
+	n->nlmsg_flags |= LKL_NLM_F_ACK;
+
+	status = lkl_sys_sendmsg(fd, &msg, 0);
+	if (status < 0) {
+		lkl_perror("Cannot talk to rtnetlink", status);
+		return status;
+	}
+
+	status = rtnl_listen(fd, check_error, (void *)&seq);
+	seq++;
+	return status;
+}
+
+static int addattr_l(struct lkl_nlmsghdr *n, unsigned int maxlen,
+		     int type, const void *data, int alen)
+{
+	int len = LKL_RTA_LENGTH(alen);
+	struct lkl_rtattr *rta;
+
+	if (LKL_NLMSG_ALIGN(n->nlmsg_len) + LKL_RTA_ALIGN(len) > maxlen) {
+		lkl_printf("%s ERROR: message exceeded bound of %d\n", __func__,
+			   maxlen);
+		return -1;
+	}
+	rta = ((struct lkl_rtattr *) (((void *) (n)) +
+				      LKL_NLMSG_ALIGN(n->nlmsg_len)));
+	rta->rta_type = type;
+	rta->rta_len = len;
+	memcpy(LKL_RTA_DATA(rta), data, alen);
+	n->nlmsg_len = LKL_NLMSG_ALIGN(n->nlmsg_len) + LKL_RTA_ALIGN(len);
+	return 0;
+}
+
+int lkl_add_neighbor(int ifindex, int af, void *ip, void *mac)
+{
+	struct {
+		struct lkl_nlmsghdr n;
+		struct lkl_ndmsg r;
+		char buf[1024];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_ndmsg)),
+		.n.nlmsg_type = LKL_RTM_NEWNEIGH,
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST |
+				 LKL_NLM_F_CREATE | LKL_NLM_F_REPLACE,
+		.r.ndm_family = af,
+		.r.ndm_ifindex = ifindex,
+		.r.ndm_state = LKL_NUD_PERMANENT,
+
+	};
+	int err, addr_sz;
+	int fd;
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	// create the IP attribute
+	addattr_l(&req.n, sizeof(req), LKL_NDA_DST, ip, addr_sz);
+
+	// create the MAC attribute
+	addattr_l(&req.n, sizeof(req), LKL_NDA_LLADDR, mac, 6);
+
+	err = rtnl_talk(fd, &req.n);
+	lkl_sys_close(fd);
+	return err;
+}
+
+static int ipaddr_modify(int cmd, int flags, int ifindex, int af, void *addr,
+			 unsigned int netprefix_len)
+{
+	struct {
+		struct lkl_nlmsghdr n;
+		struct lkl_ifaddrmsg ifa;
+		char buf[256];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_ifaddrmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST | flags,
+		.n.nlmsg_type = cmd,
+		.ifa.ifa_family = af,
+		.ifa.ifa_prefixlen = netprefix_len,
+		.ifa.ifa_index = ifindex,
+	};
+	int err, addr_sz;
+	int fd;
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	// create the IP attribute
+	addattr_l(&req.n, sizeof(req), LKL_IFA_LOCAL, addr, addr_sz);
+
+	err = rtnl_talk(fd, &req.n);
+
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_if_add_ip(int ifindex, int af, void *addr, unsigned int netprefix_len)
+{
+	return ipaddr_modify(LKL_RTM_NEWADDR, LKL_NLM_F_CREATE | LKL_NLM_F_EXCL,
+			     ifindex, af, addr, netprefix_len);
+}
+
+int lkl_if_del_ip(int ifindex, int af, void *addr, unsigned int netprefix_len)
+{
+	return ipaddr_modify(LKL_RTM_DELADDR, 0, ifindex, af,
+			     addr, netprefix_len);
+}
+
+static int iproute_modify(int cmd, unsigned int flags, int ifindex, int af,
+		void *route_addr, int route_masklen, void *gwaddr)
+{
+	struct {
+		struct lkl_nlmsghdr	n;
+		struct lkl_rtmsg	r;
+		char			buf[1024];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_rtmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST | flags,
+		.n.nlmsg_type = cmd,
+		.r.rtm_family = af,
+		.r.rtm_table = LKL_RT_TABLE_MAIN,
+		.r.rtm_scope = LKL_RT_SCOPE_UNIVERSE,
+	};
+	int err, addr_sz;
+	int i, fd;
+
+	fd = netlink_sock(0);
+	if (fd < 0) {
+		lkl_printf("netlink_sock error: %d\n", fd);
+		return fd;
+	}
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	if (cmd != LKL_RTM_DELROUTE) {
+		req.r.rtm_protocol = LKL_RTPROT_BOOT;
+		req.r.rtm_scope = LKL_RT_SCOPE_UNIVERSE;
+		req.r.rtm_type = LKL_RTN_UNICAST;
+	}
+
+	if (gwaddr)
+		addattr_l(&req.n, sizeof(req),
+				LKL_RTA_GATEWAY, gwaddr, addr_sz);
+
+	if (af == LKL_AF_INET && route_addr) {
+		unsigned int netaddr = *(unsigned int *)route_addr;
+
+		netaddr = ntohl(netaddr);
+		netaddr = (netaddr >> (32 - route_masklen));
+		netaddr = (netaddr << (32 - route_masklen));
+		netaddr =  htonl(netaddr);
+		*(unsigned int *)route_addr = netaddr;
+		req.r.rtm_dst_len = route_masklen;
+		addattr_l(&req.n, sizeof(req), LKL_RTA_DST,
+					route_addr, addr_sz);
+	}
+
+	if (af == LKL_AF_INET6 && route_addr) {
+		struct lkl_in6_addr netaddr =
+			*(struct lkl_in6_addr *)route_addr;
+		int rmbyte = route_masklen/8;
+		int rmbit = route_masklen%8;
+
+		for (i = 0; i < rmbyte; i++)
+			netaddr.in6_u.u6_addr8[15-i] = 0;
+		netaddr.in6_u.u6_addr8[15-rmbyte] =
+			(netaddr.in6_u.u6_addr8[15-rmbyte] >> rmbit);
+		netaddr.in6_u.u6_addr8[15-rmbyte] =
+			(netaddr.in6_u.u6_addr8[15-rmbyte] << rmbit);
+		*(struct lkl_in6_addr *)route_addr = netaddr;
+		req.r.rtm_dst_len = route_masklen;
+		addattr_l(&req.n, sizeof(req), LKL_RTA_DST,
+					route_addr, addr_sz);
+	}
+
+	if (ifindex != LKL_RT_TABLE_MAIN) {
+		if (af == LKL_AF_INET)
+			req.r.rtm_table = ifindex * 2;
+		else if (af == LKL_AF_INET6)
+			req.r.rtm_table = ifindex * 2 + 1;
+		addattr_l(&req.n, sizeof(req), LKL_RTA_OIF, &ifindex, addr_sz);
+	}
+	err = rtnl_talk(fd, &req.n);
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_if_add_linklocal(int ifindex, int af,  void *addr, int netprefix_len)
+{
+	return iproute_modify(LKL_RTM_NEWROUTE, LKL_NLM_F_CREATE|LKL_NLM_F_EXCL,
+			ifindex, af, addr, netprefix_len, NULL);
+}
+
+int lkl_if_add_gateway(int ifindex, int af, void *gwaddr)
+{
+	return iproute_modify(LKL_RTM_NEWROUTE, LKL_NLM_F_CREATE|LKL_NLM_F_EXCL,
+			ifindex, af, NULL, 0, gwaddr);
+}
+
+int lkl_add_gateway(int af, void *gwaddr)
+{
+	return iproute_modify(LKL_RTM_NEWROUTE, LKL_NLM_F_CREATE|LKL_NLM_F_EXCL,
+			LKL_RT_TABLE_MAIN, af, NULL, 0, gwaddr);
+}
+
+static int iprule_modify(int cmd, int ifindex, int af, void *saddr)
+{
+	struct {
+		struct lkl_nlmsghdr	n;
+		struct lkl_rtmsg		r;
+		char			buf[1024];
+	} req = {
+		.n.nlmsg_type = cmd,
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_rtmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST,
+		.r.rtm_protocol = LKL_RTPROT_BOOT,
+		.r.rtm_scope = LKL_RT_SCOPE_UNIVERSE,
+		.r.rtm_family = af,
+		.r.rtm_type = LKL_RTN_UNSPEC,
+	};
+	int fd, err;
+	int addr_sz;
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	if (cmd == LKL_RTM_NEWRULE) {
+		req.n.nlmsg_flags |= LKL_NLM_F_CREATE|LKL_NLM_F_EXCL;
+		req.r.rtm_type = LKL_RTN_UNICAST;
+	}
+
+	//set from address
+	req.r.rtm_src_len = 8 * addr_sz;
+	addattr_l(&req.n, sizeof(req), LKL_FRA_SRC, saddr, addr_sz);
+
+	//use ifindex as table id
+	if (af == LKL_AF_INET)
+		req.r.rtm_table = ifindex * 2;
+	else if (af == LKL_AF_INET6)
+		req.r.rtm_table = ifindex * 2 + 1;
+	err = rtnl_talk(fd, &req.n);
+
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_if_add_rule_from_saddr(int ifindex, int af, void *saddr)
+{
+	return iprule_modify(LKL_RTM_NEWRULE, ifindex, af, saddr);
+}
+
+static int qdisc_add(int cmd, int flags, int ifindex,
+		     const char *root, const char *type)
+{
+	struct {
+		struct lkl_nlmsghdr n;
+		struct lkl_tcmsg tc;
+		char buf[2*1024];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_tcmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST|flags,
+		.n.nlmsg_type = cmd,
+		.tc.tcm_family = LKL_AF_UNSPEC,
+	};
+	int err, fd;
+
+	if (!root || !type) {
+		lkl_printf("root and type arguments\n");
+		return -1;
+	}
+
+	if (strcmp(root, "root") == 0)
+		req.tc.tcm_parent = LKL_TC_H_ROOT;
+	req.tc.tcm_ifindex = ifindex;
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	// create the qdisc attribute
+	addattr_l(&req.n, sizeof(req), LKL_TCA_KIND, type, strlen(type)+1);
+
+	err = rtnl_talk(fd, &req.n);
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_qdisc_add(int ifindex, const char *root, const char *type)
+{
+	return qdisc_add(LKL_RTM_NEWQDISC, LKL_NLM_F_CREATE | LKL_NLM_F_EXCL,
+			 ifindex, root, type);
+}
+
+/* Add a qdisc entry for an interface in the form of
+ * "root|type;root|type;..."
+ */
+void lkl_qdisc_parse_add(int ifindex, const char *entries)
+{
+	char *saveptr = NULL, *token = NULL;
+	char *root = NULL, *type = NULL;
+	char strings[256];
+	int ret = 0;
+
+	strcpy(strings, entries);
+
+	for (token = strtok_r(strings, ";", &saveptr); token;
+	     token = strtok_r(NULL, ";", &saveptr)) {
+		root = strtok(token, "|");
+		type = strtok(NULL, "|");
+		ret = lkl_qdisc_add(ifindex, root, type);
+		if (ret) {
+			lkl_printf("Failed to add qdisc entry: %s\n",
+				   lkl_strerror(ret));
+			return;
+		}
+	}
+}
diff --git a/tools/lkl/lib/virtio_net.c b/tools/lkl/lib/virtio_net.c
new file mode 100644
index 000000000000..cd720b363f18
--- /dev/null
+++ b/tools/lkl/lib/virtio_net.c
@@ -0,0 +1,322 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <lkl_host.h>
+#include "virtio.h"
+#include "endian.h"
+
+#include <lkl/linux/virtio_net.h>
+
+#define netdev_of(x) (container_of(x, struct virtio_net_dev, dev))
+#define BIT(x) (1ULL << x)
+
+/* We always have 2 queues on a netdev: one for tx, one for rx. */
+#define RX_QUEUE_IDX 0
+#define TX_QUEUE_IDX 1
+
+#define NUM_QUEUES (TX_QUEUE_IDX + 1)
+#define QUEUE_DEPTH 128
+
+/* In fact, we'll hit the limit on the devs string below long before
+ * we hit this, but it's good enough for now.
+ */
+#define MAX_NET_DEVS 16
+
+#ifdef DEBUG
+#define bad_request(s) do {			\
+		lkl_printf("%s\n", s);		\
+		panic();			\
+	} while (0)
+#else
+#define bad_request(s) lkl_printf("virtio_net: %s\n", s)
+#endif /* DEBUG */
+
+struct virtio_net_dev {
+	struct virtio_dev dev;
+	struct lkl_virtio_net_config config;
+	struct lkl_netdev *nd;
+	struct lkl_mutex **queue_locks;
+	lkl_thread_t poll_tid;
+};
+
+static int net_check_features(struct virtio_dev *dev)
+{
+	if (dev->driver_features == dev->device_features)
+		return 0;
+
+	return -LKL_EINVAL;
+}
+
+static void net_acquire_queue(struct virtio_dev *dev, int queue_idx)
+{
+	lkl_host_ops.mutex_lock(netdev_of(dev)->queue_locks[queue_idx]);
+}
+
+static void net_release_queue(struct virtio_dev *dev, int queue_idx)
+{
+	lkl_host_ops.mutex_unlock(netdev_of(dev)->queue_locks[queue_idx]);
+}
+
+/*
+ * The buffers passed through "req" from the virtio_net driver always starts
+ * with a vnet_hdr. We need to check the backend device if it expects vnet_hdr
+ * and adjust buffer offset accordingly.
+ */
+static int net_enqueue(struct virtio_dev *dev, int q, struct virtio_req *req)
+{
+	struct lkl_virtio_net_hdr_v1 *header;
+	struct virtio_net_dev *net_dev;
+	struct iovec *iov;
+	int ret;
+
+	header = req->buf[0].iov_base;
+	net_dev = netdev_of(dev);
+	/*
+	 * The backend device does not expect a vnet_hdr so adjust buf
+	 * accordingly. (We make adjustment to req->buf so it can be used
+	 * directly for the tx/rx call but remember to undo the change after the
+	 * call.  Note that it's ok to pass iov with entry's len==0.  The caller
+	 * will skip to the next entry correctly.
+	 */
+	if (!net_dev->nd->has_vnet_hdr) {
+		req->buf[0].iov_base += sizeof(*header);
+		req->buf[0].iov_len -= sizeof(*header);
+	}
+	iov = req->buf;
+
+	/* Pick which virtqueue to send the buffer(s) to */
+	if (q == TX_QUEUE_IDX) {
+		ret = net_dev->nd->ops->tx(net_dev->nd, iov, req->buf_count);
+		if (ret < 0)
+			return -1;
+	} else if (q == RX_QUEUE_IDX) {
+		int i, len;
+
+		ret = net_dev->nd->ops->rx(net_dev->nd, iov, req->buf_count);
+		if (ret < 0)
+			return -1;
+		if (net_dev->nd->has_vnet_hdr) {
+			/*
+			 * If the number of bytes returned exactly matches the
+			 * total space in the iov then there is a good chance we
+			 * did not supply a large enough buffer for the whole
+			 * pkt, i.e., pkt has been truncated.  This is only
+			 * likely to happen under mergeable RX buffer mode.
+			 */
+			if (req->total_len == (unsigned int)ret)
+				lkl_printf("PKT is likely truncated! len=%d\n",
+				    ret);
+		} else {
+			header->flags = 0;
+			header->gso_type = LKL_VIRTIO_NET_HDR_GSO_NONE;
+		}
+		/*
+		 * Have to compute how many descriptors we've consumed (really
+		 * only matters to the the mergeable RX mode) and return it
+		 * through "num_buffers".
+		 */
+		for (i = 0, len = ret; len > 0; i++)
+			len -= req->buf[i].iov_len;
+		header->num_buffers = i;
+
+		if (dev->device_features & BIT(LKL_VIRTIO_NET_F_GUEST_CSUM))
+			header->flags |= LKL_VIRTIO_NET_HDR_F_DATA_VALID;
+	} else {
+		bad_request("tried to push on non-existent queue");
+		return -1;
+	}
+	if (!net_dev->nd->has_vnet_hdr) {
+		/* Undo the adjustment */
+		req->buf[0].iov_base -= sizeof(*header);
+		req->buf[0].iov_len += sizeof(*header);
+		ret += sizeof(struct lkl_virtio_net_hdr_v1);
+	}
+	virtio_req_complete(req, ret);
+	return 0;
+}
+
+static struct virtio_dev_ops net_ops = {
+	.check_features = net_check_features,
+	.enqueue = net_enqueue,
+	.acquire_queue = net_acquire_queue,
+	.release_queue = net_release_queue,
+};
+
+void poll_thread(void *arg)
+{
+	struct virtio_net_dev *dev = arg;
+
+	/* Synchronization is handled in virtio_process_queue */
+	do {
+		int ret = dev->nd->ops->poll(dev->nd);
+
+		if (ret < 0) {
+			lkl_printf("virtio net poll error: %d\n", ret);
+			continue;
+		}
+
+		if (ret & LKL_DEV_NET_POLL_HUP)
+			break;
+		if (ret & LKL_DEV_NET_POLL_RX)
+			virtio_process_queue(&dev->dev, 0);
+		if (ret & LKL_DEV_NET_POLL_TX)
+			virtio_process_queue(&dev->dev, 1);
+	} while (1);
+}
+
+struct virtio_net_dev *registered_devs[MAX_NET_DEVS];
+static int registered_dev_idx;
+
+static int dev_register(struct virtio_net_dev *dev)
+{
+	if (registered_dev_idx == MAX_NET_DEVS) {
+		lkl_printf("Too many virtio_net devices!\n");
+		/* This error code is a little bit of a lie */
+		return -LKL_ENOMEM;
+	}
+
+	/* registered_dev_idx is incremented by the caller */
+	registered_devs[registered_dev_idx] = dev;
+	return 0;
+}
+
+static void free_queue_locks(struct lkl_mutex **queues, int num_queues)
+{
+	int i = 0;
+
+	if (!queues)
+		return;
+
+	for (i = 0; i < num_queues; i++)
+		lkl_host_ops.mutex_free(queues[i]);
+
+	lkl_host_ops.mem_free(queues);
+}
+
+static struct lkl_mutex **init_queue_locks(int num_queues)
+{
+	int i;
+	struct lkl_mutex **ret = lkl_host_ops.mem_alloc(
+		sizeof(struct lkl_mutex *) * num_queues);
+	if (!ret)
+		return NULL;
+
+	memset(ret, 0, sizeof(struct lkl_mutex *) * num_queues);
+	for (i = 0; i < num_queues; i++) {
+		ret[i] = lkl_host_ops.mutex_alloc(1);
+		if (!ret[i]) {
+			free_queue_locks(ret, i);
+			return NULL;
+		}
+	}
+
+	return ret;
+}
+
+int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
+{
+	struct virtio_net_dev *dev;
+	int ret = -LKL_ENOMEM;
+
+	dev = lkl_host_ops.mem_alloc(sizeof(*dev));
+	if (!dev)
+		return -LKL_ENOMEM;
+
+	memset(dev, 0, sizeof(*dev));
+
+	dev->dev.device_id = LKL_VIRTIO_ID_NET;
+	if (args) {
+		if (args->mac) {
+			dev->dev.device_features |= BIT(LKL_VIRTIO_NET_F_MAC);
+			memcpy(dev->config.mac, args->mac, LKL_ETH_ALEN);
+		}
+		dev->dev.device_features |= args->offload;
+
+	}
+	dev->dev.config_data = &dev->config;
+	dev->dev.config_len = sizeof(dev->config);
+	dev->dev.ops = &net_ops;
+	dev->nd = nd;
+	dev->queue_locks = init_queue_locks(NUM_QUEUES);
+
+	if (!dev->queue_locks)
+		goto out_free;
+
+	/*
+	 * MUST match the number of queue locks we initialized. We could init
+	 * the queues in virtio_dev_setup to help enforce this, but netdevs are
+	 * the only flavor that need these locks, so it's better to do it
+	 * here.
+	 */
+	ret = virtio_dev_setup(&dev->dev, NUM_QUEUES, QUEUE_DEPTH);
+
+	if (ret)
+		goto out_free;
+
+	/*
+	 * We may receive upto 64KB TSO packet so collect as many descriptors as
+	 * there are available up to 64KB in total len.
+	 */
+	if (dev->dev.device_features & BIT(LKL_VIRTIO_NET_F_MRG_RXBUF))
+		virtio_set_queue_max_merge_len(&dev->dev, RX_QUEUE_IDX, 65536);
+
+	dev->poll_tid = lkl_host_ops.thread_create(poll_thread, dev);
+	if (dev->poll_tid == 0)
+		goto out_cleanup_dev;
+
+	ret = dev_register(dev);
+	if (ret < 0)
+		goto out_cleanup_dev;
+
+	return registered_dev_idx++;
+
+out_cleanup_dev:
+	virtio_dev_cleanup(&dev->dev);
+
+out_free:
+	if (dev->queue_locks)
+		free_queue_locks(dev->queue_locks, NUM_QUEUES);
+	lkl_host_ops.mem_free(dev);
+
+	return ret;
+}
+
+/* Return 0 for success, -1 for failure. */
+void lkl_netdev_remove(int id)
+{
+	struct virtio_net_dev *dev;
+	int ret;
+
+	if (id >= registered_dev_idx) {
+		lkl_printf("%s: invalid id: %d\n", __func__, id);
+		return;
+	}
+
+	dev = registered_devs[id];
+
+	dev->nd->ops->poll_hup(dev->nd);
+	lkl_host_ops.thread_join(dev->poll_tid);
+
+	ret = lkl_netdev_get_ifindex(id);
+	if (ret < 0) {
+		lkl_printf("%s: failed to get ifindex for id %d: %s\n",
+			   __func__, id, lkl_strerror(ret));
+		return;
+	}
+
+	ret = lkl_if_down(ret);
+	if (ret < 0) {
+		lkl_printf("%s: failed to put interface id %d down: %s\n",
+			   __func__, id, lkl_strerror(ret));
+		return;
+	}
+
+	virtio_dev_cleanup(&dev->dev);
+
+	free_queue_locks(dev->queue_locks, NUM_QUEUES);
+	lkl_host_ops.mem_free(dev);
+}
+
+void lkl_netdev_free(struct lkl_netdev *nd)
+{
+	nd->ops->free(nd);
+}
diff --git a/tools/lkl/lib/virtio_net_dpdk.c b/tools/lkl/lib/virtio_net_dpdk.c
new file mode 100644
index 000000000000..9512769554a5
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_dpdk.c
@@ -0,0 +1,480 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel DPDK based virtual network interface feature for LKL
+ * Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ */
+
+//#define DEBUG
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/queue.h>
+
+#include <rte_eal.h>
+#include <rte_ethdev.h>
+#include <rte_mempool.h>
+#include <rte_net.h>
+
+#include <lkl_host.h>
+
+static char *ealargs[4] = {
+	"lkl_vif_dpdk",
+	"-c 1",
+	"-n 1",
+	"--log-level=0",
+};
+
+#define MAX_PKT_BURST           16
+/* XXX: disable cache due to no thread-safe on mempool cache. */
+#define MEMPOOL_CACHE_SZ        0
+/* for TSO pkt */
+#define MAX_PACKET_SZ           (65535 \
+	- (sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM))
+#define MBUF_NUM                (512*2) /* vmxnet3 requires 1024 */
+#define MBUF_SIZ        \
+	(MAX_PACKET_SZ + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+#define NUMDESC         512	/* nb_min on vmxnet3 is 512 */
+#define NUMQUEUE        1
+
+#define BIT(x) (1ULL << x)
+
+static int portid;
+
+struct lkl_netdev_dpdk {
+	struct lkl_netdev dev;
+	int portid;
+	struct rte_mempool *rxpool, *txpool; /* ring buffer pool */
+	/* burst receive context by rump dpdk code */
+	struct rte_mbuf *rcv_mbuf[MAX_PKT_BURST];
+	int npkts;
+	int bufidx;
+	int offload;
+	int close: 1;
+	int busy_poll: 1;
+};
+
+static int dpdk_net_tx_prep(struct rte_mbuf *rm,
+		struct lkl_virtio_net_hdr_v1 *header)
+{
+	struct rte_net_hdr_lens hdr_lens;
+	uint32_t ptype;
+
+#ifdef DEBUG
+	lkl_printf("dpdk-tx: gso_type=%d, gso=%d, hdrlen=%d validation=%d\n",
+		header->gso_type, header->gso_size, header->hdr_len,
+		rte_validate_tx_offload(rm));
+#endif
+
+	ptype = rte_net_get_ptype(rm, &hdr_lens, RTE_PTYPE_ALL_MASK);
+	rm->l2_len = hdr_lens.l2_len;
+	rm->l3_len = hdr_lens.l3_len;
+	rm->l4_len = hdr_lens.l4_len; // including tcp opts
+
+	if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP) {
+		if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV4)
+			rm->ol_flags = PKT_TX_IPV4;
+		else if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV6)
+			rm->ol_flags = PKT_TX_IPV6;
+
+		rm->ol_flags |= PKT_TX_TCP_CKSUM;
+		rm->tso_segsz = header->gso_size;
+		/* TSO case */
+		if (header->gso_type == LKL_VIRTIO_NET_HDR_GSO_TCPV4)
+			rm->ol_flags |= (PKT_TX_TCP_SEG | PKT_TX_IP_CKSUM);
+		else if (header->gso_type == LKL_VIRTIO_NET_HDR_GSO_TCPV6)
+			rm->ol_flags |= PKT_TX_TCP_SEG;
+	}
+
+	return sizeof(struct lkl_virtio_net_hdr_v1);
+
+}
+
+static int dpdk_net_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	void *pkt;
+	struct rte_mbuf *rm;
+	struct lkl_netdev_dpdk *nd_dpdk;
+	struct lkl_virtio_net_hdr_v1 *header = NULL;
+	int i, len, sent = 0;
+	void *data = NULL;
+
+	nd_dpdk = (struct lkl_netdev_dpdk *) nd;
+
+	/*
+	 * XXX: someone reported that DPDK's mempool with cache is not thread
+	 * safe (e.g., http://www.dpdk.io/ml/archives/dev/2014-February/001401.html),
+	 * potentially rte_pktmbuf_alloc() is not thread safe here.  so I
+	 * tentatively disabled the cache on mempool by assigning
+	 * MEMPOOL_CACHE_SZ to 0.
+	 */
+	rm = rte_pktmbuf_alloc(nd_dpdk->txpool);
+
+	for (i = 0; i < cnt; i++) {
+		data = iov[i].iov_base;
+		len = (int)iov[i].iov_len;
+
+		if (i == 0) {
+			header = data;
+			data += sizeof(*header);
+			len -= sizeof(*header);
+		}
+
+		if (len == 0)
+			continue;
+
+		pkt = rte_pktmbuf_append(rm, len);
+		if (pkt) {
+			/* XXX: I wanna have M_EXT flag !!! */
+			memcpy(pkt, data, len);
+			sent += len;
+		} else {
+			lkl_printf("dpdk-tx: failed to append: idx=%d len=%d\n",
+				   i, len);
+			rte_pktmbuf_free(rm);
+			return -1;
+		}
+#ifdef DEBUG
+		lkl_printf("dpdk-tx: pkt[%d]len=%d\n", i, len);
+#endif
+	}
+
+	/* preparation for TX offloads */
+	sent += dpdk_net_tx_prep(rm, header);
+
+	/* XXX: should be bulk-trasmitted !! */
+	if (rte_eth_tx_prepare(nd_dpdk->portid, 0, &rm, 1) != 1)
+		lkl_printf("tx_prep failed\n");
+
+	rte_eth_tx_burst(nd_dpdk->portid, 0, &rm, 1);
+
+	rte_pktmbuf_free(rm);
+	return sent;
+}
+
+static int __dpdk_net_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	struct lkl_netdev_dpdk *nd_dpdk;
+	int i = 0;
+	struct rte_mbuf *rm, *first;
+	void *r_data;
+	size_t read = 0, r_size, copylen = 0, offset = 0;
+	struct lkl_virtio_net_hdr_v1 *header = iov[0].iov_base;
+	uint16_t mtu;
+
+	nd_dpdk = (struct lkl_netdev_dpdk *) nd;
+	memset(header, 0, sizeof(struct lkl_virtio_net_hdr_v1));
+
+	first = nd_dpdk->rcv_mbuf[nd_dpdk->bufidx];
+
+	for (rm = nd_dpdk->rcv_mbuf[nd_dpdk->bufidx]; rm; rm = rm->next) {
+		r_data = rte_pktmbuf_mtod(rm, void *);
+		r_size = rte_pktmbuf_data_len(rm);
+
+#ifdef DEBUG
+		lkl_printf("dpdk-rx: mbuf pktlen=%d orig_len=%lu\n",
+			   r_size, iov[i].iov_len);
+#endif
+		/* mergeable buffer starts data after vnet header at [0] */
+		if (nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_MRG_RXBUF) &&
+		    i == 0)
+			offset = sizeof(struct lkl_virtio_net_hdr_v1);
+		else if (nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_GUEST_TSO4) &&
+			 i == 0)
+			i++;
+		else
+			offset = sizeof(struct lkl_virtio_net_hdr_v1);
+
+		read += r_size;
+		while (r_size > 0) {
+			if (i >= cnt) {
+				fprintf(stderr,
+					"dpdk-rx: buffer full. skip it. ");
+				fprintf(stderr,
+					"(cnt=%d, buf[%d]=%lu, size=%lu)\n",
+					i, cnt, iov[i].iov_len, r_size);
+				goto end;
+			}
+
+			copylen = r_size < (iov[i].iov_len - offset) ? r_size
+				: iov[i].iov_len - offset;
+			memcpy(iov[i].iov_base + offset, r_data, copylen);
+
+			r_size -= copylen;
+			offset = 0;
+			i++;
+		}
+	}
+
+end:
+	/* TSO (big_packet mode) */
+	header->flags = LKL_VIRTIO_NET_HDR_F_DATA_VALID;
+	rte_eth_dev_get_mtu(nd_dpdk->portid, &mtu);
+
+	if (read > (mtu + sizeof(struct ether_hdr)
+		    + sizeof(struct lkl_virtio_net_hdr_v1))) {
+		struct rte_net_hdr_lens hdr_lens;
+		uint32_t ptype;
+
+		ptype = rte_net_get_ptype(first, &hdr_lens, RTE_PTYPE_ALL_MASK);
+
+		if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP) {
+			if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV4 &&
+			    nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_GUEST_TSO4))
+				header->gso_type = LKL_VIRTIO_NET_HDR_GSO_TCPV4;
+			/* XXX: Intel X540 doesn't support LRO
+			 * with tcpv6 packets
+			 */
+			if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV6 &&
+			    nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_GUEST_TSO6))
+				header->gso_type = LKL_VIRTIO_NET_HDR_GSO_TCPV6;
+		}
+
+		header->gso_size = mtu - hdr_lens.l3_len - hdr_lens.l4_len;
+		header->hdr_len = hdr_lens.l2_len + hdr_lens.l3_len
+			+ hdr_lens.l4_len;
+	}
+
+	read += sizeof(struct lkl_virtio_net_hdr_v1);
+
+#ifdef DEBUG
+	lkl_printf("dpdk-rx: len=%d mtu=%d type=%d, size=%d, hdrlen=%d\n",
+		   read, mtu, header->gso_type,
+		   header->gso_size, header->hdr_len);
+#endif
+
+	return read;
+}
+
+
+/*
+ * this function is not thread-safe.
+ *
+ * nd_dpdk->rcv_mbuf is specifically not safe in parallel access.  if future
+ * refactor allows us to read in parallel, the buffer (nd_dpdk->rcv_mbuf) shall
+ * be guarded.
+ */
+static int dpdk_net_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	struct lkl_netdev_dpdk *nd_dpdk;
+	int read = 0;
+
+	nd_dpdk = (struct lkl_netdev_dpdk *) nd;
+
+	if (nd_dpdk->npkts == 0) {
+		nd_dpdk->npkts = rte_eth_rx_burst(nd_dpdk->portid, 0,
+						  nd_dpdk->rcv_mbuf,
+						  MAX_PKT_BURST);
+		if (nd_dpdk->npkts <= 0) {
+			/* XXX: need to implement proper poll()
+			 * or interrupt mode PMD of dpdk, which is only
+			 * availbale on ixgbe/igb/e1000 (as of Jan. 2016)
+			 */
+			if (!nd_dpdk->busy_poll)
+				usleep(1);
+			return -1;
+		}
+		nd_dpdk->bufidx = 0;
+	}
+
+	/* mergeable buffer */
+	read = __dpdk_net_rx(nd, iov, cnt);
+
+	rte_pktmbuf_free(nd_dpdk->rcv_mbuf[nd_dpdk->bufidx]);
+
+	nd_dpdk->bufidx++;
+	nd_dpdk->npkts--;
+
+	return read;
+}
+
+static int dpdk_net_poll(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_dpdk *nd_dpdk =
+		container_of(nd, struct lkl_netdev_dpdk, dev);
+
+	if (nd_dpdk->close)
+		return LKL_DEV_NET_POLL_HUP;
+	/*
+	 * dpdk's interrupt mode has equivalent of epoll_wait(2),
+	 * which we can apply here. but AFAIK the mode is only available
+	 * on limited NIC drivers like ixgbe/igb/e1000 (with dpdk v2.2.0),
+	 * while vmxnet3 is not supported e.g..
+	 */
+	return LKL_DEV_NET_POLL_RX | LKL_DEV_NET_POLL_TX;
+}
+
+static void dpdk_net_poll_hup(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_dpdk *nd_dpdk =
+		container_of(nd, struct lkl_netdev_dpdk, dev);
+
+	nd_dpdk->close = 1;
+}
+
+static void dpdk_net_free(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_dpdk *nd_dpdk =
+		container_of(nd, struct lkl_netdev_dpdk, dev);
+
+	free(nd_dpdk);
+}
+
+struct lkl_dev_net_ops dpdk_net_ops = {
+	.tx = dpdk_net_tx,
+	.rx = dpdk_net_rx,
+	.poll = dpdk_net_poll,
+	.poll_hup = dpdk_net_poll_hup,
+	.free = dpdk_net_free,
+};
+
+
+static int dpdk_init;
+struct lkl_netdev *lkl_netdev_dpdk_create(const char *ifparams, int offload,
+					 unsigned char *mac)
+{
+	int ret = 0;
+	struct rte_eth_conf portconf;
+	struct rte_eth_link link;
+	struct lkl_netdev_dpdk *nd;
+	struct rte_eth_dev_info dev_info;
+	char poolname[RTE_MEMZONE_NAMESIZE];
+	char *debug = getenv("LKL_HIJACK_DEBUG");
+	int lkl_debug = 0;
+
+	if (!dpdk_init) {
+		if (debug)
+			lkl_debug = strtol(debug, NULL, 0);
+		if (lkl_debug & 0x400)
+			ealargs[3] = "--log-level=100";
+
+		ret = rte_eal_init(sizeof(ealargs) / sizeof(ealargs[0]),
+				   ealargs);
+		if (ret < 0)
+			lkl_printf("dpdk: failed to initialize eal\n");
+
+		dpdk_init = 1;
+	}
+
+	nd = malloc(sizeof(struct lkl_netdev_dpdk));
+	memset(nd, 0, sizeof(struct lkl_netdev_dpdk));
+	nd->dev.ops = &dpdk_net_ops;
+	nd->portid = portid++;
+	/* busy-poll mode is described 'ifparams' with "*-busy" */
+	nd->busy_poll = strstr(ifparams, "busy") ? 1 : 0;
+	/* we always enable big_packet mode with dpdk. */
+	nd->offload = offload;
+
+	snprintf(poolname, RTE_MEMZONE_NAMESIZE, "%s%s", "tx-", ifparams);
+	nd->txpool =
+		rte_mempool_create(poolname,
+				   MBUF_NUM, MBUF_SIZ, MEMPOOL_CACHE_SZ,
+				   sizeof(struct rte_pktmbuf_pool_private),
+				   rte_pktmbuf_pool_init, NULL,
+				   rte_pktmbuf_init, NULL, 0, 0);
+
+	if (!nd->txpool) {
+		lkl_printf("dpdk: failed to allocate tx pool\n");
+		free(nd);
+		return NULL;
+	}
+
+
+	snprintf(poolname, RTE_MEMZONE_NAMESIZE, "%s%s", "rx-", ifparams);
+	nd->rxpool =
+		rte_mempool_create(poolname, MBUF_NUM, MBUF_SIZ, 0,
+				   sizeof(struct rte_pktmbuf_pool_private),
+				   rte_pktmbuf_pool_init, NULL,
+				   rte_pktmbuf_init, NULL, 0, 0);
+	if (!nd->rxpool) {
+		lkl_printf("dpdk: failed to allocate rx pool\n");
+		free(nd);
+		return NULL;
+	}
+
+	memset(&portconf, 0, sizeof(portconf));
+
+	/* offload bits */
+	/* but, we only configure NIC to use TSO *only if* user specifies. */
+	if (offload & (BIT(LKL_VIRTIO_NET_F_GUEST_TSO4) |
+			BIT(LKL_VIRTIO_NET_F_GUEST_TSO6) |
+			BIT(LKL_VIRTIO_NET_F_MRG_RXBUF))) {
+		portconf.rxmode.enable_lro = 1;
+		portconf.rxmode.hw_strip_crc = 1;
+	}
+
+	ret = rte_eth_dev_configure(nd->portid, NUMQUEUE, NUMQUEUE,
+				    &portconf);
+	if (ret < 0) {
+		lkl_printf("dpdk: failed to configure port\n");
+		free(nd);
+		return NULL;
+	}
+
+	rte_eth_dev_info_get(nd->portid, &dev_info);
+
+	ret = rte_eth_rx_queue_setup(nd->portid, 0, NUMDESC, 0,
+				     &dev_info.default_rxconf, nd->rxpool);
+	if (ret < 0) {
+		lkl_printf("dpdk: failed to setup rx queue\n");
+		free(nd);
+		return NULL;
+	}
+
+	dev_info.default_txconf.txq_flags = 0;
+
+	dev_info.default_txconf.txq_flags |= ETH_TXQ_FLAGS_NOXSUMSCTP;
+	dev_info.default_txconf.txq_flags |= ETH_TXQ_FLAGS_NOVLANOFFL;
+
+
+	ret = rte_eth_tx_queue_setup(nd->portid, 0, NUMDESC, 0,
+				     &dev_info.default_txconf);
+	if (ret < 0) {
+		lkl_printf("dpdk: failed to setup tx queue\n");
+		free(nd);
+		return NULL;
+	}
+
+	ret = rte_eth_dev_start(nd->portid);
+	/* XXX: this function returns positive val (e.g., 12)
+	 * if there's an error
+	 */
+	if (ret != 0) {
+		lkl_printf("dpdk: failed to start device\n");
+		free(nd);
+		return NULL;
+	}
+
+	if (mac) {
+		rte_eth_macaddr_get(nd->portid, (struct ether_addr *)mac);
+		lkl_printf("Port %d: %02x:%02x:%02x:%02x:%02x:%02x\n",
+			   nd->portid,
+			   mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
+	}
+
+	rte_eth_dev_set_link_up(nd->portid);
+
+	rte_eth_link_get(nd->portid, &link);
+	if (!link.link_status) {
+		fprintf(stderr, "dpdk: interface state is down\n");
+		rte_eth_link_get(nd->portid, &link);
+		if (!link.link_status) {
+			fprintf(stderr,
+				"dpdk: interface state is down.. Giving up.\n");
+			return NULL;
+		}
+		lkl_printf("dpdk: interface state should be up now.\n");
+	}
+
+	/* should be promisc ? */
+	rte_eth_promiscuous_enable(nd->portid);
+
+	/* as we always assume to have vnet_hdr for dpdk device. */
+	nd->dev.has_vnet_hdr = 1;
+
+	return (struct lkl_netdev *) nd;
+}
diff --git a/tools/lkl/lib/virtio_net_fd.c b/tools/lkl/lib/virtio_net_fd.c
new file mode 100644
index 000000000000..f8664455e696
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_fd.c
@@ -0,0 +1,217 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * POSIX file descriptor based virtual network interface feature for
+ * LKL Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ *         Octavian Purdila <octavian.purdila@intel.com>
+ *
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <unistd.h>
+#ifdef __FreeBSD__
+#include <sys/syslimits.h>
+#else
+#include <limits.h>
+#endif
+#include <fcntl.h>
+#include <sys/poll.h>
+#include <sys/uio.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+struct lkl_netdev_fd {
+	struct lkl_netdev dev;
+	/* file-descriptor based device */
+	int fd_rx;
+	int fd_tx;
+	/*
+	 * Controlls the poll mask for fd. Can be acccessed concurrently from
+	 * poll, tx, or rx routines but there is no need for syncronization
+	 * because:
+	 *
+	 * (a) TX and RX routines set different variables so even if they update
+	 * at the same time there is no race condition
+	 *
+	 * (b) Even if poll and TX / RX update at the same time poll cannot
+	 * stall: when poll resets the poll variable we know that TX / RX will
+	 * run which means that eventually the poll variable will be set.
+	 */
+	int poll_tx, poll_rx;
+	/* controle pipe */
+	int pipe[2];
+};
+
+static int fd_net_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	do {
+		ret = writev(nd_fd->fd_tx, iov, cnt);
+	} while (ret == -1 && errno == EINTR);
+
+	if (ret < 0) {
+		if (errno != EAGAIN) {
+			perror("write to fd netdev fails");
+		} else {
+			char tmp = 0;
+
+			nd_fd->poll_tx = 1;
+			if (write(nd_fd->pipe[1], &tmp, 1) <= 0)
+				perror("virtio net fd pipe write");
+		}
+	}
+	return ret;
+}
+
+static int fd_net_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	do {
+		ret = readv(nd_fd->fd_rx, (struct iovec *)iov, cnt);
+	} while (ret == -1 && errno == EINTR);
+
+	if (ret < 0) {
+		if (errno != EAGAIN) {
+			perror("virtio net fd read");
+		} else {
+			char tmp = 0;
+
+			nd_fd->poll_rx = 1;
+			if (write(nd_fd->pipe[1], &tmp, 1) < 0)
+				perror("virtio net fd pipe write");
+		}
+	}
+	return ret;
+}
+
+static int fd_net_poll(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+	struct pollfd pfds[3] = {
+		{
+			.fd = nd_fd->fd_rx,
+		},
+		{
+			.fd = nd_fd->fd_tx,
+		},
+		{
+			.fd = nd_fd->pipe[0],
+			.events = POLLIN,
+		},
+	};
+	int ret;
+
+	if (nd_fd->poll_rx)
+		pfds[0].events |= POLLIN|POLLPRI;
+	if (nd_fd->poll_tx)
+		pfds[1].events |= POLLOUT;
+
+	do {
+		ret = poll(pfds, 3, -1);
+	} while (ret == -1 && errno == EINTR);
+
+	if (ret < 0) {
+		perror("virtio net fd poll");
+		return 0;
+	}
+
+	if (pfds[2].revents & (POLLHUP|POLLNVAL))
+		return LKL_DEV_NET_POLL_HUP;
+
+	if (pfds[2].revents & POLLIN) {
+		char tmp[PIPE_BUF];
+
+		ret = read(nd_fd->pipe[0], tmp, PIPE_BUF);
+		if (ret == 0)
+			return LKL_DEV_NET_POLL_HUP;
+		if (ret < 0)
+			perror("virtio net fd pipe read");
+	}
+
+	ret = 0;
+
+	if (pfds[0].revents & (POLLIN|POLLPRI)) {
+		nd_fd->poll_rx = 0;
+		ret |= LKL_DEV_NET_POLL_RX;
+	}
+
+	if (pfds[1].revents & POLLOUT) {
+		nd_fd->poll_tx = 0;
+		ret |= LKL_DEV_NET_POLL_TX;
+	}
+
+	return ret;
+}
+
+static void fd_net_poll_hup(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	/* this will cause a POLLHUP / POLLNVAL in the poll function */
+	close(nd_fd->pipe[0]);
+	close(nd_fd->pipe[1]);
+}
+
+static void fd_net_free(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	close(nd_fd->fd_rx);
+	close(nd_fd->fd_tx);
+	free(nd_fd);
+}
+
+struct lkl_dev_net_ops fd_net_ops =  {
+	.tx = fd_net_tx,
+	.rx = fd_net_rx,
+	.poll = fd_net_poll,
+	.poll_hup = fd_net_poll_hup,
+	.free = fd_net_free,
+};
+
+struct lkl_netdev *lkl_register_netdev_fd(int fd_rx, int fd_tx)
+{
+	struct lkl_netdev_fd *nd;
+
+	nd = malloc(sizeof(*nd));
+	if (!nd) {
+		fprintf(stderr, "fdnet: failed to allocate memory\n");
+		/* TODO: propagate the error state, maybe use errno for that? */
+		return NULL;
+	}
+
+	memset(nd, 0, sizeof(*nd));
+
+	nd->fd_rx = fd_rx;
+	nd->fd_tx = fd_tx;
+	if (pipe(nd->pipe) < 0) {
+		perror("pipe");
+		free(nd);
+		return NULL;
+	}
+
+	if (fcntl(nd->pipe[0], F_SETFL, O_NONBLOCK) < 0) {
+		perror("fnctl");
+		close(nd->pipe[0]);
+		close(nd->pipe[1]);
+		free(nd);
+		return NULL;
+	}
+
+	nd->dev.ops = &fd_net_ops;
+	return &nd->dev;
+}
diff --git a/tools/lkl/lib/virtio_net_fd.h b/tools/lkl/lib/virtio_net_fd.h
new file mode 100644
index 000000000000..713ba13cca7c
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_fd.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _VIRTIO_NET_FD_H
+#define _VIRTIO_NET_FD_H
+
+struct ifreq;
+
+/**
+ * lkl_register_netdev_linux_fdnet - register a file descriptor-based network
+ * device as a NIC
+ *
+ * @fd_rx - a POSIX file descriptor number for input
+ * @fd_tx - a POSIX file descriptor number for output
+ * @returns a struct lkl_netdev_linux_fdnet entry for virtio-net
+ */
+struct lkl_netdev *lkl_register_netdev_fd(int fd_rx, int fd_tx);
+
+
+/**
+ * lkl_netdev_tap_init - initialize tap related structure fot lkl_netdev.
+ *
+ * @path - the path to open the device.
+ * @offload - offload bits for the device
+ * @ifr - struct ifreq for ioctl.
+ */
+struct lkl_netdev *lkl_netdev_tap_init(const char *path, int offload,
+				       struct ifreq *ifr);
+
+#endif /* _VIRTIO_NET_FD_H*/
diff --git a/tools/lkl/lib/virtio_net_macvtap.c b/tools/lkl/lib/virtio_net_macvtap.c
new file mode 100644
index 000000000000..5d6d2c822f2d
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_macvtap.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * macvtap based virtual network interface feature for LKL
+ * Copyright (c) 2016 Hajime Tazaki
+ *
+ * Author: Hajime Tazaki <thehajime@gmail.com>
+ *
+ * Current implementation is linux-specific.
+ */
+
+/*
+ * You need to configure host device in advance.
+ *
+ * sudo ip link add link eth0 name vtap0 type macvtap mode passthru
+ * sudo ip link set dev vtap0 up
+ * sudo chown thehajime /dev/tap22
+ */
+
+#include <net/if.h>
+#include <linux/if_tun.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+struct lkl_netdev *lkl_netdev_macvtap_create(const char *path, int offload)
+{
+	struct ifreq ifr = {
+		.ifr_flags = IFF_TAP | IFF_NO_PI,
+	};
+
+	return lkl_netdev_tap_init(path, offload, &ifr);
+}
diff --git a/tools/lkl/lib/virtio_net_pipe.c b/tools/lkl/lib/virtio_net_pipe.c
new file mode 100644
index 000000000000..c68d4c855499
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_pipe.c
@@ -0,0 +1,76 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * pipe based virtual network interface feature for LKL
+ * Copyright (c) 2017,2016 Motomu Utsumi
+ *
+ * Author: Motomu Utsumi <motomuman@gmail.com>
+ *
+ * Current implementation is linux-specific.
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <fcntl.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+struct lkl_netdev *lkl_netdev_pipe_create(const char *_ifname, int offload)
+{
+	struct lkl_netdev *nd;
+	int fd_rx, fd_tx;
+	char *ifname = strdup(_ifname), *ifname_rx = NULL, *ifname_tx = NULL;
+
+	ifname_rx = strtok(ifname, "|");
+	if (ifname_rx == NULL) {
+		fprintf(stderr, "invalid ifname format: %s\n", ifname);
+		free(ifname);
+		return NULL;
+	}
+
+	ifname_tx = strtok(NULL, "|");
+	if (ifname_tx == NULL) {
+		fprintf(stderr, "invalid ifname format: %s\n", ifname);
+		free(ifname);
+		return NULL;
+	}
+
+	if (strtok(NULL, "|") != NULL) {
+		fprintf(stderr, "invalid ifname format: %s\n", ifname);
+		free(ifname);
+		return NULL;
+	}
+
+	fd_rx = open(ifname_rx, O_RDWR|O_NONBLOCK);
+	if (fd_rx < 0) {
+		perror("can not open ifname_rx pipe");
+		free(ifname);
+		return NULL;
+	}
+
+	fd_tx = open(ifname_tx, O_RDWR|O_NONBLOCK);
+	if (fd_tx < 0) {
+		perror("can not open ifname_tx pipe");
+		close(fd_rx);
+		free(ifname);
+		return NULL;
+	}
+
+	nd = lkl_register_netdev_fd(fd_rx, fd_tx);
+	if (!nd) {
+		perror("failed to register to.");
+		close(fd_rx);
+		close(fd_tx);
+		free(ifname);
+		return NULL;
+	}
+
+	free(ifname);
+	/*
+	 * To avoid mismatch with LKL otherside,
+	 * we always enabled vnet hdr
+	 */
+	nd->has_vnet_hdr = 1;
+	return nd;
+}
diff --git a/tools/lkl/lib/virtio_net_raw.c b/tools/lkl/lib/virtio_net_raw.c
new file mode 100644
index 000000000000..363ccf628569
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_raw.c
@@ -0,0 +1,94 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * raw socket based virtual network interface feature for LKL
+ * Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ *
+ * Current implementation is linux-specific.
+ */
+
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <unistd.h>
+#include <net/if.h>
+#include <arpa/inet.h>
+#ifdef __linux__
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+#elif __FreeBSD__
+#include <netinet/in.h>
+#endif
+#include <fcntl.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+/* since Linux 3.14 (man 7 packet) */
+#ifndef PACKET_QDISC_BYPASS
+#define PACKET_QDISC_BYPASS 20
+#endif
+
+struct lkl_netdev *lkl_netdev_raw_create(const char *ifname)
+{
+#ifdef __linux__
+	int ret;
+	int ifindex =  if_nametoindex(ifname);
+	struct sockaddr_ll ll = {
+		.sll_family = PF_PACKET,
+		.sll_ifindex = ifindex,
+		.sll_protocol = htons(ETH_P_ALL),
+	};
+	struct packet_mreq mreq = {
+		.mr_type = PACKET_MR_PROMISC,
+		.mr_ifindex = ifindex,
+	};
+#endif
+	int fd, fd_flags;
+#ifdef __linux__
+	int val;
+
+	if (ifindex < 0) {
+		perror("if_nametoindex");
+		return NULL;
+	}
+
+	fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
+#elif __FreeBSD__
+	fd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
+#endif
+	if (fd < 0) {
+		perror("socket");
+		return NULL;
+	}
+
+#ifdef __linux__
+	ret = bind(fd, (struct sockaddr *)&ll, sizeof(ll));
+	if (ret) {
+		perror("bind");
+		close(fd);
+		return NULL;
+	}
+
+	ret = setsockopt(fd, SOL_PACKET, PACKET_ADD_MEMBERSHIP, &mreq,
+			sizeof(mreq));
+	if (ret) {
+		perror("PACKET_ADD_MEMBERSHIP PACKET_MR_PROMISC");
+		close(fd);
+		return NULL;
+	}
+
+	val = 1;
+	ret = setsockopt(fd, SOL_PACKET, PACKET_QDISC_BYPASS, &val,
+			 sizeof(val));
+	if (ret)
+		perror("PACKET_QDISC_BYPASS, ignoring");
+#endif
+
+	fd_flags = fcntl(fd, F_GETFD, NULL);
+	fcntl(fd, F_SETFL, fd_flags | O_NONBLOCK);
+
+	return lkl_register_netdev_fd(fd, fd);
+}
diff --git a/tools/lkl/lib/virtio_net_tap.c b/tools/lkl/lib/virtio_net_tap.c
new file mode 100644
index 000000000000..f1f64cee9695
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_tap.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * tun/tap based virtual network interface feature for LKL
+ * Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ *         Octavian Purdila <octavian.purdila@intel.com>
+ *
+ * Current implementation is linux-specific.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <fcntl.h>
+#include <net/if.h>
+#ifdef __linux__
+#include <linux/if_tun.h>
+#elif __FreeBSD__
+#include <net/if_tun.h>
+#endif
+#include <sys/ioctl.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+#define BIT(x) (1ULL << x)
+
+struct lkl_netdev *lkl_netdev_tap_init(const char *path, int offload,
+				       struct ifreq *ifr)
+{
+	struct lkl_netdev *nd;
+	int fd, vnet_hdr_sz = 0;
+#ifdef __linux__
+	int ret, tap_arg = 0;
+
+	if (offload & BIT(LKL_VIRTIO_NET_F_GUEST_CSUM))
+		tap_arg |= TUN_F_CSUM;
+	if (offload & (BIT(LKL_VIRTIO_NET_F_GUEST_TSO4) |
+	    BIT(LKL_VIRTIO_NET_F_MRG_RXBUF)))
+		tap_arg |= TUN_F_TSO4 | TUN_F_CSUM;
+	if (offload & (BIT(LKL_VIRTIO_NET_F_GUEST_TSO6)))
+		tap_arg |= TUN_F_TSO6 | TUN_F_CSUM;
+
+	if (tap_arg || (offload & (BIT(LKL_VIRTIO_NET_F_CSUM) |
+				   BIT(LKL_VIRTIO_NET_F_HOST_TSO4) |
+				   BIT(LKL_VIRTIO_NET_F_HOST_TSO6)))) {
+		ifr->ifr_flags |= IFF_VNET_HDR;
+		vnet_hdr_sz = sizeof(struct lkl_virtio_net_hdr_v1);
+	}
+#endif
+	fd = open(path, O_RDWR|O_NONBLOCK);
+	if (fd < 0) {
+		perror("open");
+		return NULL;
+	}
+
+#ifdef __linux__
+	ret = ioctl(fd, TUNSETIFF, ifr);
+	if (ret < 0) {
+		fprintf(stderr, "%s: failed to attach to: %s\n",
+			path, strerror(errno));
+		close(fd);
+		return NULL;
+	}
+	if (vnet_hdr_sz && ioctl(fd, TUNSETVNETHDRSZ, &vnet_hdr_sz) != 0) {
+		fprintf(stderr, "%s: failed to TUNSETVNETHDRSZ to: %s\n",
+			path, strerror(errno));
+		close(fd);
+		return NULL;
+	}
+	if (ioctl(fd, TUNSETOFFLOAD, tap_arg) != 0) {
+		fprintf(stderr, "%s: failed to TUNSETOFFLOAD: %s\n",
+			path, strerror(errno));
+		close(fd);
+		return NULL;
+	}
+#endif
+	nd = lkl_register_netdev_fd(fd, fd);
+	if (!nd) {
+		perror("failed to register to.");
+		close(fd);
+		return NULL;
+	}
+
+	nd->has_vnet_hdr = (vnet_hdr_sz != 0);
+	return nd;
+}
+
+struct lkl_netdev *lkl_netdev_tap_create(const char *ifname, int offload)
+{
+#ifdef __linux__
+	char *path = "/dev/net/tun";
+#elif __FreeBSD__
+	char path[32];
+
+	sprintf(path, "/dev/%s", ifname);
+#endif
+
+	struct ifreq ifr = {
+#ifdef __linux__
+		.ifr_flags = IFF_TAP | IFF_NO_PI,
+#endif
+	};
+
+	strncpy(ifr.ifr_name, ifname, IFNAMSIZ);
+
+	return lkl_netdev_tap_init(path, offload, &ifr);
+}
diff --git a/tools/lkl/lib/virtio_net_vde.c b/tools/lkl/lib/virtio_net_vde.c
new file mode 100644
index 000000000000..1d017aba91ae
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_vde.c
@@ -0,0 +1,168 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <poll.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include "virtio.h"
+
+#include <libvdeplug.h>
+
+struct lkl_netdev_vde {
+	struct lkl_netdev dev;
+	VDECONN *conn;
+};
+
+struct lkl_netdev *nuse_vif_vde_create(char *switch_path);
+static int net_vde_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+static int net_vde_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+static int net_vde_poll_with_timeout(struct lkl_netdev *nd, int timeout);
+static int net_vde_poll(struct lkl_netdev *nd);
+static void net_vde_poll_hup(struct lkl_netdev *nd);
+static void net_vde_free(struct lkl_netdev *nd);
+
+struct lkl_dev_net_ops vde_net_ops = {
+	.tx = net_vde_tx,
+	.rx = net_vde_rx,
+	.poll = net_vde_poll,
+	.poll_hup = net_vde_poll_hup,
+	.free = net_vde_free,
+};
+
+int net_vde_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+	void *data = iov[0].iov_base;
+	int len = (int)iov[0].iov_len;
+
+	ret = vde_send(nd_vde->conn, data, len, 0);
+	if (ret <= 0 && errno == EAGAIN)
+		return -1;
+	return ret;
+}
+
+int net_vde_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+	void *data = iov[0].iov_base;
+	int len = (int)iov[0].iov_len;
+
+	/*
+	 * Due to a bug in libvdeplug we have to first poll to make sure
+	 * that there is data available.
+	 * The correct solution would be to just use
+	 *   ret = vde_recv(nd_vde->conn, data, len, MSG_DONTWAIT);
+	 * This should be changed once libvdeplug is fixed.
+	 */
+	ret = 0;
+	if (net_vde_poll_with_timeout(nd, 0) & LKL_DEV_NET_POLL_RX)
+		ret = vde_recv(nd_vde->conn, data, len, 0);
+	if (ret <= 0)
+		return -1;
+	return ret;
+}
+
+int net_vde_poll_with_timeout(struct lkl_netdev *nd, int timeout)
+{
+	int ret;
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+	struct pollfd pollfds[] = {
+			{
+					.fd = vde_datafd(nd_vde->conn),
+					.events = POLLIN | POLLOUT,
+			},
+			{
+					.fd = vde_ctlfd(nd_vde->conn),
+					.events = POLLHUP | POLLIN
+			}
+	};
+
+	while (poll(pollfds, 2, timeout) < 0 && errno == EINTR)
+		;
+
+	ret = 0;
+
+	if (pollfds[1].revents & (POLLHUP | POLLNVAL | POLLIN))
+		return LKL_DEV_NET_POLL_HUP;
+	if (pollfds[0].revents & (POLLHUP | POLLNVAL))
+		return LKL_DEV_NET_POLL_HUP;
+
+	if (pollfds[0].revents & POLLIN)
+		ret |= LKL_DEV_NET_POLL_RX;
+	if (pollfds[0].revents & POLLOUT)
+		ret |= LKL_DEV_NET_POLL_TX;
+
+	return ret;
+}
+
+int net_vde_poll(struct lkl_netdev *nd)
+{
+	return net_vde_poll_with_timeout(nd, -1);
+}
+
+void net_vde_poll_hup(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+
+	vde_close(nd_vde->conn);
+}
+
+void net_vde_free(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+
+	free(nd_vde);
+}
+
+struct lkl_netdev *lkl_netdev_vde_create(char const *switch_path)
+{
+	struct lkl_netdev_vde *nd;
+	struct vde_open_args open_args = {.port = 0, .group = 0, .mode = 0700 };
+	char *switch_path_copy = 0;
+
+	nd = malloc(sizeof(*nd));
+	if (!nd) {
+		fprintf(stderr, "Failed to allocate memory.\n");
+		/* TODO: propagate the error state, maybe use errno? */
+		return 0;
+	}
+	nd->dev.ops = &vde_net_ops;
+
+	/* vde_open() allows the null pointer as path which means
+	 * "VDE default path"
+	 */
+	if (switch_path != 0) {
+		/* vde_open() takes a non-const char * which is a bug in their
+		 * function declaration. Even though the implementation does not
+		 * modify the string, we shouldn't just cast away the const.
+		 */
+		size_t switch_path_length = strlen(switch_path);
+
+		switch_path_copy = calloc(switch_path_length + 1, sizeof(char));
+		if (!switch_path_copy) {
+			fprintf(stderr, "Failed to allocate memory.\n");
+			/* TODO: propagate the error state, maybe use errno? */
+			return 0;
+		}
+		strncpy(switch_path_copy, switch_path, switch_path_length);
+	}
+	nd->conn = vde_open(switch_path_copy, "lkl-virtio-net", &open_args);
+	free(switch_path_copy);
+	if (nd->conn == 0) {
+		fprintf(stderr, "Failed to connect to vde switch.\n");
+		/* TODO: propagate the error state, maybe use errno? */
+		return 0;
+	}
+
+	return &nd->dev;
+}
diff --git a/tools/lkl/tests/net-setup.sh b/tools/lkl/tests/net-setup.sh
new file mode 100644
index 000000000000..cc260ed68a7b
--- /dev/null
+++ b/tools/lkl/tests/net-setup.sh
@@ -0,0 +1,134 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+TEST_TAP_IFNAME=tap
+else
+TEST_TAP_IFNAME=lkl_test_tap
+fi
+TEST_IP_NETWORK=192.168.113.0
+TEST_IP_NETMASK=24
+TEST_IP6_NETWORK=fc03::0
+TEST_IP6_NETMASK=64
+TEST_MAC0="aa:bb:cc:dd:ee:ff"
+TEST_MAC1="aa:bb:cc:dd:ee:aa"
+TEST_NETSERVER_PORT=11223
+
+# $1 - count
+# $2 - netcount
+ip_add()
+{
+    IP_HEX=$(printf '%.2X%.2X%.2X%.2X\n' \
+         `echo $TEST_IP_NETWORK | sed -e 's/\./ /g'`)
+    NET_COUNT=$(( 1 << (32 - $TEST_IP_NETMASK) ))
+    NEXT_IP_HEX=$(printf %.8X `echo $((0x$IP_HEX + $1 + ${2:-0} * $NET_COUNT))`)
+    NEXT_IP=$(printf '%d.%d.%d.%d\n' \
+          `echo $NEXT_IP_HEX | sed -r 's/(..)/0x\1 /g'`)
+    echo -n "$NEXT_IP"
+}
+
+# $1 - count
+# $2 - netcount
+ip6_add()
+{
+    IP6_PREFIX=${TEST_IP6_NETWORK%*::*}
+    IP6_HOST=${TEST_IP6_NETWORK#*::*}
+    echo -n "$(printf "%x" $((0x$IP6_PREFIX+${2:-0})))::$(($IP6_HOST+$1))"
+}
+
+ip_host()
+{
+
+    ip_add 1 $1
+}
+
+ip_lkl()
+{
+    ip_add 2 $1
+}
+
+ip_host_mask()
+{
+    echo -n "$(ip_host $1)/$TEST_IP_NETMASK"
+}
+
+ip_net_mask()
+{
+    echo "$(ip_add 0 $1)/$TEST_IP_NETMASK"
+}
+
+ip6_host()
+{
+    ip6_add 1 $1
+}
+
+ip6_lkl()
+{
+    ip6_add 2 $1
+}
+
+ip6_host_mask()
+{
+    echo -n "$(ip6_host $1)/$TEST_IP6_NETMASK"
+}
+
+ip6_net_mask()
+{
+    echo "$(ip6_add 0 $1)/$TEST_IP6_NETMASK"
+}
+
+tap_ifname()
+{
+    echo -n "$TEST_TAP_IFNAME${1:-0}"
+}
+
+tap_prepare()
+{
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        if ! lkl_test_cmd test -d /dev/net &>/dev/null; then
+            lkl_test_cmd sudo mkdir /dev/net
+            lkl_test_cmd sudo ln -s /dev/tun /dev/net/tun
+        fi
+        TAP_USER="vpn"
+        ANDROID_USER="vpn,vpn,net_admin,inet"
+        export_vars ANDROID_USER
+    else
+        TAP_USER=$USER
+    fi
+}
+
+tap_setup()
+{
+    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+        lkl_test_cmd sudo ifconfig tap create
+        lkl_test_cmd sudo sysctl net.link.tap.up_on_open=1
+        lkl_test_cmd sudo sysctl net.link.tap.user_open=1
+        lkl_test_cmd sudo ifconfig $(tap_ifname) $(ip_host)
+        lkl_test_cmd sudo ifconfig $(tap_ifname) inet6 $(ip6_host)
+        return
+    fi
+
+    lkl_test_cmd sudo ip tuntap add dev $(tap_ifname $1) mode tap user $TAP_USER
+    lkl_test_cmd sudo ip link set dev $(tap_ifname $1) up
+    lkl_test_cmd sudo ip addr add dev $(tap_ifname $1) $(ip_host_mask $1)
+    lkl_test_cmd sudo ip -6 addr add dev $(tap_ifname $1) $(ip6_host_mask $1)
+
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        lkl_test_cmd sudo ip route add $(ip_net_mask $1) \
+                     dev $(tap_ifname $1) proto kernel scope link \
+                     src $(ip_host $1) table local
+        lkl_test_cmd sudo ip -6 route add $(ip6_net_mask $1) \
+                     dev $(tap_ifname $1) table local
+    fi
+}
+
+tap_cleanup()
+{
+    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+        lkl_test_cmd sudo ifconfig $(tap_ifname) destroy
+        return
+    fi
+
+    lkl_test_cmd sudo ip link set dev $(tap_ifname $1) down
+    lkl_test_cmd sudo ip tuntap del dev $(tap_ifname $1) mode tap
+}
diff --git a/tools/lkl/tests/net-test.c b/tools/lkl/tests/net-test.c
new file mode 100644
index 000000000000..d2fd19f1b995
--- /dev/null
+++ b/tools/lkl/tests/net-test.c
@@ -0,0 +1,317 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <errno.h>
+#ifdef __FreeBSD__
+#include <sys/types.h>
+#endif
+#ifdef __MINGW32__
+#include <winsock2.h>
+#else
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#endif
+
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include "cla.h"
+#include "test.h"
+
+enum {
+	BACKEND_TAP,
+	BACKEND_MACVTAP,
+	BACKEND_RAW,
+	BACKEND_DPDK,
+	BACKEND_PIPE,
+	BACKEND_NONE,
+};
+
+const char *backends[] = { "tap", "macvtap", "raw", "dpdk", "pipe", "loopback",
+			   NULL };
+static struct {
+	int backend;
+	const char *ifname;
+	int dhcp, nmlen;
+	unsigned int ip, dst, gateway, sleep;
+} cla = {
+	.backend = BACKEND_NONE,
+	.ip = INADDR_NONE,
+	.gateway = INADDR_NONE,
+	.dst = INADDR_NONE,
+	.sleep = 0,
+};
+
+
+struct cl_arg args[] = {
+	{"backend", 'b', "network backend type", 1, CL_ARG_STR_SET,
+	 &cla.backend, backends},
+	{"ifname", 'i', "interface name", 1, CL_ARG_STR, &cla.ifname},
+	{"dhcp", 'd', "use dhcp to configure LKL", 0, CL_ARG_BOOL, &cla.dhcp},
+	{"ip", 'I', "IPv4 address to use", 1, CL_ARG_IPV4, &cla.ip},
+	{"netmask-len", 'n', "IPv4 netmask length", 1, CL_ARG_INT,
+	 &cla.nmlen},
+	{"gateway", 'g', "IPv4 gateway to use", 1, CL_ARG_IPV4, &cla.gateway},
+	{"dst", 'D', "IPv4 destination address", 1, CL_ARG_IPV4, &cla.dst},
+	{"sleep", 's', "sleep", 1, CL_ARG_INT, &cla.sleep},
+	{0},
+};
+
+u_short
+in_cksum(const u_short *addr, register int len, u_short csum)
+{
+	int nleft = len;
+	const u_short *w = addr;
+	u_short answer;
+	int sum = csum;
+
+	while (nleft > 1)  {
+		sum += *w++;
+		nleft -= 2;
+	}
+
+	if (nleft == 1)
+		sum += htons(*(u_char *)w << 8);
+
+	sum = (sum >> 16) + (sum & 0xffff);
+	sum += (sum >> 16);
+	answer = ~sum;
+	return answer;
+}
+
+static int lkl_test_sleep(void)
+{
+	struct lkl_timespec ts = {
+		.tv_sec = cla.sleep,
+	};
+	int ret;
+
+	ret = lkl_sys_nanosleep((struct __lkl__kernel_timespec *)&ts, NULL);
+	if (ret < 0) {
+		lkl_test_logf("nanosleep error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_icmp(void)
+{
+	int sock, ret;
+	struct lkl_iphdr *iph;
+	struct lkl_icmphdr *icmp;
+	struct lkl_sockaddr_in saddr;
+	struct lkl_pollfd pfd;
+	char buf[32];
+
+	if (cla.dst == INADDR_NONE)
+		return TEST_SKIP;
+
+	memset(&saddr, 0, sizeof(saddr));
+	saddr.sin_family = AF_INET;
+	saddr.sin_addr.lkl_s_addr = cla.dst;
+
+	lkl_test_logf("pinging %s\n",
+		      inet_ntoa(*(struct in_addr *)&saddr.sin_addr));
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_RAW, LKL_IPPROTO_ICMP);
+	if (sock < 0) {
+		lkl_test_logf("socket error (%s)\n", lkl_strerror(sock));
+		return TEST_FAILURE;
+	}
+
+	icmp = malloc(sizeof(struct lkl_icmphdr *));
+	icmp->type = LKL_ICMP_ECHO;
+	icmp->code = 0;
+	icmp->checksum = 0;
+	icmp->un.echo.sequence = 0;
+	icmp->un.echo.id = 0;
+	icmp->checksum = in_cksum((u_short *)icmp, sizeof(*icmp), 0);
+
+	ret = lkl_sys_sendto(sock, icmp, sizeof(*icmp), 0,
+			     (struct lkl_sockaddr *)&saddr,
+			     sizeof(saddr));
+	if (ret < 0) {
+		lkl_test_logf("sendto error (%s)\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	free(icmp);
+
+	pfd.fd = sock;
+	pfd.events = LKL_POLLIN;
+	pfd.revents = 0;
+
+	ret = lkl_sys_poll(&pfd, 1, 1000);
+	if (ret < 0) {
+		lkl_test_logf("poll error (%s)\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_recv(sock, buf, sizeof(buf), LKL_MSG_DONTWAIT);
+	if (ret < 0) {
+		lkl_test_logf("recv error (%s)\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	iph = (struct lkl_iphdr *)buf;
+	icmp = (struct lkl_icmphdr *)(buf + iph->ihl * 4);
+	/* DHCP server may issue an ICMP echo request to a dhcp client */
+	if ((icmp->type != LKL_ICMP_ECHOREPLY || icmp->code != 0) &&
+	    (icmp->type != LKL_ICMP_ECHO)) {
+		lkl_test_logf("no ICMP echo reply (type=%d, code=%d)\n",
+			      icmp->type, icmp->code);
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static struct lkl_netdev *nd;
+
+static int lkl_test_nd_create(void)
+{
+	switch (cla.backend) {
+	case BACKEND_NONE:
+		return TEST_SKIP;
+	case BACKEND_TAP:
+		nd = lkl_netdev_tap_create(cla.ifname, 0);
+		break;
+	case BACKEND_DPDK:
+		nd = lkl_netdev_dpdk_create(cla.ifname, 0, NULL);
+		break;
+	case BACKEND_RAW:
+		nd = lkl_netdev_raw_create(cla.ifname);
+		break;
+	case BACKEND_MACVTAP:
+		nd = lkl_netdev_macvtap_create(cla.ifname, 0);
+		break;
+	case BACKEND_PIPE:
+		nd = lkl_netdev_pipe_create(cla.ifname, 0);
+		break;
+	}
+
+	if (!nd) {
+		lkl_test_logf("failed to create netdev\n");
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int nd_id;
+
+static int lkl_test_nd_add(void)
+{
+	if (cla.backend == BACKEND_NONE)
+		return TEST_SKIP;
+
+	nd_id = lkl_netdev_add(nd, NULL);
+	if (nd_id < 0) {
+		lkl_test_logf("failed to add netdev: %s\n",
+			      lkl_strerror(nd_id));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_nd_remove(void)
+{
+	if (cla.backend == BACKEND_NONE)
+		return TEST_SKIP;
+
+	lkl_netdev_remove(nd_id);
+	lkl_netdev_free(nd);
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
+	"mem=16M loglevel=8 %s", cla.dhcp ? "ip=dhcp" : "");
+LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
+
+static int nd_ifindex;
+
+static int lkl_test_nd_ifindex(void)
+{
+	if (cla.backend == BACKEND_NONE)
+		return TEST_SKIP;
+
+	nd_ifindex = lkl_netdev_get_ifindex(nd_id);
+	if (nd_ifindex < 0) {
+		lkl_test_logf("failed to get ifindex for netdev id %d: %s\n",
+			      nd_id, lkl_strerror(nd_ifindex));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(if_up, lkl_if_up, 0,
+	      cla.backend == BACKEND_NONE ? 1 : nd_ifindex);
+
+static int lkl_test_set_ipv4(void)
+{
+	int ret;
+
+	if (cla.backend == BACKEND_NONE || cla.ip == LKL_INADDR_NONE)
+		return TEST_SKIP;
+
+	ret = lkl_if_set_ipv4(nd_ifindex, cla.ip, cla.nmlen);
+	if (ret < 0) {
+		lkl_test_logf("failed to set IPv4 address: %s\n",
+			      lkl_strerror(ret));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_set_gateway(void)
+{
+	int ret;
+
+	if (cla.backend == BACKEND_NONE || cla.gateway == LKL_INADDR_NONE)
+		return TEST_SKIP;
+
+	ret = lkl_set_ipv4_gateway(cla.gateway);
+	if (ret < 0) {
+		lkl_test_logf("failed to set IPv4 gateway: %s\n",
+			      lkl_strerror(ret));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+struct lkl_test tests[] = {
+	LKL_TEST(nd_create),
+	LKL_TEST(nd_add),
+	LKL_TEST(start_kernel),
+	LKL_TEST(nd_ifindex),
+	LKL_TEST(if_up),
+	LKL_TEST(set_ipv4),
+	LKL_TEST(set_gateway),
+	LKL_TEST(sleep),
+	LKL_TEST(icmp),
+	LKL_TEST(nd_remove),
+	LKL_TEST(stop_kernel),
+};
+
+int main(int argc, const char **argv)
+{
+	if (parse_args(argc, argv, args) < 0)
+		return -1;
+
+	if (cla.ip != LKL_INADDR_NONE && (cla.nmlen < 0 || cla.nmlen > 32)) {
+		fprintf(stderr, "invalid netmask length %d\n", cla.nmlen);
+		return -1;
+	}
+
+	lkl_host_ops.print = lkl_test_log;
+
+	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
+			    "net %s", backends[cla.backend]);
+}
diff --git a/tools/lkl/tests/net.sh b/tools/lkl/tests/net.sh
new file mode 100755
index 000000000000..cd8de53fe0fd
--- /dev/null
+++ b/tools/lkl/tests/net.sh
@@ -0,0 +1,186 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+source $script_dir/test.sh
+source $script_dir/net-setup.sh
+
+cleanup_backend()
+{
+    set -e
+
+    case "$1" in
+    "tap")
+        tap_cleanup
+        ;;
+    "pipe")
+        rm -rf $work_dir
+        ;;
+    "raw")
+        ;;
+    "macvtap")
+        sudo ip link del dev $(tap_ifname) type macvtap
+        ;;
+    "loopback")
+        ;;
+    esac
+}
+
+get_test_ip()
+{
+    # DHCP test parameters
+    TEST_HOST=8.8.8.8
+    HOST_IF=$(lkl_test_cmd ip route get $TEST_HOST | head -n1 |cut -d ' ' -f5)
+    HOST_GW=$(lkl_test_cmd ip route get $TEST_HOST | head -n1 | cut -d ' ' -f3)
+    if lkl_test_cmd ping -c1 -w1 $HOST_GW; then
+        TEST_IP_REMOTE=$HOST_GW
+    elif lkl_test_cmd ping -c1 -w1 $TEST_HOST; then
+        TEST_IP_REMOTE=$TEST_HOST
+    else
+        echo "could not find remote test ip"
+        return $TEST_SKIP
+    fi
+
+    export_vars HOST_IF TEST_IP_REMOTE
+}
+
+setup_backend()
+{
+    set -e
+
+    if [ "$LKL_HOST_CONFIG_POSIX" != "y" ] &&
+       [ "$1" != "loopback" ]; then
+        echo "not a posix environment"
+        return $TEST_SKIP
+    fi
+
+    case "$1" in
+    "loopback")
+        ;;
+    "pipe")
+        if [ -z $(lkl_test_cmd which mkfifo) ]; then
+            echo "no mkfifo command"
+            return $TEST_SKIP
+        else
+            work_dir=$(lkl_test_cmd mktemp -d)
+        fi
+        fifo1=$work_dir/fifo1
+        fifo2=$work_dir/fifo2
+        lkl_test_cmd mkfifo $fifo1
+        lkl_test_cmd mkfifo $fifo2
+        export_vars work_dir fifo1 fifo2
+        ;;
+    "tap")
+        tap_prepare
+        if ! lkl_test_cmd test -c /dev/net/tun; then
+            if [ -z "$LKL_HOST_CONFIG_BSD" ]; then
+                echo "missing /dev/net/tun"
+                return $TEST_SKIP
+            fi
+        fi
+        tap_setup
+        ;;
+    "raw")
+        if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+            return $TEST_SKIP
+        fi
+        get_test_ip
+        ;;
+    "macvtap")
+        get_test_ip
+        if ! lkl_test_cmd sudo ip link add link $HOST_IF \
+             name $(tap_ifname) type macvtap mode passthru; then
+            echo "failed to create macvtap, skipping"
+            return $TEST_SKIP
+        fi
+        MACVTAP=/dev/tap$(lkl_test_cmd ip link show dev $(tap_ifname) | \
+                                 grep -o ^[0-9]*)
+        lkl_test_cmd sudo ip link set dev $(tap_ifname) up
+        lkl_test_cmd sudo chown $USER $MACVTAP
+        export_vars MACVTAP
+        ;;
+    "dpdk")
+        if -z [ $LKL_TEST_NET_DPDK ]; then
+            echo "DPDK needs user setup"
+            return $TEST_SKIP
+        fi
+        ;;
+    *)
+        echo "don't know how to setup backend $1"
+        return $TEST_FAILED
+        ;;
+    esac
+}
+
+run_tests()
+{
+    case "$1" in
+    "loopback")
+        lkl_test_exec $script_dir/net-test --dst 127.0.0.1
+        ;;
+    "pipe")
+        VALGRIND="" lkl_test_exec $script_dir/net-test --backend pipe \
+                      --ifname "$fifo1|$fifo2" \
+                      --ip $(ip_host) --netmask-len $TEST_IP_NETMASK \
+                      --sleep 1800 >/dev/null &
+        cp $script_dir/net-test $script_dir/net-test2
+
+        sleep 10
+        lkl_test_exec $script_dir/net-test2 --backend pipe \
+                      --ifname "$fifo2|$fifo1" \
+                      --ip $(ip_lkl) --netmask-len $TEST_IP_NETMASK \
+                      --dst $(ip_host)
+        rm -f $script_dir/net-test2
+        kill $!
+        wait $! 2>/dev/null
+        ;;
+    "tap")
+        lkl_test_exec $script_dir/net-test --backend tap \
+                      --ifname $(tap_ifname) \
+                      --ip $(ip_lkl) --netmask-len $TEST_IP_NETMASK \
+                      --dst $(ip_host)
+        ;;
+    "raw")
+        lkl_test_exec sudo $script_dir/net-test --backend raw \
+                      --ifname $HOST_IF --dhcp --dst $TEST_IP_REMOTE
+        ;;
+    "macvtap")
+        lkl_test_exec $script_dir/net-test --backend macvtap \
+                      --ifname $MACVTAP \
+                      --dhcp --dst $TEST_IP_REMOTE
+        ;;
+    "dpdk")
+        lkl_test_exec sudo $script_dir/net-test --backend dpdk \
+                      --ifname dpdk0 \
+                      --ip $(ip_lkl) --netmask-len $TEST_IP_NETMASK \
+                      --dst $(ip_host)
+        ;;
+    esac
+}
+
+if [ "$1" = "-b" ]; then
+    shift
+    backend=$1
+    shift
+fi
+
+if [ -z "$backend" ]; then
+    backend="loopback"
+fi
+
+lkl_test_plan 1 "net $backend"
+lkl_test_run 1 setup_backend $backend
+
+if [ $? = $TEST_SKIP ]; then
+    exit 0
+fi
+
+trap "cleanup_backend $backend" EXIT
+
+run_tests $backend
+
+trap : EXIT
+lkl_test_plan 1 "net $backend"
+lkl_test_run 1 cleanup_backend $backend
+
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 24/37] lkl tools: virtio: add network device support
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Patrick Collins, Xiao Jia, Octavian Purdila,
	Motomu Utsumi, Akira Moroo, Yuan Liu, Thomas Liebetraut,
	David Disseldorp, linux-kernel-library, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

This commit adds basic virtio_net device implementation support to be
utilized by virtio-net driver over LKL. It also adds various virtio_net
backend to be used as network devices.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/include/lkl.h            | 392 ++++++++++++++
 tools/lkl/include/lkl_host.h       |  76 +++
 tools/lkl/lib/Build                |  10 +
 tools/lkl/lib/net.c                | 818 +++++++++++++++++++++++++++++
 tools/lkl/lib/virtio_net.c         | 322 ++++++++++++
 tools/lkl/lib/virtio_net_dpdk.c    | 480 +++++++++++++++++
 tools/lkl/lib/virtio_net_fd.c      | 217 ++++++++
 tools/lkl/lib/virtio_net_fd.h      |  28 +
 tools/lkl/lib/virtio_net_macvtap.c |  32 ++
 tools/lkl/lib/virtio_net_pipe.c    |  76 +++
 tools/lkl/lib/virtio_net_raw.c     |  94 ++++
 tools/lkl/lib/virtio_net_tap.c     | 111 ++++
 tools/lkl/lib/virtio_net_vde.c     | 168 ++++++
 tools/lkl/tests/net-setup.sh       | 134 +++++
 tools/lkl/tests/net-test.c         | 317 +++++++++++
 tools/lkl/tests/net.sh             | 186 +++++++
 16 files changed, 3461 insertions(+)
 create mode 100644 tools/lkl/lib/net.c
 create mode 100644 tools/lkl/lib/virtio_net.c
 create mode 100644 tools/lkl/lib/virtio_net_dpdk.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.c
 create mode 100644 tools/lkl/lib/virtio_net_fd.h
 create mode 100644 tools/lkl/lib/virtio_net_macvtap.c
 create mode 100644 tools/lkl/lib/virtio_net_pipe.c
 create mode 100644 tools/lkl/lib/virtio_net_raw.c
 create mode 100644 tools/lkl/lib/virtio_net_tap.c
 create mode 100644 tools/lkl/lib/virtio_net_vde.c
 create mode 100644 tools/lkl/tests/net-setup.sh
 create mode 100644 tools/lkl/tests/net-test.c
 create mode 100755 tools/lkl/tests/net.sh

diff --git a/tools/lkl/include/lkl.h b/tools/lkl/include/lkl.h
index 8bda12d4c6de..710fa38af905 100644
--- a/tools/lkl/include/lkl.h
+++ b/tools/lkl/include/lkl.h
@@ -529,6 +529,398 @@ int lkl_dirfd(struct lkl_dir *dir);
  */
 int lkl_mount_fs(char *fstype);
 
+/**
+ * lkl_if_up - activate network interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_up(int ifindex);
+
+/**
+ * lkl_if_down - deactivate network interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_down(int ifindex);
+
+/**
+ * lkl_if_set_mtu - set MTU on interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @mtu - the requested MTU size
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_mtu(int ifindex, int mtu);
+
+/**
+ * lkl_if_set_ipv4 - set IPv4 address on interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @addr - 4-byte IP address (i.e., struct in_addr)
+ * @netmask_len - prefix length of the @addr
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv4(int ifindex, unsigned int addr, unsigned int netmask_len);
+
+/**
+ * lkl_set_ipv4_gateway - add an IPv4 default route
+ *
+ * @addr - 4-byte IP address of the gateway (i.e., struct in_addr)
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_set_ipv4_gateway(unsigned int addr);
+
+/**
+ * lkl_if_set_ipv4_gateway - add an IPv4 default route in rule table
+ *
+ * @ifindex - the ifindex of the interface, used for tableid calculation
+ * @addr - 4-byte IP address of the interface
+ * @netmask_len - prefix length of the @addr
+ * @gw_addr - 4-byte IP address of the gateway
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv4_gateway(int ifindex, unsigned int addr,
+		unsigned int netmask_len, unsigned int gw_addr);
+
+/**
+ * lkl_if_set_ipv6 - set IPv6 address on interface
+ * must be called after interface is up.
+ *
+ * @ifindex - the ifindex of the interface
+ * @addr - 16-byte IPv6 address (i.e., struct in6_addr)
+ * @netprefix_len - prefix length of the @addr
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv6(int ifindex, void *addr, unsigned int netprefix_len);
+
+/**
+ * lkl_set_ipv6_gateway - add an IPv6 default route
+ *
+ * @addr - 16-byte IPv6 address of the gateway (i.e., struct in6_addr)
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_set_ipv6_gateway(void *addr);
+
+/**
+ * lkl_if_set_ipv6_gateway - add an IPv6 default route in rule table
+ *
+ * @ifindex - the ifindex of the interface, used for tableid calculation
+ * @addr - 16-byte IP address of the interface
+ * @netmask_len - prefix length of the @addr
+ * @gw_addr - 16-byte IP address of the gateway (i.e., struct in_addr)
+ * @returns - return 0 if no error: otherwise negative value returns
+ */
+int lkl_if_set_ipv6_gateway(int ifindex, void *addr,
+		unsigned int netmask_len, void *gw_addr);
+
+/**
+ * lkl_ifname_to_ifindex - obtain ifindex of an interface by name
+ *
+ * @name - string of an interface
+ * @returns - return an integer of ifindex if no error
+ */
+int lkl_ifname_to_ifindex(const char *name);
+
+/**
+ * lkl_netdev - host network device handle, defined in lkl_host.h.
+ */
+struct lkl_netdev;
+
+/**
+ * lkl_netdev_args - arguments to lkl_netdev_add
+ * @mac - optional MAC address for the device
+ * @offload - offload bits for the device
+ */
+struct lkl_netdev_args {
+	void *mac;
+	unsigned int offload;
+};
+
+/**
+ * lkl_netdev_add - add a new network device
+ *
+ * Must be called before calling lkl_start_kernel.
+ *
+ * @nd - the network device host handle
+ * @args - arguments that configs the netdev. Can be NULL
+ * @returns a network device id (0 is valid) or a strictly negative value in
+ * case of error
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args);
+#else
+static inline int lkl_netdev_add(struct lkl_netdev *nd,
+				 struct lkl_netdev_args *args)
+{
+	return -LKL_ENOSYS;
+}
+#endif
+
+/**
+ * lkl_netdev_remove - remove a previously added network device
+ *
+ * Attempts to release all resources held by a network device created
+ * via lkl_netdev_add.
+ *
+ * @id - the network device id, as return by @lkl_netdev_add
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+void lkl_netdev_remove(int id);
+#else
+static inline void lkl_netdev_remove(int id)
+{
+}
+#endif
+
+/**
+ * lkl_netdev_free - frees a network device
+ *
+ * @nd - the network device to free
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+void lkl_netdev_free(struct lkl_netdev *nd);
+#else
+static inline void lkl_netdev_free(struct lkl_netdev *nd)
+{
+}
+#endif
+
+/**
+ * lkl_netdev_get_ifindex - retrieve the interface index for a given network
+ * device id
+ *
+ * @id - the network device id
+ * @returns the interface index or a stricly negative value in case of error
+ */
+int lkl_netdev_get_ifindex(int id);
+
+/**
+ * lkl_netdev_tap_create - create TAP net_device for the virtio net backend
+ *
+ * @ifname - interface name for the TAP device. need to be configured
+ * on host in advance
+ * @offload - offload bits for the device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+struct lkl_netdev *lkl_netdev_tap_create(const char *ifname, int offload);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_tap_create(const char *ifname, int offload)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_dpdk_create - create DPDK net_device for the virtio net backend
+ *
+ * @ifname - interface name for the DPDK device. The name for DPDK device is
+ * only used for an internal use.
+ * @offload - offload bits for the device
+ * @mac - mac address pointer of dpdk-ed device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET_DPDK
+struct lkl_netdev *lkl_netdev_dpdk_create(const char *ifname, int offload,
+					unsigned char *mac);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_dpdk_create(const char *ifname, int offload, unsigned char *mac)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_vde_create - create VDE net_device for the virtio net backend
+ *
+ * @switch_path - path to the VDE switch directory. Needs to be started on host
+ * in advance.
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET_VDE
+struct lkl_netdev *lkl_netdev_vde_create(const char *switch_path);
+#else
+static inline struct lkl_netdev *lkl_netdev_vde_create(const char *switch_path)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_raw_create - create raw socket net_device for the virtio net
+ *                         backend
+ *
+ * @ifname - interface name for the snoop device.
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+struct lkl_netdev *lkl_netdev_raw_create(const char *ifname);
+#else
+static inline struct lkl_netdev *lkl_netdev_raw_create(const char *ifname)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_macvtap_create - create macvtap net_device for the virtio
+ * net backend
+ *
+ * @path - a file name for the macvtap device. need to be configured
+ * on host in advance
+ * @offload - offload bits for the device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET_MACVTAP
+struct lkl_netdev *lkl_netdev_macvtap_create(const char *path, int offload);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_macvtap_create(const char *path, int offload)
+{
+	return NULL;
+}
+#endif
+
+/**
+ * lkl_netdev_pipe_create - create pipe net_device for the virtio
+ * net backend
+ *
+ * @ifname - a file name for the rx and tx pipe device. need to be configured
+ * on host in advance. delimiter is "|". e.g. "rx_name|tx_name".
+ * @offload - offload bits for the device
+ */
+#ifdef LKL_HOST_CONFIG_VIRTIO_NET
+struct lkl_netdev *lkl_netdev_pipe_create(const char *ifname, int offload);
+#else
+static inline struct lkl_netdev *
+lkl_netdev_pipe_create(const char *ifname, int offload)
+{
+	return NULL;
+}
+#endif
+
+/*
+ * lkl_register_dbg_handler- register a signal handler that loads a debug lib.
+ *
+ * The signal handler is triggered by Ctrl-Z. It creates a new pthread which
+ * call dbg_entrance().
+ *
+ * If you run the program from shell script, make sure you ignore SIGTSTP by
+ * "trap '' TSTP" in the shell script.
+ */
+void lkl_register_dbg_handler(void);
+
+/**
+ * lkl_add_neighbor - add a permanent arp entry
+ * @ifindex - the ifindex of the interface
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @ip - ip address of the entry in network byte order
+ * @mac - mac address of the entry
+ */
+int lkl_add_neighbor(int ifindex, int af, void *addr, void *mac);
+
+/**
+ * lkl_if_add_ip - add an ip address
+ * @ifindex - the ifindex of the interface
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @addr - ip address of the entry in network byte order
+ * @netprefix_len - prefix length of the @addr
+ */
+int lkl_if_add_ip(int ifindex, int af, void *addr, unsigned int netprefix_len);
+
+/**
+ * lkl_if_del_ip - add an ip address
+ * @ifindex - the ifindex of the interface
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @addr - ip address of the entry in network byte order
+ * @netprefix_len - prefix length of the @addr
+ */
+int lkl_if_del_ip(int ifindex, int af, void *addr, unsigned int netprefix_len);
+
+/**
+ * lkl_add_gateway - add a gateway
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @gwaddr - 4-byte IP address of the gateway (i.e., struct in_addr)
+ */
+int lkl_add_gateway(int af, void *gwaddr);
+
+/**
+ * XXX Should I use OIF selector?
+ * temporary table idx = ifindex * 2 + 0 <- ipv4
+ * temporary table idx = ifindex * 2 + 1 <- ipv6
+ */
+/**
+ * lkl_if_add_rule_from_addr - create an ip rule table with "from" selector
+ * @ifindex - the ifindex of the interface, used for table id calculation
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @saddr - network byte order ip address, "from" selector address of this rule
+ */
+int lkl_if_add_rule_from_saddr(int ifindex, int af, void *saddr);
+
+/**
+ * lkl_if_add_gateway - add gateway to rule table
+ * @ifindex - the ifindex of the interface, used for table id calculation
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @gwaddr - 4-byte IP address of the gateway (i.e., struct in_addr)
+ */
+int lkl_if_add_gateway(int ifindex, int af, void *gwaddr);
+
+/**
+ * lkl_if_add_linklocal - add linklocal route to rule table
+ * @ifindex - the ifindex of the interface, used for table id calculation
+ * @af - address family of the ip address. Must be LKL_AF_INET or LKL_AF_INET6
+ * @addr - ip address of the entry in network byte order
+ * @netprefix_len - prefix length of the @addr
+ */
+int lkl_if_add_linklocal(int ifindex, int af,  void *addr, int netprefix_len);
+
+/**
+ * lkl_if_wait_ipv6_dad - wait for DAD to be done for a ipv6 address
+ * must be called after interface is up
+ *
+ * @ifindex - the ifindex of the interface
+ * @addr - ip address of the entry in network byte order
+ */
+int lkl_if_wait_ipv6_dad(int ifindex, void *addr);
+
+/**
+ * lkl_set_fd_limit - set the maximum number of file descriptors allowed
+ * @fd_limit - fd max limit
+ */
+int lkl_set_fd_limit(unsigned int fd_limit);
+
+/**
+ * lkl_qdisc_add - set qdisc rule onto an interface
+ *
+ * @ifindex - the ifindex of the interface
+ * @root - the name of root class (e.g., "root");
+ * @type - the type of qdisc (e.g., "fq")
+ */
+int lkl_qdisc_add(int ifindex, const char *root, const char *type);
+
+/**
+ * lkl_qdisc_parse_add - Add a qdisc entry for an interface with strings
+ *
+ * @ifindex - the ifindex of the interface
+ * @entries - strings of qdisc configurations in the form of
+ *            "root|type;root|type;..."
+ */
+void lkl_qdisc_parse_add(int ifindex, const char *entries);
+
+/**
+ * lkl_sysctl - write a sysctl value
+ *
+ * @path - the path to an sysctl entry (e.g., "net.ipv4.tcp_wmem");
+ * @value - the value of the sysctl (e.g., "4096 87380 2147483647")
+ */
+int lkl_sysctl(const char *path, const char *value);
+
+/**
+ * lkl_sysctl_parse_write - Configure sysctl parameters with strings
+ *
+ * @sysctls - Configure sysctl parameters as the form of "key=value;..."
+ */
+void lkl_sysctl_parse_write(const char *sysctls);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/tools/lkl/include/lkl_host.h b/tools/lkl/include/lkl_host.h
index a630efc95f0f..ab9c3f2a69fb 100644
--- a/tools/lkl/include/lkl_host.h
+++ b/tools/lkl/include/lkl_host.h
@@ -76,6 +76,82 @@ struct lkl_dev_blk_ops {
 	int (*request)(struct lkl_disk disk, struct lkl_blk_req *req);
 };
 
+struct lkl_netdev {
+	struct lkl_dev_net_ops *ops;
+	int id;
+	uint8_t has_vnet_hdr: 1;
+};
+
+/**
+ * struct lkl_dev_net_ops - network device host operations
+ */
+struct lkl_dev_net_ops {
+	/**
+	 * @tx: writes a L2 packet into the net device
+	 *
+	 * The data buffer can only hold 0 or 1 complete packets.
+	 *
+	 * @nd - pointer to the network device;
+	 * @iov - pointer to the buffer vector;
+	 * @cnt - # of vectors in iov.
+	 *
+	 * @returns number of bytes transmitted
+	 */
+	int (*tx)(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+
+	/**
+	 * @rx: reads a packet from the net device.
+	 *
+	 * It must only read one complete packet if present.
+	 *
+	 * If the buffer is too small for the packet, the implementation may
+	 * decide to drop it or trim it.
+	 *
+	 * @nd - pointer to the network device
+	 * @iov - pointer to the buffer vector to store the packet
+	 * @cnt - # of vectors in iov.
+	 *
+	 * @returns number of bytes read for success or < 0 if error
+	 */
+	int (*rx)(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+
+#define LKL_DEV_NET_POLL_RX		1
+#define LKL_DEV_NET_POLL_TX		2
+#define LKL_DEV_NET_POLL_HUP		4
+
+	/**
+	 * @poll: polls a net device
+	 *
+	 * Supports the following events: LKL_DEV_NET_POLL_RX
+	 * (readable), LKL_DEV_NET_POLL_TX (writable) or
+	 * LKL_DEV_NET_POLL_HUP (the close operations has been issued
+	 * and we need to clean up). Blocks until one event is
+	 * available.
+	 *
+	 * @nd - pointer to the network device
+	 *
+	 * @returns - LKL_DEV_NET_POLL_RX, LKL_DEV_NET_POLL_TX,
+	 * LKL_DEV_NET_POLL_HUP or a negative value for errors
+	 */
+	int (*poll)(struct lkl_netdev *nd);
+
+	/**
+	 * @poll_hup: make poll wakeup and return LKL_DEV_NET_POLL_HUP
+	 *
+	 * @nd - pointer to the network device
+	 */
+	void (*poll_hup)(struct lkl_netdev *nd);
+
+	/**
+	 * @free: frees a network device
+	 *
+	 * Implementation must release its resources and free the network device
+	 * structure.
+	 *
+	 * @nd - pointer to the network device
+	 */
+	void (*free)(struct lkl_netdev *nd);
+};
 
 #ifdef __cplusplus
 }
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index a7a3bff27bb1..1f1d55f259a3 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1,8 +1,10 @@
 CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 CFLAGS_posix-host.o += -D_FILE_OFFSET_BITS=64
+CFLAGS_virtio_net_vde.o += $(pkg-config --cflags vdeplug 2>/dev/null)
 
 liblkl-y += fs.o
 liblkl-y += iomem.o
+liblkl-y += net.o
 liblkl-y += jmp_buf.o
 liblkl-$(LKL_HOST_CONFIG_POSIX) += posix-host.o
 liblkl-y += utils.o
@@ -10,5 +12,13 @@ liblkl-y += virtio_blk.o
 liblkl-y += virtio.o
 liblkl-y += dbg.o
 liblkl-y += dbg_handler.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_fd.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_tap.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_raw.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET_MACVTAP) += virtio_net_macvtap.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET_DPDK) += virtio_net_dpdk.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET_VDE) += virtio_net_vde.o
+liblkl-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_pipe.o
 liblkl-y += ../../perf/pmu-events/jsmn.o
 liblkl-y += config.o
diff --git a/tools/lkl/lib/net.c b/tools/lkl/lib/net.c
new file mode 100644
index 000000000000..316965ffd21e
--- /dev/null
+++ b/tools/lkl/lib/net.c
@@ -0,0 +1,818 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <stdio.h>
+#include "endian.h"
+#include <lkl_host.h>
+
+#ifdef __MINGW32__
+#include <ws2tcpip.h>
+
+int lkl_inet_pton(int af, const char *src, void *dst)
+{
+	struct addrinfo hint, *res = NULL;
+	int err;
+
+	memset(&hint, 0, sizeof(struct addrinfo));
+
+	hint.ai_family = af;
+	hint.ai_flags = AI_NUMERICHOST;
+
+	err = getaddrinfo(src, NULL, &hint, &res);
+	if (err)
+		return 0;
+
+	switch (af) {
+	case AF_INET:
+		*(struct in_addr *)dst =
+			((struct sockaddr_in *)&res->ai_addr)->sin_addr;
+		break;
+	case AF_INET6:
+		*(struct in6_addr *)dst =
+			((struct sockaddr_in6 *)&res->ai_addr)->sin6_addr;
+		break;
+	default:
+		freeaddrinfo(res);
+		return 0;
+	}
+
+	freeaddrinfo(res);
+	return 1;
+}
+#endif
+
+static inline void set_sockaddr(struct lkl_sockaddr_in *sin, unsigned int addr,
+				unsigned short port)
+{
+	sin->sin_family = LKL_AF_INET;
+	sin->sin_addr.lkl_s_addr = addr;
+	sin->sin_port = port;
+}
+
+static inline int ifindex_to_name(int sock, struct lkl_ifreq *ifr, int ifindex)
+{
+	ifr->lkl_ifr_ifindex = ifindex;
+	return lkl_sys_ioctl(sock, LKL_SIOCGIFNAME, (long)ifr);
+}
+
+int lkl_ifname_to_ifindex(const char *name)
+{
+	struct lkl_ifreq ifr;
+	int fd, ret;
+
+	fd = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (fd < 0)
+		return fd;
+
+	strcpy(ifr.lkl_ifr_name, name);
+
+	ret = lkl_sys_ioctl(fd, LKL_SIOCGIFINDEX, (long)&ifr);
+	if (ret < 0)
+		return ret;
+
+	return ifr.lkl_ifr_ifindex;
+}
+
+int lkl_if_up(int ifindex)
+{
+	struct lkl_ifreq ifr;
+	int err, sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+
+	if (sock < 0)
+		return sock;
+	err = ifindex_to_name(sock, &ifr, ifindex);
+	if (err < 0)
+		return err;
+
+	err = lkl_sys_ioctl(sock, LKL_SIOCGIFFLAGS, (long)&ifr);
+	if (!err) {
+		ifr.lkl_ifr_flags |= LKL_IFF_UP;
+		err = lkl_sys_ioctl(sock, LKL_SIOCSIFFLAGS, (long)&ifr);
+	}
+
+	lkl_sys_close(sock);
+
+	return err;
+}
+
+int lkl_if_down(int ifindex)
+{
+	struct lkl_ifreq ifr;
+	int err, sock;
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (sock < 0)
+		return sock;
+
+	err = ifindex_to_name(sock, &ifr, ifindex);
+	if (err < 0)
+		return err;
+
+	err = lkl_sys_ioctl(sock, LKL_SIOCGIFFLAGS, (long)&ifr);
+	if (!err) {
+		ifr.lkl_ifr_flags &= ~LKL_IFF_UP;
+		err = lkl_sys_ioctl(sock, LKL_SIOCSIFFLAGS, (long)&ifr);
+	}
+
+	lkl_sys_close(sock);
+
+	return err;
+}
+
+int lkl_if_set_mtu(int ifindex, int mtu)
+{
+	struct lkl_ifreq ifr;
+	int err, sock;
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (sock < 0)
+		return sock;
+
+	err = ifindex_to_name(sock, &ifr, ifindex);
+	if (err < 0)
+		return err;
+
+	ifr.lkl_ifr_mtu = mtu;
+
+	err = lkl_sys_ioctl(sock, LKL_SIOCSIFMTU, (long)&ifr);
+
+	lkl_sys_close(sock);
+
+	return err;
+}
+
+int lkl_if_set_ipv4(int ifindex, unsigned int addr, unsigned int netmask_len)
+{
+	return lkl_if_add_ip(ifindex, LKL_AF_INET, &addr, netmask_len);
+}
+
+int lkl_if_set_ipv4_gateway(int ifindex, unsigned int src_addr,
+		unsigned int src_masklen, unsigned int via_addr)
+{
+	int err;
+
+	err = lkl_if_add_rule_from_saddr(ifindex, LKL_AF_INET, &src_addr);
+	if (err)
+		return err;
+	err = lkl_if_add_linklocal(ifindex, LKL_AF_INET,
+					&src_addr, src_masklen);
+	if (err)
+		return err;
+	return lkl_if_add_gateway(ifindex, LKL_AF_INET, &via_addr);
+}
+
+int lkl_set_ipv4_gateway(unsigned int addr)
+{
+	return lkl_add_gateway(LKL_AF_INET, &addr);
+}
+
+int lkl_netdev_get_ifindex(int id)
+{
+	struct lkl_ifreq ifr;
+	int sock, ret;
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_DGRAM, 0);
+	if (sock < 0)
+		return sock;
+
+	snprintf(ifr.lkl_ifr_name, sizeof(ifr.lkl_ifr_name), "eth%d", id);
+	ret = lkl_sys_ioctl(sock, LKL_SIOCGIFINDEX, (long)&ifr);
+	lkl_sys_close(sock);
+
+	return ret < 0 ? ret : ifr.lkl_ifr_ifindex;
+}
+
+static int netlink_sock(unsigned int groups)
+{
+	struct lkl_sockaddr_nl la;
+	int fd, err;
+
+	fd = lkl_sys_socket(LKL_AF_NETLINK, LKL_SOCK_DGRAM, LKL_NETLINK_ROUTE);
+	if (fd < 0)
+		return fd;
+
+	memset(&la, 0, sizeof(la));
+	la.nl_family = LKL_AF_NETLINK;
+	la.nl_groups = groups;
+	err = lkl_sys_bind(fd, (struct lkl_sockaddr *)&la, sizeof(la));
+	if (err < 0)
+		return err;
+
+	return fd;
+}
+
+static int parse_rtattr(struct lkl_rtattr *tb[], int max,
+			struct lkl_rtattr *rta, int len)
+{
+	unsigned short type;
+
+	memset(tb, 0, sizeof(struct lkl_rtattr *) * (max + 1));
+	while (LKL_RTA_OK(rta, len)) {
+		type = rta->rta_type;
+		if ((type <= max) && (!tb[type]))
+			tb[type] = rta;
+		rta = LKL_RTA_NEXT(rta, len);
+	}
+	if (len)
+		lkl_printf("!!!Deficit %d, rta_len=%d\n", len,
+			rta->rta_len);
+	return 0;
+}
+
+struct addr_filter {
+	unsigned int ifindex;
+	void *addr;
+};
+
+static unsigned int get_ifa_flags(struct lkl_ifaddrmsg *ifa,
+				  struct lkl_rtattr *ifa_flags_attr)
+{
+	return ifa_flags_attr ? *(unsigned int *)LKL_RTA_DATA(ifa_flags_attr) :
+				ifa->ifa_flags;
+}
+
+/* returns:
+ * 0 - dad succeed.
+ * -1 - dad failed or other error.
+ * 1 - should wait for new msg.
+ */
+static int check_ipv6_dad(struct lkl_sockaddr_nl *nladdr,
+			  struct lkl_nlmsghdr *n, void *arg)
+{
+	struct addr_filter *filter = arg;
+	struct lkl_ifaddrmsg *ifa = LKL_NLMSG_DATA(n);
+	struct lkl_rtattr *rta_tb[LKL_IFA_MAX+1];
+	unsigned int ifa_flags;
+	int len = n->nlmsg_len;
+
+	if (n->nlmsg_type != LKL_RTM_NEWADDR)
+		return 1;
+
+	len -= LKL_NLMSG_LENGTH(sizeof(*ifa));
+	if (len < 0) {
+		lkl_printf("BUG: wrong nlmsg len %d\n", len);
+		return -1;
+	}
+
+	parse_rtattr(rta_tb, LKL_IFA_MAX, LKL_IFA_RTA(ifa),
+		     n->nlmsg_len - LKL_NLMSG_LENGTH(sizeof(*ifa)));
+
+	ifa_flags = get_ifa_flags(ifa, rta_tb[LKL_IFA_FLAGS]);
+
+	if (ifa->ifa_index != filter->ifindex)
+		return 1;
+	if (ifa->ifa_family != LKL_AF_INET6)
+		return 1;
+
+	if (!rta_tb[LKL_IFA_LOCAL])
+		rta_tb[LKL_IFA_LOCAL] = rta_tb[LKL_IFA_ADDRESS];
+
+	if (!rta_tb[LKL_IFA_LOCAL] ||
+	    (filter->addr && memcmp(LKL_RTA_DATA(rta_tb[LKL_IFA_LOCAL]),
+				    filter->addr, 16))) {
+		return 1;
+	}
+	if (ifa_flags & LKL_IFA_F_DADFAILED) {
+		lkl_printf("IPV6 DAD failed.\n");
+		return -1;
+	}
+	if (!(ifa_flags & LKL_IFA_F_TENTATIVE))
+		return 0;
+	return 1;
+}
+
+/* Copied from iproute2/lib/ */
+static int rtnl_listen(int fd, int (*handler)(struct lkl_sockaddr_nl *nladdr,
+					      struct lkl_nlmsghdr *, void *),
+		       void *arg)
+{
+	int status;
+	struct lkl_nlmsghdr *h;
+	struct lkl_sockaddr_nl nladdr = { .nl_family = LKL_AF_NETLINK };
+	struct lkl_iovec iov;
+	struct lkl_user_msghdr msg = {
+		.msg_name = &nladdr,
+		.msg_namelen = sizeof(nladdr),
+		.msg_iov = &iov,
+		.msg_iovlen = 1,
+	};
+	char   buf[16384];
+
+	iov.iov_base = buf;
+	while (1) {
+		iov.iov_len = sizeof(buf);
+		status = lkl_sys_recvmsg(fd, &msg, 0);
+
+		if (status < 0) {
+			if (status == -LKL_EINTR || status == -LKL_EAGAIN)
+				continue;
+			lkl_printf("netlink receive error %s (%d)\n",
+				lkl_strerror(status), status);
+			if (status == -LKL_ENOBUFS)
+				continue;
+			return status;
+		}
+		if (status == 0) {
+			lkl_printf("EOF on netlink\n");
+			return -1;
+		}
+		if (msg.msg_namelen != sizeof(nladdr)) {
+			lkl_printf("Sender address length == %d\n",
+				msg.msg_namelen);
+			return -1;
+		}
+
+		for (h = (struct lkl_nlmsghdr *)buf;
+		     (unsigned int)status >= sizeof(*h);) {
+			int err;
+			int len = h->nlmsg_len;
+			int l = len - sizeof(*h);
+
+			if (l < 0 || len > status) {
+				if (msg.msg_flags & LKL_MSG_TRUNC) {
+					lkl_printf("Truncated message\n");
+					return -1;
+				}
+				lkl_printf("!!!malformed message: len=%d\n",
+					len);
+				return -1;
+			}
+
+			err = handler(&nladdr, h, arg);
+			if (err <= 0)
+				return err;
+
+			status -= LKL_NLMSG_ALIGN(len);
+			h = (struct lkl_nlmsghdr *)((char *)h +
+						    LKL_NLMSG_ALIGN(len));
+		}
+		if (msg.msg_flags & LKL_MSG_TRUNC) {
+			lkl_printf("Message truncated\n");
+			continue;
+		}
+		if (status) {
+			lkl_printf("!!!Remnant of size %d\n", status);
+			return -1;
+		}
+	}
+}
+
+int lkl_if_wait_ipv6_dad(int ifindex, void *addr)
+{
+	struct addr_filter filter = {.ifindex = ifindex, .addr = addr};
+	int fd, ret;
+	struct {
+		struct lkl_nlmsghdr		nlmsg_info;
+		struct lkl_ifaddrmsg	ifaddrmsg_info;
+	} req;
+
+	fd = netlink_sock(1 << (LKL_RTNLGRP_IPV6_IFADDR - 1));
+	if (fd < 0)
+		return fd;
+
+	memset(&req, 0, sizeof(req));
+	req.nlmsg_info.nlmsg_len =
+			LKL_NLMSG_LENGTH(sizeof(struct lkl_ifaddrmsg));
+	req.nlmsg_info.nlmsg_flags = LKL_NLM_F_REQUEST | LKL_NLM_F_DUMP;
+	req.nlmsg_info.nlmsg_type = LKL_RTM_GETADDR;
+	req.ifaddrmsg_info.ifa_family = LKL_AF_INET6;
+	req.ifaddrmsg_info.ifa_index = ifindex;
+	ret = lkl_sys_send(fd, &req, req.nlmsg_info.nlmsg_len, 0);
+	if (ret < 0) {
+		lkl_perror("lkl_sys_send", ret);
+		return ret;
+	}
+	ret = rtnl_listen(fd, check_ipv6_dad, (void *)&filter);
+	lkl_sys_close(fd);
+	return ret;
+}
+
+int lkl_if_set_ipv6(int ifindex, void *addr, unsigned int netprefix_len)
+{
+	int err = lkl_if_add_ip(ifindex, LKL_AF_INET6, addr, netprefix_len);
+
+	if (err)
+		return err;
+	return lkl_if_wait_ipv6_dad(ifindex, addr);
+}
+
+int lkl_if_set_ipv6_gateway(int ifindex, void *src_addr,
+		unsigned int src_masklen, void *via_addr)
+{
+	int err;
+
+	err = lkl_if_add_rule_from_saddr(ifindex, LKL_AF_INET6, src_addr);
+	if (err)
+		return err;
+	err = lkl_if_add_linklocal(ifindex, LKL_AF_INET6,
+					src_addr, src_masklen);
+	if (err)
+		return err;
+	return lkl_if_add_gateway(ifindex, LKL_AF_INET6, via_addr);
+}
+
+int lkl_set_ipv6_gateway(void *addr)
+{
+	return lkl_add_gateway(LKL_AF_INET6, addr);
+}
+
+/* returns:
+ * 0 - succeed.
+ * < 0 - error number.
+ * 1 - should wait for new msg.
+ */
+static int check_error(struct lkl_sockaddr_nl *nladdr, struct lkl_nlmsghdr *n,
+		       void *arg)
+{
+	unsigned int s = *(unsigned int *)arg;
+
+	if (nladdr->nl_pid != 0 || n->nlmsg_seq != s) {
+		/* Don't forget to skip that message. */
+		return 1;
+	}
+
+	if (n->nlmsg_type == LKL_NLMSG_ERROR) {
+		struct lkl_nlmsgerr *err =
+			(struct lkl_nlmsgerr *)LKL_NLMSG_DATA(n);
+		int l = n->nlmsg_len - sizeof(*n);
+
+		if (l < (int)sizeof(struct lkl_nlmsgerr))
+			lkl_printf("ERROR truncated\n");
+		else if (!err->error)
+			return 0;
+
+		lkl_printf("RTNETLINK answers: %s\n",
+			lkl_strerror(-err->error));
+		return err->error;
+	}
+	lkl_printf("Unexpected reply!!!\n");
+	return -1;
+}
+
+static unsigned int seq;
+static int rtnl_talk(int fd, struct lkl_nlmsghdr *n)
+{
+	int status;
+	struct lkl_sockaddr_nl nladdr = {.nl_family = LKL_AF_NETLINK};
+	struct lkl_iovec iov = {.iov_base = (void *)n, .iov_len = n->nlmsg_len};
+	struct lkl_user_msghdr msg = {
+			.msg_name = &nladdr,
+			.msg_namelen = sizeof(nladdr),
+			.msg_iov = &iov,
+			.msg_iovlen = 1,
+	};
+
+	n->nlmsg_seq = seq;
+	n->nlmsg_flags |= LKL_NLM_F_ACK;
+
+	status = lkl_sys_sendmsg(fd, &msg, 0);
+	if (status < 0) {
+		lkl_perror("Cannot talk to rtnetlink", status);
+		return status;
+	}
+
+	status = rtnl_listen(fd, check_error, (void *)&seq);
+	seq++;
+	return status;
+}
+
+static int addattr_l(struct lkl_nlmsghdr *n, unsigned int maxlen,
+		     int type, const void *data, int alen)
+{
+	int len = LKL_RTA_LENGTH(alen);
+	struct lkl_rtattr *rta;
+
+	if (LKL_NLMSG_ALIGN(n->nlmsg_len) + LKL_RTA_ALIGN(len) > maxlen) {
+		lkl_printf("%s ERROR: message exceeded bound of %d\n", __func__,
+			   maxlen);
+		return -1;
+	}
+	rta = ((struct lkl_rtattr *) (((void *) (n)) +
+				      LKL_NLMSG_ALIGN(n->nlmsg_len)));
+	rta->rta_type = type;
+	rta->rta_len = len;
+	memcpy(LKL_RTA_DATA(rta), data, alen);
+	n->nlmsg_len = LKL_NLMSG_ALIGN(n->nlmsg_len) + LKL_RTA_ALIGN(len);
+	return 0;
+}
+
+int lkl_add_neighbor(int ifindex, int af, void *ip, void *mac)
+{
+	struct {
+		struct lkl_nlmsghdr n;
+		struct lkl_ndmsg r;
+		char buf[1024];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_ndmsg)),
+		.n.nlmsg_type = LKL_RTM_NEWNEIGH,
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST |
+				 LKL_NLM_F_CREATE | LKL_NLM_F_REPLACE,
+		.r.ndm_family = af,
+		.r.ndm_ifindex = ifindex,
+		.r.ndm_state = LKL_NUD_PERMANENT,
+
+	};
+	int err, addr_sz;
+	int fd;
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	// create the IP attribute
+	addattr_l(&req.n, sizeof(req), LKL_NDA_DST, ip, addr_sz);
+
+	// create the MAC attribute
+	addattr_l(&req.n, sizeof(req), LKL_NDA_LLADDR, mac, 6);
+
+	err = rtnl_talk(fd, &req.n);
+	lkl_sys_close(fd);
+	return err;
+}
+
+static int ipaddr_modify(int cmd, int flags, int ifindex, int af, void *addr,
+			 unsigned int netprefix_len)
+{
+	struct {
+		struct lkl_nlmsghdr n;
+		struct lkl_ifaddrmsg ifa;
+		char buf[256];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_ifaddrmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST | flags,
+		.n.nlmsg_type = cmd,
+		.ifa.ifa_family = af,
+		.ifa.ifa_prefixlen = netprefix_len,
+		.ifa.ifa_index = ifindex,
+	};
+	int err, addr_sz;
+	int fd;
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	// create the IP attribute
+	addattr_l(&req.n, sizeof(req), LKL_IFA_LOCAL, addr, addr_sz);
+
+	err = rtnl_talk(fd, &req.n);
+
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_if_add_ip(int ifindex, int af, void *addr, unsigned int netprefix_len)
+{
+	return ipaddr_modify(LKL_RTM_NEWADDR, LKL_NLM_F_CREATE | LKL_NLM_F_EXCL,
+			     ifindex, af, addr, netprefix_len);
+}
+
+int lkl_if_del_ip(int ifindex, int af, void *addr, unsigned int netprefix_len)
+{
+	return ipaddr_modify(LKL_RTM_DELADDR, 0, ifindex, af,
+			     addr, netprefix_len);
+}
+
+static int iproute_modify(int cmd, unsigned int flags, int ifindex, int af,
+		void *route_addr, int route_masklen, void *gwaddr)
+{
+	struct {
+		struct lkl_nlmsghdr	n;
+		struct lkl_rtmsg	r;
+		char			buf[1024];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_rtmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST | flags,
+		.n.nlmsg_type = cmd,
+		.r.rtm_family = af,
+		.r.rtm_table = LKL_RT_TABLE_MAIN,
+		.r.rtm_scope = LKL_RT_SCOPE_UNIVERSE,
+	};
+	int err, addr_sz;
+	int i, fd;
+
+	fd = netlink_sock(0);
+	if (fd < 0) {
+		lkl_printf("netlink_sock error: %d\n", fd);
+		return fd;
+	}
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	if (cmd != LKL_RTM_DELROUTE) {
+		req.r.rtm_protocol = LKL_RTPROT_BOOT;
+		req.r.rtm_scope = LKL_RT_SCOPE_UNIVERSE;
+		req.r.rtm_type = LKL_RTN_UNICAST;
+	}
+
+	if (gwaddr)
+		addattr_l(&req.n, sizeof(req),
+				LKL_RTA_GATEWAY, gwaddr, addr_sz);
+
+	if (af == LKL_AF_INET && route_addr) {
+		unsigned int netaddr = *(unsigned int *)route_addr;
+
+		netaddr = ntohl(netaddr);
+		netaddr = (netaddr >> (32 - route_masklen));
+		netaddr = (netaddr << (32 - route_masklen));
+		netaddr =  htonl(netaddr);
+		*(unsigned int *)route_addr = netaddr;
+		req.r.rtm_dst_len = route_masklen;
+		addattr_l(&req.n, sizeof(req), LKL_RTA_DST,
+					route_addr, addr_sz);
+	}
+
+	if (af == LKL_AF_INET6 && route_addr) {
+		struct lkl_in6_addr netaddr =
+			*(struct lkl_in6_addr *)route_addr;
+		int rmbyte = route_masklen/8;
+		int rmbit = route_masklen%8;
+
+		for (i = 0; i < rmbyte; i++)
+			netaddr.in6_u.u6_addr8[15-i] = 0;
+		netaddr.in6_u.u6_addr8[15-rmbyte] =
+			(netaddr.in6_u.u6_addr8[15-rmbyte] >> rmbit);
+		netaddr.in6_u.u6_addr8[15-rmbyte] =
+			(netaddr.in6_u.u6_addr8[15-rmbyte] << rmbit);
+		*(struct lkl_in6_addr *)route_addr = netaddr;
+		req.r.rtm_dst_len = route_masklen;
+		addattr_l(&req.n, sizeof(req), LKL_RTA_DST,
+					route_addr, addr_sz);
+	}
+
+	if (ifindex != LKL_RT_TABLE_MAIN) {
+		if (af == LKL_AF_INET)
+			req.r.rtm_table = ifindex * 2;
+		else if (af == LKL_AF_INET6)
+			req.r.rtm_table = ifindex * 2 + 1;
+		addattr_l(&req.n, sizeof(req), LKL_RTA_OIF, &ifindex, addr_sz);
+	}
+	err = rtnl_talk(fd, &req.n);
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_if_add_linklocal(int ifindex, int af,  void *addr, int netprefix_len)
+{
+	return iproute_modify(LKL_RTM_NEWROUTE, LKL_NLM_F_CREATE|LKL_NLM_F_EXCL,
+			ifindex, af, addr, netprefix_len, NULL);
+}
+
+int lkl_if_add_gateway(int ifindex, int af, void *gwaddr)
+{
+	return iproute_modify(LKL_RTM_NEWROUTE, LKL_NLM_F_CREATE|LKL_NLM_F_EXCL,
+			ifindex, af, NULL, 0, gwaddr);
+}
+
+int lkl_add_gateway(int af, void *gwaddr)
+{
+	return iproute_modify(LKL_RTM_NEWROUTE, LKL_NLM_F_CREATE|LKL_NLM_F_EXCL,
+			LKL_RT_TABLE_MAIN, af, NULL, 0, gwaddr);
+}
+
+static int iprule_modify(int cmd, int ifindex, int af, void *saddr)
+{
+	struct {
+		struct lkl_nlmsghdr	n;
+		struct lkl_rtmsg		r;
+		char			buf[1024];
+	} req = {
+		.n.nlmsg_type = cmd,
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_rtmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST,
+		.r.rtm_protocol = LKL_RTPROT_BOOT,
+		.r.rtm_scope = LKL_RT_SCOPE_UNIVERSE,
+		.r.rtm_family = af,
+		.r.rtm_type = LKL_RTN_UNSPEC,
+	};
+	int fd, err;
+	int addr_sz;
+
+	if (af == LKL_AF_INET)
+		addr_sz = 4;
+	else if (af == LKL_AF_INET6)
+		addr_sz = 16;
+	else {
+		lkl_printf("Bad address family: %d\n", af);
+		return -1;
+	}
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	if (cmd == LKL_RTM_NEWRULE) {
+		req.n.nlmsg_flags |= LKL_NLM_F_CREATE|LKL_NLM_F_EXCL;
+		req.r.rtm_type = LKL_RTN_UNICAST;
+	}
+
+	//set from address
+	req.r.rtm_src_len = 8 * addr_sz;
+	addattr_l(&req.n, sizeof(req), LKL_FRA_SRC, saddr, addr_sz);
+
+	//use ifindex as table id
+	if (af == LKL_AF_INET)
+		req.r.rtm_table = ifindex * 2;
+	else if (af == LKL_AF_INET6)
+		req.r.rtm_table = ifindex * 2 + 1;
+	err = rtnl_talk(fd, &req.n);
+
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_if_add_rule_from_saddr(int ifindex, int af, void *saddr)
+{
+	return iprule_modify(LKL_RTM_NEWRULE, ifindex, af, saddr);
+}
+
+static int qdisc_add(int cmd, int flags, int ifindex,
+		     const char *root, const char *type)
+{
+	struct {
+		struct lkl_nlmsghdr n;
+		struct lkl_tcmsg tc;
+		char buf[2*1024];
+	} req = {
+		.n.nlmsg_len = LKL_NLMSG_LENGTH(sizeof(struct lkl_tcmsg)),
+		.n.nlmsg_flags = LKL_NLM_F_REQUEST|flags,
+		.n.nlmsg_type = cmd,
+		.tc.tcm_family = LKL_AF_UNSPEC,
+	};
+	int err, fd;
+
+	if (!root || !type) {
+		lkl_printf("root and type arguments\n");
+		return -1;
+	}
+
+	if (strcmp(root, "root") == 0)
+		req.tc.tcm_parent = LKL_TC_H_ROOT;
+	req.tc.tcm_ifindex = ifindex;
+
+	fd = netlink_sock(0);
+	if (fd < 0)
+		return fd;
+
+	// create the qdisc attribute
+	addattr_l(&req.n, sizeof(req), LKL_TCA_KIND, type, strlen(type)+1);
+
+	err = rtnl_talk(fd, &req.n);
+	lkl_sys_close(fd);
+	return err;
+}
+
+int lkl_qdisc_add(int ifindex, const char *root, const char *type)
+{
+	return qdisc_add(LKL_RTM_NEWQDISC, LKL_NLM_F_CREATE | LKL_NLM_F_EXCL,
+			 ifindex, root, type);
+}
+
+/* Add a qdisc entry for an interface in the form of
+ * "root|type;root|type;..."
+ */
+void lkl_qdisc_parse_add(int ifindex, const char *entries)
+{
+	char *saveptr = NULL, *token = NULL;
+	char *root = NULL, *type = NULL;
+	char strings[256];
+	int ret = 0;
+
+	strcpy(strings, entries);
+
+	for (token = strtok_r(strings, ";", &saveptr); token;
+	     token = strtok_r(NULL, ";", &saveptr)) {
+		root = strtok(token, "|");
+		type = strtok(NULL, "|");
+		ret = lkl_qdisc_add(ifindex, root, type);
+		if (ret) {
+			lkl_printf("Failed to add qdisc entry: %s\n",
+				   lkl_strerror(ret));
+			return;
+		}
+	}
+}
diff --git a/tools/lkl/lib/virtio_net.c b/tools/lkl/lib/virtio_net.c
new file mode 100644
index 000000000000..cd720b363f18
--- /dev/null
+++ b/tools/lkl/lib/virtio_net.c
@@ -0,0 +1,322 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <string.h>
+#include <lkl_host.h>
+#include "virtio.h"
+#include "endian.h"
+
+#include <lkl/linux/virtio_net.h>
+
+#define netdev_of(x) (container_of(x, struct virtio_net_dev, dev))
+#define BIT(x) (1ULL << x)
+
+/* We always have 2 queues on a netdev: one for tx, one for rx. */
+#define RX_QUEUE_IDX 0
+#define TX_QUEUE_IDX 1
+
+#define NUM_QUEUES (TX_QUEUE_IDX + 1)
+#define QUEUE_DEPTH 128
+
+/* In fact, we'll hit the limit on the devs string below long before
+ * we hit this, but it's good enough for now.
+ */
+#define MAX_NET_DEVS 16
+
+#ifdef DEBUG
+#define bad_request(s) do {			\
+		lkl_printf("%s\n", s);		\
+		panic();			\
+	} while (0)
+#else
+#define bad_request(s) lkl_printf("virtio_net: %s\n", s)
+#endif /* DEBUG */
+
+struct virtio_net_dev {
+	struct virtio_dev dev;
+	struct lkl_virtio_net_config config;
+	struct lkl_netdev *nd;
+	struct lkl_mutex **queue_locks;
+	lkl_thread_t poll_tid;
+};
+
+static int net_check_features(struct virtio_dev *dev)
+{
+	if (dev->driver_features == dev->device_features)
+		return 0;
+
+	return -LKL_EINVAL;
+}
+
+static void net_acquire_queue(struct virtio_dev *dev, int queue_idx)
+{
+	lkl_host_ops.mutex_lock(netdev_of(dev)->queue_locks[queue_idx]);
+}
+
+static void net_release_queue(struct virtio_dev *dev, int queue_idx)
+{
+	lkl_host_ops.mutex_unlock(netdev_of(dev)->queue_locks[queue_idx]);
+}
+
+/*
+ * The buffers passed through "req" from the virtio_net driver always starts
+ * with a vnet_hdr. We need to check the backend device if it expects vnet_hdr
+ * and adjust buffer offset accordingly.
+ */
+static int net_enqueue(struct virtio_dev *dev, int q, struct virtio_req *req)
+{
+	struct lkl_virtio_net_hdr_v1 *header;
+	struct virtio_net_dev *net_dev;
+	struct iovec *iov;
+	int ret;
+
+	header = req->buf[0].iov_base;
+	net_dev = netdev_of(dev);
+	/*
+	 * The backend device does not expect a vnet_hdr so adjust buf
+	 * accordingly. (We make adjustment to req->buf so it can be used
+	 * directly for the tx/rx call but remember to undo the change after the
+	 * call.  Note that it's ok to pass iov with entry's len==0.  The caller
+	 * will skip to the next entry correctly.
+	 */
+	if (!net_dev->nd->has_vnet_hdr) {
+		req->buf[0].iov_base += sizeof(*header);
+		req->buf[0].iov_len -= sizeof(*header);
+	}
+	iov = req->buf;
+
+	/* Pick which virtqueue to send the buffer(s) to */
+	if (q == TX_QUEUE_IDX) {
+		ret = net_dev->nd->ops->tx(net_dev->nd, iov, req->buf_count);
+		if (ret < 0)
+			return -1;
+	} else if (q == RX_QUEUE_IDX) {
+		int i, len;
+
+		ret = net_dev->nd->ops->rx(net_dev->nd, iov, req->buf_count);
+		if (ret < 0)
+			return -1;
+		if (net_dev->nd->has_vnet_hdr) {
+			/*
+			 * If the number of bytes returned exactly matches the
+			 * total space in the iov then there is a good chance we
+			 * did not supply a large enough buffer for the whole
+			 * pkt, i.e., pkt has been truncated.  This is only
+			 * likely to happen under mergeable RX buffer mode.
+			 */
+			if (req->total_len == (unsigned int)ret)
+				lkl_printf("PKT is likely truncated! len=%d\n",
+				    ret);
+		} else {
+			header->flags = 0;
+			header->gso_type = LKL_VIRTIO_NET_HDR_GSO_NONE;
+		}
+		/*
+		 * Have to compute how many descriptors we've consumed (really
+		 * only matters to the the mergeable RX mode) and return it
+		 * through "num_buffers".
+		 */
+		for (i = 0, len = ret; len > 0; i++)
+			len -= req->buf[i].iov_len;
+		header->num_buffers = i;
+
+		if (dev->device_features & BIT(LKL_VIRTIO_NET_F_GUEST_CSUM))
+			header->flags |= LKL_VIRTIO_NET_HDR_F_DATA_VALID;
+	} else {
+		bad_request("tried to push on non-existent queue");
+		return -1;
+	}
+	if (!net_dev->nd->has_vnet_hdr) {
+		/* Undo the adjustment */
+		req->buf[0].iov_base -= sizeof(*header);
+		req->buf[0].iov_len += sizeof(*header);
+		ret += sizeof(struct lkl_virtio_net_hdr_v1);
+	}
+	virtio_req_complete(req, ret);
+	return 0;
+}
+
+static struct virtio_dev_ops net_ops = {
+	.check_features = net_check_features,
+	.enqueue = net_enqueue,
+	.acquire_queue = net_acquire_queue,
+	.release_queue = net_release_queue,
+};
+
+void poll_thread(void *arg)
+{
+	struct virtio_net_dev *dev = arg;
+
+	/* Synchronization is handled in virtio_process_queue */
+	do {
+		int ret = dev->nd->ops->poll(dev->nd);
+
+		if (ret < 0) {
+			lkl_printf("virtio net poll error: %d\n", ret);
+			continue;
+		}
+
+		if (ret & LKL_DEV_NET_POLL_HUP)
+			break;
+		if (ret & LKL_DEV_NET_POLL_RX)
+			virtio_process_queue(&dev->dev, 0);
+		if (ret & LKL_DEV_NET_POLL_TX)
+			virtio_process_queue(&dev->dev, 1);
+	} while (1);
+}
+
+struct virtio_net_dev *registered_devs[MAX_NET_DEVS];
+static int registered_dev_idx;
+
+static int dev_register(struct virtio_net_dev *dev)
+{
+	if (registered_dev_idx == MAX_NET_DEVS) {
+		lkl_printf("Too many virtio_net devices!\n");
+		/* This error code is a little bit of a lie */
+		return -LKL_ENOMEM;
+	}
+
+	/* registered_dev_idx is incremented by the caller */
+	registered_devs[registered_dev_idx] = dev;
+	return 0;
+}
+
+static void free_queue_locks(struct lkl_mutex **queues, int num_queues)
+{
+	int i = 0;
+
+	if (!queues)
+		return;
+
+	for (i = 0; i < num_queues; i++)
+		lkl_host_ops.mutex_free(queues[i]);
+
+	lkl_host_ops.mem_free(queues);
+}
+
+static struct lkl_mutex **init_queue_locks(int num_queues)
+{
+	int i;
+	struct lkl_mutex **ret = lkl_host_ops.mem_alloc(
+		sizeof(struct lkl_mutex *) * num_queues);
+	if (!ret)
+		return NULL;
+
+	memset(ret, 0, sizeof(struct lkl_mutex *) * num_queues);
+	for (i = 0; i < num_queues; i++) {
+		ret[i] = lkl_host_ops.mutex_alloc(1);
+		if (!ret[i]) {
+			free_queue_locks(ret, i);
+			return NULL;
+		}
+	}
+
+	return ret;
+}
+
+int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
+{
+	struct virtio_net_dev *dev;
+	int ret = -LKL_ENOMEM;
+
+	dev = lkl_host_ops.mem_alloc(sizeof(*dev));
+	if (!dev)
+		return -LKL_ENOMEM;
+
+	memset(dev, 0, sizeof(*dev));
+
+	dev->dev.device_id = LKL_VIRTIO_ID_NET;
+	if (args) {
+		if (args->mac) {
+			dev->dev.device_features |= BIT(LKL_VIRTIO_NET_F_MAC);
+			memcpy(dev->config.mac, args->mac, LKL_ETH_ALEN);
+		}
+		dev->dev.device_features |= args->offload;
+
+	}
+	dev->dev.config_data = &dev->config;
+	dev->dev.config_len = sizeof(dev->config);
+	dev->dev.ops = &net_ops;
+	dev->nd = nd;
+	dev->queue_locks = init_queue_locks(NUM_QUEUES);
+
+	if (!dev->queue_locks)
+		goto out_free;
+
+	/*
+	 * MUST match the number of queue locks we initialized. We could init
+	 * the queues in virtio_dev_setup to help enforce this, but netdevs are
+	 * the only flavor that need these locks, so it's better to do it
+	 * here.
+	 */
+	ret = virtio_dev_setup(&dev->dev, NUM_QUEUES, QUEUE_DEPTH);
+
+	if (ret)
+		goto out_free;
+
+	/*
+	 * We may receive upto 64KB TSO packet so collect as many descriptors as
+	 * there are available up to 64KB in total len.
+	 */
+	if (dev->dev.device_features & BIT(LKL_VIRTIO_NET_F_MRG_RXBUF))
+		virtio_set_queue_max_merge_len(&dev->dev, RX_QUEUE_IDX, 65536);
+
+	dev->poll_tid = lkl_host_ops.thread_create(poll_thread, dev);
+	if (dev->poll_tid == 0)
+		goto out_cleanup_dev;
+
+	ret = dev_register(dev);
+	if (ret < 0)
+		goto out_cleanup_dev;
+
+	return registered_dev_idx++;
+
+out_cleanup_dev:
+	virtio_dev_cleanup(&dev->dev);
+
+out_free:
+	if (dev->queue_locks)
+		free_queue_locks(dev->queue_locks, NUM_QUEUES);
+	lkl_host_ops.mem_free(dev);
+
+	return ret;
+}
+
+/* Return 0 for success, -1 for failure. */
+void lkl_netdev_remove(int id)
+{
+	struct virtio_net_dev *dev;
+	int ret;
+
+	if (id >= registered_dev_idx) {
+		lkl_printf("%s: invalid id: %d\n", __func__, id);
+		return;
+	}
+
+	dev = registered_devs[id];
+
+	dev->nd->ops->poll_hup(dev->nd);
+	lkl_host_ops.thread_join(dev->poll_tid);
+
+	ret = lkl_netdev_get_ifindex(id);
+	if (ret < 0) {
+		lkl_printf("%s: failed to get ifindex for id %d: %s\n",
+			   __func__, id, lkl_strerror(ret));
+		return;
+	}
+
+	ret = lkl_if_down(ret);
+	if (ret < 0) {
+		lkl_printf("%s: failed to put interface id %d down: %s\n",
+			   __func__, id, lkl_strerror(ret));
+		return;
+	}
+
+	virtio_dev_cleanup(&dev->dev);
+
+	free_queue_locks(dev->queue_locks, NUM_QUEUES);
+	lkl_host_ops.mem_free(dev);
+}
+
+void lkl_netdev_free(struct lkl_netdev *nd)
+{
+	nd->ops->free(nd);
+}
diff --git a/tools/lkl/lib/virtio_net_dpdk.c b/tools/lkl/lib/virtio_net_dpdk.c
new file mode 100644
index 000000000000..9512769554a5
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_dpdk.c
@@ -0,0 +1,480 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel DPDK based virtual network interface feature for LKL
+ * Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ */
+
+//#define DEBUG
+
+#include <stdio.h>
+#include <string.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/queue.h>
+
+#include <rte_eal.h>
+#include <rte_ethdev.h>
+#include <rte_mempool.h>
+#include <rte_net.h>
+
+#include <lkl_host.h>
+
+static char *ealargs[4] = {
+	"lkl_vif_dpdk",
+	"-c 1",
+	"-n 1",
+	"--log-level=0",
+};
+
+#define MAX_PKT_BURST           16
+/* XXX: disable cache due to no thread-safe on mempool cache. */
+#define MEMPOOL_CACHE_SZ        0
+/* for TSO pkt */
+#define MAX_PACKET_SZ           (65535 \
+	- (sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM))
+#define MBUF_NUM                (512*2) /* vmxnet3 requires 1024 */
+#define MBUF_SIZ        \
+	(MAX_PACKET_SZ + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+#define NUMDESC         512	/* nb_min on vmxnet3 is 512 */
+#define NUMQUEUE        1
+
+#define BIT(x) (1ULL << x)
+
+static int portid;
+
+struct lkl_netdev_dpdk {
+	struct lkl_netdev dev;
+	int portid;
+	struct rte_mempool *rxpool, *txpool; /* ring buffer pool */
+	/* burst receive context by rump dpdk code */
+	struct rte_mbuf *rcv_mbuf[MAX_PKT_BURST];
+	int npkts;
+	int bufidx;
+	int offload;
+	int close: 1;
+	int busy_poll: 1;
+};
+
+static int dpdk_net_tx_prep(struct rte_mbuf *rm,
+		struct lkl_virtio_net_hdr_v1 *header)
+{
+	struct rte_net_hdr_lens hdr_lens;
+	uint32_t ptype;
+
+#ifdef DEBUG
+	lkl_printf("dpdk-tx: gso_type=%d, gso=%d, hdrlen=%d validation=%d\n",
+		header->gso_type, header->gso_size, header->hdr_len,
+		rte_validate_tx_offload(rm));
+#endif
+
+	ptype = rte_net_get_ptype(rm, &hdr_lens, RTE_PTYPE_ALL_MASK);
+	rm->l2_len = hdr_lens.l2_len;
+	rm->l3_len = hdr_lens.l3_len;
+	rm->l4_len = hdr_lens.l4_len; // including tcp opts
+
+	if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP) {
+		if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV4)
+			rm->ol_flags = PKT_TX_IPV4;
+		else if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV6)
+			rm->ol_flags = PKT_TX_IPV6;
+
+		rm->ol_flags |= PKT_TX_TCP_CKSUM;
+		rm->tso_segsz = header->gso_size;
+		/* TSO case */
+		if (header->gso_type == LKL_VIRTIO_NET_HDR_GSO_TCPV4)
+			rm->ol_flags |= (PKT_TX_TCP_SEG | PKT_TX_IP_CKSUM);
+		else if (header->gso_type == LKL_VIRTIO_NET_HDR_GSO_TCPV6)
+			rm->ol_flags |= PKT_TX_TCP_SEG;
+	}
+
+	return sizeof(struct lkl_virtio_net_hdr_v1);
+
+}
+
+static int dpdk_net_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	void *pkt;
+	struct rte_mbuf *rm;
+	struct lkl_netdev_dpdk *nd_dpdk;
+	struct lkl_virtio_net_hdr_v1 *header = NULL;
+	int i, len, sent = 0;
+	void *data = NULL;
+
+	nd_dpdk = (struct lkl_netdev_dpdk *) nd;
+
+	/*
+	 * XXX: someone reported that DPDK's mempool with cache is not thread
+	 * safe (e.g., http://www.dpdk.io/ml/archives/dev/2014-February/001401.html),
+	 * potentially rte_pktmbuf_alloc() is not thread safe here.  so I
+	 * tentatively disabled the cache on mempool by assigning
+	 * MEMPOOL_CACHE_SZ to 0.
+	 */
+	rm = rte_pktmbuf_alloc(nd_dpdk->txpool);
+
+	for (i = 0; i < cnt; i++) {
+		data = iov[i].iov_base;
+		len = (int)iov[i].iov_len;
+
+		if (i == 0) {
+			header = data;
+			data += sizeof(*header);
+			len -= sizeof(*header);
+		}
+
+		if (len == 0)
+			continue;
+
+		pkt = rte_pktmbuf_append(rm, len);
+		if (pkt) {
+			/* XXX: I wanna have M_EXT flag !!! */
+			memcpy(pkt, data, len);
+			sent += len;
+		} else {
+			lkl_printf("dpdk-tx: failed to append: idx=%d len=%d\n",
+				   i, len);
+			rte_pktmbuf_free(rm);
+			return -1;
+		}
+#ifdef DEBUG
+		lkl_printf("dpdk-tx: pkt[%d]len=%d\n", i, len);
+#endif
+	}
+
+	/* preparation for TX offloads */
+	sent += dpdk_net_tx_prep(rm, header);
+
+	/* XXX: should be bulk-trasmitted !! */
+	if (rte_eth_tx_prepare(nd_dpdk->portid, 0, &rm, 1) != 1)
+		lkl_printf("tx_prep failed\n");
+
+	rte_eth_tx_burst(nd_dpdk->portid, 0, &rm, 1);
+
+	rte_pktmbuf_free(rm);
+	return sent;
+}
+
+static int __dpdk_net_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	struct lkl_netdev_dpdk *nd_dpdk;
+	int i = 0;
+	struct rte_mbuf *rm, *first;
+	void *r_data;
+	size_t read = 0, r_size, copylen = 0, offset = 0;
+	struct lkl_virtio_net_hdr_v1 *header = iov[0].iov_base;
+	uint16_t mtu;
+
+	nd_dpdk = (struct lkl_netdev_dpdk *) nd;
+	memset(header, 0, sizeof(struct lkl_virtio_net_hdr_v1));
+
+	first = nd_dpdk->rcv_mbuf[nd_dpdk->bufidx];
+
+	for (rm = nd_dpdk->rcv_mbuf[nd_dpdk->bufidx]; rm; rm = rm->next) {
+		r_data = rte_pktmbuf_mtod(rm, void *);
+		r_size = rte_pktmbuf_data_len(rm);
+
+#ifdef DEBUG
+		lkl_printf("dpdk-rx: mbuf pktlen=%d orig_len=%lu\n",
+			   r_size, iov[i].iov_len);
+#endif
+		/* mergeable buffer starts data after vnet header at [0] */
+		if (nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_MRG_RXBUF) &&
+		    i == 0)
+			offset = sizeof(struct lkl_virtio_net_hdr_v1);
+		else if (nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_GUEST_TSO4) &&
+			 i == 0)
+			i++;
+		else
+			offset = sizeof(struct lkl_virtio_net_hdr_v1);
+
+		read += r_size;
+		while (r_size > 0) {
+			if (i >= cnt) {
+				fprintf(stderr,
+					"dpdk-rx: buffer full. skip it. ");
+				fprintf(stderr,
+					"(cnt=%d, buf[%d]=%lu, size=%lu)\n",
+					i, cnt, iov[i].iov_len, r_size);
+				goto end;
+			}
+
+			copylen = r_size < (iov[i].iov_len - offset) ? r_size
+				: iov[i].iov_len - offset;
+			memcpy(iov[i].iov_base + offset, r_data, copylen);
+
+			r_size -= copylen;
+			offset = 0;
+			i++;
+		}
+	}
+
+end:
+	/* TSO (big_packet mode) */
+	header->flags = LKL_VIRTIO_NET_HDR_F_DATA_VALID;
+	rte_eth_dev_get_mtu(nd_dpdk->portid, &mtu);
+
+	if (read > (mtu + sizeof(struct ether_hdr)
+		    + sizeof(struct lkl_virtio_net_hdr_v1))) {
+		struct rte_net_hdr_lens hdr_lens;
+		uint32_t ptype;
+
+		ptype = rte_net_get_ptype(first, &hdr_lens, RTE_PTYPE_ALL_MASK);
+
+		if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP) {
+			if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV4 &&
+			    nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_GUEST_TSO4))
+				header->gso_type = LKL_VIRTIO_NET_HDR_GSO_TCPV4;
+			/* XXX: Intel X540 doesn't support LRO
+			 * with tcpv6 packets
+			 */
+			if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV6 &&
+			    nd_dpdk->offload & BIT(LKL_VIRTIO_NET_F_GUEST_TSO6))
+				header->gso_type = LKL_VIRTIO_NET_HDR_GSO_TCPV6;
+		}
+
+		header->gso_size = mtu - hdr_lens.l3_len - hdr_lens.l4_len;
+		header->hdr_len = hdr_lens.l2_len + hdr_lens.l3_len
+			+ hdr_lens.l4_len;
+	}
+
+	read += sizeof(struct lkl_virtio_net_hdr_v1);
+
+#ifdef DEBUG
+	lkl_printf("dpdk-rx: len=%d mtu=%d type=%d, size=%d, hdrlen=%d\n",
+		   read, mtu, header->gso_type,
+		   header->gso_size, header->hdr_len);
+#endif
+
+	return read;
+}
+
+
+/*
+ * this function is not thread-safe.
+ *
+ * nd_dpdk->rcv_mbuf is specifically not safe in parallel access.  if future
+ * refactor allows us to read in parallel, the buffer (nd_dpdk->rcv_mbuf) shall
+ * be guarded.
+ */
+static int dpdk_net_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	struct lkl_netdev_dpdk *nd_dpdk;
+	int read = 0;
+
+	nd_dpdk = (struct lkl_netdev_dpdk *) nd;
+
+	if (nd_dpdk->npkts == 0) {
+		nd_dpdk->npkts = rte_eth_rx_burst(nd_dpdk->portid, 0,
+						  nd_dpdk->rcv_mbuf,
+						  MAX_PKT_BURST);
+		if (nd_dpdk->npkts <= 0) {
+			/* XXX: need to implement proper poll()
+			 * or interrupt mode PMD of dpdk, which is only
+			 * availbale on ixgbe/igb/e1000 (as of Jan. 2016)
+			 */
+			if (!nd_dpdk->busy_poll)
+				usleep(1);
+			return -1;
+		}
+		nd_dpdk->bufidx = 0;
+	}
+
+	/* mergeable buffer */
+	read = __dpdk_net_rx(nd, iov, cnt);
+
+	rte_pktmbuf_free(nd_dpdk->rcv_mbuf[nd_dpdk->bufidx]);
+
+	nd_dpdk->bufidx++;
+	nd_dpdk->npkts--;
+
+	return read;
+}
+
+static int dpdk_net_poll(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_dpdk *nd_dpdk =
+		container_of(nd, struct lkl_netdev_dpdk, dev);
+
+	if (nd_dpdk->close)
+		return LKL_DEV_NET_POLL_HUP;
+	/*
+	 * dpdk's interrupt mode has equivalent of epoll_wait(2),
+	 * which we can apply here. but AFAIK the mode is only available
+	 * on limited NIC drivers like ixgbe/igb/e1000 (with dpdk v2.2.0),
+	 * while vmxnet3 is not supported e.g..
+	 */
+	return LKL_DEV_NET_POLL_RX | LKL_DEV_NET_POLL_TX;
+}
+
+static void dpdk_net_poll_hup(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_dpdk *nd_dpdk =
+		container_of(nd, struct lkl_netdev_dpdk, dev);
+
+	nd_dpdk->close = 1;
+}
+
+static void dpdk_net_free(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_dpdk *nd_dpdk =
+		container_of(nd, struct lkl_netdev_dpdk, dev);
+
+	free(nd_dpdk);
+}
+
+struct lkl_dev_net_ops dpdk_net_ops = {
+	.tx = dpdk_net_tx,
+	.rx = dpdk_net_rx,
+	.poll = dpdk_net_poll,
+	.poll_hup = dpdk_net_poll_hup,
+	.free = dpdk_net_free,
+};
+
+
+static int dpdk_init;
+struct lkl_netdev *lkl_netdev_dpdk_create(const char *ifparams, int offload,
+					 unsigned char *mac)
+{
+	int ret = 0;
+	struct rte_eth_conf portconf;
+	struct rte_eth_link link;
+	struct lkl_netdev_dpdk *nd;
+	struct rte_eth_dev_info dev_info;
+	char poolname[RTE_MEMZONE_NAMESIZE];
+	char *debug = getenv("LKL_HIJACK_DEBUG");
+	int lkl_debug = 0;
+
+	if (!dpdk_init) {
+		if (debug)
+			lkl_debug = strtol(debug, NULL, 0);
+		if (lkl_debug & 0x400)
+			ealargs[3] = "--log-level=100";
+
+		ret = rte_eal_init(sizeof(ealargs) / sizeof(ealargs[0]),
+				   ealargs);
+		if (ret < 0)
+			lkl_printf("dpdk: failed to initialize eal\n");
+
+		dpdk_init = 1;
+	}
+
+	nd = malloc(sizeof(struct lkl_netdev_dpdk));
+	memset(nd, 0, sizeof(struct lkl_netdev_dpdk));
+	nd->dev.ops = &dpdk_net_ops;
+	nd->portid = portid++;
+	/* busy-poll mode is described 'ifparams' with "*-busy" */
+	nd->busy_poll = strstr(ifparams, "busy") ? 1 : 0;
+	/* we always enable big_packet mode with dpdk. */
+	nd->offload = offload;
+
+	snprintf(poolname, RTE_MEMZONE_NAMESIZE, "%s%s", "tx-", ifparams);
+	nd->txpool =
+		rte_mempool_create(poolname,
+				   MBUF_NUM, MBUF_SIZ, MEMPOOL_CACHE_SZ,
+				   sizeof(struct rte_pktmbuf_pool_private),
+				   rte_pktmbuf_pool_init, NULL,
+				   rte_pktmbuf_init, NULL, 0, 0);
+
+	if (!nd->txpool) {
+		lkl_printf("dpdk: failed to allocate tx pool\n");
+		free(nd);
+		return NULL;
+	}
+
+
+	snprintf(poolname, RTE_MEMZONE_NAMESIZE, "%s%s", "rx-", ifparams);
+	nd->rxpool =
+		rte_mempool_create(poolname, MBUF_NUM, MBUF_SIZ, 0,
+				   sizeof(struct rte_pktmbuf_pool_private),
+				   rte_pktmbuf_pool_init, NULL,
+				   rte_pktmbuf_init, NULL, 0, 0);
+	if (!nd->rxpool) {
+		lkl_printf("dpdk: failed to allocate rx pool\n");
+		free(nd);
+		return NULL;
+	}
+
+	memset(&portconf, 0, sizeof(portconf));
+
+	/* offload bits */
+	/* but, we only configure NIC to use TSO *only if* user specifies. */
+	if (offload & (BIT(LKL_VIRTIO_NET_F_GUEST_TSO4) |
+			BIT(LKL_VIRTIO_NET_F_GUEST_TSO6) |
+			BIT(LKL_VIRTIO_NET_F_MRG_RXBUF))) {
+		portconf.rxmode.enable_lro = 1;
+		portconf.rxmode.hw_strip_crc = 1;
+	}
+
+	ret = rte_eth_dev_configure(nd->portid, NUMQUEUE, NUMQUEUE,
+				    &portconf);
+	if (ret < 0) {
+		lkl_printf("dpdk: failed to configure port\n");
+		free(nd);
+		return NULL;
+	}
+
+	rte_eth_dev_info_get(nd->portid, &dev_info);
+
+	ret = rte_eth_rx_queue_setup(nd->portid, 0, NUMDESC, 0,
+				     &dev_info.default_rxconf, nd->rxpool);
+	if (ret < 0) {
+		lkl_printf("dpdk: failed to setup rx queue\n");
+		free(nd);
+		return NULL;
+	}
+
+	dev_info.default_txconf.txq_flags = 0;
+
+	dev_info.default_txconf.txq_flags |= ETH_TXQ_FLAGS_NOXSUMSCTP;
+	dev_info.default_txconf.txq_flags |= ETH_TXQ_FLAGS_NOVLANOFFL;
+
+
+	ret = rte_eth_tx_queue_setup(nd->portid, 0, NUMDESC, 0,
+				     &dev_info.default_txconf);
+	if (ret < 0) {
+		lkl_printf("dpdk: failed to setup tx queue\n");
+		free(nd);
+		return NULL;
+	}
+
+	ret = rte_eth_dev_start(nd->portid);
+	/* XXX: this function returns positive val (e.g., 12)
+	 * if there's an error
+	 */
+	if (ret != 0) {
+		lkl_printf("dpdk: failed to start device\n");
+		free(nd);
+		return NULL;
+	}
+
+	if (mac) {
+		rte_eth_macaddr_get(nd->portid, (struct ether_addr *)mac);
+		lkl_printf("Port %d: %02x:%02x:%02x:%02x:%02x:%02x\n",
+			   nd->portid,
+			   mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
+	}
+
+	rte_eth_dev_set_link_up(nd->portid);
+
+	rte_eth_link_get(nd->portid, &link);
+	if (!link.link_status) {
+		fprintf(stderr, "dpdk: interface state is down\n");
+		rte_eth_link_get(nd->portid, &link);
+		if (!link.link_status) {
+			fprintf(stderr,
+				"dpdk: interface state is down.. Giving up.\n");
+			return NULL;
+		}
+		lkl_printf("dpdk: interface state should be up now.\n");
+	}
+
+	/* should be promisc ? */
+	rte_eth_promiscuous_enable(nd->portid);
+
+	/* as we always assume to have vnet_hdr for dpdk device. */
+	nd->dev.has_vnet_hdr = 1;
+
+	return (struct lkl_netdev *) nd;
+}
diff --git a/tools/lkl/lib/virtio_net_fd.c b/tools/lkl/lib/virtio_net_fd.c
new file mode 100644
index 000000000000..f8664455e696
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_fd.c
@@ -0,0 +1,217 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * POSIX file descriptor based virtual network interface feature for
+ * LKL Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ *         Octavian Purdila <octavian.purdila@intel.com>
+ *
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <unistd.h>
+#ifdef __FreeBSD__
+#include <sys/syslimits.h>
+#else
+#include <limits.h>
+#endif
+#include <fcntl.h>
+#include <sys/poll.h>
+#include <sys/uio.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+struct lkl_netdev_fd {
+	struct lkl_netdev dev;
+	/* file-descriptor based device */
+	int fd_rx;
+	int fd_tx;
+	/*
+	 * Controlls the poll mask for fd. Can be acccessed concurrently from
+	 * poll, tx, or rx routines but there is no need for syncronization
+	 * because:
+	 *
+	 * (a) TX and RX routines set different variables so even if they update
+	 * at the same time there is no race condition
+	 *
+	 * (b) Even if poll and TX / RX update at the same time poll cannot
+	 * stall: when poll resets the poll variable we know that TX / RX will
+	 * run which means that eventually the poll variable will be set.
+	 */
+	int poll_tx, poll_rx;
+	/* controle pipe */
+	int pipe[2];
+};
+
+static int fd_net_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	do {
+		ret = writev(nd_fd->fd_tx, iov, cnt);
+	} while (ret == -1 && errno == EINTR);
+
+	if (ret < 0) {
+		if (errno != EAGAIN) {
+			perror("write to fd netdev fails");
+		} else {
+			char tmp = 0;
+
+			nd_fd->poll_tx = 1;
+			if (write(nd_fd->pipe[1], &tmp, 1) <= 0)
+				perror("virtio net fd pipe write");
+		}
+	}
+	return ret;
+}
+
+static int fd_net_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	do {
+		ret = readv(nd_fd->fd_rx, (struct iovec *)iov, cnt);
+	} while (ret == -1 && errno == EINTR);
+
+	if (ret < 0) {
+		if (errno != EAGAIN) {
+			perror("virtio net fd read");
+		} else {
+			char tmp = 0;
+
+			nd_fd->poll_rx = 1;
+			if (write(nd_fd->pipe[1], &tmp, 1) < 0)
+				perror("virtio net fd pipe write");
+		}
+	}
+	return ret;
+}
+
+static int fd_net_poll(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+	struct pollfd pfds[3] = {
+		{
+			.fd = nd_fd->fd_rx,
+		},
+		{
+			.fd = nd_fd->fd_tx,
+		},
+		{
+			.fd = nd_fd->pipe[0],
+			.events = POLLIN,
+		},
+	};
+	int ret;
+
+	if (nd_fd->poll_rx)
+		pfds[0].events |= POLLIN|POLLPRI;
+	if (nd_fd->poll_tx)
+		pfds[1].events |= POLLOUT;
+
+	do {
+		ret = poll(pfds, 3, -1);
+	} while (ret == -1 && errno == EINTR);
+
+	if (ret < 0) {
+		perror("virtio net fd poll");
+		return 0;
+	}
+
+	if (pfds[2].revents & (POLLHUP|POLLNVAL))
+		return LKL_DEV_NET_POLL_HUP;
+
+	if (pfds[2].revents & POLLIN) {
+		char tmp[PIPE_BUF];
+
+		ret = read(nd_fd->pipe[0], tmp, PIPE_BUF);
+		if (ret == 0)
+			return LKL_DEV_NET_POLL_HUP;
+		if (ret < 0)
+			perror("virtio net fd pipe read");
+	}
+
+	ret = 0;
+
+	if (pfds[0].revents & (POLLIN|POLLPRI)) {
+		nd_fd->poll_rx = 0;
+		ret |= LKL_DEV_NET_POLL_RX;
+	}
+
+	if (pfds[1].revents & POLLOUT) {
+		nd_fd->poll_tx = 0;
+		ret |= LKL_DEV_NET_POLL_TX;
+	}
+
+	return ret;
+}
+
+static void fd_net_poll_hup(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	/* this will cause a POLLHUP / POLLNVAL in the poll function */
+	close(nd_fd->pipe[0]);
+	close(nd_fd->pipe[1]);
+}
+
+static void fd_net_free(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+
+	close(nd_fd->fd_rx);
+	close(nd_fd->fd_tx);
+	free(nd_fd);
+}
+
+struct lkl_dev_net_ops fd_net_ops =  {
+	.tx = fd_net_tx,
+	.rx = fd_net_rx,
+	.poll = fd_net_poll,
+	.poll_hup = fd_net_poll_hup,
+	.free = fd_net_free,
+};
+
+struct lkl_netdev *lkl_register_netdev_fd(int fd_rx, int fd_tx)
+{
+	struct lkl_netdev_fd *nd;
+
+	nd = malloc(sizeof(*nd));
+	if (!nd) {
+		fprintf(stderr, "fdnet: failed to allocate memory\n");
+		/* TODO: propagate the error state, maybe use errno for that? */
+		return NULL;
+	}
+
+	memset(nd, 0, sizeof(*nd));
+
+	nd->fd_rx = fd_rx;
+	nd->fd_tx = fd_tx;
+	if (pipe(nd->pipe) < 0) {
+		perror("pipe");
+		free(nd);
+		return NULL;
+	}
+
+	if (fcntl(nd->pipe[0], F_SETFL, O_NONBLOCK) < 0) {
+		perror("fnctl");
+		close(nd->pipe[0]);
+		close(nd->pipe[1]);
+		free(nd);
+		return NULL;
+	}
+
+	nd->dev.ops = &fd_net_ops;
+	return &nd->dev;
+}
diff --git a/tools/lkl/lib/virtio_net_fd.h b/tools/lkl/lib/virtio_net_fd.h
new file mode 100644
index 000000000000..713ba13cca7c
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_fd.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _VIRTIO_NET_FD_H
+#define _VIRTIO_NET_FD_H
+
+struct ifreq;
+
+/**
+ * lkl_register_netdev_linux_fdnet - register a file descriptor-based network
+ * device as a NIC
+ *
+ * @fd_rx - a POSIX file descriptor number for input
+ * @fd_tx - a POSIX file descriptor number for output
+ * @returns a struct lkl_netdev_linux_fdnet entry for virtio-net
+ */
+struct lkl_netdev *lkl_register_netdev_fd(int fd_rx, int fd_tx);
+
+
+/**
+ * lkl_netdev_tap_init - initialize tap related structure fot lkl_netdev.
+ *
+ * @path - the path to open the device.
+ * @offload - offload bits for the device
+ * @ifr - struct ifreq for ioctl.
+ */
+struct lkl_netdev *lkl_netdev_tap_init(const char *path, int offload,
+				       struct ifreq *ifr);
+
+#endif /* _VIRTIO_NET_FD_H*/
diff --git a/tools/lkl/lib/virtio_net_macvtap.c b/tools/lkl/lib/virtio_net_macvtap.c
new file mode 100644
index 000000000000..5d6d2c822f2d
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_macvtap.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * macvtap based virtual network interface feature for LKL
+ * Copyright (c) 2016 Hajime Tazaki
+ *
+ * Author: Hajime Tazaki <thehajime@gmail.com>
+ *
+ * Current implementation is linux-specific.
+ */
+
+/*
+ * You need to configure host device in advance.
+ *
+ * sudo ip link add link eth0 name vtap0 type macvtap mode passthru
+ * sudo ip link set dev vtap0 up
+ * sudo chown thehajime /dev/tap22
+ */
+
+#include <net/if.h>
+#include <linux/if_tun.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+struct lkl_netdev *lkl_netdev_macvtap_create(const char *path, int offload)
+{
+	struct ifreq ifr = {
+		.ifr_flags = IFF_TAP | IFF_NO_PI,
+	};
+
+	return lkl_netdev_tap_init(path, offload, &ifr);
+}
diff --git a/tools/lkl/lib/virtio_net_pipe.c b/tools/lkl/lib/virtio_net_pipe.c
new file mode 100644
index 000000000000..c68d4c855499
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_pipe.c
@@ -0,0 +1,76 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * pipe based virtual network interface feature for LKL
+ * Copyright (c) 2017,2016 Motomu Utsumi
+ *
+ * Author: Motomu Utsumi <motomuman@gmail.com>
+ *
+ * Current implementation is linux-specific.
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <fcntl.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+struct lkl_netdev *lkl_netdev_pipe_create(const char *_ifname, int offload)
+{
+	struct lkl_netdev *nd;
+	int fd_rx, fd_tx;
+	char *ifname = strdup(_ifname), *ifname_rx = NULL, *ifname_tx = NULL;
+
+	ifname_rx = strtok(ifname, "|");
+	if (ifname_rx == NULL) {
+		fprintf(stderr, "invalid ifname format: %s\n", ifname);
+		free(ifname);
+		return NULL;
+	}
+
+	ifname_tx = strtok(NULL, "|");
+	if (ifname_tx == NULL) {
+		fprintf(stderr, "invalid ifname format: %s\n", ifname);
+		free(ifname);
+		return NULL;
+	}
+
+	if (strtok(NULL, "|") != NULL) {
+		fprintf(stderr, "invalid ifname format: %s\n", ifname);
+		free(ifname);
+		return NULL;
+	}
+
+	fd_rx = open(ifname_rx, O_RDWR|O_NONBLOCK);
+	if (fd_rx < 0) {
+		perror("can not open ifname_rx pipe");
+		free(ifname);
+		return NULL;
+	}
+
+	fd_tx = open(ifname_tx, O_RDWR|O_NONBLOCK);
+	if (fd_tx < 0) {
+		perror("can not open ifname_tx pipe");
+		close(fd_rx);
+		free(ifname);
+		return NULL;
+	}
+
+	nd = lkl_register_netdev_fd(fd_rx, fd_tx);
+	if (!nd) {
+		perror("failed to register to.");
+		close(fd_rx);
+		close(fd_tx);
+		free(ifname);
+		return NULL;
+	}
+
+	free(ifname);
+	/*
+	 * To avoid mismatch with LKL otherside,
+	 * we always enabled vnet hdr
+	 */
+	nd->has_vnet_hdr = 1;
+	return nd;
+}
diff --git a/tools/lkl/lib/virtio_net_raw.c b/tools/lkl/lib/virtio_net_raw.c
new file mode 100644
index 000000000000..363ccf628569
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_raw.c
@@ -0,0 +1,94 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * raw socket based virtual network interface feature for LKL
+ * Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ *
+ * Current implementation is linux-specific.
+ */
+
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <unistd.h>
+#include <net/if.h>
+#include <arpa/inet.h>
+#ifdef __linux__
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+#elif __FreeBSD__
+#include <netinet/in.h>
+#endif
+#include <fcntl.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+/* since Linux 3.14 (man 7 packet) */
+#ifndef PACKET_QDISC_BYPASS
+#define PACKET_QDISC_BYPASS 20
+#endif
+
+struct lkl_netdev *lkl_netdev_raw_create(const char *ifname)
+{
+#ifdef __linux__
+	int ret;
+	int ifindex =  if_nametoindex(ifname);
+	struct sockaddr_ll ll = {
+		.sll_family = PF_PACKET,
+		.sll_ifindex = ifindex,
+		.sll_protocol = htons(ETH_P_ALL),
+	};
+	struct packet_mreq mreq = {
+		.mr_type = PACKET_MR_PROMISC,
+		.mr_ifindex = ifindex,
+	};
+#endif
+	int fd, fd_flags;
+#ifdef __linux__
+	int val;
+
+	if (ifindex < 0) {
+		perror("if_nametoindex");
+		return NULL;
+	}
+
+	fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
+#elif __FreeBSD__
+	fd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
+#endif
+	if (fd < 0) {
+		perror("socket");
+		return NULL;
+	}
+
+#ifdef __linux__
+	ret = bind(fd, (struct sockaddr *)&ll, sizeof(ll));
+	if (ret) {
+		perror("bind");
+		close(fd);
+		return NULL;
+	}
+
+	ret = setsockopt(fd, SOL_PACKET, PACKET_ADD_MEMBERSHIP, &mreq,
+			sizeof(mreq));
+	if (ret) {
+		perror("PACKET_ADD_MEMBERSHIP PACKET_MR_PROMISC");
+		close(fd);
+		return NULL;
+	}
+
+	val = 1;
+	ret = setsockopt(fd, SOL_PACKET, PACKET_QDISC_BYPASS, &val,
+			 sizeof(val));
+	if (ret)
+		perror("PACKET_QDISC_BYPASS, ignoring");
+#endif
+
+	fd_flags = fcntl(fd, F_GETFD, NULL);
+	fcntl(fd, F_SETFL, fd_flags | O_NONBLOCK);
+
+	return lkl_register_netdev_fd(fd, fd);
+}
diff --git a/tools/lkl/lib/virtio_net_tap.c b/tools/lkl/lib/virtio_net_tap.c
new file mode 100644
index 000000000000..f1f64cee9695
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_tap.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * tun/tap based virtual network interface feature for LKL
+ * Copyright (c) 2015,2016 Ryo Nakamura, Hajime Tazaki
+ *
+ * Author: Ryo Nakamura <upa@wide.ad.jp>
+ *         Hajime Tazaki <thehajime@gmail.com>
+ *         Octavian Purdila <octavian.purdila@intel.com>
+ *
+ * Current implementation is linux-specific.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <fcntl.h>
+#include <net/if.h>
+#ifdef __linux__
+#include <linux/if_tun.h>
+#elif __FreeBSD__
+#include <net/if_tun.h>
+#endif
+#include <sys/ioctl.h>
+
+#include "virtio.h"
+#include "virtio_net_fd.h"
+
+#define BIT(x) (1ULL << x)
+
+struct lkl_netdev *lkl_netdev_tap_init(const char *path, int offload,
+				       struct ifreq *ifr)
+{
+	struct lkl_netdev *nd;
+	int fd, vnet_hdr_sz = 0;
+#ifdef __linux__
+	int ret, tap_arg = 0;
+
+	if (offload & BIT(LKL_VIRTIO_NET_F_GUEST_CSUM))
+		tap_arg |= TUN_F_CSUM;
+	if (offload & (BIT(LKL_VIRTIO_NET_F_GUEST_TSO4) |
+	    BIT(LKL_VIRTIO_NET_F_MRG_RXBUF)))
+		tap_arg |= TUN_F_TSO4 | TUN_F_CSUM;
+	if (offload & (BIT(LKL_VIRTIO_NET_F_GUEST_TSO6)))
+		tap_arg |= TUN_F_TSO6 | TUN_F_CSUM;
+
+	if (tap_arg || (offload & (BIT(LKL_VIRTIO_NET_F_CSUM) |
+				   BIT(LKL_VIRTIO_NET_F_HOST_TSO4) |
+				   BIT(LKL_VIRTIO_NET_F_HOST_TSO6)))) {
+		ifr->ifr_flags |= IFF_VNET_HDR;
+		vnet_hdr_sz = sizeof(struct lkl_virtio_net_hdr_v1);
+	}
+#endif
+	fd = open(path, O_RDWR|O_NONBLOCK);
+	if (fd < 0) {
+		perror("open");
+		return NULL;
+	}
+
+#ifdef __linux__
+	ret = ioctl(fd, TUNSETIFF, ifr);
+	if (ret < 0) {
+		fprintf(stderr, "%s: failed to attach to: %s\n",
+			path, strerror(errno));
+		close(fd);
+		return NULL;
+	}
+	if (vnet_hdr_sz && ioctl(fd, TUNSETVNETHDRSZ, &vnet_hdr_sz) != 0) {
+		fprintf(stderr, "%s: failed to TUNSETVNETHDRSZ to: %s\n",
+			path, strerror(errno));
+		close(fd);
+		return NULL;
+	}
+	if (ioctl(fd, TUNSETOFFLOAD, tap_arg) != 0) {
+		fprintf(stderr, "%s: failed to TUNSETOFFLOAD: %s\n",
+			path, strerror(errno));
+		close(fd);
+		return NULL;
+	}
+#endif
+	nd = lkl_register_netdev_fd(fd, fd);
+	if (!nd) {
+		perror("failed to register to.");
+		close(fd);
+		return NULL;
+	}
+
+	nd->has_vnet_hdr = (vnet_hdr_sz != 0);
+	return nd;
+}
+
+struct lkl_netdev *lkl_netdev_tap_create(const char *ifname, int offload)
+{
+#ifdef __linux__
+	char *path = "/dev/net/tun";
+#elif __FreeBSD__
+	char path[32];
+
+	sprintf(path, "/dev/%s", ifname);
+#endif
+
+	struct ifreq ifr = {
+#ifdef __linux__
+		.ifr_flags = IFF_TAP | IFF_NO_PI,
+#endif
+	};
+
+	strncpy(ifr.ifr_name, ifname, IFNAMSIZ);
+
+	return lkl_netdev_tap_init(path, offload, &ifr);
+}
diff --git a/tools/lkl/lib/virtio_net_vde.c b/tools/lkl/lib/virtio_net_vde.c
new file mode 100644
index 000000000000..1d017aba91ae
--- /dev/null
+++ b/tools/lkl/lib/virtio_net_vde.c
@@ -0,0 +1,168 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <poll.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include "virtio.h"
+
+#include <libvdeplug.h>
+
+struct lkl_netdev_vde {
+	struct lkl_netdev dev;
+	VDECONN *conn;
+};
+
+struct lkl_netdev *nuse_vif_vde_create(char *switch_path);
+static int net_vde_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+static int net_vde_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt);
+static int net_vde_poll_with_timeout(struct lkl_netdev *nd, int timeout);
+static int net_vde_poll(struct lkl_netdev *nd);
+static void net_vde_poll_hup(struct lkl_netdev *nd);
+static void net_vde_free(struct lkl_netdev *nd);
+
+struct lkl_dev_net_ops vde_net_ops = {
+	.tx = net_vde_tx,
+	.rx = net_vde_rx,
+	.poll = net_vde_poll,
+	.poll_hup = net_vde_poll_hup,
+	.free = net_vde_free,
+};
+
+int net_vde_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+	void *data = iov[0].iov_base;
+	int len = (int)iov[0].iov_len;
+
+	ret = vde_send(nd_vde->conn, data, len, 0);
+	if (ret <= 0 && errno == EAGAIN)
+		return -1;
+	return ret;
+}
+
+int net_vde_rx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
+{
+	int ret;
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+	void *data = iov[0].iov_base;
+	int len = (int)iov[0].iov_len;
+
+	/*
+	 * Due to a bug in libvdeplug we have to first poll to make sure
+	 * that there is data available.
+	 * The correct solution would be to just use
+	 *   ret = vde_recv(nd_vde->conn, data, len, MSG_DONTWAIT);
+	 * This should be changed once libvdeplug is fixed.
+	 */
+	ret = 0;
+	if (net_vde_poll_with_timeout(nd, 0) & LKL_DEV_NET_POLL_RX)
+		ret = vde_recv(nd_vde->conn, data, len, 0);
+	if (ret <= 0)
+		return -1;
+	return ret;
+}
+
+int net_vde_poll_with_timeout(struct lkl_netdev *nd, int timeout)
+{
+	int ret;
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+	struct pollfd pollfds[] = {
+			{
+					.fd = vde_datafd(nd_vde->conn),
+					.events = POLLIN | POLLOUT,
+			},
+			{
+					.fd = vde_ctlfd(nd_vde->conn),
+					.events = POLLHUP | POLLIN
+			}
+	};
+
+	while (poll(pollfds, 2, timeout) < 0 && errno == EINTR)
+		;
+
+	ret = 0;
+
+	if (pollfds[1].revents & (POLLHUP | POLLNVAL | POLLIN))
+		return LKL_DEV_NET_POLL_HUP;
+	if (pollfds[0].revents & (POLLHUP | POLLNVAL))
+		return LKL_DEV_NET_POLL_HUP;
+
+	if (pollfds[0].revents & POLLIN)
+		ret |= LKL_DEV_NET_POLL_RX;
+	if (pollfds[0].revents & POLLOUT)
+		ret |= LKL_DEV_NET_POLL_TX;
+
+	return ret;
+}
+
+int net_vde_poll(struct lkl_netdev *nd)
+{
+	return net_vde_poll_with_timeout(nd, -1);
+}
+
+void net_vde_poll_hup(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+
+	vde_close(nd_vde->conn);
+}
+
+void net_vde_free(struct lkl_netdev *nd)
+{
+	struct lkl_netdev_vde *nd_vde =
+		container_of(nd, struct lkl_netdev_vde, dev);
+
+	free(nd_vde);
+}
+
+struct lkl_netdev *lkl_netdev_vde_create(char const *switch_path)
+{
+	struct lkl_netdev_vde *nd;
+	struct vde_open_args open_args = {.port = 0, .group = 0, .mode = 0700 };
+	char *switch_path_copy = 0;
+
+	nd = malloc(sizeof(*nd));
+	if (!nd) {
+		fprintf(stderr, "Failed to allocate memory.\n");
+		/* TODO: propagate the error state, maybe use errno? */
+		return 0;
+	}
+	nd->dev.ops = &vde_net_ops;
+
+	/* vde_open() allows the null pointer as path which means
+	 * "VDE default path"
+	 */
+	if (switch_path != 0) {
+		/* vde_open() takes a non-const char * which is a bug in their
+		 * function declaration. Even though the implementation does not
+		 * modify the string, we shouldn't just cast away the const.
+		 */
+		size_t switch_path_length = strlen(switch_path);
+
+		switch_path_copy = calloc(switch_path_length + 1, sizeof(char));
+		if (!switch_path_copy) {
+			fprintf(stderr, "Failed to allocate memory.\n");
+			/* TODO: propagate the error state, maybe use errno? */
+			return 0;
+		}
+		strncpy(switch_path_copy, switch_path, switch_path_length);
+	}
+	nd->conn = vde_open(switch_path_copy, "lkl-virtio-net", &open_args);
+	free(switch_path_copy);
+	if (nd->conn == 0) {
+		fprintf(stderr, "Failed to connect to vde switch.\n");
+		/* TODO: propagate the error state, maybe use errno? */
+		return 0;
+	}
+
+	return &nd->dev;
+}
diff --git a/tools/lkl/tests/net-setup.sh b/tools/lkl/tests/net-setup.sh
new file mode 100644
index 000000000000..cc260ed68a7b
--- /dev/null
+++ b/tools/lkl/tests/net-setup.sh
@@ -0,0 +1,134 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+TEST_TAP_IFNAME=tap
+else
+TEST_TAP_IFNAME=lkl_test_tap
+fi
+TEST_IP_NETWORK=192.168.113.0
+TEST_IP_NETMASK=24
+TEST_IP6_NETWORK=fc03::0
+TEST_IP6_NETMASK=64
+TEST_MAC0="aa:bb:cc:dd:ee:ff"
+TEST_MAC1="aa:bb:cc:dd:ee:aa"
+TEST_NETSERVER_PORT=11223
+
+# $1 - count
+# $2 - netcount
+ip_add()
+{
+    IP_HEX=$(printf '%.2X%.2X%.2X%.2X\n' \
+         `echo $TEST_IP_NETWORK | sed -e 's/\./ /g'`)
+    NET_COUNT=$(( 1 << (32 - $TEST_IP_NETMASK) ))
+    NEXT_IP_HEX=$(printf %.8X `echo $((0x$IP_HEX + $1 + ${2:-0} * $NET_COUNT))`)
+    NEXT_IP=$(printf '%d.%d.%d.%d\n' \
+          `echo $NEXT_IP_HEX | sed -r 's/(..)/0x\1 /g'`)
+    echo -n "$NEXT_IP"
+}
+
+# $1 - count
+# $2 - netcount
+ip6_add()
+{
+    IP6_PREFIX=${TEST_IP6_NETWORK%*::*}
+    IP6_HOST=${TEST_IP6_NETWORK#*::*}
+    echo -n "$(printf "%x" $((0x$IP6_PREFIX+${2:-0})))::$(($IP6_HOST+$1))"
+}
+
+ip_host()
+{
+
+    ip_add 1 $1
+}
+
+ip_lkl()
+{
+    ip_add 2 $1
+}
+
+ip_host_mask()
+{
+    echo -n "$(ip_host $1)/$TEST_IP_NETMASK"
+}
+
+ip_net_mask()
+{
+    echo "$(ip_add 0 $1)/$TEST_IP_NETMASK"
+}
+
+ip6_host()
+{
+    ip6_add 1 $1
+}
+
+ip6_lkl()
+{
+    ip6_add 2 $1
+}
+
+ip6_host_mask()
+{
+    echo -n "$(ip6_host $1)/$TEST_IP6_NETMASK"
+}
+
+ip6_net_mask()
+{
+    echo "$(ip6_add 0 $1)/$TEST_IP6_NETMASK"
+}
+
+tap_ifname()
+{
+    echo -n "$TEST_TAP_IFNAME${1:-0}"
+}
+
+tap_prepare()
+{
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        if ! lkl_test_cmd test -d /dev/net &>/dev/null; then
+            lkl_test_cmd sudo mkdir /dev/net
+            lkl_test_cmd sudo ln -s /dev/tun /dev/net/tun
+        fi
+        TAP_USER="vpn"
+        ANDROID_USER="vpn,vpn,net_admin,inet"
+        export_vars ANDROID_USER
+    else
+        TAP_USER=$USER
+    fi
+}
+
+tap_setup()
+{
+    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+        lkl_test_cmd sudo ifconfig tap create
+        lkl_test_cmd sudo sysctl net.link.tap.up_on_open=1
+        lkl_test_cmd sudo sysctl net.link.tap.user_open=1
+        lkl_test_cmd sudo ifconfig $(tap_ifname) $(ip_host)
+        lkl_test_cmd sudo ifconfig $(tap_ifname) inet6 $(ip6_host)
+        return
+    fi
+
+    lkl_test_cmd sudo ip tuntap add dev $(tap_ifname $1) mode tap user $TAP_USER
+    lkl_test_cmd sudo ip link set dev $(tap_ifname $1) up
+    lkl_test_cmd sudo ip addr add dev $(tap_ifname $1) $(ip_host_mask $1)
+    lkl_test_cmd sudo ip -6 addr add dev $(tap_ifname $1) $(ip6_host_mask $1)
+
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        lkl_test_cmd sudo ip route add $(ip_net_mask $1) \
+                     dev $(tap_ifname $1) proto kernel scope link \
+                     src $(ip_host $1) table local
+        lkl_test_cmd sudo ip -6 route add $(ip6_net_mask $1) \
+                     dev $(tap_ifname $1) table local
+    fi
+}
+
+tap_cleanup()
+{
+    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+        lkl_test_cmd sudo ifconfig $(tap_ifname) destroy
+        return
+    fi
+
+    lkl_test_cmd sudo ip link set dev $(tap_ifname $1) down
+    lkl_test_cmd sudo ip tuntap del dev $(tap_ifname $1) mode tap
+}
diff --git a/tools/lkl/tests/net-test.c b/tools/lkl/tests/net-test.c
new file mode 100644
index 000000000000..d2fd19f1b995
--- /dev/null
+++ b/tools/lkl/tests/net-test.c
@@ -0,0 +1,317 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <errno.h>
+#ifdef __FreeBSD__
+#include <sys/types.h>
+#endif
+#ifdef __MINGW32__
+#include <winsock2.h>
+#else
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#endif
+
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include "cla.h"
+#include "test.h"
+
+enum {
+	BACKEND_TAP,
+	BACKEND_MACVTAP,
+	BACKEND_RAW,
+	BACKEND_DPDK,
+	BACKEND_PIPE,
+	BACKEND_NONE,
+};
+
+const char *backends[] = { "tap", "macvtap", "raw", "dpdk", "pipe", "loopback",
+			   NULL };
+static struct {
+	int backend;
+	const char *ifname;
+	int dhcp, nmlen;
+	unsigned int ip, dst, gateway, sleep;
+} cla = {
+	.backend = BACKEND_NONE,
+	.ip = INADDR_NONE,
+	.gateway = INADDR_NONE,
+	.dst = INADDR_NONE,
+	.sleep = 0,
+};
+
+
+struct cl_arg args[] = {
+	{"backend", 'b', "network backend type", 1, CL_ARG_STR_SET,
+	 &cla.backend, backends},
+	{"ifname", 'i', "interface name", 1, CL_ARG_STR, &cla.ifname},
+	{"dhcp", 'd', "use dhcp to configure LKL", 0, CL_ARG_BOOL, &cla.dhcp},
+	{"ip", 'I', "IPv4 address to use", 1, CL_ARG_IPV4, &cla.ip},
+	{"netmask-len", 'n', "IPv4 netmask length", 1, CL_ARG_INT,
+	 &cla.nmlen},
+	{"gateway", 'g', "IPv4 gateway to use", 1, CL_ARG_IPV4, &cla.gateway},
+	{"dst", 'D', "IPv4 destination address", 1, CL_ARG_IPV4, &cla.dst},
+	{"sleep", 's', "sleep", 1, CL_ARG_INT, &cla.sleep},
+	{0},
+};
+
+u_short
+in_cksum(const u_short *addr, register int len, u_short csum)
+{
+	int nleft = len;
+	const u_short *w = addr;
+	u_short answer;
+	int sum = csum;
+
+	while (nleft > 1)  {
+		sum += *w++;
+		nleft -= 2;
+	}
+
+	if (nleft == 1)
+		sum += htons(*(u_char *)w << 8);
+
+	sum = (sum >> 16) + (sum & 0xffff);
+	sum += (sum >> 16);
+	answer = ~sum;
+	return answer;
+}
+
+static int lkl_test_sleep(void)
+{
+	struct lkl_timespec ts = {
+		.tv_sec = cla.sleep,
+	};
+	int ret;
+
+	ret = lkl_sys_nanosleep((struct __lkl__kernel_timespec *)&ts, NULL);
+	if (ret < 0) {
+		lkl_test_logf("nanosleep error: %s\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_icmp(void)
+{
+	int sock, ret;
+	struct lkl_iphdr *iph;
+	struct lkl_icmphdr *icmp;
+	struct lkl_sockaddr_in saddr;
+	struct lkl_pollfd pfd;
+	char buf[32];
+
+	if (cla.dst == INADDR_NONE)
+		return TEST_SKIP;
+
+	memset(&saddr, 0, sizeof(saddr));
+	saddr.sin_family = AF_INET;
+	saddr.sin_addr.lkl_s_addr = cla.dst;
+
+	lkl_test_logf("pinging %s\n",
+		      inet_ntoa(*(struct in_addr *)&saddr.sin_addr));
+
+	sock = lkl_sys_socket(LKL_AF_INET, LKL_SOCK_RAW, LKL_IPPROTO_ICMP);
+	if (sock < 0) {
+		lkl_test_logf("socket error (%s)\n", lkl_strerror(sock));
+		return TEST_FAILURE;
+	}
+
+	icmp = malloc(sizeof(struct lkl_icmphdr *));
+	icmp->type = LKL_ICMP_ECHO;
+	icmp->code = 0;
+	icmp->checksum = 0;
+	icmp->un.echo.sequence = 0;
+	icmp->un.echo.id = 0;
+	icmp->checksum = in_cksum((u_short *)icmp, sizeof(*icmp), 0);
+
+	ret = lkl_sys_sendto(sock, icmp, sizeof(*icmp), 0,
+			     (struct lkl_sockaddr *)&saddr,
+			     sizeof(saddr));
+	if (ret < 0) {
+		lkl_test_logf("sendto error (%s)\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	free(icmp);
+
+	pfd.fd = sock;
+	pfd.events = LKL_POLLIN;
+	pfd.revents = 0;
+
+	ret = lkl_sys_poll(&pfd, 1, 1000);
+	if (ret < 0) {
+		lkl_test_logf("poll error (%s)\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	ret = lkl_sys_recv(sock, buf, sizeof(buf), LKL_MSG_DONTWAIT);
+	if (ret < 0) {
+		lkl_test_logf("recv error (%s)\n", lkl_strerror(ret));
+		return TEST_FAILURE;
+	}
+
+	iph = (struct lkl_iphdr *)buf;
+	icmp = (struct lkl_icmphdr *)(buf + iph->ihl * 4);
+	/* DHCP server may issue an ICMP echo request to a dhcp client */
+	if ((icmp->type != LKL_ICMP_ECHOREPLY || icmp->code != 0) &&
+	    (icmp->type != LKL_ICMP_ECHO)) {
+		lkl_test_logf("no ICMP echo reply (type=%d, code=%d)\n",
+			      icmp->type, icmp->code);
+		return TEST_FAILURE;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static struct lkl_netdev *nd;
+
+static int lkl_test_nd_create(void)
+{
+	switch (cla.backend) {
+	case BACKEND_NONE:
+		return TEST_SKIP;
+	case BACKEND_TAP:
+		nd = lkl_netdev_tap_create(cla.ifname, 0);
+		break;
+	case BACKEND_DPDK:
+		nd = lkl_netdev_dpdk_create(cla.ifname, 0, NULL);
+		break;
+	case BACKEND_RAW:
+		nd = lkl_netdev_raw_create(cla.ifname);
+		break;
+	case BACKEND_MACVTAP:
+		nd = lkl_netdev_macvtap_create(cla.ifname, 0);
+		break;
+	case BACKEND_PIPE:
+		nd = lkl_netdev_pipe_create(cla.ifname, 0);
+		break;
+	}
+
+	if (!nd) {
+		lkl_test_logf("failed to create netdev\n");
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int nd_id;
+
+static int lkl_test_nd_add(void)
+{
+	if (cla.backend == BACKEND_NONE)
+		return TEST_SKIP;
+
+	nd_id = lkl_netdev_add(nd, NULL);
+	if (nd_id < 0) {
+		lkl_test_logf("failed to add netdev: %s\n",
+			      lkl_strerror(nd_id));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_nd_remove(void)
+{
+	if (cla.backend == BACKEND_NONE)
+		return TEST_SKIP;
+
+	lkl_netdev_remove(nd_id);
+	lkl_netdev_free(nd);
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
+	"mem=16M loglevel=8 %s", cla.dhcp ? "ip=dhcp" : "");
+LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
+
+static int nd_ifindex;
+
+static int lkl_test_nd_ifindex(void)
+{
+	if (cla.backend == BACKEND_NONE)
+		return TEST_SKIP;
+
+	nd_ifindex = lkl_netdev_get_ifindex(nd_id);
+	if (nd_ifindex < 0) {
+		lkl_test_logf("failed to get ifindex for netdev id %d: %s\n",
+			      nd_id, lkl_strerror(nd_ifindex));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+LKL_TEST_CALL(if_up, lkl_if_up, 0,
+	      cla.backend == BACKEND_NONE ? 1 : nd_ifindex);
+
+static int lkl_test_set_ipv4(void)
+{
+	int ret;
+
+	if (cla.backend == BACKEND_NONE || cla.ip == LKL_INADDR_NONE)
+		return TEST_SKIP;
+
+	ret = lkl_if_set_ipv4(nd_ifindex, cla.ip, cla.nmlen);
+	if (ret < 0) {
+		lkl_test_logf("failed to set IPv4 address: %s\n",
+			      lkl_strerror(ret));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static int lkl_test_set_gateway(void)
+{
+	int ret;
+
+	if (cla.backend == BACKEND_NONE || cla.gateway == LKL_INADDR_NONE)
+		return TEST_SKIP;
+
+	ret = lkl_set_ipv4_gateway(cla.gateway);
+	if (ret < 0) {
+		lkl_test_logf("failed to set IPv4 gateway: %s\n",
+			      lkl_strerror(ret));
+		return TEST_BAILOUT;
+	}
+
+	return TEST_SUCCESS;
+}
+
+struct lkl_test tests[] = {
+	LKL_TEST(nd_create),
+	LKL_TEST(nd_add),
+	LKL_TEST(start_kernel),
+	LKL_TEST(nd_ifindex),
+	LKL_TEST(if_up),
+	LKL_TEST(set_ipv4),
+	LKL_TEST(set_gateway),
+	LKL_TEST(sleep),
+	LKL_TEST(icmp),
+	LKL_TEST(nd_remove),
+	LKL_TEST(stop_kernel),
+};
+
+int main(int argc, const char **argv)
+{
+	if (parse_args(argc, argv, args) < 0)
+		return -1;
+
+	if (cla.ip != LKL_INADDR_NONE && (cla.nmlen < 0 || cla.nmlen > 32)) {
+		fprintf(stderr, "invalid netmask length %d\n", cla.nmlen);
+		return -1;
+	}
+
+	lkl_host_ops.print = lkl_test_log;
+
+	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
+			    "net %s", backends[cla.backend]);
+}
diff --git a/tools/lkl/tests/net.sh b/tools/lkl/tests/net.sh
new file mode 100755
index 000000000000..cd8de53fe0fd
--- /dev/null
+++ b/tools/lkl/tests/net.sh
@@ -0,0 +1,186 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+source $script_dir/test.sh
+source $script_dir/net-setup.sh
+
+cleanup_backend()
+{
+    set -e
+
+    case "$1" in
+    "tap")
+        tap_cleanup
+        ;;
+    "pipe")
+        rm -rf $work_dir
+        ;;
+    "raw")
+        ;;
+    "macvtap")
+        sudo ip link del dev $(tap_ifname) type macvtap
+        ;;
+    "loopback")
+        ;;
+    esac
+}
+
+get_test_ip()
+{
+    # DHCP test parameters
+    TEST_HOST=8.8.8.8
+    HOST_IF=$(lkl_test_cmd ip route get $TEST_HOST | head -n1 |cut -d ' ' -f5)
+    HOST_GW=$(lkl_test_cmd ip route get $TEST_HOST | head -n1 | cut -d ' ' -f3)
+    if lkl_test_cmd ping -c1 -w1 $HOST_GW; then
+        TEST_IP_REMOTE=$HOST_GW
+    elif lkl_test_cmd ping -c1 -w1 $TEST_HOST; then
+        TEST_IP_REMOTE=$TEST_HOST
+    else
+        echo "could not find remote test ip"
+        return $TEST_SKIP
+    fi
+
+    export_vars HOST_IF TEST_IP_REMOTE
+}
+
+setup_backend()
+{
+    set -e
+
+    if [ "$LKL_HOST_CONFIG_POSIX" != "y" ] &&
+       [ "$1" != "loopback" ]; then
+        echo "not a posix environment"
+        return $TEST_SKIP
+    fi
+
+    case "$1" in
+    "loopback")
+        ;;
+    "pipe")
+        if [ -z $(lkl_test_cmd which mkfifo) ]; then
+            echo "no mkfifo command"
+            return $TEST_SKIP
+        else
+            work_dir=$(lkl_test_cmd mktemp -d)
+        fi
+        fifo1=$work_dir/fifo1
+        fifo2=$work_dir/fifo2
+        lkl_test_cmd mkfifo $fifo1
+        lkl_test_cmd mkfifo $fifo2
+        export_vars work_dir fifo1 fifo2
+        ;;
+    "tap")
+        tap_prepare
+        if ! lkl_test_cmd test -c /dev/net/tun; then
+            if [ -z "$LKL_HOST_CONFIG_BSD" ]; then
+                echo "missing /dev/net/tun"
+                return $TEST_SKIP
+            fi
+        fi
+        tap_setup
+        ;;
+    "raw")
+        if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+            return $TEST_SKIP
+        fi
+        get_test_ip
+        ;;
+    "macvtap")
+        get_test_ip
+        if ! lkl_test_cmd sudo ip link add link $HOST_IF \
+             name $(tap_ifname) type macvtap mode passthru; then
+            echo "failed to create macvtap, skipping"
+            return $TEST_SKIP
+        fi
+        MACVTAP=/dev/tap$(lkl_test_cmd ip link show dev $(tap_ifname) | \
+                                 grep -o ^[0-9]*)
+        lkl_test_cmd sudo ip link set dev $(tap_ifname) up
+        lkl_test_cmd sudo chown $USER $MACVTAP
+        export_vars MACVTAP
+        ;;
+    "dpdk")
+        if -z [ $LKL_TEST_NET_DPDK ]; then
+            echo "DPDK needs user setup"
+            return $TEST_SKIP
+        fi
+        ;;
+    *)
+        echo "don't know how to setup backend $1"
+        return $TEST_FAILED
+        ;;
+    esac
+}
+
+run_tests()
+{
+    case "$1" in
+    "loopback")
+        lkl_test_exec $script_dir/net-test --dst 127.0.0.1
+        ;;
+    "pipe")
+        VALGRIND="" lkl_test_exec $script_dir/net-test --backend pipe \
+                      --ifname "$fifo1|$fifo2" \
+                      --ip $(ip_host) --netmask-len $TEST_IP_NETMASK \
+                      --sleep 1800 >/dev/null &
+        cp $script_dir/net-test $script_dir/net-test2
+
+        sleep 10
+        lkl_test_exec $script_dir/net-test2 --backend pipe \
+                      --ifname "$fifo2|$fifo1" \
+                      --ip $(ip_lkl) --netmask-len $TEST_IP_NETMASK \
+                      --dst $(ip_host)
+        rm -f $script_dir/net-test2
+        kill $!
+        wait $! 2>/dev/null
+        ;;
+    "tap")
+        lkl_test_exec $script_dir/net-test --backend tap \
+                      --ifname $(tap_ifname) \
+                      --ip $(ip_lkl) --netmask-len $TEST_IP_NETMASK \
+                      --dst $(ip_host)
+        ;;
+    "raw")
+        lkl_test_exec sudo $script_dir/net-test --backend raw \
+                      --ifname $HOST_IF --dhcp --dst $TEST_IP_REMOTE
+        ;;
+    "macvtap")
+        lkl_test_exec $script_dir/net-test --backend macvtap \
+                      --ifname $MACVTAP \
+                      --dhcp --dst $TEST_IP_REMOTE
+        ;;
+    "dpdk")
+        lkl_test_exec sudo $script_dir/net-test --backend dpdk \
+                      --ifname dpdk0 \
+                      --ip $(ip_lkl) --netmask-len $TEST_IP_NETMASK \
+                      --dst $(ip_host)
+        ;;
+    esac
+}
+
+if [ "$1" = "-b" ]; then
+    shift
+    backend=$1
+    shift
+fi
+
+if [ -z "$backend" ]; then
+    backend="loopback"
+fi
+
+lkl_test_plan 1 "net $backend"
+lkl_test_run 1 setup_backend $backend
+
+if [ $? = $TEST_SKIP ]; then
+    exit 0
+fi
+
+trap "cleanup_backend $backend" EXIT
+
+run_tests $backend
+
+trap : EXIT
+lkl_test_plan 1 "net $backend"
+lkl_test_run 1 cleanup_backend $backend
+
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 25/37] checkpatch: avoid showing BIT_ULL warnings for tools/ files
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch

From: Octavian Purdila <tavi.purdila@gmail.com>

Directly using shift operations in userspace compiled code should not
trigger warnings as BIT_ULL macros are not available outside the
kernel.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 scripts/checkpatch.pl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 93a7edfe0f05..e739f565497e 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -6313,7 +6313,8 @@ sub process {
 		    $line =~ /#\s*define\s+\w+\s+\(?\s*1\s*([ulUL]*)\s*\<\<\s*(?:\d+|$Ident)\s*\)?/) {
 			my $ull = "";
 			$ull = "_ULL" if (defined($1) && $1 =~ /ll/i);
-			if (CHK("BIT_MACRO",
+			if ($realfile !~ m@\btools/@ &&
+			    CHK("BIT_MACRO",
 				"Prefer using the BIT$ull macro\n" . $herecurr) &&
 			    $fix) {
 				$fixed[$fixlinenr] =~ s/\(?\s*1\s*[ulUL]*\s*<<\s*(\d+|$Ident)\s*\)?/BIT${ull}($1)/;
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 25/37] checkpatch: avoid showing BIT_ULL warnings for tools/ files
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, linux-kernel-library, linux-arch, Akira Moroo

From: Octavian Purdila <tavi.purdila@gmail.com>

Directly using shift operations in userspace compiled code should not
trigger warnings as BIT_ULL macros are not available outside the
kernel.

Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 scripts/checkpatch.pl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 93a7edfe0f05..e739f565497e 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -6313,7 +6313,8 @@ sub process {
 		    $line =~ /#\s*define\s+\w+\s+\(?\s*1\s*([ulUL]*)\s*\<\<\s*(?:\d+|$Ident)\s*\)?/) {
 			my $ull = "";
 			$ull = "_ULL" if (defined($1) && $1 =~ /ll/i);
-			if (CHK("BIT_MACRO",
+			if ($realfile !~ m@\btools/@ &&
+			    CHK("BIT_MACRO",
 				"Prefer using the BIT$ull macro\n" . $herecurr) &&
 			    $fix) {
 				$fixed[$fixlinenr] =~ s/\(?\s*1\s*[ulUL]*\s*<<\s*(\d+|$Ident)\s*\)?/BIT${ull}($1)/;
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 26/37] tools: Add the lkl host library to the common tools Makefile
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Thomas Liebetraut

From: Thomas Liebetraut <thomas@tommie-lie.de>

This patch includes the lkl host library to the Kernel tools buildsystem.
This also means that lkl can now be compiled like any other "tool" using:

  $ make tools/lkl ARCH=um UMMODE=library

Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
[Octavian: remove make ARCH=lkl defconfig as it is not (yet) necessary]
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/Makefile | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/Makefile b/tools/Makefile
index 68defd7ecf5d..0506d7dde63f 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -23,6 +23,7 @@ help:
 	@echo '  kvm_stat               - top-like utility for displaying kvm statistics'
 	@echo '  leds                   - LEDs  tools'
 	@echo '  liblockdep             - user-space wrapper for kernel locking-validator'
+	@echo '  lkl                    - The Linux Kernel Library host libraries and tools'
 	@echo '  bpf                    - misc BPF tools'
 	@echo '  pci                    - PCI tools'
 	@echo '  perf                   - Linux performance measurement and analysis tool'
@@ -63,7 +64,7 @@ acpi: FORCE
 cpupower: FORCE
 	$(call descend,power/$@)
 
-cgroup firewire hv guest spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging: FORCE
+cgroup firewire hv guest lkl spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging: FORCE
 	$(call descend,$@)
 
 liblockdep: FORCE
@@ -107,7 +108,7 @@ acpi_install:
 cpupower_install:
 	$(call descend,power/$(@:_install=),install)
 
-cgroup_install firewire_install gpio_install hv_install iio_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install:
+cgroup_install firewire_install gpio_install hv_install iio_install lkl_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install:
 	$(call descend,$(@:_install=),install)
 
 liblockdep_install:
@@ -133,7 +134,7 @@ install: acpi_install cgroup_install cpupower_install gpio_install \
 		perf_install selftests_install turbostat_install usb_install \
 		virtio_install vm_install bpf_install x86_energy_perf_policy_install \
 		tmon_install freefall_install objtool_install kvm_stat_install \
-		wmi_install pci_install debugging_install intel-speed-select_install
+		wmi_install lkl_install pci_install debugging_install intel-speed-select_install
 
 acpi_clean:
 	$(call descend,power/acpi,clean)
@@ -141,7 +142,7 @@ acpi_clean:
 cpupower_clean:
 	$(call descend,power/cpupower,clean)
 
-cgroup_clean hv_clean firewire_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean:
+cgroup_clean hv_clean firewire_clean lkl_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean:
 	$(call descend,$(@:_clean=),clean)
 
 liblockdep_clean:
@@ -179,7 +180,7 @@ clean: acpi_clean cgroup_clean cpupower_clean hv_clean firewire_clean \
 		perf_clean selftests_clean turbostat_clean spi_clean usb_clean virtio_clean \
 		vm_clean bpf_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
 		freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean \
-		gpio_clean objtool_clean leds_clean wmi_clean pci_clean firmware_clean debugging_clean \
+		gpio_clean objtool_clean leds_clean wmi_clean lkl_clean pci_clean firmware_clean debugging_clean \
 		intel-speed-select_clean
 
 .PHONY: FORCE
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 26/37] tools: Add the lkl host library to the common tools Makefile
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, linux-kernel-library, linux-arch,
	Thomas Liebetraut, Akira Moroo

From: Thomas Liebetraut <thomas@tommie-lie.de>

This patch includes the lkl host library to the Kernel tools buildsystem.
This also means that lkl can now be compiled like any other "tool" using:

  $ make tools/lkl ARCH=um UMMODE=library

Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
[Octavian: remove make ARCH=lkl defconfig as it is not (yet) necessary]
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/Makefile | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/Makefile b/tools/Makefile
index 68defd7ecf5d..0506d7dde63f 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -23,6 +23,7 @@ help:
 	@echo '  kvm_stat               - top-like utility for displaying kvm statistics'
 	@echo '  leds                   - LEDs  tools'
 	@echo '  liblockdep             - user-space wrapper for kernel locking-validator'
+	@echo '  lkl                    - The Linux Kernel Library host libraries and tools'
 	@echo '  bpf                    - misc BPF tools'
 	@echo '  pci                    - PCI tools'
 	@echo '  perf                   - Linux performance measurement and analysis tool'
@@ -63,7 +64,7 @@ acpi: FORCE
 cpupower: FORCE
 	$(call descend,power/$@)
 
-cgroup firewire hv guest spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging: FORCE
+cgroup firewire hv guest lkl spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging: FORCE
 	$(call descend,$@)
 
 liblockdep: FORCE
@@ -107,7 +108,7 @@ acpi_install:
 cpupower_install:
 	$(call descend,power/$(@:_install=),install)
 
-cgroup_install firewire_install gpio_install hv_install iio_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install:
+cgroup_install firewire_install gpio_install hv_install iio_install lkl_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install:
 	$(call descend,$(@:_install=),install)
 
 liblockdep_install:
@@ -133,7 +134,7 @@ install: acpi_install cgroup_install cpupower_install gpio_install \
 		perf_install selftests_install turbostat_install usb_install \
 		virtio_install vm_install bpf_install x86_energy_perf_policy_install \
 		tmon_install freefall_install objtool_install kvm_stat_install \
-		wmi_install pci_install debugging_install intel-speed-select_install
+		wmi_install lkl_install pci_install debugging_install intel-speed-select_install
 
 acpi_clean:
 	$(call descend,power/acpi,clean)
@@ -141,7 +142,7 @@ acpi_clean:
 cpupower_clean:
 	$(call descend,power/cpupower,clean)
 
-cgroup_clean hv_clean firewire_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean:
+cgroup_clean hv_clean firewire_clean lkl_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean:
 	$(call descend,$(@:_clean=),clean)
 
 liblockdep_clean:
@@ -179,7 +180,7 @@ clean: acpi_clean cgroup_clean cpupower_clean hv_clean firewire_clean \
 		perf_clean selftests_clean turbostat_clean spi_clean usb_clean virtio_clean \
 		vm_clean bpf_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
 		freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean \
-		gpio_clean objtool_clean leds_clean wmi_clean pci_clean firmware_clean debugging_clean \
+		gpio_clean objtool_clean leds_clean wmi_clean lkl_clean pci_clean firmware_clean debugging_clean \
 		intel-speed-select_clean
 
 .PHONY: FORCE
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 27/37] lkl tools: add lklfuse
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Conrad Meyer, Hajime Tazaki, Quentin Anglade, Rafael Gieschke

From: Octavian Purdila <tavi.purdila@gmail.com>

Add a simple fuse based program that can mount filesystem in userspace
using LKL.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Quentin Anglade <quentin.anglade@objectif-libre.com>
Signed-off-by: Rafael Gieschke <rafael.gieschke@rz.uni-freiburg.de>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore       |   1 +
 tools/lkl/Build            |   3 +
 tools/lkl/Targets          |   3 +
 tools/lkl/lklfuse.c        | 658 +++++++++++++++++++++++++++++++++++++
 tools/lkl/tests/lklfuse.sh | 110 +++++++
 5 files changed, 775 insertions(+)
 create mode 100644 tools/lkl/lklfuse.c
 create mode 100755 tools/lkl/tests/lklfuse.sh

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
index 138e65efcad2..c78ec268e4b0 100644
--- a/tools/lkl/.gitignore
+++ b/tools/lkl/.gitignore
@@ -10,3 +10,4 @@ tests/valgrind*.xml
 cptofs
 cpfromfs
 fs2tar
+lklfuse
diff --git a/tools/lkl/Build b/tools/lkl/Build
index 73b37363a6de..6048440d0e1b 100644
--- a/tools/lkl/Build
+++ b/tools/lkl/Build
@@ -1,3 +1,6 @@
+CFLAGS_lklfuse.o += -D_FILE_OFFSET_BITS=64
+
 cptofs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs.o
 fs2tar-$(LKL_HOST_CONFIG_ARCHIVE) += fs2tar.o
+lklfuse-$(LKL_HOST_CONFIG_FUSE) += lklfuse.o
 
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
index 05f5bd1dddcc..5a4b3508f0a2 100644
--- a/tools/lkl/Targets
+++ b/tools/lkl/Targets
@@ -11,3 +11,6 @@ LDLIBS_cptofs-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
 progs-$(LKL_HOST_CONFIG_ARCHIVE) += fs2tar
 LDLIBS_fs2tar-y := -larchive
 LDLIBS_fs2tar-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
+
+progs-$(LKL_HOST_CONFIG_FUSE) += lklfuse
+LDLIBS_lklfuse-y := -lfuse
diff --git a/tools/lkl/lklfuse.c b/tools/lkl/lklfuse.c
new file mode 100644
index 000000000000..4e6c8fe250d0
--- /dev/null
+++ b/tools/lkl/lklfuse.c
@@ -0,0 +1,658 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <sys/stat.h>
+#include <string.h>
+#include <errno.h>
+#include <unistd.h>
+#define FUSE_USE_VERSION 26
+#include <fuse.h>
+#include <fuse/fuse_opt.h>
+#include <fuse/fuse_lowlevel.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#define LKLFUSE_VERSION "0.1"
+
+struct lklfuse {
+	const char *file;
+	const char *log;
+	const char *type;
+	const char *opts;
+	struct lkl_disk disk;
+	int disk_id;
+	int part;
+	int ro;
+	int mb;
+} lklfuse = {
+	.mb = 64,
+};
+
+#define LKLFUSE_OPT(t, p, v) { t, offsetof(struct lklfuse, p), v }
+
+enum {
+	KEY_HELP,
+	KEY_VERSION,
+};
+
+static struct fuse_opt lklfuse_opts[] = {
+	LKLFUSE_OPT("log=%s", log, 0),
+	LKLFUSE_OPT("type=%s", type, 0),
+	LKLFUSE_OPT("mb=%d", mb, 0),
+	LKLFUSE_OPT("opts=%s", opts, 0),
+	LKLFUSE_OPT("part=%d", part, 0),
+	FUSE_OPT_KEY("-h", KEY_HELP),
+	FUSE_OPT_KEY("--help", KEY_HELP),
+	FUSE_OPT_KEY("-V",             KEY_VERSION),
+	FUSE_OPT_KEY("--version",      KEY_VERSION),
+	FUSE_OPT_END
+};
+
+static void usage(void)
+{
+	printf(
+"usage: lklfuse file mountpoint [options]\n"
+"\n"
+"general options:\n"
+"    -o opt,[opt...]        mount options\n"
+"    -h   --help            print help\n"
+"    -V   --version         print version\n"
+"\n"
+"lklfuse options:\n"
+"    -o log=FILE            log file\n"
+"    -o type=fstype         filesystem type\n"
+"    -o mb=memory in mb     ammount of memory to allocate\n"
+"    -o part=parition       partition to mount\n"
+"    -o ro                  open file read-only\n"
+"    -o opts=options        mount options (use \\ to escape , and =)\n"
+);
+}
+
+static int lklfuse_opt_proc(void *data, const char *arg, int key,
+			  struct fuse_args *args)
+{
+	switch (key) {
+	case FUSE_OPT_KEY_OPT:
+		if (strcmp(arg, "ro") == 0)
+			lklfuse.ro = 1;
+		return 1;
+
+	case FUSE_OPT_KEY_NONOPT:
+		if (!lklfuse.file) {
+			lklfuse.file = strdup(arg);
+			return 0;
+		}
+		return 1;
+
+	case KEY_HELP:
+		usage();
+		fuse_opt_add_arg(args, "-ho");
+		fuse_main(args->argc, args->argv, NULL, NULL);
+		exit(1);
+
+	case KEY_VERSION:
+		printf("lklfuse version %s\n", LKLFUSE_VERSION);
+		fuse_opt_add_arg(args, "--version");
+		fuse_main(args->argc, args->argv, NULL, NULL);
+		exit(0);
+
+	default:
+		fprintf(stderr, "internal error\n");
+		abort();
+	}
+}
+
+static void lklfuse_xlat_stat(const struct lkl_stat *in, struct stat *st)
+{
+	st->st_dev = in->st_dev;
+	st->st_ino = in->st_ino;
+	st->st_mode = in->st_mode;
+	st->st_nlink = in->st_nlink;
+	st->st_uid = in->st_uid;
+	st->st_gid = in->st_gid;
+	st->st_rdev = in->st_rdev;
+	st->st_size = in->st_size;
+	st->st_blksize = in->st_blksize;
+	st->st_blocks = in->st_blocks;
+	st->st_atim.tv_sec = in->lkl_st_atime;
+	st->st_atim.tv_nsec = in->st_atime_nsec;
+	st->st_mtim.tv_sec = in->lkl_st_mtime;
+	st->st_mtim.tv_nsec = in->st_mtime_nsec;
+	st->st_ctim.tv_sec = in->lkl_st_ctime;
+	st->st_ctim.tv_nsec = in->st_ctime_nsec;
+}
+
+static int lklfuse_fgetattr(const char *path, struct stat *st,
+			    struct fuse_file_info *fi)
+{
+	long ret;
+	struct lkl_stat lkl_stat;
+
+	ret = lkl_sys_fstat(fi->fh, &lkl_stat);
+	if (ret)
+		return ret;
+
+	lklfuse_xlat_stat(&lkl_stat, st);
+	return 0;
+}
+
+static int lklfuse_getattr(const char *path, struct stat *st)
+{
+	long ret;
+	struct lkl_stat lkl_stat;
+
+	ret = lkl_sys_lstat(path, &lkl_stat);
+	if (ret)
+		return ret;
+
+	lklfuse_xlat_stat(&lkl_stat, st);
+	return 0;
+}
+
+static int lklfuse_readlink(const char *path, char *buf, size_t len)
+{
+	long ret;
+
+	ret = lkl_sys_readlink(path, buf, len);
+	if (ret < 0)
+		return ret;
+
+	if ((size_t)ret == len)
+		ret = len - 1;
+
+	buf[ret] = 0;
+
+	return 0;
+}
+
+static int lklfuse_mknod(const char *path, mode_t mode, dev_t dev)
+{
+	return lkl_sys_mknod(path, mode, dev);
+}
+
+static int lklfuse_mkdir(const char *path, mode_t mode)
+{
+	return lkl_sys_mkdir(path, mode);
+}
+
+static int lklfuse_unlink(const char *path)
+{
+	return lkl_sys_unlink(path);
+}
+
+static int lklfuse_rmdir(const char *path)
+{
+	return lkl_sys_rmdir(path);
+}
+
+static int lklfuse_symlink(const char *oldname, const char *newname)
+{
+	return lkl_sys_symlink(oldname, newname);
+}
+
+
+static int lklfuse_rename(const char *oldname, const char *newname)
+{
+	return lkl_sys_rename(oldname, newname);
+}
+
+static int lklfuse_link(const char *oldname, const char *newname)
+{
+	return lkl_sys_link(oldname, newname);
+}
+
+static int lklfuse_chmod(const char *path, mode_t mode)
+{
+	return lkl_sys_chmod(path, mode);
+}
+
+
+static int lklfuse_chown(const char *path, uid_t uid, gid_t gid)
+{
+	return lkl_sys_fchownat(LKL_AT_FDCWD, path, uid, gid,
+				LKL_AT_SYMLINK_NOFOLLOW);
+}
+
+static int lklfuse_truncate(const char *path, off_t off)
+{
+	return lkl_sys_truncate(path, off);
+}
+
+static int lklfuse_open3(const char *path, bool create, mode_t mode,
+			 struct fuse_file_info *fi)
+{
+	long ret;
+	int flags;
+
+	if ((fi->flags & O_ACCMODE) == O_RDONLY)
+		flags = LKL_O_RDONLY;
+	else if ((fi->flags & O_ACCMODE) == O_WRONLY)
+		flags = LKL_O_WRONLY;
+	else if ((fi->flags & O_ACCMODE) == O_RDWR)
+		flags = LKL_O_RDWR;
+	else
+		return -EINVAL;
+
+	if (create)
+		flags |= LKL_O_CREAT;
+
+	ret = lkl_sys_open(path, flags, mode);
+	if (ret < 0)
+		return ret;
+
+	fi->fh = ret;
+
+	return 0;
+}
+
+static int lklfuse_create(const char *path, mode_t mode,
+			  struct fuse_file_info *fi)
+{
+	return lklfuse_open3(path, true, mode, fi);
+}
+
+static int lklfuse_open(const char *path, struct fuse_file_info *fi)
+{
+	return lklfuse_open3(path, false, 0, fi);
+}
+
+static int lklfuse_read(const char *path, char *buf, size_t size, off_t offset,
+		      struct fuse_file_info *fi)
+{
+	long ret;
+	ssize_t orig_size = size;
+
+	do {
+		ret = lkl_sys_pread64(fi->fh, buf, size, offset);
+		if (ret <= 0)
+			break;
+		size -= ret;
+		offset += ret;
+		buf += ret;
+	} while (size > 0);
+
+	return ret < 0 ? ret : orig_size - (ssize_t)size;
+
+}
+
+static int lklfuse_write(const char *path, const char *buf, size_t size,
+		       off_t offset, struct fuse_file_info *fi)
+{
+	long ret;
+	ssize_t orig_size = size;
+
+	do {
+		ret = lkl_sys_pwrite64(fi->fh, buf, size, offset);
+		if (ret <= 0)
+			break;
+		size -= ret;
+		offset += ret;
+		buf += ret;
+	} while (size > 0);
+
+	return ret < 0 ? ret : orig_size - (ssize_t)size;
+}
+
+
+static int lklfuse_statfs(const char *path, struct statvfs *stat)
+{
+	long ret;
+	struct lkl_statfs lkl_statfs;
+
+	ret = lkl_sys_statfs(path, &lkl_statfs);
+	if (ret < 0)
+		return ret;
+
+	stat->f_bsize = lkl_statfs.f_bsize;
+	stat->f_frsize = lkl_statfs.f_frsize;
+	stat->f_blocks = lkl_statfs.f_blocks;
+	stat->f_bfree = lkl_statfs.f_bfree;
+	stat->f_bavail = lkl_statfs.f_bavail;
+	stat->f_files = lkl_statfs.f_files;
+	stat->f_ffree = lkl_statfs.f_ffree;
+	stat->f_favail = stat->f_ffree;
+	stat->f_fsid = *(unsigned long *)&lkl_statfs.f_fsid.val[0];
+	stat->f_flag = lkl_statfs.f_flags;
+	stat->f_namemax = lkl_statfs.f_namelen;
+
+	return 0;
+}
+
+static int lklfuse_flush(const char *path, struct fuse_file_info *fi)
+{
+	return 0;
+}
+
+static int lklfuse_release(const char *path, struct fuse_file_info *fi)
+{
+	return lkl_sys_close(fi->fh);
+}
+
+static int lklfuse_fsync(const char *path, int datasync,
+		       struct fuse_file_info *fi)
+{
+	if (datasync)
+		return lkl_sys_fdatasync(fi->fh);
+	else
+		return lkl_sys_fsync(fi->fh);
+}
+
+static int lklfuse_setxattr(const char *path, const char *name, const char *val,
+		   size_t size, int flags)
+{
+	return lkl_sys_setxattr(path, name, val, size, flags);
+}
+
+static int lklfuse_getxattr(const char *path, const char *name, char *val,
+			  size_t size)
+{
+	return lkl_sys_getxattr(path, name, val, size);
+}
+
+static int lklfuse_listxattr(const char *path, char *list, size_t size)
+{
+	return lkl_sys_listxattr(path, list, size);
+}
+
+static int lklfuse_removexattr(const char *path, const char *name)
+{
+	return lkl_sys_removexattr(path, name);
+}
+
+static int lklfuse_opendir(const char *path, struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir;
+	int err;
+
+	dir = lkl_opendir(path, &err);
+	if (!dir)
+		return err;
+
+	fi->fh = (uintptr_t)dir;
+
+	return 0;
+}
+
+/** Read directory
+ *
+ * This supersedes the old getdir() interface.  New applications
+ * should use this.
+ *
+ * The filesystem may choose between two modes of operation:
+ *
+ * 1) The readdir implementation ignores the offset parameter, and
+ * passes zero to the filler function's offset.  The filler
+ * function will not return '1' (unless an error happens), so the
+ * whole directory is read in a single readdir operation.  This
+ * works just like the old getdir() method.
+ *
+ * 2) The readdir implementation keeps track of the offsets of the
+ * directory entries.  It uses the offset parameter and always
+ * passes non-zero offset to the filler function.  When the buffer
+ * is full (or an error happens) the filler function will return
+ * '1'.
+ *
+ * Introduced in version 2.3
+ */
+static int lklfuse_readdir(const char *path, void *buf, fuse_fill_dir_t fill,
+			 off_t off, struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir = (struct lkl_dir *)(uintptr_t)fi->fh;
+	struct lkl_linux_dirent64 *de;
+
+	while ((de = lkl_readdir(dir))) {
+		struct stat st = { 0, };
+
+		st.st_ino = de->d_ino;
+		st.st_mode = de->d_type << 12;
+
+		if (fill(buf, de->d_name, &st, 0))
+			break;
+	}
+
+	if (!de)
+		return lkl_errdir(dir);
+
+	return 0;
+}
+
+static int lklfuse_releasedir(const char *path, struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir = (struct lkl_dir *)(uintptr_t)fi->fh;
+
+	return lkl_closedir(dir);
+}
+
+static int lklfuse_fsyncdir(const char *path, int datasync,
+			  struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir = (struct lkl_dir *)(uintptr_t)fi->fh;
+	int fd = lkl_dirfd(dir);
+
+	if (datasync)
+		return lkl_sys_fdatasync(fd);
+	else
+		return lkl_sys_fsync(fd);
+}
+
+static int lklfuse_access(const char *path, int mode)
+{
+	return lkl_sys_access(path, mode);
+}
+
+static int lklfuse_utimens(const char *path, const struct timespec tv[2])
+{
+	struct lkl_timespec ts[2];
+
+	ts[0].tv_sec = tv[0].tv_sec;
+	ts[0].tv_nsec = tv[0].tv_nsec;
+	ts[1].tv_sec = tv[0].tv_sec;
+	ts[1].tv_nsec = tv[0].tv_nsec;
+
+	return lkl_sys_utimensat(-1, path, (struct __lkl__kernel_timespec *)ts,
+				 LKL_AT_SYMLINK_NOFOLLOW);
+}
+
+static int lklfuse_fallocate(const char *path, int mode, off_t offset,
+			     off_t len, struct fuse_file_info *fi)
+{
+	return lkl_sys_fallocate(fi->fh, mode, offset, len);
+}
+
+const struct fuse_operations lklfuse_ops = {
+	.flag_nullpath_ok = 1,
+	.flag_nopath = 1,
+	.flag_utime_omit_ok = 1,
+
+	.getattr = lklfuse_getattr,
+	.readlink = lklfuse_readlink,
+	.mknod = lklfuse_mknod,
+	.mkdir = lklfuse_mkdir,
+	.unlink = lklfuse_unlink,
+	.rmdir = lklfuse_rmdir,
+	.symlink = lklfuse_symlink,
+	.rename = lklfuse_rename,
+	.link = lklfuse_link,
+	.chmod = lklfuse_chmod,
+	.chown = lklfuse_chown,
+	.truncate = lklfuse_truncate,
+	.open = lklfuse_open,
+	.read = lklfuse_read,
+	.write = lklfuse_write,
+	.statfs = lklfuse_statfs,
+	.flush = lklfuse_flush,
+	.release = lklfuse_release,
+	.fsync = lklfuse_fsync,
+	.setxattr = lklfuse_setxattr,
+	.getxattr = lklfuse_getxattr,
+	.listxattr = lklfuse_listxattr,
+	.removexattr = lklfuse_removexattr,
+	.opendir = lklfuse_opendir,
+	.readdir = lklfuse_readdir,
+	.releasedir = lklfuse_releasedir,
+	.fsyncdir = lklfuse_fsyncdir,
+	.access = lklfuse_access,
+	.create = lklfuse_create,
+	.fgetattr = lklfuse_fgetattr,
+	/* .lock, */
+	.utimens = lklfuse_utimens,
+	/* .bmap, */
+	/* .ioctl, */
+	/* .poll, */
+	/* .write_buf, (SG io) */
+	/* .read_buf, (SG io) */
+	/* .flock, */
+	.fallocate = lklfuse_fallocate,
+};
+
+static int start_lkl(void)
+{
+	long ret;
+	char mpoint[32], cmdline[16];
+
+	snprintf(cmdline, sizeof(cmdline), "mem=%dM", lklfuse.mb);
+	ret = lkl_start_kernel(&lkl_host_ops, cmdline);
+	if (ret) {
+		fprintf(stderr, "can't start kernel: %s\n", lkl_strerror(ret));
+		goto out;
+	}
+
+	ret = lkl_mount_dev(lklfuse.disk_id, lklfuse.part, lklfuse.type,
+			    lklfuse.ro ? LKL_MS_RDONLY : 0, lklfuse.opts,
+			    mpoint, sizeof(mpoint));
+
+	if (ret) {
+		fprintf(stderr, "can't mount disk: %s\n", lkl_strerror(ret));
+		goto out_halt;
+	}
+
+	ret = lkl_sys_chroot(mpoint);
+	if (ret) {
+		fprintf(stderr, "can't chdir to %s: %s\n", mpoint,
+			lkl_strerror(ret));
+		goto out_umount;
+	}
+
+	return 0;
+
+out_umount:
+	lkl_umount_dev(lklfuse.disk_id, lklfuse.part, 0, 1000);
+
+out_halt:
+	lkl_sys_halt();
+
+out:
+	return ret;
+}
+
+static void stop_lkl(void)
+{
+	int ret;
+
+	ret = lkl_sys_chdir("/");
+	if (ret)
+		fprintf(stderr, "can't chdir to /: %s\n", lkl_strerror(ret));
+	ret = lkl_sys_umount("/", 0);
+	if (ret)
+		fprintf(stderr, "failed to umount disk: %d: %s\n",
+			lklfuse.disk_id, lkl_strerror(ret));
+	lkl_sys_halt();
+}
+
+int main(int argc, char **argv)
+{
+	struct fuse_args args = FUSE_ARGS_INIT(argc, argv);
+	struct fuse_chan *ch;
+	struct fuse *fuse;
+	struct stat st;
+	char *mnt;
+	int fg, mt, ret;
+
+	if (fuse_opt_parse(&args, &lklfuse, lklfuse_opts, lklfuse_opt_proc))
+		return 1;
+
+	if (!lklfuse.file || !lklfuse.type) {
+		fprintf(stderr, "no file or filesystem type specified\n");
+		return 1;
+	}
+
+	if (fuse_parse_cmdline(&args, &mnt, &mt, &fg))
+		return 1;
+
+	ret = stat(mnt, &st);
+	if (ret) {
+		perror(mnt);
+		goto out_free;
+	}
+
+	ret = open(lklfuse.file, lklfuse.ro ? O_RDONLY : O_RDWR);
+	if (ret < 0) {
+		perror(lklfuse.file);
+		goto out_free;
+	}
+
+	lklfuse.disk.fd = ret;
+
+	ret = lkl_disk_add(&lklfuse.disk);
+	if (ret < 0) {
+		fprintf(stderr, "can't add disk: %s\n", lkl_strerror(ret));
+		goto out_close_disk;
+	}
+
+	lklfuse.disk_id = ret;
+
+	ch = fuse_mount(mnt, &args);
+	if (!ch) {
+		ret = -1;
+		goto out_close_disk;
+	}
+
+	fuse = fuse_new(ch, &args, &lklfuse_ops, sizeof(lklfuse_ops), NULL);
+	if (!fuse) {
+		ret = -1;
+		goto out_fuse_unmount;
+	}
+
+	fuse_opt_free_args(&args);
+
+	if (fuse_daemonize(fg) ||
+	    fuse_set_signal_handlers(fuse_get_session(fuse))) {
+		ret = -1;
+		goto out_fuse_destroy;
+	}
+
+	ret = start_lkl();
+	if (ret) {
+		ret = -1;
+		goto out_remove_signals;
+	}
+
+	if (mt)
+		fprintf(stderr, "warning: multithreaded mode not supported\n");
+
+	ret = fuse_loop(fuse);
+
+	stop_lkl();
+
+out_remove_signals:
+	fuse_remove_signal_handlers(fuse_get_session(fuse));
+
+out_fuse_unmount:
+	if (ch)
+		fuse_unmount(mnt, ch);
+
+out_fuse_destroy:
+	if (fuse)
+		fuse_destroy(fuse);
+
+out_close_disk:
+	close(lklfuse.disk.fd);
+
+out_free:
+	free(mnt);
+
+	return ret < 0 ? 1 : 0;
+}
diff --git a/tools/lkl/tests/lklfuse.sh b/tools/lkl/tests/lklfuse.sh
new file mode 100755
index 000000000000..7f35dd53fc4e
--- /dev/null
+++ b/tools/lkl/tests/lklfuse.sh
@@ -0,0 +1,110 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+source $script_dir/test.sh
+
+cleanup()
+{
+    set -e
+
+    sleep 1
+    fusermount -u $dir
+    rm $file
+    rmdir $dir
+}
+
+
+# $1 - disk image
+# $2 - fstype
+function prepfs()
+{
+    set -e
+
+    dd if=/dev/zero of=$1 bs=1024 count=102400
+
+    yes | mkfs.$2 $1
+}
+
+# $1 - disk image
+# $2 - mount point
+# $3 - filesystem type
+lklfuse_mount()
+{
+    ${script_dir}/../lklfuse $1 $2 -o type=$3
+}
+
+# $1 - mount point
+lklfuse_basic()
+{
+    set -e
+
+    cd $1
+    touch a
+    if ! [ -e ]; then exit 1; fi
+    rm a
+    mkdir a
+    if ! [ -d ]; then exit 1; fi
+    rmdir a
+}
+
+# $1 - dir
+# $2 - filesystem type
+lklfuse_stressng()
+{
+    set -e
+
+    if [ -z $(which stress-ng) ]; then
+        echo "missing stress-ng"
+        return $TEST_SKIP
+    fi
+
+    cd $1
+
+    if [ "$2" = "vfat" ]; then
+        exclude="chmod,filename,link,mknod,symlink,xattr"
+    fi
+
+    stress-ng --class filesystem --all 0 --timeout 10 \
+	      --exclude fiemap,$exclude --fallocate-bytes 10m \
+	      --sync-file-bytes 10m
+}
+
+if [ "$1" = "-t" ]; then
+    shift
+    fstype=$1
+    shift
+fi
+
+if [ -z "$fstype" ]; then
+    fstype="ext4"
+fi
+
+if ! [ -x $script_dir/../lklfuse ]; then
+    lkl_test_plan 0 "lklfuse.sh $fstype"
+    echo "lklfuse not available"
+    exit 0
+fi
+
+if ! [ -e /dev/fuse ]; then
+    lkl_test_plan 0 "lklfuse.sh $fstype"
+    echo "/dev/fuse not available"
+    exit 0
+fi
+
+
+file=`mktemp`
+dir=`mktemp -d`
+
+trap cleanup EXIT
+
+lkl_test_plan 4 "lklfuse $fstype"
+
+lkl_test_run 1 prepfs $file $fstype
+lkl_test_run 2 lklfuse_mount $file $dir $fstype
+lkl_test_run 3 lklfuse_basic $dir
+# stress-ng returns 2 with no apparent failures so skip it for now
+#lkl_test_run 4 lklfuse_stressng $dir $fstype
+trap : EXIT
+lkl_test_run 4 cleanup
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 27/37] lkl tools: add lklfuse
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Rafael Gieschke, Conrad Meyer, Octavian Purdila,
	Akira Moroo, Quentin Anglade, linux-kernel-library,
	Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add a simple fuse based program that can mount filesystem in userspace
using LKL.

Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Quentin Anglade <quentin.anglade@objectif-libre.com>
Signed-off-by: Rafael Gieschke <rafael.gieschke@rz.uni-freiburg.de>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore       |   1 +
 tools/lkl/Build            |   3 +
 tools/lkl/Targets          |   3 +
 tools/lkl/lklfuse.c        | 658 +++++++++++++++++++++++++++++++++++++
 tools/lkl/tests/lklfuse.sh | 110 +++++++
 5 files changed, 775 insertions(+)
 create mode 100644 tools/lkl/lklfuse.c
 create mode 100755 tools/lkl/tests/lklfuse.sh

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
index 138e65efcad2..c78ec268e4b0 100644
--- a/tools/lkl/.gitignore
+++ b/tools/lkl/.gitignore
@@ -10,3 +10,4 @@ tests/valgrind*.xml
 cptofs
 cpfromfs
 fs2tar
+lklfuse
diff --git a/tools/lkl/Build b/tools/lkl/Build
index 73b37363a6de..6048440d0e1b 100644
--- a/tools/lkl/Build
+++ b/tools/lkl/Build
@@ -1,3 +1,6 @@
+CFLAGS_lklfuse.o += -D_FILE_OFFSET_BITS=64
+
 cptofs-$(LKL_HOST_CONFIG_ARCHIVE) += cptofs.o
 fs2tar-$(LKL_HOST_CONFIG_ARCHIVE) += fs2tar.o
+lklfuse-$(LKL_HOST_CONFIG_FUSE) += lklfuse.o
 
diff --git a/tools/lkl/Targets b/tools/lkl/Targets
index 05f5bd1dddcc..5a4b3508f0a2 100644
--- a/tools/lkl/Targets
+++ b/tools/lkl/Targets
@@ -11,3 +11,6 @@ LDLIBS_cptofs-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
 progs-$(LKL_HOST_CONFIG_ARCHIVE) += fs2tar
 LDLIBS_fs2tar-y := -larchive
 LDLIBS_fs2tar-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
+
+progs-$(LKL_HOST_CONFIG_FUSE) += lklfuse
+LDLIBS_lklfuse-y := -lfuse
diff --git a/tools/lkl/lklfuse.c b/tools/lkl/lklfuse.c
new file mode 100644
index 000000000000..4e6c8fe250d0
--- /dev/null
+++ b/tools/lkl/lklfuse.c
@@ -0,0 +1,658 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <sys/stat.h>
+#include <string.h>
+#include <errno.h>
+#include <unistd.h>
+#define FUSE_USE_VERSION 26
+#include <fuse.h>
+#include <fuse/fuse_opt.h>
+#include <fuse/fuse_lowlevel.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#define LKLFUSE_VERSION "0.1"
+
+struct lklfuse {
+	const char *file;
+	const char *log;
+	const char *type;
+	const char *opts;
+	struct lkl_disk disk;
+	int disk_id;
+	int part;
+	int ro;
+	int mb;
+} lklfuse = {
+	.mb = 64,
+};
+
+#define LKLFUSE_OPT(t, p, v) { t, offsetof(struct lklfuse, p), v }
+
+enum {
+	KEY_HELP,
+	KEY_VERSION,
+};
+
+static struct fuse_opt lklfuse_opts[] = {
+	LKLFUSE_OPT("log=%s", log, 0),
+	LKLFUSE_OPT("type=%s", type, 0),
+	LKLFUSE_OPT("mb=%d", mb, 0),
+	LKLFUSE_OPT("opts=%s", opts, 0),
+	LKLFUSE_OPT("part=%d", part, 0),
+	FUSE_OPT_KEY("-h", KEY_HELP),
+	FUSE_OPT_KEY("--help", KEY_HELP),
+	FUSE_OPT_KEY("-V",             KEY_VERSION),
+	FUSE_OPT_KEY("--version",      KEY_VERSION),
+	FUSE_OPT_END
+};
+
+static void usage(void)
+{
+	printf(
+"usage: lklfuse file mountpoint [options]\n"
+"\n"
+"general options:\n"
+"    -o opt,[opt...]        mount options\n"
+"    -h   --help            print help\n"
+"    -V   --version         print version\n"
+"\n"
+"lklfuse options:\n"
+"    -o log=FILE            log file\n"
+"    -o type=fstype         filesystem type\n"
+"    -o mb=memory in mb     ammount of memory to allocate\n"
+"    -o part=parition       partition to mount\n"
+"    -o ro                  open file read-only\n"
+"    -o opts=options        mount options (use \\ to escape , and =)\n"
+);
+}
+
+static int lklfuse_opt_proc(void *data, const char *arg, int key,
+			  struct fuse_args *args)
+{
+	switch (key) {
+	case FUSE_OPT_KEY_OPT:
+		if (strcmp(arg, "ro") == 0)
+			lklfuse.ro = 1;
+		return 1;
+
+	case FUSE_OPT_KEY_NONOPT:
+		if (!lklfuse.file) {
+			lklfuse.file = strdup(arg);
+			return 0;
+		}
+		return 1;
+
+	case KEY_HELP:
+		usage();
+		fuse_opt_add_arg(args, "-ho");
+		fuse_main(args->argc, args->argv, NULL, NULL);
+		exit(1);
+
+	case KEY_VERSION:
+		printf("lklfuse version %s\n", LKLFUSE_VERSION);
+		fuse_opt_add_arg(args, "--version");
+		fuse_main(args->argc, args->argv, NULL, NULL);
+		exit(0);
+
+	default:
+		fprintf(stderr, "internal error\n");
+		abort();
+	}
+}
+
+static void lklfuse_xlat_stat(const struct lkl_stat *in, struct stat *st)
+{
+	st->st_dev = in->st_dev;
+	st->st_ino = in->st_ino;
+	st->st_mode = in->st_mode;
+	st->st_nlink = in->st_nlink;
+	st->st_uid = in->st_uid;
+	st->st_gid = in->st_gid;
+	st->st_rdev = in->st_rdev;
+	st->st_size = in->st_size;
+	st->st_blksize = in->st_blksize;
+	st->st_blocks = in->st_blocks;
+	st->st_atim.tv_sec = in->lkl_st_atime;
+	st->st_atim.tv_nsec = in->st_atime_nsec;
+	st->st_mtim.tv_sec = in->lkl_st_mtime;
+	st->st_mtim.tv_nsec = in->st_mtime_nsec;
+	st->st_ctim.tv_sec = in->lkl_st_ctime;
+	st->st_ctim.tv_nsec = in->st_ctime_nsec;
+}
+
+static int lklfuse_fgetattr(const char *path, struct stat *st,
+			    struct fuse_file_info *fi)
+{
+	long ret;
+	struct lkl_stat lkl_stat;
+
+	ret = lkl_sys_fstat(fi->fh, &lkl_stat);
+	if (ret)
+		return ret;
+
+	lklfuse_xlat_stat(&lkl_stat, st);
+	return 0;
+}
+
+static int lklfuse_getattr(const char *path, struct stat *st)
+{
+	long ret;
+	struct lkl_stat lkl_stat;
+
+	ret = lkl_sys_lstat(path, &lkl_stat);
+	if (ret)
+		return ret;
+
+	lklfuse_xlat_stat(&lkl_stat, st);
+	return 0;
+}
+
+static int lklfuse_readlink(const char *path, char *buf, size_t len)
+{
+	long ret;
+
+	ret = lkl_sys_readlink(path, buf, len);
+	if (ret < 0)
+		return ret;
+
+	if ((size_t)ret == len)
+		ret = len - 1;
+
+	buf[ret] = 0;
+
+	return 0;
+}
+
+static int lklfuse_mknod(const char *path, mode_t mode, dev_t dev)
+{
+	return lkl_sys_mknod(path, mode, dev);
+}
+
+static int lklfuse_mkdir(const char *path, mode_t mode)
+{
+	return lkl_sys_mkdir(path, mode);
+}
+
+static int lklfuse_unlink(const char *path)
+{
+	return lkl_sys_unlink(path);
+}
+
+static int lklfuse_rmdir(const char *path)
+{
+	return lkl_sys_rmdir(path);
+}
+
+static int lklfuse_symlink(const char *oldname, const char *newname)
+{
+	return lkl_sys_symlink(oldname, newname);
+}
+
+
+static int lklfuse_rename(const char *oldname, const char *newname)
+{
+	return lkl_sys_rename(oldname, newname);
+}
+
+static int lklfuse_link(const char *oldname, const char *newname)
+{
+	return lkl_sys_link(oldname, newname);
+}
+
+static int lklfuse_chmod(const char *path, mode_t mode)
+{
+	return lkl_sys_chmod(path, mode);
+}
+
+
+static int lklfuse_chown(const char *path, uid_t uid, gid_t gid)
+{
+	return lkl_sys_fchownat(LKL_AT_FDCWD, path, uid, gid,
+				LKL_AT_SYMLINK_NOFOLLOW);
+}
+
+static int lklfuse_truncate(const char *path, off_t off)
+{
+	return lkl_sys_truncate(path, off);
+}
+
+static int lklfuse_open3(const char *path, bool create, mode_t mode,
+			 struct fuse_file_info *fi)
+{
+	long ret;
+	int flags;
+
+	if ((fi->flags & O_ACCMODE) == O_RDONLY)
+		flags = LKL_O_RDONLY;
+	else if ((fi->flags & O_ACCMODE) == O_WRONLY)
+		flags = LKL_O_WRONLY;
+	else if ((fi->flags & O_ACCMODE) == O_RDWR)
+		flags = LKL_O_RDWR;
+	else
+		return -EINVAL;
+
+	if (create)
+		flags |= LKL_O_CREAT;
+
+	ret = lkl_sys_open(path, flags, mode);
+	if (ret < 0)
+		return ret;
+
+	fi->fh = ret;
+
+	return 0;
+}
+
+static int lklfuse_create(const char *path, mode_t mode,
+			  struct fuse_file_info *fi)
+{
+	return lklfuse_open3(path, true, mode, fi);
+}
+
+static int lklfuse_open(const char *path, struct fuse_file_info *fi)
+{
+	return lklfuse_open3(path, false, 0, fi);
+}
+
+static int lklfuse_read(const char *path, char *buf, size_t size, off_t offset,
+		      struct fuse_file_info *fi)
+{
+	long ret;
+	ssize_t orig_size = size;
+
+	do {
+		ret = lkl_sys_pread64(fi->fh, buf, size, offset);
+		if (ret <= 0)
+			break;
+		size -= ret;
+		offset += ret;
+		buf += ret;
+	} while (size > 0);
+
+	return ret < 0 ? ret : orig_size - (ssize_t)size;
+
+}
+
+static int lklfuse_write(const char *path, const char *buf, size_t size,
+		       off_t offset, struct fuse_file_info *fi)
+{
+	long ret;
+	ssize_t orig_size = size;
+
+	do {
+		ret = lkl_sys_pwrite64(fi->fh, buf, size, offset);
+		if (ret <= 0)
+			break;
+		size -= ret;
+		offset += ret;
+		buf += ret;
+	} while (size > 0);
+
+	return ret < 0 ? ret : orig_size - (ssize_t)size;
+}
+
+
+static int lklfuse_statfs(const char *path, struct statvfs *stat)
+{
+	long ret;
+	struct lkl_statfs lkl_statfs;
+
+	ret = lkl_sys_statfs(path, &lkl_statfs);
+	if (ret < 0)
+		return ret;
+
+	stat->f_bsize = lkl_statfs.f_bsize;
+	stat->f_frsize = lkl_statfs.f_frsize;
+	stat->f_blocks = lkl_statfs.f_blocks;
+	stat->f_bfree = lkl_statfs.f_bfree;
+	stat->f_bavail = lkl_statfs.f_bavail;
+	stat->f_files = lkl_statfs.f_files;
+	stat->f_ffree = lkl_statfs.f_ffree;
+	stat->f_favail = stat->f_ffree;
+	stat->f_fsid = *(unsigned long *)&lkl_statfs.f_fsid.val[0];
+	stat->f_flag = lkl_statfs.f_flags;
+	stat->f_namemax = lkl_statfs.f_namelen;
+
+	return 0;
+}
+
+static int lklfuse_flush(const char *path, struct fuse_file_info *fi)
+{
+	return 0;
+}
+
+static int lklfuse_release(const char *path, struct fuse_file_info *fi)
+{
+	return lkl_sys_close(fi->fh);
+}
+
+static int lklfuse_fsync(const char *path, int datasync,
+		       struct fuse_file_info *fi)
+{
+	if (datasync)
+		return lkl_sys_fdatasync(fi->fh);
+	else
+		return lkl_sys_fsync(fi->fh);
+}
+
+static int lklfuse_setxattr(const char *path, const char *name, const char *val,
+		   size_t size, int flags)
+{
+	return lkl_sys_setxattr(path, name, val, size, flags);
+}
+
+static int lklfuse_getxattr(const char *path, const char *name, char *val,
+			  size_t size)
+{
+	return lkl_sys_getxattr(path, name, val, size);
+}
+
+static int lklfuse_listxattr(const char *path, char *list, size_t size)
+{
+	return lkl_sys_listxattr(path, list, size);
+}
+
+static int lklfuse_removexattr(const char *path, const char *name)
+{
+	return lkl_sys_removexattr(path, name);
+}
+
+static int lklfuse_opendir(const char *path, struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir;
+	int err;
+
+	dir = lkl_opendir(path, &err);
+	if (!dir)
+		return err;
+
+	fi->fh = (uintptr_t)dir;
+
+	return 0;
+}
+
+/** Read directory
+ *
+ * This supersedes the old getdir() interface.  New applications
+ * should use this.
+ *
+ * The filesystem may choose between two modes of operation:
+ *
+ * 1) The readdir implementation ignores the offset parameter, and
+ * passes zero to the filler function's offset.  The filler
+ * function will not return '1' (unless an error happens), so the
+ * whole directory is read in a single readdir operation.  This
+ * works just like the old getdir() method.
+ *
+ * 2) The readdir implementation keeps track of the offsets of the
+ * directory entries.  It uses the offset parameter and always
+ * passes non-zero offset to the filler function.  When the buffer
+ * is full (or an error happens) the filler function will return
+ * '1'.
+ *
+ * Introduced in version 2.3
+ */
+static int lklfuse_readdir(const char *path, void *buf, fuse_fill_dir_t fill,
+			 off_t off, struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir = (struct lkl_dir *)(uintptr_t)fi->fh;
+	struct lkl_linux_dirent64 *de;
+
+	while ((de = lkl_readdir(dir))) {
+		struct stat st = { 0, };
+
+		st.st_ino = de->d_ino;
+		st.st_mode = de->d_type << 12;
+
+		if (fill(buf, de->d_name, &st, 0))
+			break;
+	}
+
+	if (!de)
+		return lkl_errdir(dir);
+
+	return 0;
+}
+
+static int lklfuse_releasedir(const char *path, struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir = (struct lkl_dir *)(uintptr_t)fi->fh;
+
+	return lkl_closedir(dir);
+}
+
+static int lklfuse_fsyncdir(const char *path, int datasync,
+			  struct fuse_file_info *fi)
+{
+	struct lkl_dir *dir = (struct lkl_dir *)(uintptr_t)fi->fh;
+	int fd = lkl_dirfd(dir);
+
+	if (datasync)
+		return lkl_sys_fdatasync(fd);
+	else
+		return lkl_sys_fsync(fd);
+}
+
+static int lklfuse_access(const char *path, int mode)
+{
+	return lkl_sys_access(path, mode);
+}
+
+static int lklfuse_utimens(const char *path, const struct timespec tv[2])
+{
+	struct lkl_timespec ts[2];
+
+	ts[0].tv_sec = tv[0].tv_sec;
+	ts[0].tv_nsec = tv[0].tv_nsec;
+	ts[1].tv_sec = tv[0].tv_sec;
+	ts[1].tv_nsec = tv[0].tv_nsec;
+
+	return lkl_sys_utimensat(-1, path, (struct __lkl__kernel_timespec *)ts,
+				 LKL_AT_SYMLINK_NOFOLLOW);
+}
+
+static int lklfuse_fallocate(const char *path, int mode, off_t offset,
+			     off_t len, struct fuse_file_info *fi)
+{
+	return lkl_sys_fallocate(fi->fh, mode, offset, len);
+}
+
+const struct fuse_operations lklfuse_ops = {
+	.flag_nullpath_ok = 1,
+	.flag_nopath = 1,
+	.flag_utime_omit_ok = 1,
+
+	.getattr = lklfuse_getattr,
+	.readlink = lklfuse_readlink,
+	.mknod = lklfuse_mknod,
+	.mkdir = lklfuse_mkdir,
+	.unlink = lklfuse_unlink,
+	.rmdir = lklfuse_rmdir,
+	.symlink = lklfuse_symlink,
+	.rename = lklfuse_rename,
+	.link = lklfuse_link,
+	.chmod = lklfuse_chmod,
+	.chown = lklfuse_chown,
+	.truncate = lklfuse_truncate,
+	.open = lklfuse_open,
+	.read = lklfuse_read,
+	.write = lklfuse_write,
+	.statfs = lklfuse_statfs,
+	.flush = lklfuse_flush,
+	.release = lklfuse_release,
+	.fsync = lklfuse_fsync,
+	.setxattr = lklfuse_setxattr,
+	.getxattr = lklfuse_getxattr,
+	.listxattr = lklfuse_listxattr,
+	.removexattr = lklfuse_removexattr,
+	.opendir = lklfuse_opendir,
+	.readdir = lklfuse_readdir,
+	.releasedir = lklfuse_releasedir,
+	.fsyncdir = lklfuse_fsyncdir,
+	.access = lklfuse_access,
+	.create = lklfuse_create,
+	.fgetattr = lklfuse_fgetattr,
+	/* .lock, */
+	.utimens = lklfuse_utimens,
+	/* .bmap, */
+	/* .ioctl, */
+	/* .poll, */
+	/* .write_buf, (SG io) */
+	/* .read_buf, (SG io) */
+	/* .flock, */
+	.fallocate = lklfuse_fallocate,
+};
+
+static int start_lkl(void)
+{
+	long ret;
+	char mpoint[32], cmdline[16];
+
+	snprintf(cmdline, sizeof(cmdline), "mem=%dM", lklfuse.mb);
+	ret = lkl_start_kernel(&lkl_host_ops, cmdline);
+	if (ret) {
+		fprintf(stderr, "can't start kernel: %s\n", lkl_strerror(ret));
+		goto out;
+	}
+
+	ret = lkl_mount_dev(lklfuse.disk_id, lklfuse.part, lklfuse.type,
+			    lklfuse.ro ? LKL_MS_RDONLY : 0, lklfuse.opts,
+			    mpoint, sizeof(mpoint));
+
+	if (ret) {
+		fprintf(stderr, "can't mount disk: %s\n", lkl_strerror(ret));
+		goto out_halt;
+	}
+
+	ret = lkl_sys_chroot(mpoint);
+	if (ret) {
+		fprintf(stderr, "can't chdir to %s: %s\n", mpoint,
+			lkl_strerror(ret));
+		goto out_umount;
+	}
+
+	return 0;
+
+out_umount:
+	lkl_umount_dev(lklfuse.disk_id, lklfuse.part, 0, 1000);
+
+out_halt:
+	lkl_sys_halt();
+
+out:
+	return ret;
+}
+
+static void stop_lkl(void)
+{
+	int ret;
+
+	ret = lkl_sys_chdir("/");
+	if (ret)
+		fprintf(stderr, "can't chdir to /: %s\n", lkl_strerror(ret));
+	ret = lkl_sys_umount("/", 0);
+	if (ret)
+		fprintf(stderr, "failed to umount disk: %d: %s\n",
+			lklfuse.disk_id, lkl_strerror(ret));
+	lkl_sys_halt();
+}
+
+int main(int argc, char **argv)
+{
+	struct fuse_args args = FUSE_ARGS_INIT(argc, argv);
+	struct fuse_chan *ch;
+	struct fuse *fuse;
+	struct stat st;
+	char *mnt;
+	int fg, mt, ret;
+
+	if (fuse_opt_parse(&args, &lklfuse, lklfuse_opts, lklfuse_opt_proc))
+		return 1;
+
+	if (!lklfuse.file || !lklfuse.type) {
+		fprintf(stderr, "no file or filesystem type specified\n");
+		return 1;
+	}
+
+	if (fuse_parse_cmdline(&args, &mnt, &mt, &fg))
+		return 1;
+
+	ret = stat(mnt, &st);
+	if (ret) {
+		perror(mnt);
+		goto out_free;
+	}
+
+	ret = open(lklfuse.file, lklfuse.ro ? O_RDONLY : O_RDWR);
+	if (ret < 0) {
+		perror(lklfuse.file);
+		goto out_free;
+	}
+
+	lklfuse.disk.fd = ret;
+
+	ret = lkl_disk_add(&lklfuse.disk);
+	if (ret < 0) {
+		fprintf(stderr, "can't add disk: %s\n", lkl_strerror(ret));
+		goto out_close_disk;
+	}
+
+	lklfuse.disk_id = ret;
+
+	ch = fuse_mount(mnt, &args);
+	if (!ch) {
+		ret = -1;
+		goto out_close_disk;
+	}
+
+	fuse = fuse_new(ch, &args, &lklfuse_ops, sizeof(lklfuse_ops), NULL);
+	if (!fuse) {
+		ret = -1;
+		goto out_fuse_unmount;
+	}
+
+	fuse_opt_free_args(&args);
+
+	if (fuse_daemonize(fg) ||
+	    fuse_set_signal_handlers(fuse_get_session(fuse))) {
+		ret = -1;
+		goto out_fuse_destroy;
+	}
+
+	ret = start_lkl();
+	if (ret) {
+		ret = -1;
+		goto out_remove_signals;
+	}
+
+	if (mt)
+		fprintf(stderr, "warning: multithreaded mode not supported\n");
+
+	ret = fuse_loop(fuse);
+
+	stop_lkl();
+
+out_remove_signals:
+	fuse_remove_signal_handlers(fuse_get_session(fuse));
+
+out_fuse_unmount:
+	if (ch)
+		fuse_unmount(mnt, ch);
+
+out_fuse_destroy:
+	if (fuse)
+		fuse_destroy(fuse);
+
+out_close_disk:
+	close(lklfuse.disk.fd);
+
+out_free:
+	free(mnt);
+
+	return ret < 0 ? 1 : 0;
+}
diff --git a/tools/lkl/tests/lklfuse.sh b/tools/lkl/tests/lklfuse.sh
new file mode 100755
index 000000000000..7f35dd53fc4e
--- /dev/null
+++ b/tools/lkl/tests/lklfuse.sh
@@ -0,0 +1,110 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+source $script_dir/test.sh
+
+cleanup()
+{
+    set -e
+
+    sleep 1
+    fusermount -u $dir
+    rm $file
+    rmdir $dir
+}
+
+
+# $1 - disk image
+# $2 - fstype
+function prepfs()
+{
+    set -e
+
+    dd if=/dev/zero of=$1 bs=1024 count=102400
+
+    yes | mkfs.$2 $1
+}
+
+# $1 - disk image
+# $2 - mount point
+# $3 - filesystem type
+lklfuse_mount()
+{
+    ${script_dir}/../lklfuse $1 $2 -o type=$3
+}
+
+# $1 - mount point
+lklfuse_basic()
+{
+    set -e
+
+    cd $1
+    touch a
+    if ! [ -e ]; then exit 1; fi
+    rm a
+    mkdir a
+    if ! [ -d ]; then exit 1; fi
+    rmdir a
+}
+
+# $1 - dir
+# $2 - filesystem type
+lklfuse_stressng()
+{
+    set -e
+
+    if [ -z $(which stress-ng) ]; then
+        echo "missing stress-ng"
+        return $TEST_SKIP
+    fi
+
+    cd $1
+
+    if [ "$2" = "vfat" ]; then
+        exclude="chmod,filename,link,mknod,symlink,xattr"
+    fi
+
+    stress-ng --class filesystem --all 0 --timeout 10 \
+	      --exclude fiemap,$exclude --fallocate-bytes 10m \
+	      --sync-file-bytes 10m
+}
+
+if [ "$1" = "-t" ]; then
+    shift
+    fstype=$1
+    shift
+fi
+
+if [ -z "$fstype" ]; then
+    fstype="ext4"
+fi
+
+if ! [ -x $script_dir/../lklfuse ]; then
+    lkl_test_plan 0 "lklfuse.sh $fstype"
+    echo "lklfuse not available"
+    exit 0
+fi
+
+if ! [ -e /dev/fuse ]; then
+    lkl_test_plan 0 "lklfuse.sh $fstype"
+    echo "/dev/fuse not available"
+    exit 0
+fi
+
+
+file=`mktemp`
+dir=`mktemp -d`
+
+trap cleanup EXIT
+
+lkl_test_plan 4 "lklfuse $fstype"
+
+lkl_test_run 1 prepfs $file $fstype
+lkl_test_run 2 lklfuse_mount $file $dir $fstype
+lkl_test_run 3 lklfuse_basic $dir
+# stress-ng returns 2 with no apparent failures so skip it for now
+#lkl_test_run 4 lklfuse_stressng $dir $fstype
+trap : EXIT
+lkl_test_run 4 cleanup
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 28/37] lkl: add system call hijack support
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki, Motomu Utsumi, Patrick Collins, Thomas Liebetraut,
	Xiao Jia, Yuan Liu

This commit introduces initial support of system call hijack, based on
LD_PRELOAD with POSIX applications on a host.

Note that system call hijack by renaming symbol by LD_PRELOAD is not a
complete solution: it must address various issues with dirty tricks.

Those tricks/issues are:
- introduce file descriptor offset (i.e., fd + offset)
- path name isolation (i.e., chrooted)
- need of handling mixture of fd between host and lkl-ed ones
- un-hijackable symbol (__socket inside if_nametoindex() of linux
  glibc) needs to be hijacked by upper call (i.e., if_nametoindex)

Nevertheless, it is powerful in some case such as replacing network
stack only for an application.

It has been tested with socket(AF_INET/AF_INET6/AF_NETLINK) without any
external netdevices, i.e. only works with localhost (127.0.0.1/::1).
It may need more work on non-Linux host.

The below should work on Linux.
% ./tools/lkl/bin/hijack.sh ip ad

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
[Octavian: use lkl_sys_* calls instead of lkl_sys_wrapper_* calls]
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/Targets              |   9 +
 tools/lkl/bin/lkl-hijack.sh    |  23 +
 tools/lkl/lib/hijack/Build     |   4 +
 tools/lkl/lib/hijack/hijack.c  | 607 +++++++++++++++++++++++++++
 tools/lkl/lib/hijack/init.c    | 241 +++++++++++
 tools/lkl/lib/hijack/init.h    |   8 +
 tools/lkl/lib/hijack/xlate.c   | 613 +++++++++++++++++++++++++++
 tools/lkl/lib/hijack/xlate.h   |  13 +
 tools/lkl/tests/hijack-test.sh | 737 +++++++++++++++++++++++++++++++++
 tools/lkl/tests/run_netperf.sh |  98 +++++
 10 files changed, 2353 insertions(+)
 create mode 100755 tools/lkl/bin/lkl-hijack.sh
 create mode 100644 tools/lkl/lib/hijack/Build
 create mode 100644 tools/lkl/lib/hijack/hijack.c
 create mode 100644 tools/lkl/lib/hijack/init.c
 create mode 100644 tools/lkl/lib/hijack/init.h
 create mode 100644 tools/lkl/lib/hijack/xlate.c
 create mode 100644 tools/lkl/lib/hijack/xlate.h
 create mode 100755 tools/lkl/tests/hijack-test.sh
 create mode 100755 tools/lkl/tests/run_netperf.sh

diff --git a/tools/lkl/Targets b/tools/lkl/Targets
index 5a4b3508f0a2..3ec093af1722 100644
--- a/tools/lkl/Targets
+++ b/tools/lkl/Targets
@@ -14,3 +14,12 @@ LDLIBS_fs2tar-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
 
 progs-$(LKL_HOST_CONFIG_FUSE) += lklfuse
 LDLIBS_lklfuse-y := -lfuse
+
+ifneq ($(LKL_HOST_CONFIG_BSD),y)
+libs-$(LKL_HOST_CONFIG_POSIX) += lib/hijack/liblkl-hijack
+endif
+LDFLAGS_lib/hijack/liblkl-hijack-y += -shared -nodefaultlibs
+LDLIBS_lib/hijack/liblkl-hijack-y += -ldl
+LDLIBS_lib/hijack/liblkl-hijack-$(LKL_HOST_CONFIG_ARM) += -lgcc -lc
+LDLIBS_lib/hijack/liblkl-hijack-$(LKL_HOST_CONFIG_AARCH64) += -lc
+LDLIBS_lib/hijack/liblkl-hijack-$(LKL_HOST_CONFIG_I386) += -lc_nonshared
diff --git a/tools/lkl/bin/lkl-hijack.sh b/tools/lkl/bin/lkl-hijack.sh
new file mode 100755
index 000000000000..7cf92856dfad
--- /dev/null
+++ b/tools/lkl/bin/lkl-hijack.sh
@@ -0,0 +1,23 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+##
+## This wrapper script works to replace system calls symbols such as
+## socket(2), recvmsg(2) for the redirection to LKL. Ideally it works
+## with any applications, but in practice (tm) it depends on the maturity
+## of hijack library (liblkl-hijack.so).
+##
+## Since LD_PRELOAD technique with setuid/setgid binary is tricky, you may
+## need to use sudo (or equivalents) to do it (e.g., ping).
+##
+## % sudo hijack.sh ping 127.0.0.1
+##
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+export LD_LIBRARY_PATH=${script_dir}/../lib/hijack
+if [ -n ${LKL_HIJACK_DEBUG+x}  ]
+then
+  trap '' TSTP
+fi
+LD_PRELOAD=liblkl-hijack.so $*
diff --git a/tools/lkl/lib/hijack/Build b/tools/lkl/lib/hijack/Build
new file mode 100644
index 000000000000..e68e93a3328a
--- /dev/null
+++ b/tools/lkl/lib/hijack/Build
@@ -0,0 +1,4 @@
+liblkl-hijack-y += hijack.o
+liblkl-hijack-y += init.o
+liblkl-hijack-y += xlate.o
+
diff --git a/tools/lkl/lib/hijack/hijack.c b/tools/lkl/lib/hijack/hijack.c
new file mode 100644
index 000000000000..485c15d7c279
--- /dev/null
+++ b/tools/lkl/lib/hijack/hijack.c
@@ -0,0 +1,607 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * system calls hijack code
+ * Copyright (c) 2015 Hajime Tazaki
+ *
+ * Author: Hajime Tazaki <tazaki@sfc.wide.ad.jp>
+ *
+ * Note: some of the code is picked from rumpkernel, written by Antti Kantee.
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdarg.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <sys/mman.h>
+#define __USE_GNU
+#include <dlfcn.h>
+#include <sys/socket.h>
+#include <sys/select.h>
+#include <sys/epoll.h>
+#include <stdint.h>
+#include <string.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <poll.h>
+#include <sys/ioctl.h>
+#include <assert.h>
+#include <pthread.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include "xlate.h"
+#include "init.h"
+
+static int is_lklfd(int fd)
+{
+	if (fd < LKL_FD_OFFSET)
+		return 0;
+
+	return 1;
+}
+
+static void *resolve_sym(const char *sym)
+{
+	void *resolv;
+
+	resolv = dlsym(RTLD_NEXT, sym);
+	if (!resolv) {
+		fprintf(stderr, "dlsym fail %s (%s)\n", sym, dlerror());
+		assert(0);
+	}
+	return resolv;
+}
+
+typedef long (*host_call)(long p1, long p2, long p3, long p4, long p5, long p6);
+
+static host_call host_calls[__lkl__NR_syscalls];
+/* internally managed fd list for epoll */
+int dual_fds[LKL_FD_OFFSET];
+
+#define HOOK_FD_CALL(name)						\
+	static void __attribute__((constructor(101)))			\
+	init_host_##name(void)						\
+	{								\
+		host_calls[__lkl__NR_##name] = resolve_sym(#name);	\
+	}								\
+									\
+	long name##_hook(long p1, long p2, long p3, long p4, long p5,	\
+			 long p6)					\
+	{								\
+		long p[6] = {p1, p2, p3, p4, p5, p6 };			\
+									\
+		if (!host_calls[__lkl__NR_##name])			\
+			host_calls[__lkl__NR_##name] = resolve_sym(#name); \
+		if (!is_lklfd(p1))					\
+			return host_calls[__lkl__NR_##name](p1, p2, p3,	\
+							    p4, p5, p6); \
+									\
+		return lkl_set_errno(lkl_syscall(__lkl__NR_##name, p));	\
+	}								\
+	asm(".global " #name);						\
+	asm(".set " #name "," #name "_hook")
+
+#define HOOK_CALL_USE_HOST_BEFORE_START(name)				\
+	static void __attribute__((constructor(101)))			\
+	init_host_##name(void)						\
+	{								\
+		host_calls[__lkl__NR_##name] = resolve_sym(#name);	\
+	}								\
+									\
+	long name##_hook(long p1, long p2, long p3, long p4, long p5,	\
+			 long p6)					\
+	{								\
+		long p[6] = {p1, p2, p3, p4, p5, p6 };			\
+									\
+		if (!host_calls[__lkl__NR_##name])			\
+			host_calls[__lkl__NR_##name] = resolve_sym(#name); \
+		if (!lkl_running)					\
+			return host_calls[__lkl__NR_##name](p1, p2, p3,	\
+							    p4, p5, p6); \
+									\
+		return lkl_set_errno(lkl_syscall(__lkl__NR_##name, p));	\
+	}								\
+	asm(".global " #name);						\
+	asm(".set " #name "," #name "_hook")
+
+#define HOST_CALL(name)							\
+	static long (*host_##name)();					\
+	static void __attribute__((constructor(101)))			\
+	init2_host_##name(void)						\
+	{								\
+		host_##name = resolve_sym(#name);			\
+	}
+
+#define HOOK_CALL(name)							\
+	long name##_hook(long p1, long p2, long p3, long p4, long p5,	\
+			 long p6)					\
+	{								\
+		long p[6] = {p1, p2, p3, p4, p5, p6};			\
+									\
+		return lkl_set_errno(lkl_syscall(__lkl__NR_##name, p));	\
+	}								\
+	asm(".global " #name);						\
+	asm(".set " #name "," #name "_hook")
+
+#define CHECK_HOST_CALL(name)					\
+	do {							\
+		if (!host_##name)				\
+			host_##name = resolve_sym(#name);	\
+	} while (0)
+
+static int lkl_call(int nr, int args, ...)
+{
+	long params[6];
+	va_list vl;
+	int i;
+
+	va_start(vl, args);
+	for (i = 0; i < args; i++)
+		params[i] = va_arg(vl, long);
+	va_end(vl);
+
+	return lkl_set_errno(lkl_syscall(nr, params));
+}
+
+HOOK_FD_CALL(recvmsg);
+HOOK_FD_CALL(sendmsg);
+HOOK_FD_CALL(sendmmsg);
+HOOK_FD_CALL(getsockname);
+HOOK_FD_CALL(getpeername);
+HOOK_FD_CALL(bind);
+HOOK_FD_CALL(connect);
+HOOK_FD_CALL(listen);
+HOOK_FD_CALL(shutdown);
+HOOK_FD_CALL(accept);
+HOOK_FD_CALL(write);
+HOOK_FD_CALL(writev);
+HOOK_FD_CALL(sendto);
+HOOK_FD_CALL(read);
+HOOK_FD_CALL(readv);
+HOOK_FD_CALL(recvfrom);
+HOOK_FD_CALL(splice);
+HOOK_FD_CALL(vmsplice);
+
+HOOK_CALL_USE_HOST_BEFORE_START(accept4);
+HOOK_CALL_USE_HOST_BEFORE_START(pipe2);
+
+HOST_CALL(write);
+HOST_CALL(pipe2);
+
+HOST_CALL(setsockopt);
+int setsockopt(int fd, int level, int optname, const void *optval,
+	       socklen_t optlen)
+{
+	CHECK_HOST_CALL(setsockopt);
+	if (!is_lklfd(fd))
+		return host_setsockopt(fd, level, optname, optval, optlen);
+	return lkl_call(__lkl__NR_setsockopt, 5, fd, lkl_solevel_xlate(level),
+			lkl_soname_xlate(optname), (void *)optval, optlen);
+}
+
+HOST_CALL(getsockopt);
+int getsockopt(int fd, int level, int optname, void *optval, socklen_t *optlen)
+{
+	CHECK_HOST_CALL(getsockopt);
+	if (!is_lklfd(fd))
+		return host_getsockopt(fd, level, optname, optval, optlen);
+	return lkl_call(__lkl__NR_getsockopt, 5, fd, lkl_solevel_xlate(level),
+			lkl_soname_xlate(optname), optval, (int *)optlen);
+}
+
+HOST_CALL(socket);
+int socket(int domain, int type, int protocol)
+{
+	CHECK_HOST_CALL(socket);
+	if (domain == AF_UNIX || domain == PF_PACKET)
+		return host_socket(domain, type, protocol);
+
+	if (!lkl_running)
+		return host_socket(domain, type, protocol);
+
+	return lkl_call(__lkl__NR_socket, 3, domain, type, protocol);
+}
+
+HOST_CALL(ioctl);
+int ioctl(int fd, unsigned long req, ...)
+{
+	va_list vl;
+	long arg;
+
+	va_start(vl, req);
+	arg = va_arg(vl, long);
+	va_end(vl);
+
+	CHECK_HOST_CALL(ioctl);
+
+	if (!is_lklfd(fd))
+		return host_ioctl(fd, req, arg);
+	return lkl_call(__lkl__NR_ioctl, 3, fd, lkl_ioctl_req_xlate(req), arg);
+}
+
+
+HOST_CALL(fcntl);
+int fcntl(int fd, int cmd, ...)
+{
+	va_list vl;
+	long arg;
+
+	va_start(vl, cmd);
+	arg = va_arg(vl, long);
+	va_end(vl);
+
+	CHECK_HOST_CALL(fcntl);
+
+	if (!is_lklfd(fd))
+		return host_fcntl(fd, cmd, arg);
+	return lkl_call(__lkl__NR_fcntl, 3, fd, lkl_fcntl_cmd_xlate(cmd), arg);
+}
+
+HOST_CALL(poll);
+int poll(struct pollfd *fds, nfds_t nfds, int timeout)
+{
+	unsigned int i, lklfds = 0, hostfds = 0;
+
+	CHECK_HOST_CALL(poll);
+
+	for (i = 0; i < nfds; i++) {
+		if (is_lklfd(fds[i].fd))
+			lklfds = 1;
+		else
+			hostfds = 1;
+	}
+
+	/* FIXME: need to handle mixed case of hostfd and lklfd. */
+	if (lklfds && hostfds)
+		return lkl_set_errno(-LKL_EOPNOTSUPP);
+
+
+	if (hostfds)
+		return host_poll(fds, nfds, timeout);
+
+	return lkl_sys_poll((struct lkl_pollfd *)fds, nfds, timeout);
+}
+
+int __poll(struct pollfd *, nfds_t, int) __attribute__((alias("poll")));
+
+HOST_CALL(select);
+int select(int nfds, fd_set *r, fd_set *w, fd_set *e, struct timeval *t)
+{
+	int fd, hostfds = 0, lklfds = 0;
+
+	CHECK_HOST_CALL(select);
+
+	for (fd = 0; fd < nfds; fd++) {
+		if (r != 0 && FD_ISSET(fd, r)) {
+			if (is_lklfd(fd))
+				lklfds = 1;
+			else
+				hostfds = 1;
+		}
+		if (w != 0 && FD_ISSET(fd, w)) {
+			if (is_lklfd(fd))
+				lklfds = 1;
+			else
+				hostfds = 1;
+		}
+		if (e != 0 && FD_ISSET(fd, e)) {
+			if (is_lklfd(fd))
+				lklfds = 1;
+			else
+				hostfds = 1;
+		}
+	}
+
+	/* FIXME: handle mixed case of hostfd and lklfd */
+	if (lklfds && hostfds)
+		return lkl_set_errno(-LKL_EOPNOTSUPP);
+
+	if (hostfds)
+		return host_select(nfds, r, w, e, t);
+
+	return lkl_sys_select(nfds, (lkl_fd_set *)r, (lkl_fd_set *)w,
+			      (lkl_fd_set *)e, (struct lkl_timeval *)t);
+}
+
+HOST_CALL(close);
+int close(int fd)
+{
+	CHECK_HOST_CALL(close);
+
+	if (!is_lklfd(fd)) {
+		/* handle epoll's dual_fd */
+		if ((dual_fds[fd] != -1) && lkl_running) {
+			lkl_call(__lkl__NR_close, 1, dual_fds[fd]);
+			dual_fds[fd] = -1;
+		}
+
+		return host_close(fd);
+	}
+
+	return lkl_call(__lkl__NR_close, 1, fd);
+}
+
+HOST_CALL(epoll_create);
+int epoll_create(int size)
+{
+	int host_fd;
+
+	CHECK_HOST_CALL(epoll_create);
+
+	host_fd = host_epoll_create(size);
+	if (!host_fd) {
+		fprintf(stderr, "%s fail (%d)\n", __func__, errno);
+		return -1;
+	}
+
+	if (!lkl_running)
+		return host_fd;
+
+	dual_fds[host_fd] = lkl_sys_epoll_create(size);
+
+	/* always returns the host fd */
+	return host_fd;
+}
+
+HOST_CALL(epoll_create1);
+int epoll_create1(int flags)
+{
+	int host_fd;
+
+	CHECK_HOST_CALL(epoll_create1);
+
+	host_fd = host_epoll_create1(flags);
+	if (!host_fd) {
+		fprintf(stderr, "%s fail (%d)\n", __func__, errno);
+		return -1;
+	}
+
+	if (!lkl_running)
+		return host_fd;
+
+	dual_fds[host_fd] = lkl_sys_epoll_create1(flags);
+
+	/* always returns the host fd */
+	return host_fd;
+}
+
+
+HOST_CALL(epoll_ctl);
+int epoll_ctl(int epollfd, int op, int fd, struct epoll_event *event)
+{
+	CHECK_HOST_CALL(epoll_ctl);
+
+	if (!is_lklfd(fd))
+		return host_epoll_ctl(epollfd, op, fd, event);
+
+	return lkl_call(__lkl__NR_epoll_ctl, 4, dual_fds[epollfd],
+			op, fd, event);
+}
+
+struct epollarg {
+	int epfd;
+	struct epoll_event *events;
+	int maxevents;
+	int timeout;
+	int pipefd;
+	int errnum;
+};
+
+HOST_CALL(epoll_wait)
+static void *host_epollwait(void *arg)
+{
+	struct epollarg *earg = arg;
+	int ret;
+
+	ret = host_epoll_wait(earg->epfd, earg->events,
+			      earg->maxevents, earg->timeout);
+	if (ret == -1)
+		earg->errnum = errno;
+	lkl_call(__lkl__NR_write, 3, earg->pipefd, &ret, sizeof(ret));
+
+	return (void *)(intptr_t)ret;
+}
+
+int epoll_wait(int epfd, struct epoll_event *events,
+	       int maxevents, int timeout)
+{
+	CHECK_HOST_CALL(epoll_wait);
+	CHECK_HOST_CALL(pipe2);
+
+	int l_pipe[2] = {-1, -1}, h_pipe[2] = {-1, -1};
+	struct epoll_event host_ev, lkl_ev;
+	int ret_events = maxevents;
+	struct epoll_event h_events[ret_events], l_events[ret_events];
+	struct epollarg earg;
+	pthread_t thread;
+	void *trv_val;
+	int i, ret, ret_lkl, ret_host;
+
+	ret = lkl_sys_pipe(l_pipe);
+	if (ret == -1) {
+		fprintf(stderr, "lkl pipe error(errno=%d)\n", errno);
+		return -1;
+	}
+
+	ret = host_pipe2(h_pipe, 0);
+	if (ret == -1) {
+		fprintf(stderr, "host pipe error(errno=%d)\n", errno);
+		return -1;
+	}
+
+	if (dual_fds[epfd] == -1) {
+		fprintf(stderr, "epollfd isn't available (%d)\n", epfd);
+		abort();
+	}
+
+	/* wait pipe at host/lkl epoll_fd */
+	memset(&lkl_ev, 0, sizeof(lkl_ev));
+	lkl_ev.events = EPOLLIN;
+	lkl_ev.data.fd = l_pipe[0];
+	ret = lkl_call(__lkl__NR_epoll_ctl, 4, dual_fds[epfd], EPOLL_CTL_ADD,
+		       l_pipe[0], &lkl_ev);
+	if (ret == -1) {
+		fprintf(stderr, "epoll_ctl error(epfd=%d:%d, fd=%d, err=%d)\n",
+			epfd, dual_fds[epfd], l_pipe[0], errno);
+		return -1;
+	}
+
+	memset(&host_ev, 0, sizeof(host_ev));
+	host_ev.events = EPOLLIN;
+	host_ev.data.fd = h_pipe[0];
+	ret = host_epoll_ctl(epfd, EPOLL_CTL_ADD, h_pipe[0], &host_ev);
+	if (ret == -1) {
+		fprintf(stderr, "host epoll_ctl error(%d, %d, %d, %d)\n",
+			epfd, h_pipe[0], h_pipe[1], errno);
+		return -1;
+	}
+
+
+	/* now wait by epoll_wait on 2 threads */
+	memset(h_events, 0, sizeof(struct epoll_event) * ret_events);
+	memset(l_events, 0, sizeof(struct epoll_event) * ret_events);
+	earg.epfd = epfd;
+	earg.events = h_events;
+	earg.maxevents = maxevents;
+	earg.timeout = timeout;
+	earg.pipefd = l_pipe[1];
+	pthread_create(&thread, NULL, host_epollwait, &earg);
+
+	ret_lkl = lkl_sys_epoll_wait(dual_fds[epfd],
+				     (struct lkl_epoll_event *)l_events,
+				     maxevents, timeout);
+	if (ret_lkl == -1) {
+		fprintf(stderr,
+			"lkl_%s_wait error(epfd=%d:%d, fd=%d, err=%d)\n",
+			__func__, epfd, dual_fds[epfd], l_pipe[0], errno);
+		return -1;
+	}
+	host_write(h_pipe[1], &ret, sizeof(ret));
+	pthread_join(thread, &trv_val);
+	ret_host = (int)(intptr_t)trv_val;
+	if (ret_host == -1) {
+		fprintf(stderr,
+			"host epoll_ctl error(%d, %d, %d, %d)\n", epfd,
+			h_pipe[0], h_pipe[1], errno);
+		return -1;
+	}
+
+	ret = lkl_call(__lkl__NR_epoll_ctl, 4, dual_fds[epfd], EPOLL_CTL_DEL,
+		       l_pipe[0], &lkl_ev);
+	if (ret == -1) {
+		fprintf(stderr,
+			"lkl epoll_ctl error(epfd=%d:%d, fd=%d, err=%d)\n",
+			epfd, dual_fds[epfd], l_pipe[0], errno);
+		return -1;
+	}
+
+	ret = host_epoll_ctl(epfd, EPOLL_CTL_DEL, h_pipe[0], &host_ev);
+	if (ret == -1) {
+		fprintf(stderr, "host epoll_ctl error(%d, %d, %d, %d)\n",
+			epfd, h_pipe[0], h_pipe[1], errno);
+		return -1;
+	}
+
+	memset(events, 0, sizeof(struct epoll_event) * maxevents);
+	ret = 0;
+	if (ret_host > 0) {
+		for (i = 0; i < ret_host; i++) {
+			if (h_events[i].data.fd == h_pipe[0])
+				continue;
+			if (is_lklfd(h_events[i].data.fd))
+				continue;
+
+			memcpy(events, &(h_events[i]),
+			       sizeof(struct epoll_event));
+			events++;
+			ret++;
+		}
+	}
+	if (ret_lkl > 0) {
+		for (i = 0; i < ret_lkl; i++) {
+			if (l_events[i].data.fd == l_pipe[0])
+				continue;
+			if (!is_lklfd(l_events[i].data.fd))
+				continue;
+
+			memcpy(events, &(l_events[i]),
+			       sizeof(struct epoll_event));
+			events++;
+			ret++;
+		}
+	}
+
+	lkl_call(__lkl__NR_close, 1, l_pipe[0]);
+	lkl_call(__lkl__NR_close, 1, l_pipe[1]);
+	host_close(h_pipe[0]);
+	host_close(h_pipe[1]);
+
+	return ret;
+}
+
+int eventfd(unsigned int count, int flags)
+{
+	if (!lkl_running) {
+		int (*f)(unsigned int a, int b) = resolve_sym("eventfd");
+
+		return f(count, flags);
+	}
+
+	return lkl_sys_eventfd2(count, flags);
+}
+
+HOST_CALL(eventfd_read);
+int eventfd_read(int fd, uint64_t *value)
+{
+	CHECK_HOST_CALL(eventfd_read);
+
+	if (!is_lklfd(fd))
+		return host_eventfd_read(fd, value);
+
+	return lkl_sys_read(fd, (void *) value,
+			    sizeof(*value)) != sizeof(*value) ? -1 : 0;
+}
+
+HOST_CALL(eventfd_write);
+int eventfd_write(int fd, uint64_t value)
+{
+	CHECK_HOST_CALL(eventfd_write);
+
+	if (!is_lklfd(fd))
+		return host_eventfd_write(fd, value);
+
+	return lkl_sys_write(fd, (void *) &value,
+			     sizeof(value)) != sizeof(value) ? -1 : 0;
+}
+
+HOST_CALL(mmap)
+void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset)
+{
+	CHECK_HOST_CALL(mmap);
+
+	if (addr != NULL || flags != (MAP_ANONYMOUS|MAP_PRIVATE) ||
+	    prot != (PROT_READ|PROT_WRITE) || fd != -2 || offset != 0)
+		return (void *)host_mmap(addr, length, prot, flags, fd, offset);
+	return lkl_sys_mmap(addr, length, prot, flags, fd, offset);
+}
+
+ssize_t send(int fd, const void *buf, size_t len, int flags)
+{
+	return sendto(fd, buf, len, flags, 0, 0);
+}
+
+ssize_t recv(int fd, void *buf, size_t len, int flags)
+{
+	return recvfrom(fd, buf, len, flags, 0, 0);
+}
+
+extern int pipe2(int fd[2], int flag);
+int pipe(int fd[2])
+{
+	if (!lkl_running)
+		return host_calls[__lkl__NR_pipe2]((long)fd, 0, 0, 0, 0, 0);
+
+	return pipe2(fd, 0);
+
+}
diff --git a/tools/lkl/lib/hijack/init.c b/tools/lkl/lib/hijack/init.c
new file mode 100644
index 000000000000..2145fb7ec2cb
--- /dev/null
+++ b/tools/lkl/lib/hijack/init.c
@@ -0,0 +1,241 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * system calls hijack code
+ * Copyright (c) 2015 Hajime Tazaki
+ *
+ * Author: Hajime Tazaki <tazaki@sfc.wide.ad.jp>
+ *
+ * Note: some of the code is picked from rumpkernel, written by Antti Kantee.
+ */
+
+#include <stdio.h>
+#include <net/if.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/types.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <signal.h>
+#include <lkl.h>
+#include <lkl_host.h>
+#include <lkl_config.h>
+
+#include "xlate.h"
+#include "init.h"
+
+#define __USE_GNU
+#include <dlfcn.h>
+
+#define _GNU_SOURCE
+#include <sched.h>
+
+/* Mount points are named after filesystem types so they should never
+ * be longer than ~6 characters.
+ */
+#define MAX_FSTYPE_LEN 50
+
+static void PinToCpus(const cpu_set_t *cpus)
+{
+	if (sched_setaffinity(0, sizeof(cpu_set_t), cpus))
+		perror("sched_setaffinity");
+}
+
+static void PinToFirstCpu(const cpu_set_t *cpus)
+{
+	int j;
+	cpu_set_t pinto;
+
+	CPU_ZERO(&pinto);
+	for (j = 0; j < CPU_SETSIZE; j++) {
+		if (CPU_ISSET(j, cpus)) {
+			lkl_printf("LKL: Pin To CPU %d\n", j);
+			CPU_SET(j, &pinto);
+			PinToCpus(&pinto);
+			return;
+		}
+	}
+}
+
+int lkl_debug, lkl_running;
+
+static struct lkl_config *cfg;
+
+static int config_load(void)
+{
+	int len, ret = -1;
+	char *buf;
+	int fd;
+	char *path = getenv("LKL_HIJACK_CONFIG_FILE");
+
+	cfg = (struct lkl_config *)malloc(sizeof(struct lkl_config));
+	if (!cfg) {
+		perror("config malloc");
+		return -1;
+	}
+	memset(cfg, 0, sizeof(struct lkl_config));
+
+	ret = lkl_load_config_env(cfg);
+	if (ret < 0)
+		return ret;
+
+	if (path)
+		fd = open(path, O_RDONLY, 0);
+	else if (access("lkl-hijack.json", R_OK) == 0)
+		fd = open("lkl-hijack.json", O_RDONLY, 0);
+	else
+		return 0;
+	if (fd < 0) {
+		fprintf(stderr, "config_file open %s: %s\n",
+			path, strerror(errno));
+		return -1;
+	}
+	len = lseek(fd, 0, SEEK_END);
+	lseek(fd, 0, SEEK_SET);
+	if (len < 0) {
+		perror("config size check (lseek)");
+		return -1;
+	} else if (len == 0) {
+		return 0;
+	}
+	buf = (char *)malloc(len * sizeof(char) + 1);
+	if (!buf) {
+		perror("config buf malloc");
+		return -1;
+	}
+	ret = read(fd, buf, len);
+	if (ret < 0) {
+		perror("config file read");
+		free(buf);
+		return -1;
+	}
+	ret = lkl_load_config_json(cfg, buf);
+	free(buf);
+	return ret;
+}
+
+void __attribute__((constructor))
+hijack_init(void)
+{
+	int ret, i, dev_null;
+	int single_cpu_mode = 0;
+	cpu_set_t ori_cpu;
+
+	ret = config_load();
+	if (ret < 0)
+		return;
+
+	/* reflect pre-configuration */
+	lkl_load_config_pre(cfg);
+
+	/* hijack library specific configurations */
+	if (cfg->debug)
+		lkl_register_dbg_handler();
+
+	if (lkl_debug & 0x200) {
+		char c;
+
+		printf("press 'enter' to continue\n");
+		if (scanf("%c", &c) <= 0) {
+			fprintf(stderr, "scanf() fails\n");
+			return;
+		}
+	}
+	if (cfg->single_cpu) {
+		single_cpu_mode = atoi(cfg->single_cpu);
+		switch (single_cpu_mode) {
+		case 0:
+		case 1:
+		case 2:
+			break;
+		default:
+			fprintf(stderr, "single cpu mode must be 0~2.\n");
+			single_cpu_mode = 0;
+			break;
+		}
+	}
+
+	if (single_cpu_mode) {
+		if (sched_getaffinity(0, sizeof(cpu_set_t), &ori_cpu)) {
+			perror("sched_getaffinity");
+			single_cpu_mode = 0;
+		}
+	}
+
+	/* Pin to a single cpu.
+	 * Any children thread created after it are pinned to the same CPU.
+	 */
+	if (single_cpu_mode == 2)
+		PinToFirstCpu(&ori_cpu);
+
+	if (single_cpu_mode == 1)
+		PinToFirstCpu(&ori_cpu);
+
+	ret = lkl_start_kernel(&lkl_host_ops, cfg->boot_cmdline);
+	if (ret) {
+		fprintf(stderr, "can't start kernel: %s\n", lkl_strerror(ret));
+		return;
+	}
+
+	lkl_running = 1;
+
+	/* initialize epoll manage list */
+	memset(dual_fds, -1, sizeof(int) * LKL_FD_OFFSET);
+
+	/* restore cpu affinity */
+	if (single_cpu_mode)
+		PinToCpus(&ori_cpu);
+
+	ret = lkl_set_fd_limit(65535);
+	if (ret)
+		fprintf(stderr, "lkl_set_fd_limit failed: %s\n",
+			lkl_strerror(ret));
+
+	/* fillup FDs up to LKL_FD_OFFSET */
+	ret = lkl_sys_mknod("/dev_null", LKL_S_IFCHR | 0600, LKL_MKDEV(1, 3));
+	dev_null = lkl_sys_open("/dev_null", LKL_O_RDONLY, 0);
+	if (dev_null < 0) {
+		fprintf(stderr, "failed to open /dev/null: %s\n",
+				lkl_strerror(dev_null));
+		return;
+	}
+
+	for (i = 1; i < LKL_FD_OFFSET; i++)
+		lkl_sys_dup(dev_null);
+
+	/* lo iff_up */
+	lkl_if_up(1);
+
+	/* reflect post-configuration */
+	lkl_load_config_post(cfg);
+}
+
+void __attribute__((destructor))
+hijack_fini(void)
+{
+	int i;
+	int err;
+
+	/* The following pauses the kernel before exiting allowing one
+	 * to debug or collect stattistics/diagnosis info from it.
+	 */
+	if (lkl_debug & 0x100) {
+		while (1)
+			pause();
+	}
+
+	if (!cfg)
+		return;
+
+	lkl_unload_config(cfg);
+	free(cfg);
+
+	if (!lkl_running)
+		return;
+
+	for (i = 0; i < LKL_FD_OFFSET; i++)
+		lkl_sys_close(i);
+
+	err = lkl_sys_halt();
+	if (err)
+		fprintf(stderr, "lkl_sys_halt: %s\n", lkl_strerror(err));
+}
diff --git a/tools/lkl/lib/hijack/init.h b/tools/lkl/lib/hijack/init.h
new file mode 100644
index 000000000000..c4039e018b2b
--- /dev/null
+++ b/tools/lkl/lib/hijack/init.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_HIJACK_INIT_H
+#define _LKL_HIJACK_INIT_H
+
+extern int lkl_running;
+extern int dual_fds[];
+
+#endif /*_LKL_HIJACK_INIT_H */
diff --git a/tools/lkl/lib/hijack/xlate.c b/tools/lkl/lib/hijack/xlate.c
new file mode 100644
index 000000000000..b96a0107116a
--- /dev/null
+++ b/tools/lkl/lib/hijack/xlate.c
@@ -0,0 +1,613 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <errno.h>
+#define __USE_GNU
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#undef st_atime
+#undef st_mtime
+#undef st_ctime
+#include <lkl.h>
+
+#include "xlate.h"
+
+long lkl_set_errno(long err)
+{
+	if (err >= 0)
+		return err;
+
+	switch (err) {
+	case -LKL_EPERM:
+		errno = EPERM;
+		break;
+	case -LKL_ENOENT:
+		errno = ENOENT;
+		break;
+	case -LKL_ESRCH:
+		errno = ESRCH;
+		break;
+	case -LKL_EINTR:
+		errno = EINTR;
+		break;
+	case -LKL_EIO:
+		errno = EIO;
+		break;
+	case -LKL_ENXIO:
+		errno = ENXIO;
+		break;
+	case -LKL_E2BIG:
+		errno = E2BIG;
+		break;
+	case -LKL_ENOEXEC:
+		errno = ENOEXEC;
+		break;
+	case -LKL_EBADF:
+		errno = EBADF;
+		break;
+	case -LKL_ECHILD:
+		errno = ECHILD;
+		break;
+	case -LKL_EAGAIN:
+		errno = EAGAIN;
+		break;
+	case -LKL_ENOMEM:
+		errno = ENOMEM;
+		break;
+	case -LKL_EACCES:
+		errno = EACCES;
+		break;
+	case -LKL_EFAULT:
+		errno = EFAULT;
+		break;
+	case -LKL_ENOTBLK:
+		errno = ENOTBLK;
+		break;
+	case -LKL_EBUSY:
+		errno = EBUSY;
+		break;
+	case -LKL_EEXIST:
+		errno = EEXIST;
+		break;
+	case -LKL_EXDEV:
+		errno = EXDEV;
+		break;
+	case -LKL_ENODEV:
+		errno = ENODEV;
+		break;
+	case -LKL_ENOTDIR:
+		errno = ENOTDIR;
+		break;
+	case -LKL_EISDIR:
+		errno = EISDIR;
+		break;
+	case -LKL_EINVAL:
+		errno = EINVAL;
+		break;
+	case -LKL_ENFILE:
+		errno = ENFILE;
+		break;
+	case -LKL_EMFILE:
+		errno = EMFILE;
+		break;
+	case -LKL_ENOTTY:
+		errno = ENOTTY;
+		break;
+	case -LKL_ETXTBSY:
+		errno = ETXTBSY;
+		break;
+	case -LKL_EFBIG:
+		errno = EFBIG;
+		break;
+	case -LKL_ENOSPC:
+		errno = ENOSPC;
+		break;
+	case -LKL_ESPIPE:
+		errno = ESPIPE;
+		break;
+	case -LKL_EROFS:
+		errno = EROFS;
+		break;
+	case -LKL_EMLINK:
+		errno = EMLINK;
+		break;
+	case -LKL_EPIPE:
+		errno = EPIPE;
+		break;
+	case -LKL_EDOM:
+		errno = EDOM;
+		break;
+	case -LKL_ERANGE:
+		errno = ERANGE;
+		break;
+	case -LKL_EDEADLK:
+		errno = EDEADLK;
+		break;
+	case -LKL_ENAMETOOLONG:
+		errno = ENAMETOOLONG;
+		break;
+	case -LKL_ENOLCK:
+		errno = ENOLCK;
+		break;
+	case -LKL_ENOSYS:
+		errno = ENOSYS;
+		break;
+	case -LKL_ENOTEMPTY:
+		errno = ENOTEMPTY;
+		break;
+	case -LKL_ELOOP:
+		errno = ELOOP;
+		break;
+	case -LKL_ENOMSG:
+		errno = ENOMSG;
+		break;
+	case -LKL_EIDRM:
+		errno = EIDRM;
+		break;
+	case -LKL_ECHRNG:
+		errno = ECHRNG;
+		break;
+	case -LKL_EL2NSYNC:
+		errno = EL2NSYNC;
+		break;
+	case -LKL_EL3HLT:
+		errno = EL3HLT;
+		break;
+	case -LKL_EL3RST:
+		errno = EL3RST;
+		break;
+	case -LKL_ELNRNG:
+		errno = ELNRNG;
+		break;
+	case -LKL_EUNATCH:
+		errno = EUNATCH;
+		break;
+	case -LKL_ENOCSI:
+		errno = ENOCSI;
+		break;
+	case -LKL_EL2HLT:
+		errno = EL2HLT;
+		break;
+	case -LKL_EBADE:
+		errno = EBADE;
+		break;
+	case -LKL_EBADR:
+		errno = EBADR;
+		break;
+	case -LKL_EXFULL:
+		errno = EXFULL;
+		break;
+	case -LKL_ENOANO:
+		errno = ENOANO;
+		break;
+	case -LKL_EBADRQC:
+		errno = EBADRQC;
+		break;
+	case -LKL_EBADSLT:
+		errno = EBADSLT;
+		break;
+	case -LKL_EBFONT:
+		errno = EBFONT;
+		break;
+	case -LKL_ENOSTR:
+		errno = ENOSTR;
+		break;
+	case -LKL_ENODATA:
+		errno = ENODATA;
+		break;
+	case -LKL_ETIME:
+		errno = ETIME;
+		break;
+	case -LKL_ENOSR:
+		errno = ENOSR;
+		break;
+	case -LKL_ENONET:
+		errno = ENONET;
+		break;
+	case -LKL_ENOPKG:
+		errno = ENOPKG;
+		break;
+	case -LKL_EREMOTE:
+		errno = EREMOTE;
+		break;
+	case -LKL_ENOLINK:
+		errno = ENOLINK;
+		break;
+	case -LKL_EADV:
+		errno = EADV;
+		break;
+	case -LKL_ESRMNT:
+		errno = ESRMNT;
+		break;
+	case -LKL_ECOMM:
+		errno = ECOMM;
+		break;
+	case -LKL_EPROTO:
+		errno = EPROTO;
+		break;
+	case -LKL_EMULTIHOP:
+		errno = EMULTIHOP;
+		break;
+	case -LKL_EDOTDOT:
+		errno = EDOTDOT;
+		break;
+	case -LKL_EBADMSG:
+		errno = EBADMSG;
+		break;
+	case -LKL_EOVERFLOW:
+		errno = EOVERFLOW;
+		break;
+	case -LKL_ENOTUNIQ:
+		errno = ENOTUNIQ;
+		break;
+	case -LKL_EBADFD:
+		errno = EBADFD;
+		break;
+	case -LKL_EREMCHG:
+		errno = EREMCHG;
+		break;
+	case -LKL_ELIBACC:
+		errno = ELIBACC;
+		break;
+	case -LKL_ELIBBAD:
+		errno = ELIBBAD;
+		break;
+	case -LKL_ELIBSCN:
+		errno = ELIBSCN;
+		break;
+	case -LKL_ELIBMAX:
+		errno = ELIBMAX;
+		break;
+	case -LKL_ELIBEXEC:
+		errno = ELIBEXEC;
+		break;
+	case -LKL_EILSEQ:
+		errno = EILSEQ;
+		break;
+	case -LKL_ERESTART:
+		errno = ERESTART;
+		break;
+	case -LKL_ESTRPIPE:
+		errno = ESTRPIPE;
+		break;
+	case -LKL_EUSERS:
+		errno = EUSERS;
+		break;
+	case -LKL_ENOTSOCK:
+		errno = ENOTSOCK;
+		break;
+	case -LKL_EDESTADDRREQ:
+		errno = EDESTADDRREQ;
+		break;
+	case -LKL_EMSGSIZE:
+		errno = EMSGSIZE;
+		break;
+	case -LKL_EPROTOTYPE:
+		errno = EPROTOTYPE;
+		break;
+	case -LKL_ENOPROTOOPT:
+		errno = ENOPROTOOPT;
+		break;
+	case -LKL_EPROTONOSUPPORT:
+		errno = EPROTONOSUPPORT;
+		break;
+	case -LKL_ESOCKTNOSUPPORT:
+		errno = ESOCKTNOSUPPORT;
+		break;
+	case -LKL_EOPNOTSUPP:
+		errno = EOPNOTSUPP;
+		break;
+	case -LKL_EPFNOSUPPORT:
+		errno = EPFNOSUPPORT;
+		break;
+	case -LKL_EAFNOSUPPORT:
+		errno = EAFNOSUPPORT;
+		break;
+	case -LKL_EADDRINUSE:
+		errno = EADDRINUSE;
+		break;
+	case -LKL_EADDRNOTAVAIL:
+		errno = EADDRNOTAVAIL;
+		break;
+	case -LKL_ENETDOWN:
+		errno = ENETDOWN;
+		break;
+	case -LKL_ENETUNREACH:
+		errno = ENETUNREACH;
+		break;
+	case -LKL_ENETRESET:
+		errno = ENETRESET;
+		break;
+	case -LKL_ECONNABORTED:
+		errno = ECONNABORTED;
+		break;
+	case -LKL_ECONNRESET:
+		errno = ECONNRESET;
+		break;
+	case -LKL_ENOBUFS:
+		errno = ENOBUFS;
+		break;
+	case -LKL_EISCONN:
+		errno = EISCONN;
+		break;
+	case -LKL_ENOTCONN:
+		errno = ENOTCONN;
+		break;
+	case -LKL_ESHUTDOWN:
+		errno = ESHUTDOWN;
+		break;
+	case -LKL_ETOOMANYREFS:
+		errno = ETOOMANYREFS;
+		break;
+	case -LKL_ETIMEDOUT:
+		errno = ETIMEDOUT;
+		break;
+	case -LKL_ECONNREFUSED:
+		errno = ECONNREFUSED;
+		break;
+	case -LKL_EHOSTDOWN:
+		errno = EHOSTDOWN;
+		break;
+	case -LKL_EHOSTUNREACH:
+		errno = EHOSTUNREACH;
+		break;
+	case -LKL_EALREADY:
+		errno = EALREADY;
+		break;
+	case -LKL_EINPROGRESS:
+		errno = EINPROGRESS;
+		break;
+	case -LKL_ESTALE:
+		errno = ESTALE;
+		break;
+	case -LKL_EUCLEAN:
+		errno = EUCLEAN;
+		break;
+	case -LKL_ENOTNAM:
+		errno = ENOTNAM;
+		break;
+	case -LKL_ENAVAIL:
+		errno = ENAVAIL;
+		break;
+	case -LKL_EISNAM:
+		errno = EISNAM;
+		break;
+	case -LKL_EREMOTEIO:
+		errno = EREMOTEIO;
+		break;
+	case -LKL_EDQUOT:
+		errno = EDQUOT;
+		break;
+	case -LKL_ENOMEDIUM:
+		errno = ENOMEDIUM;
+		break;
+	case -LKL_EMEDIUMTYPE:
+		errno = EMEDIUMTYPE;
+		break;
+	case -LKL_ECANCELED:
+		errno = ECANCELED;
+		break;
+	case -LKL_ENOKEY:
+		errno = ENOKEY;
+		break;
+	case -LKL_EKEYEXPIRED:
+		errno = EKEYEXPIRED;
+		break;
+	case -LKL_EKEYREVOKED:
+		errno = EKEYREVOKED;
+		break;
+	case -LKL_EKEYREJECTED:
+		errno = EKEYREJECTED;
+		break;
+	case -LKL_EOWNERDEAD:
+		errno = EOWNERDEAD;
+		break;
+	case -LKL_ENOTRECOVERABLE:
+		errno = ENOTRECOVERABLE;
+		break;
+	case -LKL_ERFKILL:
+		errno = ERFKILL;
+		break;
+	case -LKL_EHWPOISON:
+		errno = EHWPOISON;
+		break;
+	}
+
+	return -1;
+}
+
+int lkl_soname_xlate(int soname)
+{
+	switch (soname) {
+	case SO_DEBUG:
+		return LKL_SO_DEBUG;
+	case SO_REUSEADDR:
+		return LKL_SO_REUSEADDR;
+	case SO_TYPE:
+		return LKL_SO_TYPE;
+	case SO_ERROR:
+		return LKL_SO_ERROR;
+	case SO_DONTROUTE:
+		return LKL_SO_DONTROUTE;
+	case SO_BROADCAST:
+		return LKL_SO_BROADCAST;
+	case SO_SNDBUF:
+		return LKL_SO_SNDBUF;
+	case SO_RCVBUF:
+		return LKL_SO_RCVBUF;
+	case SO_SNDBUFFORCE:
+		return LKL_SO_SNDBUFFORCE;
+	case SO_RCVBUFFORCE:
+		return LKL_SO_RCVBUFFORCE;
+	case SO_KEEPALIVE:
+		return LKL_SO_KEEPALIVE;
+	case SO_OOBINLINE:
+		return LKL_SO_OOBINLINE;
+	case SO_NO_CHECK:
+		return LKL_SO_NO_CHECK;
+	case SO_PRIORITY:
+		return LKL_SO_PRIORITY;
+	case SO_LINGER:
+		return LKL_SO_LINGER;
+	case SO_BSDCOMPAT:
+		return LKL_SO_BSDCOMPAT;
+#ifdef SO_REUSEPORT
+	case SO_REUSEPORT:
+		return LKL_SO_REUSEPORT;
+#endif
+	case SO_PASSCRED:
+		return LKL_SO_PASSCRED;
+	case SO_PEERCRED:
+		return LKL_SO_PEERCRED;
+	case SO_RCVLOWAT:
+		return LKL_SO_RCVLOWAT;
+	case SO_SNDLOWAT:
+		return LKL_SO_SNDLOWAT;
+	case SO_RCVTIMEO:
+		return LKL_SO_RCVTIMEO;
+	case SO_SNDTIMEO:
+		return LKL_SO_SNDTIMEO;
+	case SO_SECURITY_AUTHENTICATION:
+		return LKL_SO_SECURITY_AUTHENTICATION;
+	case SO_SECURITY_ENCRYPTION_TRANSPORT:
+		return LKL_SO_SECURITY_ENCRYPTION_TRANSPORT;
+	case SO_SECURITY_ENCRYPTION_NETWORK:
+		return LKL_SO_SECURITY_ENCRYPTION_NETWORK;
+	case SO_BINDTODEVICE:
+		return LKL_SO_BINDTODEVICE;
+	case SO_ATTACH_FILTER:
+		return LKL_SO_ATTACH_FILTER;
+	case SO_DETACH_FILTER:
+		return LKL_SO_DETACH_FILTER;
+	case SO_PEERNAME:
+		return LKL_SO_PEERNAME;
+	case SO_TIMESTAMP:
+		return LKL_SO_TIMESTAMP;
+	case SO_ACCEPTCONN:
+		return LKL_SO_ACCEPTCONN;
+	case SO_PEERSEC:
+		return LKL_SO_PEERSEC;
+	case SO_PASSSEC:
+		return LKL_SO_PASSSEC;
+	case SO_TIMESTAMPNS:
+		return LKL_SO_TIMESTAMPNS;
+	case SO_MARK:
+		return LKL_SO_MARK;
+	case SO_TIMESTAMPING:
+		return LKL_SO_TIMESTAMPING;
+	case SO_PROTOCOL:
+		return LKL_SO_PROTOCOL;
+	case SO_DOMAIN:
+		return LKL_SO_DOMAIN;
+	case SO_RXQ_OVFL:
+		return LKL_SO_RXQ_OVFL;
+#ifdef SO_WIFI_STATUS
+	case SO_WIFI_STATUS:
+		return LKL_SO_WIFI_STATUS;
+#endif
+#ifdef SO_PEEK_OFF
+	case SO_PEEK_OFF:
+		return LKL_SO_PEEK_OFF;
+#endif
+#ifdef SO_NOFCS
+	case SO_NOFCS:
+		return LKL_SO_NOFCS;
+#endif
+#ifdef SO_LOCK_FILTER
+	case SO_LOCK_FILTER:
+		return LKL_SO_LOCK_FILTER;
+#endif
+#ifdef SO_SELECT_ERR_QUEUE
+	case SO_SELECT_ERR_QUEUE:
+		return LKL_SO_SELECT_ERR_QUEUE;
+#endif
+#ifdef SO_BUSY_POLL
+	case SO_BUSY_POLL:
+		return LKL_SO_BUSY_POLL;
+#endif
+#ifdef SO_MAX_PACING_RATE
+	case SO_MAX_PACING_RATE:
+		return LKL_SO_MAX_PACING_RATE;
+#endif
+	}
+
+	return soname;
+}
+
+int lkl_solevel_xlate(int solevel)
+{
+	switch (solevel) {
+	case SOL_SOCKET:
+		return LKL_SOL_SOCKET;
+	}
+
+	return solevel;
+}
+
+unsigned long lkl_ioctl_req_xlate(unsigned long req)
+{
+	switch (req) {
+	case FIOSETOWN:
+		return LKL_FIOSETOWN;
+	case SIOCSPGRP:
+		return LKL_SIOCSPGRP;
+	case FIOGETOWN:
+		return LKL_FIOGETOWN;
+	case SIOCGPGRP:
+		return LKL_SIOCGPGRP;
+	case SIOCATMARK:
+		return LKL_SIOCATMARK;
+	case SIOCGSTAMP:
+		return LKL_SIOCGSTAMP;
+	case SIOCGSTAMPNS:
+		return LKL_SIOCGSTAMPNS;
+	}
+
+	/* TODO: asm/termios.h translations */
+
+	return req;
+}
+
+int lkl_fcntl_cmd_xlate(int cmd)
+{
+	switch (cmd) {
+	case F_DUPFD:
+		return LKL_F_DUPFD;
+	case F_GETFD:
+		return LKL_F_GETFD;
+	case F_SETFD:
+		return LKL_F_SETFD;
+	case F_GETFL:
+		return LKL_F_GETFL;
+	case F_SETFL:
+		return LKL_F_SETFL;
+	case F_GETLK:
+		return LKL_F_GETLK;
+	case F_SETLK:
+		return LKL_F_SETLK;
+	case F_SETLKW:
+		return LKL_F_SETLKW;
+	case F_SETOWN:
+		return LKL_F_SETOWN;
+	case F_GETOWN:
+		return LKL_F_GETOWN;
+	case F_SETSIG:
+		return LKL_F_SETSIG;
+	case F_GETSIG:
+		return LKL_F_GETSIG;
+#ifndef LKL_CONFIG_64BIT
+	case F_GETLK64:
+		return LKL_F_GETLK64;
+	case F_SETLK64:
+		return LKL_F_SETLK64;
+	case F_SETLKW64:
+		return LKL_F_SETLKW64;
+#endif
+	case F_SETOWN_EX:
+		return LKL_F_SETOWN_EX;
+	case F_GETOWN_EX:
+		return LKL_F_GETOWN_EX;
+	}
+
+	return cmd;
+}
+
diff --git a/tools/lkl/lib/hijack/xlate.h b/tools/lkl/lib/hijack/xlate.h
new file mode 100644
index 000000000000..0c0281f241a6
--- /dev/null
+++ b/tools/lkl/lib/hijack/xlate.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_HIJACK_XLATE_H
+#define _LKL_HIJACK_XLATE_H
+
+long lkl_set_errno(long err);
+int lkl_soname_xlate(int soname);
+int lkl_solevel_xlate(int solevel);
+unsigned long lkl_ioctl_req_xlate(unsigned long req);
+int lkl_fcntl_cmd_xlate(int cmd);
+
+#define LKL_FD_OFFSET (FD_SETSIZE/2)
+
+#endif /* _LKL_HIJACK_XLATE_H */
diff --git a/tools/lkl/tests/hijack-test.sh b/tools/lkl/tests/hijack-test.sh
new file mode 100755
index 000000000000..097af6cff3ba
--- /dev/null
+++ b/tools/lkl/tests/hijack-test.sh
@@ -0,0 +1,737 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+clear_wdir()
+{
+    test -f ${VDESWITCH}.pid && kill $(cat ${VDESWITCH}.pid)
+    rm -rf ${wdir}
+    tap_cleanup
+    tap_cleanup 1
+}
+
+set_cfgjson()
+{
+    cfgjson=${wdir}/hijack-test$1.conf
+
+    cat > ${cfgjson}
+
+    export_vars cfgjson
+}
+
+run_hijack_cfg()
+{
+    lkl_test_cmd LKL_HIJACK_CONFIG_FILE=$cfgjson $hijack $@
+}
+
+run_hijack()
+{
+    lkl_test_cmd $hijack $@
+}
+
+run_netperf()
+{
+    lkl_test_cmd TEST_NETSERVER_PORT=$TEST_NETSERVER_PORT \
+                 LKL_HIJACK_CONFIG_FILE=$cfgjson $netperf $@
+}
+
+test_ping()
+{
+    set -e
+
+    run_hijack ${ping} -c 1 127.0.0.1
+}
+
+test_ping6()
+{
+    set -e
+
+    run_hijack ${ping6} -c 1 ::1
+}
+
+test_mount_and_dump()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "mount":"proc,sysfs",
+        "dump":"/sysfs/class/net/lo/mtu,/sysfs/class/net/lo/dev_id",
+        "debug": "1"
+    }
+EOF
+
+    ans=$(run_hijack_cfg $(lkl_test_cmd which true))
+    echo "$ans"
+    echo "$ans" | grep "^65536" # lo's MTU
+    echo "$ans" | grep "0x0" # lo's dev_id
+}
+
+test_boot_cmdline()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "debug":"1",
+        "boot_cmdline":"loglevel=1"
+    }
+EOF
+
+    ans=$(run_hijack_cfg $(lkl_test_cmd which true))
+    echo "$ans"
+    [ $(echo "$ans" | wc -l) = 1 ]
+}
+
+
+test_pipe_setup()
+{
+    set -e
+
+    mkfifo ${fifo1}
+    mkfifo ${fifo2}
+
+    set_cfgjson << EOF
+    {
+        "interfaces":
+        [
+            {
+                "type":"pipe",
+                "param":"${fifo1}|${fifo2}",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "mac":"$TEST_MAC0",
+            }
+        ]
+    }
+EOF
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+}
+
+test_pipe_ping()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_lkl)",
+        "gateway6":"$(ip6_lkl)",
+        "interfaces":
+        [
+            {
+                "type":"pipe",
+                "param":"${fifo1}|${fifo2}",
+                "ip":"$(ip_host)",
+                "masklen":"$TEST_IP_NETMASK",
+                "mac":"$TEST_MAC0",
+                "ipv6":"$(ip6_host)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_hijack_cfg $(lkl_test_cmd which sleep) 10 &
+
+    set_cfgjson 2 << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"pipe",
+                "param":"${fifo2}|${fifo1}",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "mac":"$TEST_MAC0",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    # Ping under LKL
+    run_hijack_cfg ${ping} -c 1 -w 10 $(ip_host)
+
+    # Ping 6 under LKL
+    run_hijack_cfg ${ping6} -c 1 -w 10 $(ip6_host)
+
+    wait
+}
+
+test_tap_setup()
+{
+    set -e
+
+    # Set up the TAP device we'd like to use
+    tap_setup
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "debug":"1",
+        "interfaces": [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac": "$TEST_MAC0"
+            }
+        ]
+    }
+EOF
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+    echo "$addr" | grep "$(ip6_lkl)"
+    ! echo "$addr" | grep "WARN: failed to free"
+}
+
+test_tap_cleanup()
+{
+    tap_cleanup
+    tap_cleanup 1
+}
+
+test_tap_ping_host()
+{
+    set -e
+
+    # Make sure we can ping the host from inside LKL
+    run_hijack_cfg ${ping} -c 1 $(ip_host)
+    run_hijack_cfg ${ping6} -c 1 $(ip6_host)
+}
+
+test_tap_ping_lkl()
+{
+    set -e
+
+    # Now let's check that the host can see LKL.
+    lkl_test_cmd sudo ip -6 neigh del $(ip6_lkl) dev $(tap_ifname)
+    lkl_test_cmd sudo ip neigh del $(ip_lkl) dev $(tap_ifname)
+    run_hijack_cfg $(lkl_test_cmd which sleep) 3 &
+    sleep 2
+    lkl_test_cmd sudo ping -i 0.01 -c 65 $(ip_lkl)
+    lkl_test_cmd sudo ping6 -i 0.01 -c 65 $(ip6_lkl)
+}
+
+test_tap_neighbours()
+{
+    set -e
+
+    neigh1="$(ip_add 100)|12:34:56:78:9a:bc"
+    neigh2="$(ip6_add 100)|12:34:56:78:9a:be"
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "neigh":"${neigh1};${neigh2}"
+            }
+        ]
+    }
+EOF
+
+    # add neighbor entries
+    ans=$(run_hijack_cfg ip neighbor show) || true
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bc"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:be"
+
+    # gateway
+    ans=$(run_hijack_cfg ip route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip_host)"
+
+    # gateway v6
+    ans=$(run_hijack_cfg ip -6 route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip6_host)"
+}
+
+test_tap_netperf_stream_tso_csum()
+{
+    set -e
+
+    # offload
+    # LKL_VIRTIO_NET_F_HOST_TSO4 && LKL_VIRTIO_NET_F_GUEST_TSO4
+    # LKL_VIRTIO_NET_F_CSUM && LKL_VIRTIO_NET_F_GUEST_CSUM
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "offload":"0x883",
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_netperf $(ip_host) TCP_STREAM
+}
+
+test_tap_netperf_maerts_csum_tso()
+{
+    run_netperf $(ip_host) TCP_MAERTS
+}
+
+test_tap_netperf_stream_csum_tso_mrgrxbuf()
+{
+    set -e
+
+    # offload
+    # LKL_VIRTIO_NET_F_HOST_TSO4 && LKL_VIRTIO_NET_F_MRG_RXBUF
+    # LKL_VIRTIO_NET_F_CSUM && LKL_VIRTIO_NET_F_GUEST_CSUM
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "offload":"0x8803",
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_netperf $(ip_host) TCP_MAERTS
+}
+
+test_tap_netperf_tcp_rr()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_netperf $(ip_host) TCP_RR
+}
+
+test_tap_netperf_tcp_stream()
+{
+    set -e
+
+    run_netperf $(ip_host) TCP_STREAM
+}
+
+test_tap_netperf_tcp_maerts()
+{
+    set -e
+
+    run_netperf $(ip_host) TCP_MAERTS
+}
+
+
+test_tap_qdisc()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0",
+                "qdisc":"root|fq"
+            }
+        ]
+    }
+EOF
+
+    qdisc=$(run_hijack_cfg tc -s -d qdisc show)
+    echo "$qdisc"
+    echo "$qdisc" | grep "qdisc fq" > /dev/null
+    echo "$qdisc" | grep throttled > /dev/null
+}
+
+test_tap_multi_if_setup()
+{
+    set -e
+
+    # Set up 2nd TAP device we'd like to use
+    tap_setup 1
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0"
+            },
+            {
+                "type":"tap",
+                "param":"$(tap_ifname 1)",
+                "ip":"$(ip_lkl 1)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl 1)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC1"
+            }
+        ]
+    }
+EOF
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+    echo "$addr" | grep "$(ip6_lkl)"
+    echo "$addr" | grep eth1
+    echo "$addr" | grep $(ip_lkl 1)
+    echo "$addr" | grep "$TEST_MAC1"
+    echo "$addr" | grep "$(ip6_lkl 1)"
+    ! echo "$addr" | grep "WARN: failed to free"
+}
+
+test_tap_multi_if_ping()
+{
+    run_hijack_cfg ${ping} -c 1 $(ip_host)
+    run_hijack_cfg ${ping6} -c 1 $(ip6_host)
+    run_hijack_cfg ${ping} -c 1 $(ip_host 1)
+    run_hijack_cfg ${ping6} -c 1 $(ip6_host 1)
+}
+
+test_tap_multi_if_neigh()
+{
+
+    neigh1="$(ip_host)00|12:34:56:78:9a:bc"
+    neigh2="$(ip6_host)00|12:34:56:78:9a:be"
+    neigh3="$(ip_host 1)00|12:34:56:78:9a:bd"
+    neigh4="$(ip6_host 1)00|12:34:56:78:9a:bf"
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0",
+                "neigh":"${neigh1};${neigh2}"
+            },
+            {
+                "type":"tap",
+                "param":"$(tap_ifname 1)",
+                "ip":"$(ip_lkl 1)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl 1)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC1",
+                "neigh":"${neigh3};${neigh4}"
+            }
+        ]
+    }
+EOF
+
+    # add neighbor entries
+    ans=$(run_hijack_cfg ip neighbor show) || true
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bc"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:be"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bd"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bf"
+}
+
+test_tap_multi_if_gateway()
+{
+    ans=$(run_hijack_cfg ip route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip_host)"
+}
+
+test_tap_multi_if_gateway_v6()
+{
+    ans=$(run_hijack_cfg ip -6 route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip6_host)"
+}
+
+
+test_tap_multitable_setup()
+{
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ifgateway":"$(ip_host)",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "ifgateway6":"$(ip6_host)",
+                "mac":"$TEST_MAC0",
+                "neigh":"${neigh1};${neigh2}"
+            },
+            {
+                "type":"tap",
+                "param":"$(tap_ifname 1)",
+                "ip":"$(ip_lkl 1)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ifgateway":"$(ip_host 1)",
+                "ipv6":"$(ip6_lkl 1)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "ifgateway6":"$(ip6_host 1)",
+                "mac":"$TEST_MAC1",
+                "neigh":"${neigh3};${neigh4}"
+            }
+        ]
+    }
+EOF
+}
+
+test_tap_multitable_ipv4_rule()
+{
+    addr=$(run_hijack_cfg ip rule show)
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep $(ip_lkl 1)
+}
+
+test_tap_multitable_ipv6_rule()
+{
+    addr=$(run_hijack_cfg ip -6 rule show)
+    echo "$addr" | grep $(ip6_lkl)
+    echo "$addr" | grep $(ip6_lkl 1)
+}
+
+test_tap_multitable_ipv4_rule_table_4()
+{
+    addr=$(run_hijack_cfg ip route show table 4)
+    echo "$addr" | grep $(ip_host)
+}
+
+test_tap_multitable_ipv6_rule_table_5()
+{
+    addr=$(run_hijack_cfg ip -6 route show table 5)
+    echo "$addr" | grep fc03::
+    echo "$addr" | grep $(ip6_host)
+}
+
+test_tap_multitable_ipv6_rule_table_6()
+{
+    addr=$(run_hijack_cfg ip route show table 6)
+    echo "$addr" | grep $(ip_host 1)
+}
+
+test_tap_multitable_ipv6_rule_table_7()
+{
+    addr=$(run_hijack_cfg ip -6 route show table 7)
+    echo "$addr" | grep fc04::
+    echo "$addr" | grep $(ip6_host 1)
+}
+
+test_vde_setup()
+{
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"vde",
+                "param":"${VDESWITCH}",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0",
+                "neigh":"${neigh1};${neigh2}"
+            }
+        ]
+    }
+EOF
+
+    tap_setup
+
+    sleep 2
+    vde_switch -d -t $(tap_ifname) -s ${VDESWITCH} -p ${VDESWITCH}.pid
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+}
+
+test_vde_cleanup()
+{
+    tap_cleanup
+}
+
+test_vde_ping_host()
+{
+    run_hijack_cfg ./ping $(ip_host) -c 1
+}
+
+test_vde_ping_lkl()
+{
+    lkl_test_cmd sudo arp -d $(ip_lkl)
+    lkl_test_cmd sudo ping -i 0.01 -c 65 $(ip_lkl) &
+    run_hijack_cfg sleep 3
+}
+
+source ${script_dir}/test.sh
+source ${script_dir}/net-setup.sh
+
+if [[ ! -e ${basedir}/lib/hijack/liblkl-hijack.so ]]; then
+    lkl_test_plan 0 "hijack tests"
+    echo "missing liblkl-hijack.so"
+    exit 0
+fi
+
+# Make a temporary directory to run tests in, since we'll be copying
+# things there.
+wdir=$(mktemp -d)
+cp `which ping` ${wdir}
+cp `which ping6` ${wdir}
+ping=${wdir}/ping
+ping6=${wdir}/ping6
+hijack=$basedir/bin/lkl-hijack.sh
+netperf=$basedir/tests/run_netperf.sh
+
+fifo1=${wdir}/fifo1
+fifo2=${wdir}/fifo2
+VDESWITCH=${wdir}/vde_switch
+
+# And make sure we clean up when we're done
+trap "clear_wdir &>/dev/null" EXIT
+
+lkl_test_plan 5 "hijack basic tests"
+lkl_test_run 1 run_hijack ip addr
+lkl_test_run 2 run_hijack ip route
+lkl_test_run 3 test_ping
+lkl_test_run 4 test_ping6
+lkl_test_run 5 test_mount_and_dump
+lkl_test_run 6 test_boot_cmdline
+
+if [ -z "$(QUIET=1 lkl_test_cmd which mkfifo)" ]; then
+    lkl_test_plan 0 "hijack pipe backend tests"
+    echo "no mkfifo command"
+else
+    lkl_test_plan 2 "hijack pipe backend tests"
+    lkl_test_run 1 test_pipe_setup
+    lkl_test_run 2 test_pipe_ping
+fi
+
+tap_prepare
+
+if ! lkl_test_cmd test -c /dev/net/tun &>/dev/null; then
+    lkl_test_plan 0 "hijack tap backend tests"
+    echo "missing /dev/net/tun"
+else
+    lkl_test_plan 23 "hijack tap backend tests"
+    lkl_test_run 1 test_tap_setup
+    lkl_test_run 2 test_tap_ping_host
+    lkl_test_run 3 test_tap_ping_lkl
+    lkl_test_run 4 test_tap_neighbours
+    lkl_test_run 5 test_tap_netperf_tcp_rr
+    lkl_test_run 6 test_tap_netperf_tcp_stream
+    lkl_test_run 7 test_tap_netperf_tcp_maerts
+    lkl_test_run 8 test_tap_netperf_stream_tso_csum
+    lkl_test_run 9 test_tap_netperf_maerts_csum_tso
+    lkl_test_run 10 test_tap_netperf_stream_csum_tso_mrgrxbuf
+    lkl_test_run 11 test_tap_qdisc
+    lkl_test_run 12 test_tap_multi_if_setup
+    lkl_test_run 13 test_tap_multi_if_ping
+    lkl_test_run 14 test_tap_multi_if_neigh
+    lkl_test_run 15 test_tap_multi_if_gateway
+    lkl_test_run 16 test_tap_multi_if_gateway_v6
+    lkl_test_run 17 test_tap_multitable_setup
+    lkl_test_run 18 test_tap_multitable_ipv4_rule
+    lkl_test_run 19 test_tap_multitable_ipv6_rule
+    lkl_test_run 20 test_tap_multitable_ipv4_rule_table_4
+    lkl_test_run 21 test_tap_multitable_ipv6_rule_table_5
+    lkl_test_run 22 test_tap_multitable_ipv6_rule_table_6
+    lkl_test_run 23 test_tap_multitable_ipv6_rule_table_7
+    lkl_test_run 24 test_tap_cleanup
+fi
+
+if [ -z "$LKL_HOST_CONFIG_VIRTIO_NET_VDE" ]; then
+    lkl_test_plan 0 "vde tests"
+    echo "vde not supported"
+elif [ ! -x "$(which vde_switch)" ]; then
+    lkl_test_plan 0 "hijack vde tests"
+    echo "could not find a vde_switch executable"
+else
+    lkl_test_plan 3 "hijack vde tests"
+    lkl_test_run 1 test_vde_setup
+    lkl_test_run 2 test_vde_ping_host
+    lkl_test_run 3 test_vde_ping_lkl
+    lkl_test_run 4 test_vde_cleanup
+fi
diff --git a/tools/lkl/tests/run_netperf.sh b/tools/lkl/tests/run_netperf.sh
new file mode 100755
index 000000000000..08c4337b7830
--- /dev/null
+++ b/tools/lkl/tests/run_netperf.sh
@@ -0,0 +1,98 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+# Usage
+#  ./run_netperf.sh [ip] [test_name] [use_taskset] [num_runs]
+
+set -e
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+hijack_script=${script_dir}/../bin/lkl-hijack.sh
+
+num_runs="1"
+test_name="TCP_STREAM"
+use_taskset="0"
+host_ip="localhost"
+taskset_cmd="taskset -c 1"
+test_len=10  # second
+
+if [ ! -x "$(which netperf)" ]; then
+    echo "WARNING: Cannot find a netserver executable, skipping netperf tests."
+    exit $TEST_SKIP
+fi
+
+if [ $# -ge 1 ]; then
+    host_ip=$1
+fi
+if [ $# -ge 2 ]; then
+    test_name=$2
+fi
+if [ $# -ge 3 ]; then
+    use_taskset=$2
+fi
+if [ $# -ge 4 ]; then
+    num_runs=$3
+fi
+if [ $# -ge 5 ]; then
+    echo "BAD NUMBER of INPUTS."
+    exit 1
+fi
+
+if [ $use_taskset = "0" ]; then
+  taskset_cmd=""
+fi
+
+clean() {
+    kill %1 || true
+}
+
+clean_with_tap() {
+    tap_cleanup &> /dev/null || true
+    clean
+    rm -rf ${work_dir}
+}
+
+# LKL_HIJACK_CONFIG_FILE is not set, which means it's not called from
+# hijack-test.sh. Needs to set up things first.
+if [ -z ${LKL_HIJACK_CONFIG_FILE+x} ]; then
+
+    # Setting up environmental vars and TAP
+    work_dir=$(mktemp -d)
+    cfgjson=${work_dir}/hijack-test.conf
+    export LKL_HIJACK_CONFIG_FILE=$cfgjson
+
+    cat <<EOF > ${cfgjson}
+    {
+         "interfaces": [
+               {
+                    "type": "tap"
+                    "param": "$(tap_ifname)"
+                    "ip": "$(ip_lkl)"
+                    "masklen":"$TEST_IP_NETMASK"
+                    "ipv6":"$(ip6_lkl)"
+                    "masklen6":"$TEST_IP6_NETMASK"
+               }
+         ]
+    }
+EOF
+
+    . $script_dir/net-setup.sh
+    host_ip=$(ip_host)
+
+    tap_prepare
+    tap_setup
+    trap clean_with_tap EXIT
+fi
+
+netserver -D -N -p $TEST_NETSERVER_PORT &
+
+trap clean EXIT
+
+echo NUM=$num_runs, TEST=$test_name, TASKSET=$use_taskset
+for i in `seq $num_runs`; do
+    echo Test: $i
+    set -x
+    $taskset_cmd ${hijack_script} netperf -p $TEST_NETSERVER_PORT -H $host_ip \
+		         -t $test_name -l $test_len
+    set +x
+done
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 28/37] lkl: add system call hijack support
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Xiao Jia, Octavian Purdila, Motomu Utsumi,
	Akira Moroo, Yuan Liu, Thomas Liebetraut, Patrick Collins,
	linux-kernel-library, Hajime Tazaki

This commit introduces initial support of system call hijack, based on
LD_PRELOAD with POSIX applications on a host.

Note that system call hijack by renaming symbol by LD_PRELOAD is not a
complete solution: it must address various issues with dirty tricks.

Those tricks/issues are:
- introduce file descriptor offset (i.e., fd + offset)
- path name isolation (i.e., chrooted)
- need of handling mixture of fd between host and lkl-ed ones
- un-hijackable symbol (__socket inside if_nametoindex() of linux
  glibc) needs to be hijacked by upper call (i.e., if_nametoindex)

Nevertheless, it is powerful in some case such as replacing network
stack only for an application.

It has been tested with socket(AF_INET/AF_INET6/AF_NETLINK) without any
external netdevices, i.e. only works with localhost (127.0.0.1/::1).
It may need more work on non-Linux host.

The below should work on Linux.
% ./tools/lkl/bin/hijack.sh ip ad

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Xiao Jia <xiaoj@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
[Octavian: use lkl_sys_* calls instead of lkl_sys_wrapper_* calls]
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/Targets              |   9 +
 tools/lkl/bin/lkl-hijack.sh    |  23 +
 tools/lkl/lib/hijack/Build     |   4 +
 tools/lkl/lib/hijack/hijack.c  | 607 +++++++++++++++++++++++++++
 tools/lkl/lib/hijack/init.c    | 241 +++++++++++
 tools/lkl/lib/hijack/init.h    |   8 +
 tools/lkl/lib/hijack/xlate.c   | 613 +++++++++++++++++++++++++++
 tools/lkl/lib/hijack/xlate.h   |  13 +
 tools/lkl/tests/hijack-test.sh | 737 +++++++++++++++++++++++++++++++++
 tools/lkl/tests/run_netperf.sh |  98 +++++
 10 files changed, 2353 insertions(+)
 create mode 100755 tools/lkl/bin/lkl-hijack.sh
 create mode 100644 tools/lkl/lib/hijack/Build
 create mode 100644 tools/lkl/lib/hijack/hijack.c
 create mode 100644 tools/lkl/lib/hijack/init.c
 create mode 100644 tools/lkl/lib/hijack/init.h
 create mode 100644 tools/lkl/lib/hijack/xlate.c
 create mode 100644 tools/lkl/lib/hijack/xlate.h
 create mode 100755 tools/lkl/tests/hijack-test.sh
 create mode 100755 tools/lkl/tests/run_netperf.sh

diff --git a/tools/lkl/Targets b/tools/lkl/Targets
index 5a4b3508f0a2..3ec093af1722 100644
--- a/tools/lkl/Targets
+++ b/tools/lkl/Targets
@@ -14,3 +14,12 @@ LDLIBS_fs2tar-$(LKL_HOST_CONFIG_NEEDS_LARGP) += -largp
 
 progs-$(LKL_HOST_CONFIG_FUSE) += lklfuse
 LDLIBS_lklfuse-y := -lfuse
+
+ifneq ($(LKL_HOST_CONFIG_BSD),y)
+libs-$(LKL_HOST_CONFIG_POSIX) += lib/hijack/liblkl-hijack
+endif
+LDFLAGS_lib/hijack/liblkl-hijack-y += -shared -nodefaultlibs
+LDLIBS_lib/hijack/liblkl-hijack-y += -ldl
+LDLIBS_lib/hijack/liblkl-hijack-$(LKL_HOST_CONFIG_ARM) += -lgcc -lc
+LDLIBS_lib/hijack/liblkl-hijack-$(LKL_HOST_CONFIG_AARCH64) += -lc
+LDLIBS_lib/hijack/liblkl-hijack-$(LKL_HOST_CONFIG_I386) += -lc_nonshared
diff --git a/tools/lkl/bin/lkl-hijack.sh b/tools/lkl/bin/lkl-hijack.sh
new file mode 100755
index 000000000000..7cf92856dfad
--- /dev/null
+++ b/tools/lkl/bin/lkl-hijack.sh
@@ -0,0 +1,23 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+##
+## This wrapper script works to replace system calls symbols such as
+## socket(2), recvmsg(2) for the redirection to LKL. Ideally it works
+## with any applications, but in practice (tm) it depends on the maturity
+## of hijack library (liblkl-hijack.so).
+##
+## Since LD_PRELOAD technique with setuid/setgid binary is tricky, you may
+## need to use sudo (or equivalents) to do it (e.g., ping).
+##
+## % sudo hijack.sh ping 127.0.0.1
+##
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+export LD_LIBRARY_PATH=${script_dir}/../lib/hijack
+if [ -n ${LKL_HIJACK_DEBUG+x}  ]
+then
+  trap '' TSTP
+fi
+LD_PRELOAD=liblkl-hijack.so $*
diff --git a/tools/lkl/lib/hijack/Build b/tools/lkl/lib/hijack/Build
new file mode 100644
index 000000000000..e68e93a3328a
--- /dev/null
+++ b/tools/lkl/lib/hijack/Build
@@ -0,0 +1,4 @@
+liblkl-hijack-y += hijack.o
+liblkl-hijack-y += init.o
+liblkl-hijack-y += xlate.o
+
diff --git a/tools/lkl/lib/hijack/hijack.c b/tools/lkl/lib/hijack/hijack.c
new file mode 100644
index 000000000000..485c15d7c279
--- /dev/null
+++ b/tools/lkl/lib/hijack/hijack.c
@@ -0,0 +1,607 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * system calls hijack code
+ * Copyright (c) 2015 Hajime Tazaki
+ *
+ * Author: Hajime Tazaki <tazaki@sfc.wide.ad.jp>
+ *
+ * Note: some of the code is picked from rumpkernel, written by Antti Kantee.
+ */
+
+#include <unistd.h>
+#include <stdio.h>
+#include <stdarg.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <sys/mman.h>
+#define __USE_GNU
+#include <dlfcn.h>
+#include <sys/socket.h>
+#include <sys/select.h>
+#include <sys/epoll.h>
+#include <stdint.h>
+#include <string.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <poll.h>
+#include <sys/ioctl.h>
+#include <assert.h>
+#include <pthread.h>
+#include <lkl.h>
+#include <lkl_host.h>
+
+#include "xlate.h"
+#include "init.h"
+
+static int is_lklfd(int fd)
+{
+	if (fd < LKL_FD_OFFSET)
+		return 0;
+
+	return 1;
+}
+
+static void *resolve_sym(const char *sym)
+{
+	void *resolv;
+
+	resolv = dlsym(RTLD_NEXT, sym);
+	if (!resolv) {
+		fprintf(stderr, "dlsym fail %s (%s)\n", sym, dlerror());
+		assert(0);
+	}
+	return resolv;
+}
+
+typedef long (*host_call)(long p1, long p2, long p3, long p4, long p5, long p6);
+
+static host_call host_calls[__lkl__NR_syscalls];
+/* internally managed fd list for epoll */
+int dual_fds[LKL_FD_OFFSET];
+
+#define HOOK_FD_CALL(name)						\
+	static void __attribute__((constructor(101)))			\
+	init_host_##name(void)						\
+	{								\
+		host_calls[__lkl__NR_##name] = resolve_sym(#name);	\
+	}								\
+									\
+	long name##_hook(long p1, long p2, long p3, long p4, long p5,	\
+			 long p6)					\
+	{								\
+		long p[6] = {p1, p2, p3, p4, p5, p6 };			\
+									\
+		if (!host_calls[__lkl__NR_##name])			\
+			host_calls[__lkl__NR_##name] = resolve_sym(#name); \
+		if (!is_lklfd(p1))					\
+			return host_calls[__lkl__NR_##name](p1, p2, p3,	\
+							    p4, p5, p6); \
+									\
+		return lkl_set_errno(lkl_syscall(__lkl__NR_##name, p));	\
+	}								\
+	asm(".global " #name);						\
+	asm(".set " #name "," #name "_hook")
+
+#define HOOK_CALL_USE_HOST_BEFORE_START(name)				\
+	static void __attribute__((constructor(101)))			\
+	init_host_##name(void)						\
+	{								\
+		host_calls[__lkl__NR_##name] = resolve_sym(#name);	\
+	}								\
+									\
+	long name##_hook(long p1, long p2, long p3, long p4, long p5,	\
+			 long p6)					\
+	{								\
+		long p[6] = {p1, p2, p3, p4, p5, p6 };			\
+									\
+		if (!host_calls[__lkl__NR_##name])			\
+			host_calls[__lkl__NR_##name] = resolve_sym(#name); \
+		if (!lkl_running)					\
+			return host_calls[__lkl__NR_##name](p1, p2, p3,	\
+							    p4, p5, p6); \
+									\
+		return lkl_set_errno(lkl_syscall(__lkl__NR_##name, p));	\
+	}								\
+	asm(".global " #name);						\
+	asm(".set " #name "," #name "_hook")
+
+#define HOST_CALL(name)							\
+	static long (*host_##name)();					\
+	static void __attribute__((constructor(101)))			\
+	init2_host_##name(void)						\
+	{								\
+		host_##name = resolve_sym(#name);			\
+	}
+
+#define HOOK_CALL(name)							\
+	long name##_hook(long p1, long p2, long p3, long p4, long p5,	\
+			 long p6)					\
+	{								\
+		long p[6] = {p1, p2, p3, p4, p5, p6};			\
+									\
+		return lkl_set_errno(lkl_syscall(__lkl__NR_##name, p));	\
+	}								\
+	asm(".global " #name);						\
+	asm(".set " #name "," #name "_hook")
+
+#define CHECK_HOST_CALL(name)					\
+	do {							\
+		if (!host_##name)				\
+			host_##name = resolve_sym(#name);	\
+	} while (0)
+
+static int lkl_call(int nr, int args, ...)
+{
+	long params[6];
+	va_list vl;
+	int i;
+
+	va_start(vl, args);
+	for (i = 0; i < args; i++)
+		params[i] = va_arg(vl, long);
+	va_end(vl);
+
+	return lkl_set_errno(lkl_syscall(nr, params));
+}
+
+HOOK_FD_CALL(recvmsg);
+HOOK_FD_CALL(sendmsg);
+HOOK_FD_CALL(sendmmsg);
+HOOK_FD_CALL(getsockname);
+HOOK_FD_CALL(getpeername);
+HOOK_FD_CALL(bind);
+HOOK_FD_CALL(connect);
+HOOK_FD_CALL(listen);
+HOOK_FD_CALL(shutdown);
+HOOK_FD_CALL(accept);
+HOOK_FD_CALL(write);
+HOOK_FD_CALL(writev);
+HOOK_FD_CALL(sendto);
+HOOK_FD_CALL(read);
+HOOK_FD_CALL(readv);
+HOOK_FD_CALL(recvfrom);
+HOOK_FD_CALL(splice);
+HOOK_FD_CALL(vmsplice);
+
+HOOK_CALL_USE_HOST_BEFORE_START(accept4);
+HOOK_CALL_USE_HOST_BEFORE_START(pipe2);
+
+HOST_CALL(write);
+HOST_CALL(pipe2);
+
+HOST_CALL(setsockopt);
+int setsockopt(int fd, int level, int optname, const void *optval,
+	       socklen_t optlen)
+{
+	CHECK_HOST_CALL(setsockopt);
+	if (!is_lklfd(fd))
+		return host_setsockopt(fd, level, optname, optval, optlen);
+	return lkl_call(__lkl__NR_setsockopt, 5, fd, lkl_solevel_xlate(level),
+			lkl_soname_xlate(optname), (void *)optval, optlen);
+}
+
+HOST_CALL(getsockopt);
+int getsockopt(int fd, int level, int optname, void *optval, socklen_t *optlen)
+{
+	CHECK_HOST_CALL(getsockopt);
+	if (!is_lklfd(fd))
+		return host_getsockopt(fd, level, optname, optval, optlen);
+	return lkl_call(__lkl__NR_getsockopt, 5, fd, lkl_solevel_xlate(level),
+			lkl_soname_xlate(optname), optval, (int *)optlen);
+}
+
+HOST_CALL(socket);
+int socket(int domain, int type, int protocol)
+{
+	CHECK_HOST_CALL(socket);
+	if (domain == AF_UNIX || domain == PF_PACKET)
+		return host_socket(domain, type, protocol);
+
+	if (!lkl_running)
+		return host_socket(domain, type, protocol);
+
+	return lkl_call(__lkl__NR_socket, 3, domain, type, protocol);
+}
+
+HOST_CALL(ioctl);
+int ioctl(int fd, unsigned long req, ...)
+{
+	va_list vl;
+	long arg;
+
+	va_start(vl, req);
+	arg = va_arg(vl, long);
+	va_end(vl);
+
+	CHECK_HOST_CALL(ioctl);
+
+	if (!is_lklfd(fd))
+		return host_ioctl(fd, req, arg);
+	return lkl_call(__lkl__NR_ioctl, 3, fd, lkl_ioctl_req_xlate(req), arg);
+}
+
+
+HOST_CALL(fcntl);
+int fcntl(int fd, int cmd, ...)
+{
+	va_list vl;
+	long arg;
+
+	va_start(vl, cmd);
+	arg = va_arg(vl, long);
+	va_end(vl);
+
+	CHECK_HOST_CALL(fcntl);
+
+	if (!is_lklfd(fd))
+		return host_fcntl(fd, cmd, arg);
+	return lkl_call(__lkl__NR_fcntl, 3, fd, lkl_fcntl_cmd_xlate(cmd), arg);
+}
+
+HOST_CALL(poll);
+int poll(struct pollfd *fds, nfds_t nfds, int timeout)
+{
+	unsigned int i, lklfds = 0, hostfds = 0;
+
+	CHECK_HOST_CALL(poll);
+
+	for (i = 0; i < nfds; i++) {
+		if (is_lklfd(fds[i].fd))
+			lklfds = 1;
+		else
+			hostfds = 1;
+	}
+
+	/* FIXME: need to handle mixed case of hostfd and lklfd. */
+	if (lklfds && hostfds)
+		return lkl_set_errno(-LKL_EOPNOTSUPP);
+
+
+	if (hostfds)
+		return host_poll(fds, nfds, timeout);
+
+	return lkl_sys_poll((struct lkl_pollfd *)fds, nfds, timeout);
+}
+
+int __poll(struct pollfd *, nfds_t, int) __attribute__((alias("poll")));
+
+HOST_CALL(select);
+int select(int nfds, fd_set *r, fd_set *w, fd_set *e, struct timeval *t)
+{
+	int fd, hostfds = 0, lklfds = 0;
+
+	CHECK_HOST_CALL(select);
+
+	for (fd = 0; fd < nfds; fd++) {
+		if (r != 0 && FD_ISSET(fd, r)) {
+			if (is_lklfd(fd))
+				lklfds = 1;
+			else
+				hostfds = 1;
+		}
+		if (w != 0 && FD_ISSET(fd, w)) {
+			if (is_lklfd(fd))
+				lklfds = 1;
+			else
+				hostfds = 1;
+		}
+		if (e != 0 && FD_ISSET(fd, e)) {
+			if (is_lklfd(fd))
+				lklfds = 1;
+			else
+				hostfds = 1;
+		}
+	}
+
+	/* FIXME: handle mixed case of hostfd and lklfd */
+	if (lklfds && hostfds)
+		return lkl_set_errno(-LKL_EOPNOTSUPP);
+
+	if (hostfds)
+		return host_select(nfds, r, w, e, t);
+
+	return lkl_sys_select(nfds, (lkl_fd_set *)r, (lkl_fd_set *)w,
+			      (lkl_fd_set *)e, (struct lkl_timeval *)t);
+}
+
+HOST_CALL(close);
+int close(int fd)
+{
+	CHECK_HOST_CALL(close);
+
+	if (!is_lklfd(fd)) {
+		/* handle epoll's dual_fd */
+		if ((dual_fds[fd] != -1) && lkl_running) {
+			lkl_call(__lkl__NR_close, 1, dual_fds[fd]);
+			dual_fds[fd] = -1;
+		}
+
+		return host_close(fd);
+	}
+
+	return lkl_call(__lkl__NR_close, 1, fd);
+}
+
+HOST_CALL(epoll_create);
+int epoll_create(int size)
+{
+	int host_fd;
+
+	CHECK_HOST_CALL(epoll_create);
+
+	host_fd = host_epoll_create(size);
+	if (!host_fd) {
+		fprintf(stderr, "%s fail (%d)\n", __func__, errno);
+		return -1;
+	}
+
+	if (!lkl_running)
+		return host_fd;
+
+	dual_fds[host_fd] = lkl_sys_epoll_create(size);
+
+	/* always returns the host fd */
+	return host_fd;
+}
+
+HOST_CALL(epoll_create1);
+int epoll_create1(int flags)
+{
+	int host_fd;
+
+	CHECK_HOST_CALL(epoll_create1);
+
+	host_fd = host_epoll_create1(flags);
+	if (!host_fd) {
+		fprintf(stderr, "%s fail (%d)\n", __func__, errno);
+		return -1;
+	}
+
+	if (!lkl_running)
+		return host_fd;
+
+	dual_fds[host_fd] = lkl_sys_epoll_create1(flags);
+
+	/* always returns the host fd */
+	return host_fd;
+}
+
+
+HOST_CALL(epoll_ctl);
+int epoll_ctl(int epollfd, int op, int fd, struct epoll_event *event)
+{
+	CHECK_HOST_CALL(epoll_ctl);
+
+	if (!is_lklfd(fd))
+		return host_epoll_ctl(epollfd, op, fd, event);
+
+	return lkl_call(__lkl__NR_epoll_ctl, 4, dual_fds[epollfd],
+			op, fd, event);
+}
+
+struct epollarg {
+	int epfd;
+	struct epoll_event *events;
+	int maxevents;
+	int timeout;
+	int pipefd;
+	int errnum;
+};
+
+HOST_CALL(epoll_wait)
+static void *host_epollwait(void *arg)
+{
+	struct epollarg *earg = arg;
+	int ret;
+
+	ret = host_epoll_wait(earg->epfd, earg->events,
+			      earg->maxevents, earg->timeout);
+	if (ret == -1)
+		earg->errnum = errno;
+	lkl_call(__lkl__NR_write, 3, earg->pipefd, &ret, sizeof(ret));
+
+	return (void *)(intptr_t)ret;
+}
+
+int epoll_wait(int epfd, struct epoll_event *events,
+	       int maxevents, int timeout)
+{
+	CHECK_HOST_CALL(epoll_wait);
+	CHECK_HOST_CALL(pipe2);
+
+	int l_pipe[2] = {-1, -1}, h_pipe[2] = {-1, -1};
+	struct epoll_event host_ev, lkl_ev;
+	int ret_events = maxevents;
+	struct epoll_event h_events[ret_events], l_events[ret_events];
+	struct epollarg earg;
+	pthread_t thread;
+	void *trv_val;
+	int i, ret, ret_lkl, ret_host;
+
+	ret = lkl_sys_pipe(l_pipe);
+	if (ret == -1) {
+		fprintf(stderr, "lkl pipe error(errno=%d)\n", errno);
+		return -1;
+	}
+
+	ret = host_pipe2(h_pipe, 0);
+	if (ret == -1) {
+		fprintf(stderr, "host pipe error(errno=%d)\n", errno);
+		return -1;
+	}
+
+	if (dual_fds[epfd] == -1) {
+		fprintf(stderr, "epollfd isn't available (%d)\n", epfd);
+		abort();
+	}
+
+	/* wait pipe at host/lkl epoll_fd */
+	memset(&lkl_ev, 0, sizeof(lkl_ev));
+	lkl_ev.events = EPOLLIN;
+	lkl_ev.data.fd = l_pipe[0];
+	ret = lkl_call(__lkl__NR_epoll_ctl, 4, dual_fds[epfd], EPOLL_CTL_ADD,
+		       l_pipe[0], &lkl_ev);
+	if (ret == -1) {
+		fprintf(stderr, "epoll_ctl error(epfd=%d:%d, fd=%d, err=%d)\n",
+			epfd, dual_fds[epfd], l_pipe[0], errno);
+		return -1;
+	}
+
+	memset(&host_ev, 0, sizeof(host_ev));
+	host_ev.events = EPOLLIN;
+	host_ev.data.fd = h_pipe[0];
+	ret = host_epoll_ctl(epfd, EPOLL_CTL_ADD, h_pipe[0], &host_ev);
+	if (ret == -1) {
+		fprintf(stderr, "host epoll_ctl error(%d, %d, %d, %d)\n",
+			epfd, h_pipe[0], h_pipe[1], errno);
+		return -1;
+	}
+
+
+	/* now wait by epoll_wait on 2 threads */
+	memset(h_events, 0, sizeof(struct epoll_event) * ret_events);
+	memset(l_events, 0, sizeof(struct epoll_event) * ret_events);
+	earg.epfd = epfd;
+	earg.events = h_events;
+	earg.maxevents = maxevents;
+	earg.timeout = timeout;
+	earg.pipefd = l_pipe[1];
+	pthread_create(&thread, NULL, host_epollwait, &earg);
+
+	ret_lkl = lkl_sys_epoll_wait(dual_fds[epfd],
+				     (struct lkl_epoll_event *)l_events,
+				     maxevents, timeout);
+	if (ret_lkl == -1) {
+		fprintf(stderr,
+			"lkl_%s_wait error(epfd=%d:%d, fd=%d, err=%d)\n",
+			__func__, epfd, dual_fds[epfd], l_pipe[0], errno);
+		return -1;
+	}
+	host_write(h_pipe[1], &ret, sizeof(ret));
+	pthread_join(thread, &trv_val);
+	ret_host = (int)(intptr_t)trv_val;
+	if (ret_host == -1) {
+		fprintf(stderr,
+			"host epoll_ctl error(%d, %d, %d, %d)\n", epfd,
+			h_pipe[0], h_pipe[1], errno);
+		return -1;
+	}
+
+	ret = lkl_call(__lkl__NR_epoll_ctl, 4, dual_fds[epfd], EPOLL_CTL_DEL,
+		       l_pipe[0], &lkl_ev);
+	if (ret == -1) {
+		fprintf(stderr,
+			"lkl epoll_ctl error(epfd=%d:%d, fd=%d, err=%d)\n",
+			epfd, dual_fds[epfd], l_pipe[0], errno);
+		return -1;
+	}
+
+	ret = host_epoll_ctl(epfd, EPOLL_CTL_DEL, h_pipe[0], &host_ev);
+	if (ret == -1) {
+		fprintf(stderr, "host epoll_ctl error(%d, %d, %d, %d)\n",
+			epfd, h_pipe[0], h_pipe[1], errno);
+		return -1;
+	}
+
+	memset(events, 0, sizeof(struct epoll_event) * maxevents);
+	ret = 0;
+	if (ret_host > 0) {
+		for (i = 0; i < ret_host; i++) {
+			if (h_events[i].data.fd == h_pipe[0])
+				continue;
+			if (is_lklfd(h_events[i].data.fd))
+				continue;
+
+			memcpy(events, &(h_events[i]),
+			       sizeof(struct epoll_event));
+			events++;
+			ret++;
+		}
+	}
+	if (ret_lkl > 0) {
+		for (i = 0; i < ret_lkl; i++) {
+			if (l_events[i].data.fd == l_pipe[0])
+				continue;
+			if (!is_lklfd(l_events[i].data.fd))
+				continue;
+
+			memcpy(events, &(l_events[i]),
+			       sizeof(struct epoll_event));
+			events++;
+			ret++;
+		}
+	}
+
+	lkl_call(__lkl__NR_close, 1, l_pipe[0]);
+	lkl_call(__lkl__NR_close, 1, l_pipe[1]);
+	host_close(h_pipe[0]);
+	host_close(h_pipe[1]);
+
+	return ret;
+}
+
+int eventfd(unsigned int count, int flags)
+{
+	if (!lkl_running) {
+		int (*f)(unsigned int a, int b) = resolve_sym("eventfd");
+
+		return f(count, flags);
+	}
+
+	return lkl_sys_eventfd2(count, flags);
+}
+
+HOST_CALL(eventfd_read);
+int eventfd_read(int fd, uint64_t *value)
+{
+	CHECK_HOST_CALL(eventfd_read);
+
+	if (!is_lklfd(fd))
+		return host_eventfd_read(fd, value);
+
+	return lkl_sys_read(fd, (void *) value,
+			    sizeof(*value)) != sizeof(*value) ? -1 : 0;
+}
+
+HOST_CALL(eventfd_write);
+int eventfd_write(int fd, uint64_t value)
+{
+	CHECK_HOST_CALL(eventfd_write);
+
+	if (!is_lklfd(fd))
+		return host_eventfd_write(fd, value);
+
+	return lkl_sys_write(fd, (void *) &value,
+			     sizeof(value)) != sizeof(value) ? -1 : 0;
+}
+
+HOST_CALL(mmap)
+void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset)
+{
+	CHECK_HOST_CALL(mmap);
+
+	if (addr != NULL || flags != (MAP_ANONYMOUS|MAP_PRIVATE) ||
+	    prot != (PROT_READ|PROT_WRITE) || fd != -2 || offset != 0)
+		return (void *)host_mmap(addr, length, prot, flags, fd, offset);
+	return lkl_sys_mmap(addr, length, prot, flags, fd, offset);
+}
+
+ssize_t send(int fd, const void *buf, size_t len, int flags)
+{
+	return sendto(fd, buf, len, flags, 0, 0);
+}
+
+ssize_t recv(int fd, void *buf, size_t len, int flags)
+{
+	return recvfrom(fd, buf, len, flags, 0, 0);
+}
+
+extern int pipe2(int fd[2], int flag);
+int pipe(int fd[2])
+{
+	if (!lkl_running)
+		return host_calls[__lkl__NR_pipe2]((long)fd, 0, 0, 0, 0, 0);
+
+	return pipe2(fd, 0);
+
+}
diff --git a/tools/lkl/lib/hijack/init.c b/tools/lkl/lib/hijack/init.c
new file mode 100644
index 000000000000..2145fb7ec2cb
--- /dev/null
+++ b/tools/lkl/lib/hijack/init.c
@@ -0,0 +1,241 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * system calls hijack code
+ * Copyright (c) 2015 Hajime Tazaki
+ *
+ * Author: Hajime Tazaki <tazaki@sfc.wide.ad.jp>
+ *
+ * Note: some of the code is picked from rumpkernel, written by Antti Kantee.
+ */
+
+#include <stdio.h>
+#include <net/if.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/types.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <signal.h>
+#include <lkl.h>
+#include <lkl_host.h>
+#include <lkl_config.h>
+
+#include "xlate.h"
+#include "init.h"
+
+#define __USE_GNU
+#include <dlfcn.h>
+
+#define _GNU_SOURCE
+#include <sched.h>
+
+/* Mount points are named after filesystem types so they should never
+ * be longer than ~6 characters.
+ */
+#define MAX_FSTYPE_LEN 50
+
+static void PinToCpus(const cpu_set_t *cpus)
+{
+	if (sched_setaffinity(0, sizeof(cpu_set_t), cpus))
+		perror("sched_setaffinity");
+}
+
+static void PinToFirstCpu(const cpu_set_t *cpus)
+{
+	int j;
+	cpu_set_t pinto;
+
+	CPU_ZERO(&pinto);
+	for (j = 0; j < CPU_SETSIZE; j++) {
+		if (CPU_ISSET(j, cpus)) {
+			lkl_printf("LKL: Pin To CPU %d\n", j);
+			CPU_SET(j, &pinto);
+			PinToCpus(&pinto);
+			return;
+		}
+	}
+}
+
+int lkl_debug, lkl_running;
+
+static struct lkl_config *cfg;
+
+static int config_load(void)
+{
+	int len, ret = -1;
+	char *buf;
+	int fd;
+	char *path = getenv("LKL_HIJACK_CONFIG_FILE");
+
+	cfg = (struct lkl_config *)malloc(sizeof(struct lkl_config));
+	if (!cfg) {
+		perror("config malloc");
+		return -1;
+	}
+	memset(cfg, 0, sizeof(struct lkl_config));
+
+	ret = lkl_load_config_env(cfg);
+	if (ret < 0)
+		return ret;
+
+	if (path)
+		fd = open(path, O_RDONLY, 0);
+	else if (access("lkl-hijack.json", R_OK) == 0)
+		fd = open("lkl-hijack.json", O_RDONLY, 0);
+	else
+		return 0;
+	if (fd < 0) {
+		fprintf(stderr, "config_file open %s: %s\n",
+			path, strerror(errno));
+		return -1;
+	}
+	len = lseek(fd, 0, SEEK_END);
+	lseek(fd, 0, SEEK_SET);
+	if (len < 0) {
+		perror("config size check (lseek)");
+		return -1;
+	} else if (len == 0) {
+		return 0;
+	}
+	buf = (char *)malloc(len * sizeof(char) + 1);
+	if (!buf) {
+		perror("config buf malloc");
+		return -1;
+	}
+	ret = read(fd, buf, len);
+	if (ret < 0) {
+		perror("config file read");
+		free(buf);
+		return -1;
+	}
+	ret = lkl_load_config_json(cfg, buf);
+	free(buf);
+	return ret;
+}
+
+void __attribute__((constructor))
+hijack_init(void)
+{
+	int ret, i, dev_null;
+	int single_cpu_mode = 0;
+	cpu_set_t ori_cpu;
+
+	ret = config_load();
+	if (ret < 0)
+		return;
+
+	/* reflect pre-configuration */
+	lkl_load_config_pre(cfg);
+
+	/* hijack library specific configurations */
+	if (cfg->debug)
+		lkl_register_dbg_handler();
+
+	if (lkl_debug & 0x200) {
+		char c;
+
+		printf("press 'enter' to continue\n");
+		if (scanf("%c", &c) <= 0) {
+			fprintf(stderr, "scanf() fails\n");
+			return;
+		}
+	}
+	if (cfg->single_cpu) {
+		single_cpu_mode = atoi(cfg->single_cpu);
+		switch (single_cpu_mode) {
+		case 0:
+		case 1:
+		case 2:
+			break;
+		default:
+			fprintf(stderr, "single cpu mode must be 0~2.\n");
+			single_cpu_mode = 0;
+			break;
+		}
+	}
+
+	if (single_cpu_mode) {
+		if (sched_getaffinity(0, sizeof(cpu_set_t), &ori_cpu)) {
+			perror("sched_getaffinity");
+			single_cpu_mode = 0;
+		}
+	}
+
+	/* Pin to a single cpu.
+	 * Any children thread created after it are pinned to the same CPU.
+	 */
+	if (single_cpu_mode == 2)
+		PinToFirstCpu(&ori_cpu);
+
+	if (single_cpu_mode == 1)
+		PinToFirstCpu(&ori_cpu);
+
+	ret = lkl_start_kernel(&lkl_host_ops, cfg->boot_cmdline);
+	if (ret) {
+		fprintf(stderr, "can't start kernel: %s\n", lkl_strerror(ret));
+		return;
+	}
+
+	lkl_running = 1;
+
+	/* initialize epoll manage list */
+	memset(dual_fds, -1, sizeof(int) * LKL_FD_OFFSET);
+
+	/* restore cpu affinity */
+	if (single_cpu_mode)
+		PinToCpus(&ori_cpu);
+
+	ret = lkl_set_fd_limit(65535);
+	if (ret)
+		fprintf(stderr, "lkl_set_fd_limit failed: %s\n",
+			lkl_strerror(ret));
+
+	/* fillup FDs up to LKL_FD_OFFSET */
+	ret = lkl_sys_mknod("/dev_null", LKL_S_IFCHR | 0600, LKL_MKDEV(1, 3));
+	dev_null = lkl_sys_open("/dev_null", LKL_O_RDONLY, 0);
+	if (dev_null < 0) {
+		fprintf(stderr, "failed to open /dev/null: %s\n",
+				lkl_strerror(dev_null));
+		return;
+	}
+
+	for (i = 1; i < LKL_FD_OFFSET; i++)
+		lkl_sys_dup(dev_null);
+
+	/* lo iff_up */
+	lkl_if_up(1);
+
+	/* reflect post-configuration */
+	lkl_load_config_post(cfg);
+}
+
+void __attribute__((destructor))
+hijack_fini(void)
+{
+	int i;
+	int err;
+
+	/* The following pauses the kernel before exiting allowing one
+	 * to debug or collect stattistics/diagnosis info from it.
+	 */
+	if (lkl_debug & 0x100) {
+		while (1)
+			pause();
+	}
+
+	if (!cfg)
+		return;
+
+	lkl_unload_config(cfg);
+	free(cfg);
+
+	if (!lkl_running)
+		return;
+
+	for (i = 0; i < LKL_FD_OFFSET; i++)
+		lkl_sys_close(i);
+
+	err = lkl_sys_halt();
+	if (err)
+		fprintf(stderr, "lkl_sys_halt: %s\n", lkl_strerror(err));
+}
diff --git a/tools/lkl/lib/hijack/init.h b/tools/lkl/lib/hijack/init.h
new file mode 100644
index 000000000000..c4039e018b2b
--- /dev/null
+++ b/tools/lkl/lib/hijack/init.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_HIJACK_INIT_H
+#define _LKL_HIJACK_INIT_H
+
+extern int lkl_running;
+extern int dual_fds[];
+
+#endif /*_LKL_HIJACK_INIT_H */
diff --git a/tools/lkl/lib/hijack/xlate.c b/tools/lkl/lib/hijack/xlate.c
new file mode 100644
index 000000000000..b96a0107116a
--- /dev/null
+++ b/tools/lkl/lib/hijack/xlate.c
@@ -0,0 +1,613 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <errno.h>
+#define __USE_GNU
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#undef st_atime
+#undef st_mtime
+#undef st_ctime
+#include <lkl.h>
+
+#include "xlate.h"
+
+long lkl_set_errno(long err)
+{
+	if (err >= 0)
+		return err;
+
+	switch (err) {
+	case -LKL_EPERM:
+		errno = EPERM;
+		break;
+	case -LKL_ENOENT:
+		errno = ENOENT;
+		break;
+	case -LKL_ESRCH:
+		errno = ESRCH;
+		break;
+	case -LKL_EINTR:
+		errno = EINTR;
+		break;
+	case -LKL_EIO:
+		errno = EIO;
+		break;
+	case -LKL_ENXIO:
+		errno = ENXIO;
+		break;
+	case -LKL_E2BIG:
+		errno = E2BIG;
+		break;
+	case -LKL_ENOEXEC:
+		errno = ENOEXEC;
+		break;
+	case -LKL_EBADF:
+		errno = EBADF;
+		break;
+	case -LKL_ECHILD:
+		errno = ECHILD;
+		break;
+	case -LKL_EAGAIN:
+		errno = EAGAIN;
+		break;
+	case -LKL_ENOMEM:
+		errno = ENOMEM;
+		break;
+	case -LKL_EACCES:
+		errno = EACCES;
+		break;
+	case -LKL_EFAULT:
+		errno = EFAULT;
+		break;
+	case -LKL_ENOTBLK:
+		errno = ENOTBLK;
+		break;
+	case -LKL_EBUSY:
+		errno = EBUSY;
+		break;
+	case -LKL_EEXIST:
+		errno = EEXIST;
+		break;
+	case -LKL_EXDEV:
+		errno = EXDEV;
+		break;
+	case -LKL_ENODEV:
+		errno = ENODEV;
+		break;
+	case -LKL_ENOTDIR:
+		errno = ENOTDIR;
+		break;
+	case -LKL_EISDIR:
+		errno = EISDIR;
+		break;
+	case -LKL_EINVAL:
+		errno = EINVAL;
+		break;
+	case -LKL_ENFILE:
+		errno = ENFILE;
+		break;
+	case -LKL_EMFILE:
+		errno = EMFILE;
+		break;
+	case -LKL_ENOTTY:
+		errno = ENOTTY;
+		break;
+	case -LKL_ETXTBSY:
+		errno = ETXTBSY;
+		break;
+	case -LKL_EFBIG:
+		errno = EFBIG;
+		break;
+	case -LKL_ENOSPC:
+		errno = ENOSPC;
+		break;
+	case -LKL_ESPIPE:
+		errno = ESPIPE;
+		break;
+	case -LKL_EROFS:
+		errno = EROFS;
+		break;
+	case -LKL_EMLINK:
+		errno = EMLINK;
+		break;
+	case -LKL_EPIPE:
+		errno = EPIPE;
+		break;
+	case -LKL_EDOM:
+		errno = EDOM;
+		break;
+	case -LKL_ERANGE:
+		errno = ERANGE;
+		break;
+	case -LKL_EDEADLK:
+		errno = EDEADLK;
+		break;
+	case -LKL_ENAMETOOLONG:
+		errno = ENAMETOOLONG;
+		break;
+	case -LKL_ENOLCK:
+		errno = ENOLCK;
+		break;
+	case -LKL_ENOSYS:
+		errno = ENOSYS;
+		break;
+	case -LKL_ENOTEMPTY:
+		errno = ENOTEMPTY;
+		break;
+	case -LKL_ELOOP:
+		errno = ELOOP;
+		break;
+	case -LKL_ENOMSG:
+		errno = ENOMSG;
+		break;
+	case -LKL_EIDRM:
+		errno = EIDRM;
+		break;
+	case -LKL_ECHRNG:
+		errno = ECHRNG;
+		break;
+	case -LKL_EL2NSYNC:
+		errno = EL2NSYNC;
+		break;
+	case -LKL_EL3HLT:
+		errno = EL3HLT;
+		break;
+	case -LKL_EL3RST:
+		errno = EL3RST;
+		break;
+	case -LKL_ELNRNG:
+		errno = ELNRNG;
+		break;
+	case -LKL_EUNATCH:
+		errno = EUNATCH;
+		break;
+	case -LKL_ENOCSI:
+		errno = ENOCSI;
+		break;
+	case -LKL_EL2HLT:
+		errno = EL2HLT;
+		break;
+	case -LKL_EBADE:
+		errno = EBADE;
+		break;
+	case -LKL_EBADR:
+		errno = EBADR;
+		break;
+	case -LKL_EXFULL:
+		errno = EXFULL;
+		break;
+	case -LKL_ENOANO:
+		errno = ENOANO;
+		break;
+	case -LKL_EBADRQC:
+		errno = EBADRQC;
+		break;
+	case -LKL_EBADSLT:
+		errno = EBADSLT;
+		break;
+	case -LKL_EBFONT:
+		errno = EBFONT;
+		break;
+	case -LKL_ENOSTR:
+		errno = ENOSTR;
+		break;
+	case -LKL_ENODATA:
+		errno = ENODATA;
+		break;
+	case -LKL_ETIME:
+		errno = ETIME;
+		break;
+	case -LKL_ENOSR:
+		errno = ENOSR;
+		break;
+	case -LKL_ENONET:
+		errno = ENONET;
+		break;
+	case -LKL_ENOPKG:
+		errno = ENOPKG;
+		break;
+	case -LKL_EREMOTE:
+		errno = EREMOTE;
+		break;
+	case -LKL_ENOLINK:
+		errno = ENOLINK;
+		break;
+	case -LKL_EADV:
+		errno = EADV;
+		break;
+	case -LKL_ESRMNT:
+		errno = ESRMNT;
+		break;
+	case -LKL_ECOMM:
+		errno = ECOMM;
+		break;
+	case -LKL_EPROTO:
+		errno = EPROTO;
+		break;
+	case -LKL_EMULTIHOP:
+		errno = EMULTIHOP;
+		break;
+	case -LKL_EDOTDOT:
+		errno = EDOTDOT;
+		break;
+	case -LKL_EBADMSG:
+		errno = EBADMSG;
+		break;
+	case -LKL_EOVERFLOW:
+		errno = EOVERFLOW;
+		break;
+	case -LKL_ENOTUNIQ:
+		errno = ENOTUNIQ;
+		break;
+	case -LKL_EBADFD:
+		errno = EBADFD;
+		break;
+	case -LKL_EREMCHG:
+		errno = EREMCHG;
+		break;
+	case -LKL_ELIBACC:
+		errno = ELIBACC;
+		break;
+	case -LKL_ELIBBAD:
+		errno = ELIBBAD;
+		break;
+	case -LKL_ELIBSCN:
+		errno = ELIBSCN;
+		break;
+	case -LKL_ELIBMAX:
+		errno = ELIBMAX;
+		break;
+	case -LKL_ELIBEXEC:
+		errno = ELIBEXEC;
+		break;
+	case -LKL_EILSEQ:
+		errno = EILSEQ;
+		break;
+	case -LKL_ERESTART:
+		errno = ERESTART;
+		break;
+	case -LKL_ESTRPIPE:
+		errno = ESTRPIPE;
+		break;
+	case -LKL_EUSERS:
+		errno = EUSERS;
+		break;
+	case -LKL_ENOTSOCK:
+		errno = ENOTSOCK;
+		break;
+	case -LKL_EDESTADDRREQ:
+		errno = EDESTADDRREQ;
+		break;
+	case -LKL_EMSGSIZE:
+		errno = EMSGSIZE;
+		break;
+	case -LKL_EPROTOTYPE:
+		errno = EPROTOTYPE;
+		break;
+	case -LKL_ENOPROTOOPT:
+		errno = ENOPROTOOPT;
+		break;
+	case -LKL_EPROTONOSUPPORT:
+		errno = EPROTONOSUPPORT;
+		break;
+	case -LKL_ESOCKTNOSUPPORT:
+		errno = ESOCKTNOSUPPORT;
+		break;
+	case -LKL_EOPNOTSUPP:
+		errno = EOPNOTSUPP;
+		break;
+	case -LKL_EPFNOSUPPORT:
+		errno = EPFNOSUPPORT;
+		break;
+	case -LKL_EAFNOSUPPORT:
+		errno = EAFNOSUPPORT;
+		break;
+	case -LKL_EADDRINUSE:
+		errno = EADDRINUSE;
+		break;
+	case -LKL_EADDRNOTAVAIL:
+		errno = EADDRNOTAVAIL;
+		break;
+	case -LKL_ENETDOWN:
+		errno = ENETDOWN;
+		break;
+	case -LKL_ENETUNREACH:
+		errno = ENETUNREACH;
+		break;
+	case -LKL_ENETRESET:
+		errno = ENETRESET;
+		break;
+	case -LKL_ECONNABORTED:
+		errno = ECONNABORTED;
+		break;
+	case -LKL_ECONNRESET:
+		errno = ECONNRESET;
+		break;
+	case -LKL_ENOBUFS:
+		errno = ENOBUFS;
+		break;
+	case -LKL_EISCONN:
+		errno = EISCONN;
+		break;
+	case -LKL_ENOTCONN:
+		errno = ENOTCONN;
+		break;
+	case -LKL_ESHUTDOWN:
+		errno = ESHUTDOWN;
+		break;
+	case -LKL_ETOOMANYREFS:
+		errno = ETOOMANYREFS;
+		break;
+	case -LKL_ETIMEDOUT:
+		errno = ETIMEDOUT;
+		break;
+	case -LKL_ECONNREFUSED:
+		errno = ECONNREFUSED;
+		break;
+	case -LKL_EHOSTDOWN:
+		errno = EHOSTDOWN;
+		break;
+	case -LKL_EHOSTUNREACH:
+		errno = EHOSTUNREACH;
+		break;
+	case -LKL_EALREADY:
+		errno = EALREADY;
+		break;
+	case -LKL_EINPROGRESS:
+		errno = EINPROGRESS;
+		break;
+	case -LKL_ESTALE:
+		errno = ESTALE;
+		break;
+	case -LKL_EUCLEAN:
+		errno = EUCLEAN;
+		break;
+	case -LKL_ENOTNAM:
+		errno = ENOTNAM;
+		break;
+	case -LKL_ENAVAIL:
+		errno = ENAVAIL;
+		break;
+	case -LKL_EISNAM:
+		errno = EISNAM;
+		break;
+	case -LKL_EREMOTEIO:
+		errno = EREMOTEIO;
+		break;
+	case -LKL_EDQUOT:
+		errno = EDQUOT;
+		break;
+	case -LKL_ENOMEDIUM:
+		errno = ENOMEDIUM;
+		break;
+	case -LKL_EMEDIUMTYPE:
+		errno = EMEDIUMTYPE;
+		break;
+	case -LKL_ECANCELED:
+		errno = ECANCELED;
+		break;
+	case -LKL_ENOKEY:
+		errno = ENOKEY;
+		break;
+	case -LKL_EKEYEXPIRED:
+		errno = EKEYEXPIRED;
+		break;
+	case -LKL_EKEYREVOKED:
+		errno = EKEYREVOKED;
+		break;
+	case -LKL_EKEYREJECTED:
+		errno = EKEYREJECTED;
+		break;
+	case -LKL_EOWNERDEAD:
+		errno = EOWNERDEAD;
+		break;
+	case -LKL_ENOTRECOVERABLE:
+		errno = ENOTRECOVERABLE;
+		break;
+	case -LKL_ERFKILL:
+		errno = ERFKILL;
+		break;
+	case -LKL_EHWPOISON:
+		errno = EHWPOISON;
+		break;
+	}
+
+	return -1;
+}
+
+int lkl_soname_xlate(int soname)
+{
+	switch (soname) {
+	case SO_DEBUG:
+		return LKL_SO_DEBUG;
+	case SO_REUSEADDR:
+		return LKL_SO_REUSEADDR;
+	case SO_TYPE:
+		return LKL_SO_TYPE;
+	case SO_ERROR:
+		return LKL_SO_ERROR;
+	case SO_DONTROUTE:
+		return LKL_SO_DONTROUTE;
+	case SO_BROADCAST:
+		return LKL_SO_BROADCAST;
+	case SO_SNDBUF:
+		return LKL_SO_SNDBUF;
+	case SO_RCVBUF:
+		return LKL_SO_RCVBUF;
+	case SO_SNDBUFFORCE:
+		return LKL_SO_SNDBUFFORCE;
+	case SO_RCVBUFFORCE:
+		return LKL_SO_RCVBUFFORCE;
+	case SO_KEEPALIVE:
+		return LKL_SO_KEEPALIVE;
+	case SO_OOBINLINE:
+		return LKL_SO_OOBINLINE;
+	case SO_NO_CHECK:
+		return LKL_SO_NO_CHECK;
+	case SO_PRIORITY:
+		return LKL_SO_PRIORITY;
+	case SO_LINGER:
+		return LKL_SO_LINGER;
+	case SO_BSDCOMPAT:
+		return LKL_SO_BSDCOMPAT;
+#ifdef SO_REUSEPORT
+	case SO_REUSEPORT:
+		return LKL_SO_REUSEPORT;
+#endif
+	case SO_PASSCRED:
+		return LKL_SO_PASSCRED;
+	case SO_PEERCRED:
+		return LKL_SO_PEERCRED;
+	case SO_RCVLOWAT:
+		return LKL_SO_RCVLOWAT;
+	case SO_SNDLOWAT:
+		return LKL_SO_SNDLOWAT;
+	case SO_RCVTIMEO:
+		return LKL_SO_RCVTIMEO;
+	case SO_SNDTIMEO:
+		return LKL_SO_SNDTIMEO;
+	case SO_SECURITY_AUTHENTICATION:
+		return LKL_SO_SECURITY_AUTHENTICATION;
+	case SO_SECURITY_ENCRYPTION_TRANSPORT:
+		return LKL_SO_SECURITY_ENCRYPTION_TRANSPORT;
+	case SO_SECURITY_ENCRYPTION_NETWORK:
+		return LKL_SO_SECURITY_ENCRYPTION_NETWORK;
+	case SO_BINDTODEVICE:
+		return LKL_SO_BINDTODEVICE;
+	case SO_ATTACH_FILTER:
+		return LKL_SO_ATTACH_FILTER;
+	case SO_DETACH_FILTER:
+		return LKL_SO_DETACH_FILTER;
+	case SO_PEERNAME:
+		return LKL_SO_PEERNAME;
+	case SO_TIMESTAMP:
+		return LKL_SO_TIMESTAMP;
+	case SO_ACCEPTCONN:
+		return LKL_SO_ACCEPTCONN;
+	case SO_PEERSEC:
+		return LKL_SO_PEERSEC;
+	case SO_PASSSEC:
+		return LKL_SO_PASSSEC;
+	case SO_TIMESTAMPNS:
+		return LKL_SO_TIMESTAMPNS;
+	case SO_MARK:
+		return LKL_SO_MARK;
+	case SO_TIMESTAMPING:
+		return LKL_SO_TIMESTAMPING;
+	case SO_PROTOCOL:
+		return LKL_SO_PROTOCOL;
+	case SO_DOMAIN:
+		return LKL_SO_DOMAIN;
+	case SO_RXQ_OVFL:
+		return LKL_SO_RXQ_OVFL;
+#ifdef SO_WIFI_STATUS
+	case SO_WIFI_STATUS:
+		return LKL_SO_WIFI_STATUS;
+#endif
+#ifdef SO_PEEK_OFF
+	case SO_PEEK_OFF:
+		return LKL_SO_PEEK_OFF;
+#endif
+#ifdef SO_NOFCS
+	case SO_NOFCS:
+		return LKL_SO_NOFCS;
+#endif
+#ifdef SO_LOCK_FILTER
+	case SO_LOCK_FILTER:
+		return LKL_SO_LOCK_FILTER;
+#endif
+#ifdef SO_SELECT_ERR_QUEUE
+	case SO_SELECT_ERR_QUEUE:
+		return LKL_SO_SELECT_ERR_QUEUE;
+#endif
+#ifdef SO_BUSY_POLL
+	case SO_BUSY_POLL:
+		return LKL_SO_BUSY_POLL;
+#endif
+#ifdef SO_MAX_PACING_RATE
+	case SO_MAX_PACING_RATE:
+		return LKL_SO_MAX_PACING_RATE;
+#endif
+	}
+
+	return soname;
+}
+
+int lkl_solevel_xlate(int solevel)
+{
+	switch (solevel) {
+	case SOL_SOCKET:
+		return LKL_SOL_SOCKET;
+	}
+
+	return solevel;
+}
+
+unsigned long lkl_ioctl_req_xlate(unsigned long req)
+{
+	switch (req) {
+	case FIOSETOWN:
+		return LKL_FIOSETOWN;
+	case SIOCSPGRP:
+		return LKL_SIOCSPGRP;
+	case FIOGETOWN:
+		return LKL_FIOGETOWN;
+	case SIOCGPGRP:
+		return LKL_SIOCGPGRP;
+	case SIOCATMARK:
+		return LKL_SIOCATMARK;
+	case SIOCGSTAMP:
+		return LKL_SIOCGSTAMP;
+	case SIOCGSTAMPNS:
+		return LKL_SIOCGSTAMPNS;
+	}
+
+	/* TODO: asm/termios.h translations */
+
+	return req;
+}
+
+int lkl_fcntl_cmd_xlate(int cmd)
+{
+	switch (cmd) {
+	case F_DUPFD:
+		return LKL_F_DUPFD;
+	case F_GETFD:
+		return LKL_F_GETFD;
+	case F_SETFD:
+		return LKL_F_SETFD;
+	case F_GETFL:
+		return LKL_F_GETFL;
+	case F_SETFL:
+		return LKL_F_SETFL;
+	case F_GETLK:
+		return LKL_F_GETLK;
+	case F_SETLK:
+		return LKL_F_SETLK;
+	case F_SETLKW:
+		return LKL_F_SETLKW;
+	case F_SETOWN:
+		return LKL_F_SETOWN;
+	case F_GETOWN:
+		return LKL_F_GETOWN;
+	case F_SETSIG:
+		return LKL_F_SETSIG;
+	case F_GETSIG:
+		return LKL_F_GETSIG;
+#ifndef LKL_CONFIG_64BIT
+	case F_GETLK64:
+		return LKL_F_GETLK64;
+	case F_SETLK64:
+		return LKL_F_SETLK64;
+	case F_SETLKW64:
+		return LKL_F_SETLKW64;
+#endif
+	case F_SETOWN_EX:
+		return LKL_F_SETOWN_EX;
+	case F_GETOWN_EX:
+		return LKL_F_GETOWN_EX;
+	}
+
+	return cmd;
+}
+
diff --git a/tools/lkl/lib/hijack/xlate.h b/tools/lkl/lib/hijack/xlate.h
new file mode 100644
index 000000000000..0c0281f241a6
--- /dev/null
+++ b/tools/lkl/lib/hijack/xlate.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LKL_HIJACK_XLATE_H
+#define _LKL_HIJACK_XLATE_H
+
+long lkl_set_errno(long err);
+int lkl_soname_xlate(int soname);
+int lkl_solevel_xlate(int solevel);
+unsigned long lkl_ioctl_req_xlate(unsigned long req);
+int lkl_fcntl_cmd_xlate(int cmd);
+
+#define LKL_FD_OFFSET (FD_SETSIZE/2)
+
+#endif /* _LKL_HIJACK_XLATE_H */
diff --git a/tools/lkl/tests/hijack-test.sh b/tools/lkl/tests/hijack-test.sh
new file mode 100755
index 000000000000..097af6cff3ba
--- /dev/null
+++ b/tools/lkl/tests/hijack-test.sh
@@ -0,0 +1,737 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+
+clear_wdir()
+{
+    test -f ${VDESWITCH}.pid && kill $(cat ${VDESWITCH}.pid)
+    rm -rf ${wdir}
+    tap_cleanup
+    tap_cleanup 1
+}
+
+set_cfgjson()
+{
+    cfgjson=${wdir}/hijack-test$1.conf
+
+    cat > ${cfgjson}
+
+    export_vars cfgjson
+}
+
+run_hijack_cfg()
+{
+    lkl_test_cmd LKL_HIJACK_CONFIG_FILE=$cfgjson $hijack $@
+}
+
+run_hijack()
+{
+    lkl_test_cmd $hijack $@
+}
+
+run_netperf()
+{
+    lkl_test_cmd TEST_NETSERVER_PORT=$TEST_NETSERVER_PORT \
+                 LKL_HIJACK_CONFIG_FILE=$cfgjson $netperf $@
+}
+
+test_ping()
+{
+    set -e
+
+    run_hijack ${ping} -c 1 127.0.0.1
+}
+
+test_ping6()
+{
+    set -e
+
+    run_hijack ${ping6} -c 1 ::1
+}
+
+test_mount_and_dump()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "mount":"proc,sysfs",
+        "dump":"/sysfs/class/net/lo/mtu,/sysfs/class/net/lo/dev_id",
+        "debug": "1"
+    }
+EOF
+
+    ans=$(run_hijack_cfg $(lkl_test_cmd which true))
+    echo "$ans"
+    echo "$ans" | grep "^65536" # lo's MTU
+    echo "$ans" | grep "0x0" # lo's dev_id
+}
+
+test_boot_cmdline()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "debug":"1",
+        "boot_cmdline":"loglevel=1"
+    }
+EOF
+
+    ans=$(run_hijack_cfg $(lkl_test_cmd which true))
+    echo "$ans"
+    [ $(echo "$ans" | wc -l) = 1 ]
+}
+
+
+test_pipe_setup()
+{
+    set -e
+
+    mkfifo ${fifo1}
+    mkfifo ${fifo2}
+
+    set_cfgjson << EOF
+    {
+        "interfaces":
+        [
+            {
+                "type":"pipe",
+                "param":"${fifo1}|${fifo2}",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "mac":"$TEST_MAC0",
+            }
+        ]
+    }
+EOF
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+}
+
+test_pipe_ping()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_lkl)",
+        "gateway6":"$(ip6_lkl)",
+        "interfaces":
+        [
+            {
+                "type":"pipe",
+                "param":"${fifo1}|${fifo2}",
+                "ip":"$(ip_host)",
+                "masklen":"$TEST_IP_NETMASK",
+                "mac":"$TEST_MAC0",
+                "ipv6":"$(ip6_host)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_hijack_cfg $(lkl_test_cmd which sleep) 10 &
+
+    set_cfgjson 2 << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"pipe",
+                "param":"${fifo2}|${fifo1}",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "mac":"$TEST_MAC0",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    # Ping under LKL
+    run_hijack_cfg ${ping} -c 1 -w 10 $(ip_host)
+
+    # Ping 6 under LKL
+    run_hijack_cfg ${ping6} -c 1 -w 10 $(ip6_host)
+
+    wait
+}
+
+test_tap_setup()
+{
+    set -e
+
+    # Set up the TAP device we'd like to use
+    tap_setup
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "debug":"1",
+        "interfaces": [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac": "$TEST_MAC0"
+            }
+        ]
+    }
+EOF
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+    echo "$addr" | grep "$(ip6_lkl)"
+    ! echo "$addr" | grep "WARN: failed to free"
+}
+
+test_tap_cleanup()
+{
+    tap_cleanup
+    tap_cleanup 1
+}
+
+test_tap_ping_host()
+{
+    set -e
+
+    # Make sure we can ping the host from inside LKL
+    run_hijack_cfg ${ping} -c 1 $(ip_host)
+    run_hijack_cfg ${ping6} -c 1 $(ip6_host)
+}
+
+test_tap_ping_lkl()
+{
+    set -e
+
+    # Now let's check that the host can see LKL.
+    lkl_test_cmd sudo ip -6 neigh del $(ip6_lkl) dev $(tap_ifname)
+    lkl_test_cmd sudo ip neigh del $(ip_lkl) dev $(tap_ifname)
+    run_hijack_cfg $(lkl_test_cmd which sleep) 3 &
+    sleep 2
+    lkl_test_cmd sudo ping -i 0.01 -c 65 $(ip_lkl)
+    lkl_test_cmd sudo ping6 -i 0.01 -c 65 $(ip6_lkl)
+}
+
+test_tap_neighbours()
+{
+    set -e
+
+    neigh1="$(ip_add 100)|12:34:56:78:9a:bc"
+    neigh2="$(ip6_add 100)|12:34:56:78:9a:be"
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "neigh":"${neigh1};${neigh2}"
+            }
+        ]
+    }
+EOF
+
+    # add neighbor entries
+    ans=$(run_hijack_cfg ip neighbor show) || true
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bc"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:be"
+
+    # gateway
+    ans=$(run_hijack_cfg ip route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip_host)"
+
+    # gateway v6
+    ans=$(run_hijack_cfg ip -6 route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip6_host)"
+}
+
+test_tap_netperf_stream_tso_csum()
+{
+    set -e
+
+    # offload
+    # LKL_VIRTIO_NET_F_HOST_TSO4 && LKL_VIRTIO_NET_F_GUEST_TSO4
+    # LKL_VIRTIO_NET_F_CSUM && LKL_VIRTIO_NET_F_GUEST_CSUM
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "offload":"0x883",
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_netperf $(ip_host) TCP_STREAM
+}
+
+test_tap_netperf_maerts_csum_tso()
+{
+    run_netperf $(ip_host) TCP_MAERTS
+}
+
+test_tap_netperf_stream_csum_tso_mrgrxbuf()
+{
+    set -e
+
+    # offload
+    # LKL_VIRTIO_NET_F_HOST_TSO4 && LKL_VIRTIO_NET_F_MRG_RXBUF
+    # LKL_VIRTIO_NET_F_CSUM && LKL_VIRTIO_NET_F_GUEST_CSUM
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "offload":"0x8803",
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_netperf $(ip_host) TCP_MAERTS
+}
+
+test_tap_netperf_tcp_rr()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK"
+            }
+        ]
+    }
+EOF
+
+    run_netperf $(ip_host) TCP_RR
+}
+
+test_tap_netperf_tcp_stream()
+{
+    set -e
+
+    run_netperf $(ip_host) TCP_STREAM
+}
+
+test_tap_netperf_tcp_maerts()
+{
+    set -e
+
+    run_netperf $(ip_host) TCP_MAERTS
+}
+
+
+test_tap_qdisc()
+{
+    set -e
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0",
+                "qdisc":"root|fq"
+            }
+        ]
+    }
+EOF
+
+    qdisc=$(run_hijack_cfg tc -s -d qdisc show)
+    echo "$qdisc"
+    echo "$qdisc" | grep "qdisc fq" > /dev/null
+    echo "$qdisc" | grep throttled > /dev/null
+}
+
+test_tap_multi_if_setup()
+{
+    set -e
+
+    # Set up 2nd TAP device we'd like to use
+    tap_setup 1
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0"
+            },
+            {
+                "type":"tap",
+                "param":"$(tap_ifname 1)",
+                "ip":"$(ip_lkl 1)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl 1)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC1"
+            }
+        ]
+    }
+EOF
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+    echo "$addr" | grep "$(ip6_lkl)"
+    echo "$addr" | grep eth1
+    echo "$addr" | grep $(ip_lkl 1)
+    echo "$addr" | grep "$TEST_MAC1"
+    echo "$addr" | grep "$(ip6_lkl 1)"
+    ! echo "$addr" | grep "WARN: failed to free"
+}
+
+test_tap_multi_if_ping()
+{
+    run_hijack_cfg ${ping} -c 1 $(ip_host)
+    run_hijack_cfg ${ping6} -c 1 $(ip6_host)
+    run_hijack_cfg ${ping} -c 1 $(ip_host 1)
+    run_hijack_cfg ${ping6} -c 1 $(ip6_host 1)
+}
+
+test_tap_multi_if_neigh()
+{
+
+    neigh1="$(ip_host)00|12:34:56:78:9a:bc"
+    neigh2="$(ip6_host)00|12:34:56:78:9a:be"
+    neigh3="$(ip_host 1)00|12:34:56:78:9a:bd"
+    neigh4="$(ip6_host 1)00|12:34:56:78:9a:bf"
+
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0",
+                "neigh":"${neigh1};${neigh2}"
+            },
+            {
+                "type":"tap",
+                "param":"$(tap_ifname 1)",
+                "ip":"$(ip_lkl 1)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl 1)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC1",
+                "neigh":"${neigh3};${neigh4}"
+            }
+        ]
+    }
+EOF
+
+    # add neighbor entries
+    ans=$(run_hijack_cfg ip neighbor show) || true
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bc"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:be"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bd"
+    echo "$ans" | tail -n 15 | grep "12:34:56:78:9a:bf"
+}
+
+test_tap_multi_if_gateway()
+{
+    ans=$(run_hijack_cfg ip route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip_host)"
+}
+
+test_tap_multi_if_gateway_v6()
+{
+    ans=$(run_hijack_cfg ip -6 route show) || true
+    echo "$ans" | tail -n 15 | grep "$(ip6_host)"
+}
+
+
+test_tap_multitable_setup()
+{
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"tap",
+                "param":"$(tap_ifname)",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ifgateway":"$(ip_host)",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "ifgateway6":"$(ip6_host)",
+                "mac":"$TEST_MAC0",
+                "neigh":"${neigh1};${neigh2}"
+            },
+            {
+                "type":"tap",
+                "param":"$(tap_ifname 1)",
+                "ip":"$(ip_lkl 1)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ifgateway":"$(ip_host 1)",
+                "ipv6":"$(ip6_lkl 1)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "ifgateway6":"$(ip6_host 1)",
+                "mac":"$TEST_MAC1",
+                "neigh":"${neigh3};${neigh4}"
+            }
+        ]
+    }
+EOF
+}
+
+test_tap_multitable_ipv4_rule()
+{
+    addr=$(run_hijack_cfg ip rule show)
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep $(ip_lkl 1)
+}
+
+test_tap_multitable_ipv6_rule()
+{
+    addr=$(run_hijack_cfg ip -6 rule show)
+    echo "$addr" | grep $(ip6_lkl)
+    echo "$addr" | grep $(ip6_lkl 1)
+}
+
+test_tap_multitable_ipv4_rule_table_4()
+{
+    addr=$(run_hijack_cfg ip route show table 4)
+    echo "$addr" | grep $(ip_host)
+}
+
+test_tap_multitable_ipv6_rule_table_5()
+{
+    addr=$(run_hijack_cfg ip -6 route show table 5)
+    echo "$addr" | grep fc03::
+    echo "$addr" | grep $(ip6_host)
+}
+
+test_tap_multitable_ipv6_rule_table_6()
+{
+    addr=$(run_hijack_cfg ip route show table 6)
+    echo "$addr" | grep $(ip_host 1)
+}
+
+test_tap_multitable_ipv6_rule_table_7()
+{
+    addr=$(run_hijack_cfg ip -6 route show table 7)
+    echo "$addr" | grep fc04::
+    echo "$addr" | grep $(ip6_host 1)
+}
+
+test_vde_setup()
+{
+    set_cfgjson << EOF
+    {
+        "gateway":"$(ip_host)",
+        "gateway6":"$(ip6_host)",
+        "interfaces":
+        [
+            {
+                "type":"vde",
+                "param":"${VDESWITCH}",
+                "ip":"$(ip_lkl)",
+                "masklen":"$TEST_IP_NETMASK",
+                "ipv6":"$(ip6_lkl)",
+                "masklen6":"$TEST_IP6_NETMASK",
+                "mac":"$TEST_MAC0",
+                "neigh":"${neigh1};${neigh2}"
+            }
+        ]
+    }
+EOF
+
+    tap_setup
+
+    sleep 2
+    vde_switch -d -t $(tap_ifname) -s ${VDESWITCH} -p ${VDESWITCH}.pid
+
+    # Make sure our device has the addresses we expect
+    addr=$(run_hijack_cfg ip addr)
+    echo "$addr" | grep eth0
+    echo "$addr" | grep $(ip_lkl)
+    echo "$addr" | grep "$TEST_MAC0"
+}
+
+test_vde_cleanup()
+{
+    tap_cleanup
+}
+
+test_vde_ping_host()
+{
+    run_hijack_cfg ./ping $(ip_host) -c 1
+}
+
+test_vde_ping_lkl()
+{
+    lkl_test_cmd sudo arp -d $(ip_lkl)
+    lkl_test_cmd sudo ping -i 0.01 -c 65 $(ip_lkl) &
+    run_hijack_cfg sleep 3
+}
+
+source ${script_dir}/test.sh
+source ${script_dir}/net-setup.sh
+
+if [[ ! -e ${basedir}/lib/hijack/liblkl-hijack.so ]]; then
+    lkl_test_plan 0 "hijack tests"
+    echo "missing liblkl-hijack.so"
+    exit 0
+fi
+
+# Make a temporary directory to run tests in, since we'll be copying
+# things there.
+wdir=$(mktemp -d)
+cp `which ping` ${wdir}
+cp `which ping6` ${wdir}
+ping=${wdir}/ping
+ping6=${wdir}/ping6
+hijack=$basedir/bin/lkl-hijack.sh
+netperf=$basedir/tests/run_netperf.sh
+
+fifo1=${wdir}/fifo1
+fifo2=${wdir}/fifo2
+VDESWITCH=${wdir}/vde_switch
+
+# And make sure we clean up when we're done
+trap "clear_wdir &>/dev/null" EXIT
+
+lkl_test_plan 5 "hijack basic tests"
+lkl_test_run 1 run_hijack ip addr
+lkl_test_run 2 run_hijack ip route
+lkl_test_run 3 test_ping
+lkl_test_run 4 test_ping6
+lkl_test_run 5 test_mount_and_dump
+lkl_test_run 6 test_boot_cmdline
+
+if [ -z "$(QUIET=1 lkl_test_cmd which mkfifo)" ]; then
+    lkl_test_plan 0 "hijack pipe backend tests"
+    echo "no mkfifo command"
+else
+    lkl_test_plan 2 "hijack pipe backend tests"
+    lkl_test_run 1 test_pipe_setup
+    lkl_test_run 2 test_pipe_ping
+fi
+
+tap_prepare
+
+if ! lkl_test_cmd test -c /dev/net/tun &>/dev/null; then
+    lkl_test_plan 0 "hijack tap backend tests"
+    echo "missing /dev/net/tun"
+else
+    lkl_test_plan 23 "hijack tap backend tests"
+    lkl_test_run 1 test_tap_setup
+    lkl_test_run 2 test_tap_ping_host
+    lkl_test_run 3 test_tap_ping_lkl
+    lkl_test_run 4 test_tap_neighbours
+    lkl_test_run 5 test_tap_netperf_tcp_rr
+    lkl_test_run 6 test_tap_netperf_tcp_stream
+    lkl_test_run 7 test_tap_netperf_tcp_maerts
+    lkl_test_run 8 test_tap_netperf_stream_tso_csum
+    lkl_test_run 9 test_tap_netperf_maerts_csum_tso
+    lkl_test_run 10 test_tap_netperf_stream_csum_tso_mrgrxbuf
+    lkl_test_run 11 test_tap_qdisc
+    lkl_test_run 12 test_tap_multi_if_setup
+    lkl_test_run 13 test_tap_multi_if_ping
+    lkl_test_run 14 test_tap_multi_if_neigh
+    lkl_test_run 15 test_tap_multi_if_gateway
+    lkl_test_run 16 test_tap_multi_if_gateway_v6
+    lkl_test_run 17 test_tap_multitable_setup
+    lkl_test_run 18 test_tap_multitable_ipv4_rule
+    lkl_test_run 19 test_tap_multitable_ipv6_rule
+    lkl_test_run 20 test_tap_multitable_ipv4_rule_table_4
+    lkl_test_run 21 test_tap_multitable_ipv6_rule_table_5
+    lkl_test_run 22 test_tap_multitable_ipv6_rule_table_6
+    lkl_test_run 23 test_tap_multitable_ipv6_rule_table_7
+    lkl_test_run 24 test_tap_cleanup
+fi
+
+if [ -z "$LKL_HOST_CONFIG_VIRTIO_NET_VDE" ]; then
+    lkl_test_plan 0 "vde tests"
+    echo "vde not supported"
+elif [ ! -x "$(which vde_switch)" ]; then
+    lkl_test_plan 0 "hijack vde tests"
+    echo "could not find a vde_switch executable"
+else
+    lkl_test_plan 3 "hijack vde tests"
+    lkl_test_run 1 test_vde_setup
+    lkl_test_run 2 test_vde_ping_host
+    lkl_test_run 3 test_vde_ping_lkl
+    lkl_test_run 4 test_vde_cleanup
+fi
diff --git a/tools/lkl/tests/run_netperf.sh b/tools/lkl/tests/run_netperf.sh
new file mode 100755
index 000000000000..08c4337b7830
--- /dev/null
+++ b/tools/lkl/tests/run_netperf.sh
@@ -0,0 +1,98 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+
+# Usage
+#  ./run_netperf.sh [ip] [test_name] [use_taskset] [num_runs]
+
+set -e
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+hijack_script=${script_dir}/../bin/lkl-hijack.sh
+
+num_runs="1"
+test_name="TCP_STREAM"
+use_taskset="0"
+host_ip="localhost"
+taskset_cmd="taskset -c 1"
+test_len=10  # second
+
+if [ ! -x "$(which netperf)" ]; then
+    echo "WARNING: Cannot find a netserver executable, skipping netperf tests."
+    exit $TEST_SKIP
+fi
+
+if [ $# -ge 1 ]; then
+    host_ip=$1
+fi
+if [ $# -ge 2 ]; then
+    test_name=$2
+fi
+if [ $# -ge 3 ]; then
+    use_taskset=$2
+fi
+if [ $# -ge 4 ]; then
+    num_runs=$3
+fi
+if [ $# -ge 5 ]; then
+    echo "BAD NUMBER of INPUTS."
+    exit 1
+fi
+
+if [ $use_taskset = "0" ]; then
+  taskset_cmd=""
+fi
+
+clean() {
+    kill %1 || true
+}
+
+clean_with_tap() {
+    tap_cleanup &> /dev/null || true
+    clean
+    rm -rf ${work_dir}
+}
+
+# LKL_HIJACK_CONFIG_FILE is not set, which means it's not called from
+# hijack-test.sh. Needs to set up things first.
+if [ -z ${LKL_HIJACK_CONFIG_FILE+x} ]; then
+
+    # Setting up environmental vars and TAP
+    work_dir=$(mktemp -d)
+    cfgjson=${work_dir}/hijack-test.conf
+    export LKL_HIJACK_CONFIG_FILE=$cfgjson
+
+    cat <<EOF > ${cfgjson}
+    {
+         "interfaces": [
+               {
+                    "type": "tap"
+                    "param": "$(tap_ifname)"
+                    "ip": "$(ip_lkl)"
+                    "masklen":"$TEST_IP_NETMASK"
+                    "ipv6":"$(ip6_lkl)"
+                    "masklen6":"$TEST_IP6_NETMASK"
+               }
+         ]
+    }
+EOF
+
+    . $script_dir/net-setup.sh
+    host_ip=$(ip_host)
+
+    tap_prepare
+    tap_setup
+    trap clean_with_tap EXIT
+fi
+
+netserver -D -N -p $TEST_NETSERVER_PORT &
+
+trap clean EXIT
+
+echo NUM=$num_runs, TEST=$test_name, TASKSET=$use_taskset
+for i in `seq $num_runs`; do
+    echo Test: $i
+    set -x
+    $taskset_cmd ${hijack_script} netperf -p $TEST_NETSERVER_PORT -H $host_ip \
+		         -t $test_name -l $test_len
+    set +x
+done
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 29/37] lkl: add documentation
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Chenyang Zhong, Conrad Meyer, Gustavo Bittencourt, Motomu Utsumi,
	Patrick Collins, Thomas Liebetraut, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

A document describing brief introduction of LKL, why it is needed, and
how it is used is added.  The document is located under uml/ directory.

Signed-off-by: Chenyang Zhong <zhongcy95@gmail.com>
Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Gustavo Bittencourt <gbitten@gmail.com>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 Documentation/virt/uml/lkl.txt | 453 +++++++++++++++++++++++++++++++++
 1 file changed, 453 insertions(+)
 create mode 100644 Documentation/virt/uml/lkl.txt

diff --git a/Documentation/virt/uml/lkl.txt b/Documentation/virt/uml/lkl.txt
new file mode 100644
index 000000000000..01c7b8bf7edc
--- /dev/null
+++ b/Documentation/virt/uml/lkl.txt
@@ -0,0 +1,453 @@
+
+Introduction
+============
+
+LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code as
+extensively as possible with minimal effort and reduced maintenance overhead.
+
+Examples of how LKL can be used are: creating userspace applications (running on
+Linux and other operating systems) that can read or write Linux filesystems or
+can use the Linux networking stack, creating kernel drivers for other operating
+systems that can read Linux filesystems, bootloaders support for reading/writing
+Linux filesystems, etc.
+
+With LKL, the kernel code is compiled into an object file that can be directly
+linked by applications. The API offered by LKL is based on the Linux system call
+interface.
+
+LKL is implemented as one of the mode of UML (arch/um). It uses host operations
+defined by the application or a host library (tools/lkl/lib).
+
+
+Supported hosts
+===============
+
+The supported hosts for now are POSIX and Windows userspace applications.
+
+
+Building LKL the host library and LKL applications
+==================================================
+
+    $ make -C tools/lkl
+
+will build LKL as a object file, it will install it in tools/lkl/lib together
+with the headers files in tools/lkl/include then will build the host library,
+tests and a few of application examples:
+
+* tests/boot - a simple applications that uses LKL and exercises the basic LKL
+  APIs
+
+* tests/net-test - a simple applications that uses network feature of LKL and
+  exercises the basic network-related APIs
+
+* fs2tar - a tool that converts a filesystem image to a tar archive
+
+* cptofs/cpfromfs - a tool that copies files to/from a filesystem image
+
+* lklfuse - a tool that can mount a filesystem image in userspace,
+  without root privileges, using FUSE
+
+
+Building LKL on FreeBSD
+-----------------------
+
+    $ pkg install binutils gcc gnubc gmake gsed coreutils bison flex python argp-standalone
+
+    #Prefer ports binutils and GNU bc(1):
+    $ export PATH=/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/lib64/ccache
+
+    $ gmake -C tools/lkl
+
+Building LKL on Ubuntu
+-----------------------
+
+    $ sudo apt-get install libfuse-dev libarchive-dev xfsprogs
+
+    # Optional, if you would like to be able to run tests
+    $ sudo apt-get install btrfs-tools
+    $ pip install yamlish junit_xml
+
+    $ make -C tools/lkl
+
+    # To check that everything works:
+    $ cd tools/lkl
+    $ make run-tests
+
+
+Building LKL for Windows
+------------------------
+
+In order to build LKL for Win32 the mingw cross compiler needs to be installed
+on the host (e.g. on Ubuntu the following packages are required:
+binutils-mingw-w64-i686, gcc-mingw-w64-base, gcc-mingw-w64-i686
+mingw-w64-common, mingw-w64-i686-dev).
+
+Due to a bug in mingw regarding weak symbols the following patches needs to be
+applied to mingw-binutils:
+
+https://sourceware.org/ml/binutils/2015-10/msg00234.html
+
+and i686-w64-mingw32-gas, i686-w64-mingw32-ld and i686-w64-mingw32-objcopy need
+to be recompiled.
+
+With that pre-requisites fullfilled you can now build LKL for Win32 with the
+following command:
+
+    $ make CROSS_COMPILE=i686-w64-mingw32- -C tools/lkl
+
+
+
+Building LKL on Windows
+------------------------
+
+To build on Windows, certain GNU tools need to be installed. These tools can come
+from several different projects, such as cygwin, unxutils, gnu-win32 or busybox-w32.
+Below is one minimal/modular set-up based on msys2.
+
+### Common build dependencies:
+* [MSYS2](https://sourceforge.net/projects/msys2/) (provides GNU bash and many other utilities)
+* Extra utilities from MSYS2/pacman: bc, base-devel
+
+### General considerations:
+* No spaces in pathnames (source, prefix, destination,...)!
+* Make sure that all utilities are in the PATH.
+* Win64 (and MinGW 64-bit crt) is LLP64, which causes conflicts in size of "long" in the
+Linux source. Linux (and lkl) can (currently) not
+be built on LLP64.
+* Cygwin (and msys2) are LP64, like linux.
+
+### For MSYS2 (and Cygwin):
+Msys2 will install a gcc tool chain as part of the base-devel bundle. Binutils (2.26) is already
+patched for NT weak externals. Using the msys2 shell, cd to the lkl sources and run:
+
+    $ make -C tools/lkl
+
+### For MinGW:
+Install mingw-w64-i686-toolchain via pacman, mingw-w64-i686-binutils (2.26) is already patched
+for NT weak externals. Start a MinGW Win32 shell (64-bit will not work, see above)
+and run:
+
+    $ make -C tools/lkl
+
+
+LKL hijack library
+==================
+
+LKL hijack library (liblkl-hijack.so) is used to replace system calls used by an
+application on the fly so that the application can use LKL instead of the kernel
+of host operating system. LD_PRELOAD is used to dynamically override system
+calls with this library when you execute a program.
+
+You can usually use this library via a wrapper script.
+
+    $ cd tools/lkl
+    $ ./bin/lkl-hijack.sh ip address show
+
+In order to configure the behavior of LKL, a json file can be used. You can
+specify json file with environmental variables (LKL_HIJACK_CONFIG_FILE). If
+there is nothing specified, LKL tries to find with the name 'lkl-hijack.json'
+for the configuration file.  You can also use the old-style configuration with
+environmental variables (e.g., LKL_HIJACK_NET_IFTYPE) but those are overridden
+if a json file is specified.
+
+```
+     $ cat conf.json
+     {
+       "gateway":"192.168.0.1",
+       "gateway6":"2001:db8:0:f101::1",
+       "debug":"1",
+       "singlecpu":"1",
+       "sysctl":"net.ipv4.tcp_wmem=4096 87380 2147483647",
+       "boot_cmdline":"ip=dhcp",
+       "interfaces":[
+               {
+                       "mac":"12:34:56:78:9a:bc",
+                       "type":"tap",
+                       "param":"tap7",
+                       "ip":"192.168.0.2",
+                       "masklen":"24",
+                       "ifgateway":"192.168.0.1",
+                       "ipv6":"2001:db8:0:f101::2",
+                       "masklen6":"64",
+                       "ifgateway6":"2001:db8:0:f101::1",
+                       "offload":"0xc803"
+               },
+               {
+                       "mac":"12:34:56:78:9a:bd",
+                       "type":"tap",
+                       "param":"tap77",
+                       "ip":"192.168.1.2",
+                       "masklen":"24",
+                       "ifgateway":"192.168.1.1",
+                       "ipv6":"2001:db8:0:f102::2",
+                       "masklen6":"64",
+                       "ifgateway6":"2001:db8:0:f102::1",
+                       "offload":"0xc803"
+               }
+       ]
+     }
+     $ LKL_HIJACK_CONFIG_FILE="conf.json" lkl-hijack.sh ip addr s
+```
+
+The following are the list of keys to describe a JSON file.
+
+* IPv4 gateway address
+
+  key: "gateway"
+  value type: string
+
+  the gateway IPv4 address of LKL network stack.
+```
+     "gateway":"192.168.0.1"
+```
+
+* IPv6 gateway address
+
+  key: "gateway6"
+  value type: string
+
+  the gateway IPv6 address of LKL network stack.
+```
+     "gateway6":"2001:db8:0:f101::1"
+```
+
+* Debug
+
+  key: "debug"
+  value type: string
+
+  Setting it causes some debug information (both from the kernel and the
+  LKL library) to be enabled.  If zero' is specified it is disabled.
+  It is also used as a bit mask to turn on specific debugging facilities.
+  E.g., setting it to "0x100" will cause the LKL kernel to pause after
+  the hijack'ed app exits. This allows one to debug or collect info from
+  the LKL kernel before it quits.
+```
+     "debug":"1"
+```
+
+* Single CPU pinning
+
+  key: "singlecpu"
+  value type: string
+
+  Pin LKL kernel threads on to a single host cpu. value "1" pins
+  only LKL kernel threads while value "2" also pins polling
+  threads.
+```
+     "singlecpu":"1"
+```
+
+* SYSCTL
+
+  key: "sysctl"
+  value type: string
+
+  Configure sysctl values of the booted kernel via the hijack library. Multiple
+  entries can be specified.
+```
+     "sysctl":"net.ipv4.tcp_wmem=4096 87380 2147483647"
+```
+
+* Boot command line
+
+  key: "boot_cmdline"
+  value type: string
+
+  Specify the command line to the kernel boot so that change the configuration
+  on a kernel instance.  For instance, you can change the memory size with
+  below.
+```
+     "boot_cmdline": "mem=1G"
+```
+
+* Mount
+
+  key: "mount"
+  value type: string
+
+```
+     "mount": "proc,sysfs"
+```
+
+* Network Interface Configuration
+
+  key: "interfaces"
+  value type: array of objects
+
+  This key takes a set of sub-keys to configure a single interface. Each key is defined as follows.
+  ```
+       "interfaces":[{....},{....}]
+  ```
+
+
+	* Interface type
+
+	  key: "type"
+	  value type: string
+
+	  The interface type in host operating system to connect to LKL.
+	  The following example specifies a tap interface.
+	```
+	     "type":"tap"
+	```
+
+	* Interface parameter
+
+	  key: "param"
+	  value type: string
+
+	  Additional configuration parameters for the interface specified by Interface type (type).
+	  The parameters depend on the interface type.
+	```
+	     "type":"tap",
+	     "param":"tap0"
+	```
+
+	* Interface MTU size
+
+	  key: "mtu"
+	  value type: string
+
+	  the MTU size of the interface.
+	```
+	     "mtu":"1280"
+	```
+
+	* Interface IPv4 address
+
+	  key: "ip"
+	  value type: string
+
+	  the IPv4 address of the interface.
+	  If you want to use DHCP for the IP address assignment,
+	  use "boot_cmdline" with "ip=dhcp" option.
+	```
+	     "ip":"192.168.0.2"
+	```
+	```
+	     "boot_cmdline":"ip=dhcp"
+	```
+
+	* Interface IPv4 netmask length
+
+	  key: "masklen"
+	  value type: string
+
+	  the network mask length of the interface.
+	```
+	     "ip":"192.168.0.2",
+	     "masklen":"24"
+	```
+
+	* Interface IPv4 gateway on routing policy table
+
+	  key: "ifgateway"
+	  value type: string
+
+	  If you specify this parameter, LKL adds routing policy table.
+	  And then LKL creates link local and gateway route on this table.
+	  Table SELECTOR is "from" and PREFIX is address you assigned to this interface.
+	  Table id is 2 * (interface index).
+	  This parameter could be used to configure LKL for mptcp, for example.
+
+	```
+	     "ip":"192.168.0.2",
+	     "masklen":"24",
+	     "ifgateway":"192.168.0.1"
+	```
+
+	* Interface IPv6 address
+
+	  key: "ipv6"
+	  value type: string
+
+	  the IPv6 address of the interface.
+	```
+	     "ipv6":"2001:db8:0:f101::2"
+	```
+
+	* Interface IPv6 netmask length
+
+	  key: "masklen6"
+	  value type: string
+
+	  the network mask length of the interface.
+	```
+	     "ipv6":"2001:db8:0:f101::2",
+	     "masklen":"64"
+	```
+
+	* Interface IPv6 gateway on routing policy table
+
+	  key: "ifgateway6"
+	  value type: string
+
+	  If you specify this parameter, LKL adds routing policy table.
+	  And then LKL creates link local and gateway route on this table.
+	  Table SELECTOR is "from" and PREFIX is address you assigned to this interface.
+	  Table id is 2 * (interface index) + 1.
+	  This parameter could be used to configure LKL for mptcp, for example.
+	```
+	     "ipv6":"2001:db8:0:f101::2",
+	     "masklen":"64"
+	     "ifgateway6":"2001:db8:0:f101::1",
+	```
+
+	* Interface MAC address
+
+	  key: "mac"
+	  value type: string
+
+	  the MAC address of the interface.
+	```
+	     "mac":"12:34:56:78:9a:bc"
+	```
+
+	* Interfac neighbor entries
+
+	  key: "neigh"
+	  value type: string
+
+	  Add a list of permanent neighbor entries in the form of "ip|mac;ip|mac;...". ipv6 are supported
+	```
+	     "neigh":"192.168.0.1|12:34:56:78:9a:bc;2001:db8:0:f101::1|12:34:56:78:9a:be"
+	```
+
+	* Interface qdisc entries
+
+	  key: "qdisc"
+	  value type: string
+
+	  Add a qdisc entry in the form of "root|type;root|type;...".
+	```
+	     "qdisc":"root|fq"
+	```
+
+	* Interface offload
+
+	  key: "offload"
+	  value type: string
+
+	  Work as a bit mask to enable selective device offload features. E.g.,
+	  to enable "mergeable RX buffer" (LKL_VIRTIO_NET_F_MRG_RXBUF) +
+	  "guest csum" (LKL_VIRTIO_NET_F_GUEST_CSUM) device features, simply set
+	  it to 0x8002.
+	  See virtio_net.h for a list of offload features and their bit masks.
+	```
+	     "offload":"0x8002"
+	```
+
+* Delay
+
+  key: "delay_main"
+  value type: string
+
+  The delay before calling main() function of the application after the
+  initialization of LKL.  Some subsystems in Linux tree require a certain
+  amount of time before accepting a request from application, such as
+  delivery of address assignment to an network interface.  This parameter
+  is used in such case.  The value is described as a microsecond value.
+```
+     "delay_main":"500000"
+```
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 29/37] lkl: add documentation
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Conrad Meyer, Octavian Purdila, Motomu Utsumi,
	Akira Moroo, Thomas Liebetraut, Patrick Collins,
	linux-kernel-library, Chenyang Zhong, Yuan Liu,
	Gustavo Bittencourt

From: Octavian Purdila <tavi.purdila@gmail.com>

A document describing brief introduction of LKL, why it is needed, and
how it is used is added.  The document is located under uml/ directory.

Signed-off-by: Chenyang Zhong <zhongcy95@gmail.com>
Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
Signed-off-by: Gustavo Bittencourt <gbitten@gmail.com>
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 Documentation/virt/uml/lkl.txt | 453 +++++++++++++++++++++++++++++++++
 1 file changed, 453 insertions(+)
 create mode 100644 Documentation/virt/uml/lkl.txt

diff --git a/Documentation/virt/uml/lkl.txt b/Documentation/virt/uml/lkl.txt
new file mode 100644
index 000000000000..01c7b8bf7edc
--- /dev/null
+++ b/Documentation/virt/uml/lkl.txt
@@ -0,0 +1,453 @@
+
+Introduction
+============
+
+LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code as
+extensively as possible with minimal effort and reduced maintenance overhead.
+
+Examples of how LKL can be used are: creating userspace applications (running on
+Linux and other operating systems) that can read or write Linux filesystems or
+can use the Linux networking stack, creating kernel drivers for other operating
+systems that can read Linux filesystems, bootloaders support for reading/writing
+Linux filesystems, etc.
+
+With LKL, the kernel code is compiled into an object file that can be directly
+linked by applications. The API offered by LKL is based on the Linux system call
+interface.
+
+LKL is implemented as one of the mode of UML (arch/um). It uses host operations
+defined by the application or a host library (tools/lkl/lib).
+
+
+Supported hosts
+===============
+
+The supported hosts for now are POSIX and Windows userspace applications.
+
+
+Building LKL the host library and LKL applications
+==================================================
+
+    $ make -C tools/lkl
+
+will build LKL as a object file, it will install it in tools/lkl/lib together
+with the headers files in tools/lkl/include then will build the host library,
+tests and a few of application examples:
+
+* tests/boot - a simple applications that uses LKL and exercises the basic LKL
+  APIs
+
+* tests/net-test - a simple applications that uses network feature of LKL and
+  exercises the basic network-related APIs
+
+* fs2tar - a tool that converts a filesystem image to a tar archive
+
+* cptofs/cpfromfs - a tool that copies files to/from a filesystem image
+
+* lklfuse - a tool that can mount a filesystem image in userspace,
+  without root privileges, using FUSE
+
+
+Building LKL on FreeBSD
+-----------------------
+
+    $ pkg install binutils gcc gnubc gmake gsed coreutils bison flex python argp-standalone
+
+    #Prefer ports binutils and GNU bc(1):
+    $ export PATH=/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/lib64/ccache
+
+    $ gmake -C tools/lkl
+
+Building LKL on Ubuntu
+-----------------------
+
+    $ sudo apt-get install libfuse-dev libarchive-dev xfsprogs
+
+    # Optional, if you would like to be able to run tests
+    $ sudo apt-get install btrfs-tools
+    $ pip install yamlish junit_xml
+
+    $ make -C tools/lkl
+
+    # To check that everything works:
+    $ cd tools/lkl
+    $ make run-tests
+
+
+Building LKL for Windows
+------------------------
+
+In order to build LKL for Win32 the mingw cross compiler needs to be installed
+on the host (e.g. on Ubuntu the following packages are required:
+binutils-mingw-w64-i686, gcc-mingw-w64-base, gcc-mingw-w64-i686
+mingw-w64-common, mingw-w64-i686-dev).
+
+Due to a bug in mingw regarding weak symbols the following patches needs to be
+applied to mingw-binutils:
+
+https://sourceware.org/ml/binutils/2015-10/msg00234.html
+
+and i686-w64-mingw32-gas, i686-w64-mingw32-ld and i686-w64-mingw32-objcopy need
+to be recompiled.
+
+With that pre-requisites fullfilled you can now build LKL for Win32 with the
+following command:
+
+    $ make CROSS_COMPILE=i686-w64-mingw32- -C tools/lkl
+
+
+
+Building LKL on Windows
+------------------------
+
+To build on Windows, certain GNU tools need to be installed. These tools can come
+from several different projects, such as cygwin, unxutils, gnu-win32 or busybox-w32.
+Below is one minimal/modular set-up based on msys2.
+
+### Common build dependencies:
+* [MSYS2](https://sourceforge.net/projects/msys2/) (provides GNU bash and many other utilities)
+* Extra utilities from MSYS2/pacman: bc, base-devel
+
+### General considerations:
+* No spaces in pathnames (source, prefix, destination,...)!
+* Make sure that all utilities are in the PATH.
+* Win64 (and MinGW 64-bit crt) is LLP64, which causes conflicts in size of "long" in the
+Linux source. Linux (and lkl) can (currently) not
+be built on LLP64.
+* Cygwin (and msys2) are LP64, like linux.
+
+### For MSYS2 (and Cygwin):
+Msys2 will install a gcc tool chain as part of the base-devel bundle. Binutils (2.26) is already
+patched for NT weak externals. Using the msys2 shell, cd to the lkl sources and run:
+
+    $ make -C tools/lkl
+
+### For MinGW:
+Install mingw-w64-i686-toolchain via pacman, mingw-w64-i686-binutils (2.26) is already patched
+for NT weak externals. Start a MinGW Win32 shell (64-bit will not work, see above)
+and run:
+
+    $ make -C tools/lkl
+
+
+LKL hijack library
+==================
+
+LKL hijack library (liblkl-hijack.so) is used to replace system calls used by an
+application on the fly so that the application can use LKL instead of the kernel
+of host operating system. LD_PRELOAD is used to dynamically override system
+calls with this library when you execute a program.
+
+You can usually use this library via a wrapper script.
+
+    $ cd tools/lkl
+    $ ./bin/lkl-hijack.sh ip address show
+
+In order to configure the behavior of LKL, a json file can be used. You can
+specify json file with environmental variables (LKL_HIJACK_CONFIG_FILE). If
+there is nothing specified, LKL tries to find with the name 'lkl-hijack.json'
+for the configuration file.  You can also use the old-style configuration with
+environmental variables (e.g., LKL_HIJACK_NET_IFTYPE) but those are overridden
+if a json file is specified.
+
+```
+     $ cat conf.json
+     {
+       "gateway":"192.168.0.1",
+       "gateway6":"2001:db8:0:f101::1",
+       "debug":"1",
+       "singlecpu":"1",
+       "sysctl":"net.ipv4.tcp_wmem=4096 87380 2147483647",
+       "boot_cmdline":"ip=dhcp",
+       "interfaces":[
+               {
+                       "mac":"12:34:56:78:9a:bc",
+                       "type":"tap",
+                       "param":"tap7",
+                       "ip":"192.168.0.2",
+                       "masklen":"24",
+                       "ifgateway":"192.168.0.1",
+                       "ipv6":"2001:db8:0:f101::2",
+                       "masklen6":"64",
+                       "ifgateway6":"2001:db8:0:f101::1",
+                       "offload":"0xc803"
+               },
+               {
+                       "mac":"12:34:56:78:9a:bd",
+                       "type":"tap",
+                       "param":"tap77",
+                       "ip":"192.168.1.2",
+                       "masklen":"24",
+                       "ifgateway":"192.168.1.1",
+                       "ipv6":"2001:db8:0:f102::2",
+                       "masklen6":"64",
+                       "ifgateway6":"2001:db8:0:f102::1",
+                       "offload":"0xc803"
+               }
+       ]
+     }
+     $ LKL_HIJACK_CONFIG_FILE="conf.json" lkl-hijack.sh ip addr s
+```
+
+The following are the list of keys to describe a JSON file.
+
+* IPv4 gateway address
+
+  key: "gateway"
+  value type: string
+
+  the gateway IPv4 address of LKL network stack.
+```
+     "gateway":"192.168.0.1"
+```
+
+* IPv6 gateway address
+
+  key: "gateway6"
+  value type: string
+
+  the gateway IPv6 address of LKL network stack.
+```
+     "gateway6":"2001:db8:0:f101::1"
+```
+
+* Debug
+
+  key: "debug"
+  value type: string
+
+  Setting it causes some debug information (both from the kernel and the
+  LKL library) to be enabled.  If zero' is specified it is disabled.
+  It is also used as a bit mask to turn on specific debugging facilities.
+  E.g., setting it to "0x100" will cause the LKL kernel to pause after
+  the hijack'ed app exits. This allows one to debug or collect info from
+  the LKL kernel before it quits.
+```
+     "debug":"1"
+```
+
+* Single CPU pinning
+
+  key: "singlecpu"
+  value type: string
+
+  Pin LKL kernel threads on to a single host cpu. value "1" pins
+  only LKL kernel threads while value "2" also pins polling
+  threads.
+```
+     "singlecpu":"1"
+```
+
+* SYSCTL
+
+  key: "sysctl"
+  value type: string
+
+  Configure sysctl values of the booted kernel via the hijack library. Multiple
+  entries can be specified.
+```
+     "sysctl":"net.ipv4.tcp_wmem=4096 87380 2147483647"
+```
+
+* Boot command line
+
+  key: "boot_cmdline"
+  value type: string
+
+  Specify the command line to the kernel boot so that change the configuration
+  on a kernel instance.  For instance, you can change the memory size with
+  below.
+```
+     "boot_cmdline": "mem=1G"
+```
+
+* Mount
+
+  key: "mount"
+  value type: string
+
+```
+     "mount": "proc,sysfs"
+```
+
+* Network Interface Configuration
+
+  key: "interfaces"
+  value type: array of objects
+
+  This key takes a set of sub-keys to configure a single interface. Each key is defined as follows.
+  ```
+       "interfaces":[{....},{....}]
+  ```
+
+
+	* Interface type
+
+	  key: "type"
+	  value type: string
+
+	  The interface type in host operating system to connect to LKL.
+	  The following example specifies a tap interface.
+	```
+	     "type":"tap"
+	```
+
+	* Interface parameter
+
+	  key: "param"
+	  value type: string
+
+	  Additional configuration parameters for the interface specified by Interface type (type).
+	  The parameters depend on the interface type.
+	```
+	     "type":"tap",
+	     "param":"tap0"
+	```
+
+	* Interface MTU size
+
+	  key: "mtu"
+	  value type: string
+
+	  the MTU size of the interface.
+	```
+	     "mtu":"1280"
+	```
+
+	* Interface IPv4 address
+
+	  key: "ip"
+	  value type: string
+
+	  the IPv4 address of the interface.
+	  If you want to use DHCP for the IP address assignment,
+	  use "boot_cmdline" with "ip=dhcp" option.
+	```
+	     "ip":"192.168.0.2"
+	```
+	```
+	     "boot_cmdline":"ip=dhcp"
+	```
+
+	* Interface IPv4 netmask length
+
+	  key: "masklen"
+	  value type: string
+
+	  the network mask length of the interface.
+	```
+	     "ip":"192.168.0.2",
+	     "masklen":"24"
+	```
+
+	* Interface IPv4 gateway on routing policy table
+
+	  key: "ifgateway"
+	  value type: string
+
+	  If you specify this parameter, LKL adds routing policy table.
+	  And then LKL creates link local and gateway route on this table.
+	  Table SELECTOR is "from" and PREFIX is address you assigned to this interface.
+	  Table id is 2 * (interface index).
+	  This parameter could be used to configure LKL for mptcp, for example.
+
+	```
+	     "ip":"192.168.0.2",
+	     "masklen":"24",
+	     "ifgateway":"192.168.0.1"
+	```
+
+	* Interface IPv6 address
+
+	  key: "ipv6"
+	  value type: string
+
+	  the IPv6 address of the interface.
+	```
+	     "ipv6":"2001:db8:0:f101::2"
+	```
+
+	* Interface IPv6 netmask length
+
+	  key: "masklen6"
+	  value type: string
+
+	  the network mask length of the interface.
+	```
+	     "ipv6":"2001:db8:0:f101::2",
+	     "masklen":"64"
+	```
+
+	* Interface IPv6 gateway on routing policy table
+
+	  key: "ifgateway6"
+	  value type: string
+
+	  If you specify this parameter, LKL adds routing policy table.
+	  And then LKL creates link local and gateway route on this table.
+	  Table SELECTOR is "from" and PREFIX is address you assigned to this interface.
+	  Table id is 2 * (interface index) + 1.
+	  This parameter could be used to configure LKL for mptcp, for example.
+	```
+	     "ipv6":"2001:db8:0:f101::2",
+	     "masklen":"64"
+	     "ifgateway6":"2001:db8:0:f101::1",
+	```
+
+	* Interface MAC address
+
+	  key: "mac"
+	  value type: string
+
+	  the MAC address of the interface.
+	```
+	     "mac":"12:34:56:78:9a:bc"
+	```
+
+	* Interfac neighbor entries
+
+	  key: "neigh"
+	  value type: string
+
+	  Add a list of permanent neighbor entries in the form of "ip|mac;ip|mac;...". ipv6 are supported
+	```
+	     "neigh":"192.168.0.1|12:34:56:78:9a:bc;2001:db8:0:f101::1|12:34:56:78:9a:be"
+	```
+
+	* Interface qdisc entries
+
+	  key: "qdisc"
+	  value type: string
+
+	  Add a qdisc entry in the form of "root|type;root|type;...".
+	```
+	     "qdisc":"root|fq"
+	```
+
+	* Interface offload
+
+	  key: "offload"
+	  value type: string
+
+	  Work as a bit mask to enable selective device offload features. E.g.,
+	  to enable "mergeable RX buffer" (LKL_VIRTIO_NET_F_MRG_RXBUF) +
+	  "guest csum" (LKL_VIRTIO_NET_F_GUEST_CSUM) device features, simply set
+	  it to 0x8002.
+	  See virtio_net.h for a list of offload features and their bit masks.
+	```
+	     "offload":"0x8002"
+	```
+
+* Delay
+
+  key: "delay_main"
+  value type: string
+
+  The delay before calling main() function of the application after the
+  initialization of LKL.  Some subsystems in Linux tree require a certain
+  amount of time before accepting a request from application, such as
+  delivery of address assignment to an network interface.  This parameter
+  is used in such case.  The value is described as a microsecond value.
+```
+     "delay_main":"500000"
+```
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 30/37] scripts: revert CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX patches
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki

LKL requires this config option in order to build/run on Windows host
because Windows compiler (mingw32) prepends '_' to each symbol.

This commit reverts:
commit 7953002a7c65 ("vmlinux.lds.h: remove stale <linux/export.h>
include")
commit c4df32c80d04 ("export.h: remove VMLINUX_SYMBOL() and
VMLINUX_SYMBOL_STR()")
commit 00979ce4fcc9 ("linux/linkage.h: replace VMLINUX_SYMBOL_STR()
with __stringify()")
commit a6b04f0ed5e9 ("checkpatch: remove VMLINUX_SYMBOL() check")
commit a62143850053 ("vmlinux.lds.h: remove no-op macro
VMLINUX_SYMBOL()")
commit 704db5433fb4 ("kbuild: remove
CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX")
commit 94e58e0ac312 ("export.h: remove code for prefixing symbols
with underscore")
commit 5a144a1acd0b ("depmod.sh: remove symbol prefix support")
commit 534c9f2ec4c9 ("kallsyms: remove symbol prefix support")
commit 74d931716151 ("genksyms: remove symbol prefix support")
commit b2c5cdcfd4bc ("modpost: remove symbol prefix support")

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 Makefile                          |   2 +-
 arch/Kconfig                      |   6 +
 certs/system_certificates.S       |  16 +-
 include/asm-generic/export.h      |  34 ++--
 include/asm-generic/vmlinux.lds.h | 279 +++++++++++++++---------------
 include/linux/export.h            |  23 ++-
 include/linux/linkage.h           |  12 +-
 scripts/Makefile.build            |   9 +-
 scripts/adjust_autoksyms.sh       |   6 +
 scripts/checkpatch.pl             |  10 ++
 scripts/depmod.sh                 |  25 ++-
 scripts/genksyms/genksyms.c       |  11 +-
 scripts/kallsyms.c                |  47 +++--
 scripts/link-vmlinux.sh           |   4 +
 scripts/mod/modpost.c             |  30 ++--
 usr/initramfs_data.S              |   4 +-
 16 files changed, 318 insertions(+), 200 deletions(-)

diff --git a/Makefile b/Makefile
index 874c0aec0f9c..d2c9e3a420f6 100644
--- a/Makefile
+++ b/Makefile
@@ -1805,7 +1805,7 @@ quiet_cmd_rmfiles = $(if $(wildcard $(rm-files)),CLEAN   $(wildcard $(rm-files))
 # Run depmod only if we have System.map and depmod is executable
 quiet_cmd_depmod = DEPMOD  $(KERNELRELEASE)
       cmd_depmod = $(CONFIG_SHELL) $(srctree)/scripts/depmod.sh $(DEPMOD) \
-                   $(KERNELRELEASE)
+                   $(KERNELRELEASE) "$(patsubst y,_,$(CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX))"
 
 # read saved command lines for existing targets
 existing-targets := $(wildcard $(sort $(targets)))
diff --git a/arch/Kconfig b/arch/Kconfig
index a7b57dd42c26..a01df2ae6a1b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -594,6 +594,12 @@ config MODULES_USE_ELF_REL
 	  Modules only use ELF REL relocations.  Modules with ELF RELA
 	  relocations will give an error.
 
+config HAVE_UNDERSCORE_SYMBOL_PREFIX
+	bool
+	help
+	  Some architectures generate an _ in front of C symbols; things like
+	  module loading and assembly files need to know about this.
+
 config HAVE_IRQ_EXIT_ON_IRQ_STACK
 	bool
 	help
diff --git a/certs/system_certificates.S b/certs/system_certificates.S
index 8f29058adf93..3918ff7235ed 100644
--- a/certs/system_certificates.S
+++ b/certs/system_certificates.S
@@ -5,8 +5,8 @@
 	__INITRODATA
 
 	.align 8
-	.globl system_certificate_list
-system_certificate_list:
+	.globl VMLINUX_SYMBOL(system_certificate_list)
+VMLINUX_SYMBOL(system_certificate_list):
 __cert_list_start:
 #ifdef CONFIG_MODULE_SIG
 	.incbin "certs/signing_key.x509"
@@ -15,21 +15,21 @@ __cert_list_start:
 __cert_list_end:
 
 #ifdef CONFIG_SYSTEM_EXTRA_CERTIFICATE
-	.globl system_extra_cert
+	.globl VMLINUX_SYMBOL(system_extra_cert)
 	.size system_extra_cert, CONFIG_SYSTEM_EXTRA_CERTIFICATE_SIZE
-system_extra_cert:
+VMLINUX_SYMBOL(system_extra_cert):
 	.fill CONFIG_SYSTEM_EXTRA_CERTIFICATE_SIZE, 1, 0
 
 	.align 4
-	.globl system_extra_cert_used
-system_extra_cert_used:
+	.globl VMLINUX_SYMBOL(system_extra_cert_used)
+VMLINUX_SYMBOL(system_extra_cert_used):
 	.int 0
 
 #endif /* CONFIG_SYSTEM_EXTRA_CERTIFICATE */
 
 	.align 8
-	.globl system_certificate_list_size
-system_certificate_list_size:
+	.globl VMLINUX_SYMBOL(system_certificate_list_size)
+VMLINUX_SYMBOL(system_certificate_list_size):
 #ifdef CONFIG_64BIT
 	.quad __cert_list_end - __cert_list_start
 #else
diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index 294d6ae785d4..69ce0914b025 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -27,32 +27,42 @@
 #endif
 .endm
 
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+#define KSYM(name) _##name
+#else
+#define KSYM(name) name
+#endif
+
 /*
  * note on .section use: @progbits vs %progbits nastiness doesn't matter,
  * since we immediately emit into those sections anyway.
  */
 .macro ___EXPORT_SYMBOL name,val,sec
 #ifdef CONFIG_MODULES
-	.globl __ksymtab_\name
+	.globl KSYM(__ksymtab_\name)
 	.section ___ksymtab\sec+\name,"a"
 	.balign KSYM_ALIGN
-__ksymtab_\name:
-	__put \val, __kstrtab_\name
+KSYM(__ksymtab_\name):
+	__put \val, KSYM(__kstrtab_\name)
 	.previous
 	.section __ksymtab_strings,"a"
-__kstrtab_\name:
+KSYM(__kstrtab_\name):
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+	.asciz "_\name"
+#else
 	.asciz "\name"
+#endif
 	.previous
 #ifdef CONFIG_MODVERSIONS
 	.section ___kcrctab\sec+\name,"a"
 	.balign KCRC_ALIGN
-__kcrctab_\name:
+KSYM(__kcrctab_\name):
 #if defined(CONFIG_MODULE_REL_CRCS)
-	.long __crc_\name - .
+	.long KSYM(__crc_\name) - .
 #else
-	.long __crc_\name
+	.long KSYM(__crc_\name)
 #endif
-	.weak __crc_\name
+	.weak KSYM(__crc_\name)
 	.previous
 #endif
 #endif
@@ -85,12 +95,12 @@ __ksym_marker_\sym:
 #endif
 
 #define EXPORT_SYMBOL(name)					\
-	__EXPORT_SYMBOL(name, KSYM_FUNC(name),)
+	__EXPORT_SYMBOL(name, KSYM_FUNC(KSYM(name)),)
 #define EXPORT_SYMBOL_GPL(name) 				\
-	__EXPORT_SYMBOL(name, KSYM_FUNC(name), _gpl)
+	__EXPORT_SYMBOL(name, KSYM_FUNC(KSYM(name)), _gpl)
 #define EXPORT_DATA_SYMBOL(name)				\
-	__EXPORT_SYMBOL(name, name,)
+	__EXPORT_SYMBOL(name, KSYM(name),)
 #define EXPORT_DATA_SYMBOL_GPL(name)				\
-	__EXPORT_SYMBOL(name, name,_gpl)
+	__EXPORT_SYMBOL(name, KSYM(name),_gpl)
 
 #endif
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index cd28f63bfbc7..d0043449d2d3 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -54,6 +54,8 @@
 #define LOAD_OFFSET 0
 #endif
 
+#include <linux/export.h>
+
 /* Align . to a 8 byte boundary equals to maximum function alignment. */
 #define ALIGN_FUNCTION()  . = ALIGN(8)
 
@@ -117,67 +119,67 @@
 			__stop_mcount_loc = .;
 #else
 #define MCOUNT_REC()	. = ALIGN(8);				\
-			__start_mcount_loc = .;			\
+			VMLINUX_SYMBOL(__start_mcount_loc) = .; \
 			KEEP(*(__mcount_loc))			\
-			__stop_mcount_loc = .;
+			VMLINUX_SYMBOL(__stop_mcount_loc) = .;
 #endif
 #else
 #define MCOUNT_REC()
 #endif
 
 #ifdef CONFIG_TRACE_BRANCH_PROFILING
-#define LIKELY_PROFILE()	__start_annotated_branch_profile = .;	\
-				KEEP(*(_ftrace_annotated_branch))	\
-				__stop_annotated_branch_profile = .;
+#define LIKELY_PROFILE()	VMLINUX_SYMBOL(__start_annotated_branch_profile) = .; \
+				KEEP(*(_ftrace_annotated_branch))		      \
+				VMLINUX_SYMBOL(__stop_annotated_branch_profile) = .;
 #else
 #define LIKELY_PROFILE()
 #endif
 
 #ifdef CONFIG_PROFILE_ALL_BRANCHES
-#define BRANCH_PROFILE()	__start_branch_profile = .;		\
-				KEEP(*(_ftrace_branch))			\
-				__stop_branch_profile = .;
+#define BRANCH_PROFILE()	VMLINUX_SYMBOL(__start_branch_profile) = .;   \
+				KEEP(*(_ftrace_branch))			      \
+				VMLINUX_SYMBOL(__stop_branch_profile) = .;
 #else
 #define BRANCH_PROFILE()
 #endif
 
 #ifdef CONFIG_KPROBES
 #define KPROBE_BLACKLIST()	. = ALIGN(8);				      \
-				__start_kprobe_blacklist = .;		      \
+				VMLINUX_SYMBOL(__start_kprobe_blacklist) = .; \
 				KEEP(*(_kprobe_blacklist))		      \
-				__stop_kprobe_blacklist = .;
+				VMLINUX_SYMBOL(__stop_kprobe_blacklist) = .;
 #else
 #define KPROBE_BLACKLIST()
 #endif
 
 #ifdef CONFIG_FUNCTION_ERROR_INJECTION
 #define ERROR_INJECT_WHITELIST()	STRUCT_ALIGN();			      \
-			__start_error_injection_whitelist = .;		      \
+			VMLINUX_SYMBOL(__start_error_injection_whitelist) = .;\
 			KEEP(*(_error_injection_whitelist))		      \
-			__stop_error_injection_whitelist = .;
+			VMLINUX_SYMBOL(__stop_error_injection_whitelist) = .;
 #else
 #define ERROR_INJECT_WHITELIST()
 #endif
 
 #ifdef CONFIG_EVENT_TRACING
 #define FTRACE_EVENTS()	. = ALIGN(8);					\
-			__start_ftrace_events = .;			\
+			VMLINUX_SYMBOL(__start_ftrace_events) = .;	\
 			KEEP(*(_ftrace_events))				\
-			__stop_ftrace_events = .;			\
-			__start_ftrace_eval_maps = .;			\
+			VMLINUX_SYMBOL(__stop_ftrace_events) = .;	\
+			VMLINUX_SYMBOL(__start_ftrace_eval_maps) = .;	\
 			KEEP(*(_ftrace_eval_map))			\
-			__stop_ftrace_eval_maps = .;
+			VMLINUX_SYMBOL(__stop_ftrace_eval_maps) = .;
 #else
 #define FTRACE_EVENTS()
 #endif
 
 #ifdef CONFIG_TRACING
-#define TRACE_PRINTKS()	 __start___trace_bprintk_fmt = .;      \
+#define TRACE_PRINTKS() VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .;      \
 			 KEEP(*(__trace_printk_fmt)) /* Trace_printk fmt' pointer */ \
-			 __stop___trace_bprintk_fmt = .;
-#define TRACEPOINT_STR() __start___tracepoint_str = .;	\
+			 VMLINUX_SYMBOL(__stop___trace_bprintk_fmt) = .;
+#define TRACEPOINT_STR() VMLINUX_SYMBOL(__start___tracepoint_str) = .;	\
 			 KEEP(*(__tracepoint_str)) /* Trace_printk fmt' pointer */ \
-			 __stop___tracepoint_str = .;
+			 VMLINUX_SYMBOL(__stop___tracepoint_str) = .;
 #else
 #define TRACE_PRINTKS()
 #define TRACEPOINT_STR()
@@ -185,27 +187,27 @@
 
 #ifdef CONFIG_FTRACE_SYSCALLS
 #define TRACE_SYSCALLS() . = ALIGN(8);					\
-			 __start_syscalls_metadata = .;			\
+			 VMLINUX_SYMBOL(__start_syscalls_metadata) = .;	\
 			 KEEP(*(__syscalls_metadata))			\
-			 __stop_syscalls_metadata = .;
+			 VMLINUX_SYMBOL(__stop_syscalls_metadata) = .;
 #else
 #define TRACE_SYSCALLS()
 #endif
 
 #ifdef CONFIG_BPF_EVENTS
 #define BPF_RAW_TP() STRUCT_ALIGN();					\
-			 __start__bpf_raw_tp = .;			\
+			 VMLINUX_SYMBOL(__start__bpf_raw_tp) = .;	\
 			 KEEP(*(__bpf_raw_tp_map))			\
-			 __stop__bpf_raw_tp = .;
+			 VMLINUX_SYMBOL(__stop__bpf_raw_tp) = .;
 #else
 #define BPF_RAW_TP()
 #endif
 
 #ifdef CONFIG_SERIAL_EARLYCON
 #define EARLYCON_TABLE() . = ALIGN(8);				\
-			 __earlycon_table = .;			\
+			 VMLINUX_SYMBOL(__earlycon_table) = .;	\
 			 KEEP(*(__earlycon_table))		\
-			 __earlycon_table_end = .;
+			 VMLINUX_SYMBOL(__earlycon_table_end) = .;
 #else
 #define EARLYCON_TABLE()
 #endif
@@ -225,7 +227,7 @@
 #define _OF_TABLE_0(name)
 #define _OF_TABLE_1(name)						\
 	. = ALIGN(8);							\
-	__##name##_of_table = .;					\
+	VMLINUX_SYMBOL(__##name##_of_table) = .;			\
 	KEEP(*(__##name##_of_table))					\
 	KEEP(*(__##name##_of_table_end))
 
@@ -239,9 +241,9 @@
 #ifdef CONFIG_ACPI
 #define ACPI_PROBE_TABLE(name)						\
 	. = ALIGN(8);							\
-	__##name##_acpi_probe_table = .;				\
+	VMLINUX_SYMBOL(__##name##_acpi_probe_table) = .;		\
 	KEEP(*(__##name##_acpi_probe_table))				\
-	__##name##_acpi_probe_table_end = .;
+	VMLINUX_SYMBOL(__##name##_acpi_probe_table_end) = .;
 #else
 #define ACPI_PROBE_TABLE(name)
 #endif
@@ -258,9 +260,9 @@
 
 #define KERNEL_DTB()							\
 	STRUCT_ALIGN();							\
-	__dtb_start = .;						\
+	VMLINUX_SYMBOL(__dtb_start) = .;				\
 	KEEP(*(.dtb.init.rodata))					\
-	__dtb_end = .;
+	VMLINUX_SYMBOL(__dtb_end) = .;
 
 /*
  * .data section
@@ -273,16 +275,16 @@
 	MEM_KEEP(init.data*)						\
 	MEM_KEEP(exit.data*)						\
 	*(.data.unlikely)						\
-	__start_once = .;						\
+	VMLINUX_SYMBOL(__start_once) = .;				\
 	*(.data.once)							\
-	__end_once = .;							\
+	VMLINUX_SYMBOL(__end_once) = .;					\
 	STRUCT_ALIGN();							\
 	*(__tracepoints)						\
 	/* implement dynamic printk debug */				\
 	. = ALIGN(8);							\
-	__start___verbose = .;						\
+	VMLINUX_SYMBOL(__start___verbose) = .;                          \
 	KEEP(*(__verbose))                                              \
-	__stop___verbose = .;						\
+	VMLINUX_SYMBOL(__stop___verbose) = .;				\
 	LIKELY_PROFILE()		       				\
 	BRANCH_PROFILE()						\
 	TRACE_PRINTKS()							\
@@ -294,10 +296,10 @@
  */
 #define NOSAVE_DATA							\
 	. = ALIGN(PAGE_SIZE);						\
-	__nosave_begin = .;						\
+	VMLINUX_SYMBOL(__nosave_begin) = .;				\
 	*(.data..nosave)						\
 	. = ALIGN(PAGE_SIZE);						\
-	__nosave_end = .;
+	VMLINUX_SYMBOL(__nosave_end) = .;
 
 #define PAGE_ALIGNED_DATA(page_align)					\
 	. = ALIGN(page_align);						\
@@ -314,13 +316,13 @@
 
 #define INIT_TASK_DATA(align)						\
 	. = ALIGN(align);						\
-	__start_init_task = .;						\
-	init_thread_union = .;						\
-	init_stack = .;							\
+	VMLINUX_SYMBOL(__start_init_task) = .;				\
+	VMLINUX_SYMBOL(init_thread_union) = .;				\
+	VMLINUX_SYMBOL(init_stack) = .;					\
 	KEEP(*(.data..init_task))					\
 	KEEP(*(.data..init_thread_info))				\
-	. = __start_init_task + THREAD_SIZE;				\
-	__end_init_task = .;
+	. = VMLINUX_SYMBOL(__start_init_task) + THREAD_SIZE;		\
+	VMLINUX_SYMBOL(__end_init_task) = .;
 
 #define JUMP_TABLE_DATA							\
 	. = ALIGN(8);							\
@@ -334,10 +336,10 @@
  */
 #ifndef RO_AFTER_INIT_DATA
 #define RO_AFTER_INIT_DATA						\
-	__start_ro_after_init = .;					\
+	VMLINUX_SYMBOL(__start_ro_after_init) = .;			\
 	*(.data..ro_after_init)						\
 	JUMP_TABLE_DATA							\
-	__end_ro_after_init = .;
+	VMLINUX_SYMBOL(__end_ro_after_init) = .;
 #endif
 
 /*
@@ -346,13 +348,13 @@
 #define RO_DATA_SECTION(align)						\
 	. = ALIGN((align));						\
 	.rodata           : AT(ADDR(.rodata) - LOAD_OFFSET) {		\
-		__start_rodata = .;					\
+		VMLINUX_SYMBOL(__start_rodata) = .;			\
 		*(.rodata) *(.rodata.*)					\
 		RO_AFTER_INIT_DATA	/* Read only after init */	\
 		. = ALIGN(8);						\
-		__start___tracepoints_ptrs = .;				\
+		VMLINUX_SYMBOL(__start___tracepoints_ptrs) = .;		\
 		KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
-		__stop___tracepoints_ptrs = .;				\
+		VMLINUX_SYMBOL(__stop___tracepoints_ptrs) = .;		\
 		*(__tracepoints_strings)/* Tracepoints: strings */	\
 	}								\
 									\
@@ -362,109 +364,109 @@
 									\
 	/* PCI quirks */						\
 	.pci_fixup        : AT(ADDR(.pci_fixup) - LOAD_OFFSET) {	\
-		__start_pci_fixups_early = .;				\
+		VMLINUX_SYMBOL(__start_pci_fixups_early) = .;		\
 		KEEP(*(.pci_fixup_early))				\
-		__end_pci_fixups_early = .;				\
-		__start_pci_fixups_header = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_early) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_header) = .;		\
 		KEEP(*(.pci_fixup_header))				\
-		__end_pci_fixups_header = .;				\
-		__start_pci_fixups_final = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_header) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_final) = .;		\
 		KEEP(*(.pci_fixup_final))				\
-		__end_pci_fixups_final = .;				\
-		__start_pci_fixups_enable = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_final) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_enable) = .;		\
 		KEEP(*(.pci_fixup_enable))				\
-		__end_pci_fixups_enable = .;				\
-		__start_pci_fixups_resume = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_enable) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_resume) = .;		\
 		KEEP(*(.pci_fixup_resume))				\
-		__end_pci_fixups_resume = .;				\
-		__start_pci_fixups_resume_early = .;			\
+		VMLINUX_SYMBOL(__end_pci_fixups_resume) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_resume_early) = .;	\
 		KEEP(*(.pci_fixup_resume_early))			\
-		__end_pci_fixups_resume_early = .;			\
-		__start_pci_fixups_suspend = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_resume_early) = .;	\
+		VMLINUX_SYMBOL(__start_pci_fixups_suspend) = .;		\
 		KEEP(*(.pci_fixup_suspend))				\
-		__end_pci_fixups_suspend = .;				\
-		__start_pci_fixups_suspend_late = .;			\
+		VMLINUX_SYMBOL(__end_pci_fixups_suspend) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_suspend_late) = .;	\
 		KEEP(*(.pci_fixup_suspend_late))			\
-		__end_pci_fixups_suspend_late = .;			\
+		VMLINUX_SYMBOL(__end_pci_fixups_suspend_late) = .;	\
 	}								\
 									\
 	/* Built-in firmware blobs */					\
 	.builtin_fw        : AT(ADDR(.builtin_fw) - LOAD_OFFSET) {	\
-		__start_builtin_fw = .;					\
+		VMLINUX_SYMBOL(__start_builtin_fw) = .;			\
 		KEEP(*(.builtin_fw))					\
-		__end_builtin_fw = .;					\
+		VMLINUX_SYMBOL(__end_builtin_fw) = .;			\
 	}								\
 									\
 	TRACEDATA							\
 									\
 	/* Kernel symbol table: Normal symbols */			\
 	__ksymtab         : AT(ADDR(__ksymtab) - LOAD_OFFSET) {		\
-		__start___ksymtab = .;					\
+		VMLINUX_SYMBOL(__start___ksymtab) = .;			\
 		KEEP(*(SORT(___ksymtab+*)))				\
-		__stop___ksymtab = .;					\
+		VMLINUX_SYMBOL(__stop___ksymtab) = .;			\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__ksymtab_gpl     : AT(ADDR(__ksymtab_gpl) - LOAD_OFFSET) {	\
-		__start___ksymtab_gpl = .;				\
+		VMLINUX_SYMBOL(__start___ksymtab_gpl) = .;		\
 		KEEP(*(SORT(___ksymtab_gpl+*)))				\
-		__stop___ksymtab_gpl = .;				\
+		VMLINUX_SYMBOL(__stop___ksymtab_gpl) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: Normal unused symbols */		\
 	__ksymtab_unused  : AT(ADDR(__ksymtab_unused) - LOAD_OFFSET) {	\
-		__start___ksymtab_unused = .;				\
+		VMLINUX_SYMBOL(__start___ksymtab_unused) = .;		\
 		KEEP(*(SORT(___ksymtab_unused+*)))			\
-		__stop___ksymtab_unused = .;				\
+		VMLINUX_SYMBOL(__stop___ksymtab_unused) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only unused symbols */		\
 	__ksymtab_unused_gpl : AT(ADDR(__ksymtab_unused_gpl) - LOAD_OFFSET) { \
-		__start___ksymtab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__start___ksymtab_unused_gpl) = .;	\
 		KEEP(*(SORT(___ksymtab_unused_gpl+*)))			\
-		__stop___ksymtab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__stop___ksymtab_unused_gpl) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: GPL-future-only symbols */		\
 	__ksymtab_gpl_future : AT(ADDR(__ksymtab_gpl_future) - LOAD_OFFSET) { \
-		__start___ksymtab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__start___ksymtab_gpl_future) = .;	\
 		KEEP(*(SORT(___ksymtab_gpl_future+*)))			\
-		__stop___ksymtab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__stop___ksymtab_gpl_future) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: Normal symbols */			\
 	__kcrctab         : AT(ADDR(__kcrctab) - LOAD_OFFSET) {		\
-		__start___kcrctab = .;					\
+		VMLINUX_SYMBOL(__start___kcrctab) = .;			\
 		KEEP(*(SORT(___kcrctab+*)))				\
-		__stop___kcrctab = .;					\
+		VMLINUX_SYMBOL(__stop___kcrctab) = .;			\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__kcrctab_gpl     : AT(ADDR(__kcrctab_gpl) - LOAD_OFFSET) {	\
-		__start___kcrctab_gpl = .;				\
+		VMLINUX_SYMBOL(__start___kcrctab_gpl) = .;		\
 		KEEP(*(SORT(___kcrctab_gpl+*)))				\
-		__stop___kcrctab_gpl = .;				\
+		VMLINUX_SYMBOL(__stop___kcrctab_gpl) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: Normal unused symbols */		\
 	__kcrctab_unused  : AT(ADDR(__kcrctab_unused) - LOAD_OFFSET) {	\
-		__start___kcrctab_unused = .;				\
+		VMLINUX_SYMBOL(__start___kcrctab_unused) = .;		\
 		KEEP(*(SORT(___kcrctab_unused+*)))			\
-		__stop___kcrctab_unused = .;				\
+		VMLINUX_SYMBOL(__stop___kcrctab_unused) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only unused symbols */		\
 	__kcrctab_unused_gpl : AT(ADDR(__kcrctab_unused_gpl) - LOAD_OFFSET) { \
-		__start___kcrctab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__start___kcrctab_unused_gpl) = .;	\
 		KEEP(*(SORT(___kcrctab_unused_gpl+*)))			\
-		__stop___kcrctab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__stop___kcrctab_unused_gpl) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: GPL-future-only symbols */		\
 	__kcrctab_gpl_future : AT(ADDR(__kcrctab_gpl_future) - LOAD_OFFSET) { \
-		__start___kcrctab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__start___kcrctab_gpl_future) = .;	\
 		KEEP(*(SORT(___kcrctab_gpl_future+*)))			\
-		__stop___kcrctab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__stop___kcrctab_gpl_future) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: strings */				\
@@ -481,18 +483,18 @@
 									\
 	/* Built-in module parameters. */				\
 	__param : AT(ADDR(__param) - LOAD_OFFSET) {			\
-		__start___param = .;					\
+		VMLINUX_SYMBOL(__start___param) = .;			\
 		KEEP(*(__param))					\
-		__stop___param = .;					\
+		VMLINUX_SYMBOL(__stop___param) = .;			\
 	}								\
 									\
 	/* Built-in module versions. */					\
 	__modver : AT(ADDR(__modver) - LOAD_OFFSET) {			\
-		__start___modver = .;					\
+		VMLINUX_SYMBOL(__start___modver) = .;			\
 		KEEP(*(__modver))					\
-		__stop___modver = .;					\
+		VMLINUX_SYMBOL(__stop___modver) = .;			\
 		. = ALIGN((align));					\
-		__end_rodata = .;					\
+		VMLINUX_SYMBOL(__end_rodata) = .;			\
 	}								\
 	. = ALIGN((align));
 
@@ -522,47 +524,47 @@
  * address even at second ld pass when generating System.map */
 #define SCHED_TEXT							\
 		ALIGN_FUNCTION();					\
-		__sched_text_start = .;					\
+		VMLINUX_SYMBOL(__sched_text_start) = .;			\
 		*(.sched.text)						\
-		__sched_text_end = .;
+		VMLINUX_SYMBOL(__sched_text_end) = .;
 
 /* spinlock.text is aling to function alignment to secure we have same
  * address even at second ld pass when generating System.map */
 #define LOCK_TEXT							\
 		ALIGN_FUNCTION();					\
-		__lock_text_start = .;					\
+		VMLINUX_SYMBOL(__lock_text_start) = .;			\
 		*(.spinlock.text)					\
-		__lock_text_end = .;
+		VMLINUX_SYMBOL(__lock_text_end) = .;
 
 #define CPUIDLE_TEXT							\
 		ALIGN_FUNCTION();					\
-		__cpuidle_text_start = .;				\
+		VMLINUX_SYMBOL(__cpuidle_text_start) = .;		\
 		*(.cpuidle.text)					\
-		__cpuidle_text_end = .;
+		VMLINUX_SYMBOL(__cpuidle_text_end) = .;
 
 #define KPROBES_TEXT							\
 		ALIGN_FUNCTION();					\
-		__kprobes_text_start = .;				\
+		VMLINUX_SYMBOL(__kprobes_text_start) = .;		\
 		*(.kprobes.text)					\
-		__kprobes_text_end = .;
+		VMLINUX_SYMBOL(__kprobes_text_end) = .;
 
 #define ENTRY_TEXT							\
 		ALIGN_FUNCTION();					\
-		__entry_text_start = .;					\
+		VMLINUX_SYMBOL(__entry_text_start) = .;			\
 		*(.entry.text)						\
-		__entry_text_end = .;
+		VMLINUX_SYMBOL(__entry_text_end) = .;
 
 #define IRQENTRY_TEXT							\
 		ALIGN_FUNCTION();					\
-		__irqentry_text_start = .;				\
+		VMLINUX_SYMBOL(__irqentry_text_start) = .;		\
 		*(.irqentry.text)					\
-		__irqentry_text_end = .;
+		VMLINUX_SYMBOL(__irqentry_text_end) = .;
 
 #define SOFTIRQENTRY_TEXT						\
 		ALIGN_FUNCTION();					\
-		__softirqentry_text_start = .;				\
+		VMLINUX_SYMBOL(__softirqentry_text_start) = .;		\
 		*(.softirqentry.text)					\
-		__softirqentry_text_end = .;
+		VMLINUX_SYMBOL(__softirqentry_text_end) = .;
 
 /* Section used for early init (in .S files) */
 #define HEAD_TEXT  KEEP(*(.head.text))
@@ -578,9 +580,9 @@
 #define EXCEPTION_TABLE(align)						\
 	. = ALIGN(align);						\
 	__ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {		\
-		__start___ex_table = .;					\
+		VMLINUX_SYMBOL(__start___ex_table) = .;			\
 		KEEP(*(__ex_table))					\
-		__stop___ex_table = .;					\
+		VMLINUX_SYMBOL(__stop___ex_table) = .;			\
 	}
 
 /*
@@ -594,11 +596,11 @@
 
 #ifdef CONFIG_CONSTRUCTORS
 #define KERNEL_CTORS()	. = ALIGN(8);			   \
-			__ctors_start = .;		   \
+			VMLINUX_SYMBOL(__ctors_start) = .; \
 			KEEP(*(.ctors))			   \
 			KEEP(*(SORT(.init_array.*)))	   \
 			KEEP(*(.init_array))		   \
-			__ctors_end = .;
+			VMLINUX_SYMBOL(__ctors_end) = .;
 #else
 #define KERNEL_CTORS()
 #endif
@@ -734,9 +736,9 @@
 #define BUG_TABLE							\
 	. = ALIGN(8);							\
 	__bug_table : AT(ADDR(__bug_table) - LOAD_OFFSET) {		\
-		__start___bug_table = .;				\
+		VMLINUX_SYMBOL(__start___bug_table) = .;		\
 		KEEP(*(__bug_table))					\
-		__stop___bug_table = .;					\
+		VMLINUX_SYMBOL(__stop___bug_table) = .;			\
 	}
 #else
 #define BUG_TABLE
@@ -746,22 +748,22 @@
 #define ORC_UNWIND_TABLE						\
 	. = ALIGN(4);							\
 	.orc_unwind_ip : AT(ADDR(.orc_unwind_ip) - LOAD_OFFSET) {	\
-		__start_orc_unwind_ip = .;				\
+		VMLINUX_SYMBOL(__start_orc_unwind_ip) = .;		\
 		KEEP(*(.orc_unwind_ip))					\
-		__stop_orc_unwind_ip = .;				\
+		VMLINUX_SYMBOL(__stop_orc_unwind_ip) = .;		\
 	}								\
 	. = ALIGN(2);							\
 	.orc_unwind : AT(ADDR(.orc_unwind) - LOAD_OFFSET) {		\
-		__start_orc_unwind = .;					\
+		VMLINUX_SYMBOL(__start_orc_unwind) = .;			\
 		KEEP(*(.orc_unwind))					\
-		__stop_orc_unwind = .;					\
+		VMLINUX_SYMBOL(__stop_orc_unwind) = .;			\
 	}								\
 	. = ALIGN(4);							\
 	.orc_lookup : AT(ADDR(.orc_lookup) - LOAD_OFFSET) {		\
-		orc_lookup = .;						\
+		VMLINUX_SYMBOL(orc_lookup) = .;				\
 		. += (((SIZEOF(.text) + LOOKUP_BLOCK_SIZE - 1) /	\
 			LOOKUP_BLOCK_SIZE) + 1) * 4;			\
-		orc_lookup_end = .;					\
+		VMLINUX_SYMBOL(orc_lookup_end) = .;			\
 	}
 #else
 #define ORC_UNWIND_TABLE
@@ -771,9 +773,9 @@
 #define TRACEDATA							\
 	. = ALIGN(4);							\
 	.tracedata : AT(ADDR(.tracedata) - LOAD_OFFSET) {		\
-		__tracedata_start = .;					\
+		VMLINUX_SYMBOL(__tracedata_start) = .;			\
 		KEEP(*(.tracedata))					\
-		__tracedata_end = .;					\
+		VMLINUX_SYMBOL(__tracedata_end) = .;			\
 	}
 #else
 #define TRACEDATA
@@ -781,24 +783,24 @@
 
 #define NOTES								\
 	.notes : AT(ADDR(.notes) - LOAD_OFFSET) {			\
-		__start_notes = .;					\
+		VMLINUX_SYMBOL(__start_notes) = .;			\
 		KEEP(*(.note.*))					\
-		__stop_notes = .;					\
+		VMLINUX_SYMBOL(__stop_notes) = .;			\
 	}
 
 #define INIT_SETUP(initsetup_align)					\
 		. = ALIGN(initsetup_align);				\
-		__setup_start = .;					\
+		VMLINUX_SYMBOL(__setup_start) = .;			\
 		KEEP(*(.init.setup))					\
-		__setup_end = .;
+		VMLINUX_SYMBOL(__setup_end) = .;
 
 #define INIT_CALLS_LEVEL(level)						\
-		__initcall##level##_start = .;				\
+		VMLINUX_SYMBOL(__initcall##level##_start) = .;		\
 		KEEP(*(.initcall##level##.init))			\
 		KEEP(*(.initcall##level##s.init))			\
 
 #define INIT_CALLS							\
-		__initcall_start = .;					\
+		VMLINUX_SYMBOL(__initcall_start) = .;			\
 		KEEP(*(.initcallearly.init))				\
 		INIT_CALLS_LEVEL(0)					\
 		INIT_CALLS_LEVEL(1)					\
@@ -809,17 +811,17 @@
 		INIT_CALLS_LEVEL(rootfs)				\
 		INIT_CALLS_LEVEL(6)					\
 		INIT_CALLS_LEVEL(7)					\
-		__initcall_end = .;
+		VMLINUX_SYMBOL(__initcall_end) = .;
 
 #define CON_INITCALL							\
-		__con_initcall_start = .;				\
+		VMLINUX_SYMBOL(__con_initcall_start) = .;		\
 		KEEP(*(.con_initcall.init))				\
-		__con_initcall_end = .;
+		VMLINUX_SYMBOL(__con_initcall_end) = .;
 
 #ifdef CONFIG_BLK_DEV_INITRD
 #define INIT_RAM_FS							\
 	. = ALIGN(4);							\
-	__initramfs_start = .;						\
+	VMLINUX_SYMBOL(__initramfs_start) = .;				\
 	KEEP(*(.init.ramfs))						\
 	. = ALIGN(8);							\
 	KEEP(*(.init.ramfs.info))
@@ -875,7 +877,7 @@
  * sharing between subsections for different purposes.
  */
 #define PERCPU_INPUT(cacheline)						\
-	__per_cpu_start = .;						\
+	VMLINUX_SYMBOL(__per_cpu_start) = .;				\
 	*(.data..percpu..first)						\
 	. = ALIGN(PAGE_SIZE);						\
 	*(.data..percpu..page_aligned)					\
@@ -885,7 +887,7 @@
 	*(.data..percpu)						\
 	*(.data..percpu..shared_aligned)				\
 	PERCPU_DECRYPTED_SECTION					\
-	__per_cpu_end = .;
+	VMLINUX_SYMBOL(__per_cpu_end) = .;
 
 /**
  * PERCPU_VADDR - define output section for percpu area
@@ -912,11 +914,12 @@
  * address, use PERCPU_SECTION.
  */
 #define PERCPU_VADDR(cacheline, vaddr, phdr)				\
-	__per_cpu_load = .;						\
-	.data..percpu vaddr : AT(__per_cpu_load - LOAD_OFFSET) {	\
+	VMLINUX_SYMBOL(__per_cpu_load) = .;				\
+	.data..percpu vaddr : AT(VMLINUX_SYMBOL(__per_cpu_load)		\
+				- LOAD_OFFSET) {			\
 		PERCPU_INPUT(cacheline)					\
 	} phdr								\
-	. = __per_cpu_load + SIZEOF(.data..percpu);
+	. = VMLINUX_SYMBOL(__per_cpu_load) + SIZEOF(.data..percpu);
 
 /**
  * PERCPU_SECTION - define output section for percpu area, simple version
@@ -933,7 +936,7 @@
 #define PERCPU_SECTION(cacheline)					\
 	. = ALIGN(PAGE_SIZE);						\
 	.data..percpu	: AT(ADDR(.data..percpu) - LOAD_OFFSET) {	\
-		__per_cpu_load = .;					\
+		VMLINUX_SYMBOL(__per_cpu_load) = .;			\
 		PERCPU_INPUT(cacheline)					\
 	}
 
@@ -972,9 +975,9 @@
 #define INIT_TEXT_SECTION(inittext_align)				\
 	. = ALIGN(inittext_align);					\
 	.init.text : AT(ADDR(.init.text) - LOAD_OFFSET) {		\
-		_sinittext = .;						\
+		VMLINUX_SYMBOL(_sinittext) = .;				\
 		INIT_TEXT						\
-		_einittext = .;						\
+		VMLINUX_SYMBOL(_einittext) = .;				\
 	}
 
 #define INIT_DATA_SECTION(initsetup_align)				\
@@ -988,8 +991,8 @@
 
 #define BSS_SECTION(sbss_align, bss_align, stop_align)			\
 	. = ALIGN(sbss_align);						\
-	__bss_start = .;						\
+	VMLINUX_SYMBOL(__bss_start) = .;				\
 	SBSS(sbss_align)						\
 	BSS(bss_align)							\
 	. = ALIGN(stop_align);						\
-	__bss_stop = .;
+	VMLINUX_SYMBOL(__bss_stop) = .;
diff --git a/include/linux/export.h b/include/linux/export.h
index fd8711ed9ac4..34c34d09103c 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -10,6 +10,19 @@
  * hackers place grumpy comments in header files.
  */
 
+/* Some toolchains use a `_' prefix for all user symbols. */
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+#define __VMLINUX_SYMBOL(x) _##x
+#define __VMLINUX_SYMBOL_STR(x) "_" #x
+#else
+#define __VMLINUX_SYMBOL(x) x
+#define __VMLINUX_SYMBOL_STR(x) #x
+#endif
+
+/* Indirect, so macros are expanded before pasting. */
+#define VMLINUX_SYMBOL(x) __VMLINUX_SYMBOL(x)
+#define VMLINUX_SYMBOL_STR(x) __VMLINUX_SYMBOL_STR(x)
+
 #ifndef __ASSEMBLY__
 #ifdef MODULE
 extern struct module __this_module;
@@ -27,14 +40,14 @@ extern struct module __this_module;
 #if defined(CONFIG_MODULE_REL_CRCS)
 #define __CRC_SYMBOL(sym, sec)						\
 	asm("	.section \"___kcrctab" sec "+" #sym "\", \"a\"	\n"	\
-	    "	.weak	__crc_" #sym "				\n"	\
-	    "	.long	__crc_" #sym " - .			\n"	\
+	    "	.weak	" VMLINUX_SYMBOL_STR(__crc_##sym) "	\n"	\
+	    "	.long	" VMLINUX_SYMBOL_STR(__crc_##sym) " - .	\n"	\
 	    "	.previous					\n");
 #else
 #define __CRC_SYMBOL(sym, sec)						\
 	asm("	.section \"___kcrctab" sec "+" #sym "\", \"a\"	\n"	\
-	    "	.weak	__crc_" #sym "				\n"	\
-	    "	.long	__crc_" #sym "				\n"	\
+	    "	.weak	" VMLINUX_SYMBOL_STR(__crc_##sym) "	\n"	\
+	    "	.long	" VMLINUX_SYMBOL_STR(__crc_##sym) "	\n"	\
 	    "	.previous					\n");
 #endif
 #else
@@ -80,7 +93,7 @@ struct kernel_symbol {
 	__CRC_SYMBOL(sym, sec)						\
 	static const char __kstrtab_##sym[]				\
 	__attribute__((section("__ksymtab_strings"), used, aligned(1)))	\
-	= #sym;								\
+	= VMLINUX_SYMBOL_STR(#sym);					\
 	__KSYMTAB_ENTRY(sym, sec)
 
 #if defined(__DISABLE_EXPORTS)
diff --git a/include/linux/linkage.h b/include/linux/linkage.h
index 7e020782ade2..d287823ee947 100644
--- a/include/linux/linkage.h
+++ b/include/linux/linkage.h
@@ -24,16 +24,16 @@
 
 #ifndef cond_syscall
 #define cond_syscall(x)	asm(				\
-	".weak " __stringify(x) "\n\t"			\
-	".set  " __stringify(x) ","			\
-		 __stringify(sys_ni_syscall))
+	".weak " VMLINUX_SYMBOL_STR(x) "\n\t"		\
+	".set  " VMLINUX_SYMBOL_STR(x) ","		\
+		 VMLINUX_SYMBOL_STR(sys_ni_syscall))
 #endif
 
 #ifndef SYSCALL_ALIAS
 #define SYSCALL_ALIAS(alias, name) asm(			\
-	".globl " __stringify(alias) "\n\t"		\
-	".set   " __stringify(alias) ","		\
-		  __stringify(name))
+	".globl " VMLINUX_SYMBOL_STR(alias) "\n\t"	\
+	".set   " VMLINUX_SYMBOL_STR(alias) ","		\
+		  VMLINUX_SYMBOL_STR(name))
 #endif
 
 #define __page_aligned_data	__section(.data..page_aligned) __aligned(PAGE_SIZE)
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 2f66ed388d1c..aa4dac94e563 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -123,6 +123,7 @@ $(obj)/%.i: $(src)/%.c FORCE
 cmd_gensymtypes_c =                                                         \
     $(CPP) -D__GENKSYMS__ $(c_flags) $< |                                   \
     scripts/genksyms/genksyms $(if $(1), -T $(2))                           \
+     $(patsubst y,-s _,$(CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX))             \
      $(patsubst y,-R,$(CONFIG_MODULE_REL_CRCS))                             \
      $(if $(KBUILD_PRESERVE),-p)                                            \
      -r $(firstword $(wildcard $(2:.symtypes=.symref) /dev/null))
@@ -334,6 +335,7 @@ cmd_gensymtypes_S =                                                         \
      sed 's/.*___EXPORT_SYMBOL[[:space:]]*\([a-zA-Z0-9_]*\)[[:space:]]*,.*/EXPORT_SYMBOL(\1);/' ; } | \
     $(CPP) -D__GENKSYMS__ $(c_flags) -xc - |                                \
     scripts/genksyms/genksyms $(if $(1), -T $(2))                           \
+     $(patsubst y,-s _,$(CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX))             \
      $(patsubst y,-R,$(CONFIG_MODULE_REL_CRCS))                             \
      $(if $(KBUILD_PRESERVE),-p)                                            \
      -r $(firstword $(wildcard $(2:.symtypes=.symref) /dev/null))
@@ -444,10 +446,15 @@ targets += $(lib-target)
 
 dummy-object = $(obj)/.lib_exports.o
 ksyms-lds = $(dot-target).lds
+ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+ref_prefix = EXTERN(_
+else
+ref_prefix = EXTERN(
+endif
 
 quiet_cmd_export_list = EXPORTS $@
 cmd_export_list = $(OBJDUMP) -h $< | \
-	sed -ne '/___ksymtab/s/.*+\([^ ]*\).*/EXTERN(\1)/p' >$(ksyms-lds);\
+	sed -ne '/___ksymtab/s/.*+\([^ ]*\).*/$(ref_prefix)\1)/p' >$(ksyms-lds);\
 	rm -f $(dummy-object);\
 	echo | $(CC) $(a_flags) -c -o $(dummy-object) -x assembler -;\
 	$(LD) $(ld_flags) -r -o $@ -T $(ksyms-lds) $(dummy-object);\
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
index a904bf1f5e67..e6023399ceb0 100755
--- a/scripts/adjust_autoksyms.sh
+++ b/scripts/adjust_autoksyms.sh
@@ -49,6 +49,12 @@ EOT
 sed 's/ko$/mod/' modules.order |
 xargs -n1 sed -n -e '2{s/ /\n/g;/^$/!p;}' -- |
 sort -u |
+while read sym; do
+	if [ -n "$CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX" ]; then
+		sym="${sym#_}"
+	fi
+	echo ${sym}
+done |
 sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$new_ksyms_file"
 
 # Special case for modversions (see modpost.c)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index e739f565497e..c65f5647d2af 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5274,6 +5274,16 @@ sub process {
 			}
 		}
 
+# make sure symbols are always wrapped with VMLINUX_SYMBOL() ...
+# all assignments may have only one of the following with an assignment:
+#	.
+#	ALIGN(...)
+#	VMLINUX_SYMBOL(...)
+		if ($realfile eq 'vmlinux.lds.h' && $line =~ /(?:(?:^|\s)$Ident\s*=|=\s*$Ident(?:\s|$))/) {
+			WARN("MISSING_VMLINUX_SYMBOL",
+			     "vmlinux.lds.h needs VMLINUX_SYMBOL() around C-visible symbols\n" . $herecurr);
+		}
+
 # check for redundant bracing round if etc
 		if ($line =~ /(^.*)\bif\b/ && $1 !~ /else\s*$/) {
 			my ($level, $endln, @chunks) =
diff --git a/scripts/depmod.sh b/scripts/depmod.sh
index e083bcae343f..805b6d5b36a2 100755
--- a/scripts/depmod.sh
+++ b/scripts/depmod.sh
@@ -3,12 +3,13 @@
 #
 # A depmod wrapper used by the toplevel Makefile
 
-if test $# -ne 2; then
-	echo "Usage: $0 /sbin/depmod <kernelrelease>" >&2
+if test $# -ne 3; then
+	echo "Usage: $0 /sbin/depmod <kernelrelease> <symbolprefix>" >&2
 	exit 1
 fi
 DEPMOD=$1
 KERNELRELEASE=$2
+SYMBOL_PREFIX=$3
 
 if ! test -r System.map ; then
 	echo "Warning: modules_install: missing 'System.map' file. Skipping depmod." >&2
@@ -21,6 +22,24 @@ if [ -z $(command -v $DEPMOD) ]; then
 	exit 0
 fi
 
+# older versions of depmod don't support -P <symbol-prefix>
+# support was added in module-init-tools 3.13
+if test -n "$SYMBOL_PREFIX"; then
+	release=$("$DEPMOD" --version)
+	package=$(echo "$release" | cut -d' ' -f 1)
+	if test "$package" = "module-init-tools"; then
+		version=$(echo "$release" | cut -d' ' -f 2)
+		later=$(printf '%s\n' "$version" "3.13" | sort -V | tail -n 1)
+		if test "$later" != "$version"; then
+			# module-init-tools < 3.13, drop the symbol prefix
+			SYMBOL_PREFIX=""
+		fi
+	fi
+	if test -n "$SYMBOL_PREFIX"; then
+		SYMBOL_PREFIX="-P $SYMBOL_PREFIX"
+	fi
+fi
+
 # older versions of depmod require the version string to start with three
 # numbers, so we cheat with a symlink here
 depmod_hack_needed=true
@@ -43,7 +62,7 @@ set -- -ae -F System.map
 if test -n "$INSTALL_MOD_PATH"; then
 	set -- "$@" -b "$INSTALL_MOD_PATH"
 fi
-"$DEPMOD" "$@" "$KERNELRELEASE"
+"$DEPMOD" "$@" "$KERNELRELEASE" $SYMBOL_PREFIX
 ret=$?
 
 if $depmod_hack_needed; then
diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
index 23eff234184f..139bf698a7a8 100644
--- a/scripts/genksyms/genksyms.c
+++ b/scripts/genksyms/genksyms.c
@@ -34,6 +34,7 @@ int in_source_file;
 
 static int flag_debug, flag_dump_defs, flag_reference, flag_dump_types,
 	   flag_preserve, flag_warnings, flag_rel_crcs;
+static const char *mod_prefix = "";
 
 static int errors;
 static int nsyms;
@@ -681,10 +682,10 @@ void export_symbol(const char *name)
 			fputs(">\n", debugfile);
 
 		/* Used as a linker script. */
-		printf(!flag_rel_crcs ? "__crc_%s = 0x%08lx;\n" :
+		printf(!flag_rel_crcs ? "%s__crc_%s = 0x%08lx;\n" :
 		       "SECTIONS { .rodata : ALIGN(4) { "
-		       "__crc_%s = .; LONG(0x%08lx); } }\n",
-		       name, crc);
+		       "%s__crc_%s = .; LONG(0x%08lx); } }\n",
+		       mod_prefix, name, crc);
 	}
 }
 
@@ -757,6 +758,7 @@ int main(int argc, char **argv)
 
 #ifdef __GNU_LIBRARY__
 	struct option long_opts[] = {
+		{"symbol-prefix", 1, 0, 's'},
 		{"debug", 0, 0, 'd'},
 		{"warnings", 0, 0, 'w'},
 		{"quiet", 0, 0, 'q'},
@@ -776,6 +778,9 @@ int main(int argc, char **argv)
 	while ((o = getopt(argc, argv, "s:dwqVDr:T:phR")) != EOF)
 #endif				/* __GNU_LIBRARY__ */
 		switch (o) {
+		case 's':
+			mod_prefix = optarg;
+			break;
 		case 'd':
 			flag_debug++;
 			break;
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index ae6504d07fd6..75ec25554111 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -60,6 +60,7 @@ static struct sym_entry *table;
 static unsigned int table_size, table_cnt;
 static int all_symbols = 0;
 static int absolute_percpu = 0;
+static char symbol_prefix_char = '\0';
 static int base_relative = 0;
 
 static int token_profit[0x10000];
@@ -72,6 +73,7 @@ static unsigned char best_table_len[256];
 static void usage(void)
 {
 	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
+			"[--symbol-prefix=<prefix char>] "
 			"[--base-relative] < in.map > out.S\n");
 	exit(1);
 }
@@ -109,22 +111,28 @@ static int check_symbol_range(const char *sym, unsigned long long addr,
 
 static int read_symbol(FILE *in, struct sym_entry *s)
 {
-	char sym[500], stype;
+	char str[500];
+	char *sym, stype;
 	int rc;
 
-	rc = fscanf(in, "%llx %c %499s\n", &s->addr, &stype, sym);
+	rc = fscanf(in, "%llx %c %499s\n", &s->addr, &stype, str);
 	if (rc != 3) {
-		if (rc != EOF && fgets(sym, 500, in) == NULL)
+		if (rc != EOF && fgets(str, 500, in) == NULL)
 			fprintf(stderr, "Read error or end of file.\n");
 		return -1;
 	}
-	if (strlen(sym) >= KSYM_NAME_LEN) {
+	if (strlen(str) >= KSYM_NAME_LEN) {
 		fprintf(stderr, "Symbol %s too long for kallsyms (%zu >= %d).\n"
 				"Please increase KSYM_NAME_LEN both in kernel and kallsyms.c\n",
-			sym, strlen(sym), KSYM_NAME_LEN);
+			str, strlen(str), KSYM_NAME_LEN);
 		return -1;
 	}
 
+	sym = str;
+	/* skip prefix char */
+	if (symbol_prefix_char && str[0] == symbol_prefix_char)
+		sym++;
+
 	/* Ignore most absolute/undefined (?) symbols. */
 	if (strcmp(sym, "_text") == 0)
 		_text = s->addr;
@@ -145,7 +153,7 @@ static int read_symbol(FILE *in, struct sym_entry *s)
 		 is_arm_mapping_symbol(sym))
 		return -1;
 	/* exclude also MIPS ELF local symbols ($L123 instead of .L123) */
-	else if (sym[0] == '$')
+	else if (str[0] == '$')
 		return -1;
 	/* exclude debugging symbols */
 	else if (stype == 'N' || stype == 'n')
@@ -156,14 +164,14 @@ static int read_symbol(FILE *in, struct sym_entry *s)
 
 	/* include the type field in the symbol name, so that it gets
 	 * compressed together */
-	s->len = strlen(sym) + 1;
+	s->len = strlen(str) + 1;
 	s->sym = malloc(s->len + 1);
 	if (!s->sym) {
 		fprintf(stderr, "kallsyms failure: "
 			"unable to allocate required amount of memory\n");
 		exit(EXIT_FAILURE);
 	}
-	strcpy((char *)s->sym + 1, sym);
+	strcpy((char *)s->sym + 1, str);
 	s->sym[0] = stype;
 
 	s->percpu_absolute = 0;
@@ -226,6 +234,11 @@ static int symbol_valid(struct sym_entry *s)
 	int i;
 	char *sym_name = (char *)s->sym + 1;
 
+	/* skip prefix char */
+	if (symbol_prefix_char && *sym_name == symbol_prefix_char)
+		sym_name++;
+
+
 	/* if --all-symbols is not specified, then symbols outside the text
 	 * and inittext sections are discarded */
 	if (!all_symbols) {
@@ -290,9 +303,15 @@ static void read_map(FILE *in)
 
 static void output_label(char *label)
 {
-	printf(".globl %s\n", label);
+	if (symbol_prefix_char)
+		printf(".globl %c%s\n", symbol_prefix_char, label);
+	else
+		printf(".globl %s\n", label);
 	printf("\tALGN\n");
-	printf("%s:\n", label);
+	if (symbol_prefix_char)
+		printf("%c%s:\n", symbol_prefix_char, label);
+	else
+		printf("%s:\n", label);
 }
 
 /* uncompress a compressed symbol. When this function is called, the best table
@@ -749,7 +768,13 @@ int main(int argc, char **argv)
 				absolute_percpu = 1;
 			else if (strcmp(argv[i], "--base-relative") == 0)
 				base_relative = 1;
-			else
+			else if (strncmp(argv[i], "--symbol-prefix=", 16) == 0) {
+				char *p = &argv[i][16];
+				/* skip quote */
+				if ((*p == '"' && *(p+2) == '"') || (*p == '\'' && *(p+2) == '\''))
+					p++;
+				symbol_prefix_char = *p;
+			} else
 				usage();
 		}
 	} else if (argc != 1)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 915775eb2921..c3c5758ed7d6 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -117,6 +117,10 @@ kallsyms()
 	info KSYM ${2}
 	local kallsymopt;
 
+	if [ -n "${CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX}" ]; then
+		kallsymopt="${kallsymopt} --symbol-prefix=_"
+	fi
+
 	if [ -n "${CONFIG_KALLSYMS_ALL}" ]; then
 		kallsymopt="${kallsymopt} --all-symbols"
 	fi
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index f277e116e0eb..555f152bbebe 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -19,7 +19,9 @@
 #include <stdbool.h>
 #include <errno.h>
 #include "modpost.h"
+#include "../../include/generated/autoconf.h"
 #include "../../include/linux/license.h"
+#include "../../include/linux/export.h"
 
 /* Are we using CONFIG_MODVERSIONS? */
 static int modversions = 0;
@@ -588,7 +590,7 @@ static void parse_elf_finish(struct elf_info *info)
 static int ignore_undef_symbol(struct elf_info *info, const char *symname)
 {
 	/* ignore __this_module, it will be resolved shortly */
-	if (strcmp(symname, "__this_module") == 0)
+	if (strcmp(symname, VMLINUX_SYMBOL_STR(__this_module)) == 0)
 		return 1;
 	/* ignore global offset table */
 	if (strcmp(symname, "_GLOBAL_OFFSET_TABLE_") == 0)
@@ -614,6 +616,9 @@ static int ignore_undef_symbol(struct elf_info *info, const char *symname)
 	return 0;
 }
 
+#define CRC_PFX     VMLINUX_SYMBOL_STR(__crc_)
+#define KSYMTAB_PFX VMLINUX_SYMBOL_STR(__ksymtab_)
+
 static void handle_modversions(struct module *mod, struct elf_info *info,
 			       Elf_Sym *sym, const char *symname)
 {
@@ -628,7 +633,7 @@ static void handle_modversions(struct module *mod, struct elf_info *info,
 		export = export_from_sec(info, get_secindex(info, sym));
 
 	/* CRC'd symbol */
-	if (strstarts(symname, "__crc_")) {
+	if (strstarts(symname, CRC_PFX)) {
 		is_crc = true;
 		crc = (unsigned int) sym->st_value;
 		if (sym->st_shndx != SHN_UNDEF && sym->st_shndx != SHN_ABS) {
@@ -641,7 +646,7 @@ static void handle_modversions(struct module *mod, struct elf_info *info,
 				info->sechdrs[sym->st_shndx].sh_addr : 0);
 			crc = TO_NATIVE(*crcp);
 		}
-		sym_update_crc(symname + strlen("__crc_"), mod, crc,
+		sym_update_crc(symname + strlen(CRC_PFX), mod, crc,
 				export);
 	}
 
@@ -679,10 +684,15 @@ static void handle_modversions(struct module *mod, struct elf_info *info,
 		}
 #endif
 
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+		if (symname[0] != '_')
+			break;
+		else
+			symname++;
+#endif
 		if (is_crc) {
 			const char *e = is_vmlinux(mod->name) ?"":".ko";
-			warn("EXPORT symbol \"%s\" [%s%s] version generation failed, symbol will not be versioned.\n",
-			     symname + strlen("__crc_"), mod->name, e);
+			warn("EXPORT symbol \"%s\" [%s%s] version generation failed, symbol will not be versioned.\n", symname + strlen(CRC_PFX), mod->name, e);
 		}
 		mod->unres = alloc_symbol(symname,
 					  ELF_ST_BIND(sym->st_info) == STB_WEAK,
@@ -690,13 +700,13 @@ static void handle_modversions(struct module *mod, struct elf_info *info,
 		break;
 	default:
 		/* All exported symbols */
-		if (strstarts(symname, "__ksymtab_")) {
-			sym_add_exported(symname + strlen("__ksymtab_"), mod,
+		if (strstarts(symname, KSYMTAB_PFX)) {
+			sym_add_exported(symname + strlen(KSYMTAB_PFX), mod,
 					export);
 		}
-		if (strcmp(symname, "init_module") == 0)
+		if (strcmp(symname, VMLINUX_SYMBOL_STR(init_module)) == 0)
 			mod->has_init = 1;
-		if (strcmp(symname, "cleanup_module") == 0)
+		if (strcmp(symname, VMLINUX_SYMBOL_STR(cleanup_module)) == 0)
 			mod->has_cleanup = 1;
 		break;
 	}
@@ -2230,7 +2240,7 @@ static int add_versions(struct buffer *b, struct module *mod)
 			err = 1;
 			break;
 		}
-		buf_printf(b, "\t{ %#8x, \"%s\" },\n",
+		buf_printf(b, "\t{ %#8x, __VMLINUX_SYMBOL_STR(%s) },\n",
 			   s->crc, s->name);
 	}
 
diff --git a/usr/initramfs_data.S b/usr/initramfs_data.S
index d07648f05bbf..b28da799f6a6 100644
--- a/usr/initramfs_data.S
+++ b/usr/initramfs_data.S
@@ -30,8 +30,8 @@ __irf_start:
 .incbin __stringify(INITRAMFS_IMAGE)
 __irf_end:
 .section .init.ramfs.info,"a"
-.globl __initramfs_size
-__initramfs_size:
+.globl VMLINUX_SYMBOL(__initramfs_size)
+VMLINUX_SYMBOL(__initramfs_size):
 #ifdef CONFIG_64BIT
 	.quad __irf_end - __irf_start
 #else
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 30/37] scripts: revert CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX patches
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, linux-kernel-library, linux-arch,
	Hajime Tazaki, Akira Moroo

LKL requires this config option in order to build/run on Windows host
because Windows compiler (mingw32) prepends '_' to each symbol.

This commit reverts:
commit 7953002a7c65 ("vmlinux.lds.h: remove stale <linux/export.h>
include")
commit c4df32c80d04 ("export.h: remove VMLINUX_SYMBOL() and
VMLINUX_SYMBOL_STR()")
commit 00979ce4fcc9 ("linux/linkage.h: replace VMLINUX_SYMBOL_STR()
with __stringify()")
commit a6b04f0ed5e9 ("checkpatch: remove VMLINUX_SYMBOL() check")
commit a62143850053 ("vmlinux.lds.h: remove no-op macro
VMLINUX_SYMBOL()")
commit 704db5433fb4 ("kbuild: remove
CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX")
commit 94e58e0ac312 ("export.h: remove code for prefixing symbols
with underscore")
commit 5a144a1acd0b ("depmod.sh: remove symbol prefix support")
commit 534c9f2ec4c9 ("kallsyms: remove symbol prefix support")
commit 74d931716151 ("genksyms: remove symbol prefix support")
commit b2c5cdcfd4bc ("modpost: remove symbol prefix support")

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 Makefile                          |   2 +-
 arch/Kconfig                      |   6 +
 certs/system_certificates.S       |  16 +-
 include/asm-generic/export.h      |  34 ++--
 include/asm-generic/vmlinux.lds.h | 279 +++++++++++++++---------------
 include/linux/export.h            |  23 ++-
 include/linux/linkage.h           |  12 +-
 scripts/Makefile.build            |   9 +-
 scripts/adjust_autoksyms.sh       |   6 +
 scripts/checkpatch.pl             |  10 ++
 scripts/depmod.sh                 |  25 ++-
 scripts/genksyms/genksyms.c       |  11 +-
 scripts/kallsyms.c                |  47 +++--
 scripts/link-vmlinux.sh           |   4 +
 scripts/mod/modpost.c             |  30 ++--
 usr/initramfs_data.S              |   4 +-
 16 files changed, 318 insertions(+), 200 deletions(-)

diff --git a/Makefile b/Makefile
index 874c0aec0f9c..d2c9e3a420f6 100644
--- a/Makefile
+++ b/Makefile
@@ -1805,7 +1805,7 @@ quiet_cmd_rmfiles = $(if $(wildcard $(rm-files)),CLEAN   $(wildcard $(rm-files))
 # Run depmod only if we have System.map and depmod is executable
 quiet_cmd_depmod = DEPMOD  $(KERNELRELEASE)
       cmd_depmod = $(CONFIG_SHELL) $(srctree)/scripts/depmod.sh $(DEPMOD) \
-                   $(KERNELRELEASE)
+                   $(KERNELRELEASE) "$(patsubst y,_,$(CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX))"
 
 # read saved command lines for existing targets
 existing-targets := $(wildcard $(sort $(targets)))
diff --git a/arch/Kconfig b/arch/Kconfig
index a7b57dd42c26..a01df2ae6a1b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -594,6 +594,12 @@ config MODULES_USE_ELF_REL
 	  Modules only use ELF REL relocations.  Modules with ELF RELA
 	  relocations will give an error.
 
+config HAVE_UNDERSCORE_SYMBOL_PREFIX
+	bool
+	help
+	  Some architectures generate an _ in front of C symbols; things like
+	  module loading and assembly files need to know about this.
+
 config HAVE_IRQ_EXIT_ON_IRQ_STACK
 	bool
 	help
diff --git a/certs/system_certificates.S b/certs/system_certificates.S
index 8f29058adf93..3918ff7235ed 100644
--- a/certs/system_certificates.S
+++ b/certs/system_certificates.S
@@ -5,8 +5,8 @@
 	__INITRODATA
 
 	.align 8
-	.globl system_certificate_list
-system_certificate_list:
+	.globl VMLINUX_SYMBOL(system_certificate_list)
+VMLINUX_SYMBOL(system_certificate_list):
 __cert_list_start:
 #ifdef CONFIG_MODULE_SIG
 	.incbin "certs/signing_key.x509"
@@ -15,21 +15,21 @@ __cert_list_start:
 __cert_list_end:
 
 #ifdef CONFIG_SYSTEM_EXTRA_CERTIFICATE
-	.globl system_extra_cert
+	.globl VMLINUX_SYMBOL(system_extra_cert)
 	.size system_extra_cert, CONFIG_SYSTEM_EXTRA_CERTIFICATE_SIZE
-system_extra_cert:
+VMLINUX_SYMBOL(system_extra_cert):
 	.fill CONFIG_SYSTEM_EXTRA_CERTIFICATE_SIZE, 1, 0
 
 	.align 4
-	.globl system_extra_cert_used
-system_extra_cert_used:
+	.globl VMLINUX_SYMBOL(system_extra_cert_used)
+VMLINUX_SYMBOL(system_extra_cert_used):
 	.int 0
 
 #endif /* CONFIG_SYSTEM_EXTRA_CERTIFICATE */
 
 	.align 8
-	.globl system_certificate_list_size
-system_certificate_list_size:
+	.globl VMLINUX_SYMBOL(system_certificate_list_size)
+VMLINUX_SYMBOL(system_certificate_list_size):
 #ifdef CONFIG_64BIT
 	.quad __cert_list_end - __cert_list_start
 #else
diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index 294d6ae785d4..69ce0914b025 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -27,32 +27,42 @@
 #endif
 .endm
 
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+#define KSYM(name) _##name
+#else
+#define KSYM(name) name
+#endif
+
 /*
  * note on .section use: @progbits vs %progbits nastiness doesn't matter,
  * since we immediately emit into those sections anyway.
  */
 .macro ___EXPORT_SYMBOL name,val,sec
 #ifdef CONFIG_MODULES
-	.globl __ksymtab_\name
+	.globl KSYM(__ksymtab_\name)
 	.section ___ksymtab\sec+\name,"a"
 	.balign KSYM_ALIGN
-__ksymtab_\name:
-	__put \val, __kstrtab_\name
+KSYM(__ksymtab_\name):
+	__put \val, KSYM(__kstrtab_\name)
 	.previous
 	.section __ksymtab_strings,"a"
-__kstrtab_\name:
+KSYM(__kstrtab_\name):
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+	.asciz "_\name"
+#else
 	.asciz "\name"
+#endif
 	.previous
 #ifdef CONFIG_MODVERSIONS
 	.section ___kcrctab\sec+\name,"a"
 	.balign KCRC_ALIGN
-__kcrctab_\name:
+KSYM(__kcrctab_\name):
 #if defined(CONFIG_MODULE_REL_CRCS)
-	.long __crc_\name - .
+	.long KSYM(__crc_\name) - .
 #else
-	.long __crc_\name
+	.long KSYM(__crc_\name)
 #endif
-	.weak __crc_\name
+	.weak KSYM(__crc_\name)
 	.previous
 #endif
 #endif
@@ -85,12 +95,12 @@ __ksym_marker_\sym:
 #endif
 
 #define EXPORT_SYMBOL(name)					\
-	__EXPORT_SYMBOL(name, KSYM_FUNC(name),)
+	__EXPORT_SYMBOL(name, KSYM_FUNC(KSYM(name)),)
 #define EXPORT_SYMBOL_GPL(name) 				\
-	__EXPORT_SYMBOL(name, KSYM_FUNC(name), _gpl)
+	__EXPORT_SYMBOL(name, KSYM_FUNC(KSYM(name)), _gpl)
 #define EXPORT_DATA_SYMBOL(name)				\
-	__EXPORT_SYMBOL(name, name,)
+	__EXPORT_SYMBOL(name, KSYM(name),)
 #define EXPORT_DATA_SYMBOL_GPL(name)				\
-	__EXPORT_SYMBOL(name, name,_gpl)
+	__EXPORT_SYMBOL(name, KSYM(name),_gpl)
 
 #endif
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index cd28f63bfbc7..d0043449d2d3 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -54,6 +54,8 @@
 #define LOAD_OFFSET 0
 #endif
 
+#include <linux/export.h>
+
 /* Align . to a 8 byte boundary equals to maximum function alignment. */
 #define ALIGN_FUNCTION()  . = ALIGN(8)
 
@@ -117,67 +119,67 @@
 			__stop_mcount_loc = .;
 #else
 #define MCOUNT_REC()	. = ALIGN(8);				\
-			__start_mcount_loc = .;			\
+			VMLINUX_SYMBOL(__start_mcount_loc) = .; \
 			KEEP(*(__mcount_loc))			\
-			__stop_mcount_loc = .;
+			VMLINUX_SYMBOL(__stop_mcount_loc) = .;
 #endif
 #else
 #define MCOUNT_REC()
 #endif
 
 #ifdef CONFIG_TRACE_BRANCH_PROFILING
-#define LIKELY_PROFILE()	__start_annotated_branch_profile = .;	\
-				KEEP(*(_ftrace_annotated_branch))	\
-				__stop_annotated_branch_profile = .;
+#define LIKELY_PROFILE()	VMLINUX_SYMBOL(__start_annotated_branch_profile) = .; \
+				KEEP(*(_ftrace_annotated_branch))		      \
+				VMLINUX_SYMBOL(__stop_annotated_branch_profile) = .;
 #else
 #define LIKELY_PROFILE()
 #endif
 
 #ifdef CONFIG_PROFILE_ALL_BRANCHES
-#define BRANCH_PROFILE()	__start_branch_profile = .;		\
-				KEEP(*(_ftrace_branch))			\
-				__stop_branch_profile = .;
+#define BRANCH_PROFILE()	VMLINUX_SYMBOL(__start_branch_profile) = .;   \
+				KEEP(*(_ftrace_branch))			      \
+				VMLINUX_SYMBOL(__stop_branch_profile) = .;
 #else
 #define BRANCH_PROFILE()
 #endif
 
 #ifdef CONFIG_KPROBES
 #define KPROBE_BLACKLIST()	. = ALIGN(8);				      \
-				__start_kprobe_blacklist = .;		      \
+				VMLINUX_SYMBOL(__start_kprobe_blacklist) = .; \
 				KEEP(*(_kprobe_blacklist))		      \
-				__stop_kprobe_blacklist = .;
+				VMLINUX_SYMBOL(__stop_kprobe_blacklist) = .;
 #else
 #define KPROBE_BLACKLIST()
 #endif
 
 #ifdef CONFIG_FUNCTION_ERROR_INJECTION
 #define ERROR_INJECT_WHITELIST()	STRUCT_ALIGN();			      \
-			__start_error_injection_whitelist = .;		      \
+			VMLINUX_SYMBOL(__start_error_injection_whitelist) = .;\
 			KEEP(*(_error_injection_whitelist))		      \
-			__stop_error_injection_whitelist = .;
+			VMLINUX_SYMBOL(__stop_error_injection_whitelist) = .;
 #else
 #define ERROR_INJECT_WHITELIST()
 #endif
 
 #ifdef CONFIG_EVENT_TRACING
 #define FTRACE_EVENTS()	. = ALIGN(8);					\
-			__start_ftrace_events = .;			\
+			VMLINUX_SYMBOL(__start_ftrace_events) = .;	\
 			KEEP(*(_ftrace_events))				\
-			__stop_ftrace_events = .;			\
-			__start_ftrace_eval_maps = .;			\
+			VMLINUX_SYMBOL(__stop_ftrace_events) = .;	\
+			VMLINUX_SYMBOL(__start_ftrace_eval_maps) = .;	\
 			KEEP(*(_ftrace_eval_map))			\
-			__stop_ftrace_eval_maps = .;
+			VMLINUX_SYMBOL(__stop_ftrace_eval_maps) = .;
 #else
 #define FTRACE_EVENTS()
 #endif
 
 #ifdef CONFIG_TRACING
-#define TRACE_PRINTKS()	 __start___trace_bprintk_fmt = .;      \
+#define TRACE_PRINTKS() VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .;      \
 			 KEEP(*(__trace_printk_fmt)) /* Trace_printk fmt' pointer */ \
-			 __stop___trace_bprintk_fmt = .;
-#define TRACEPOINT_STR() __start___tracepoint_str = .;	\
+			 VMLINUX_SYMBOL(__stop___trace_bprintk_fmt) = .;
+#define TRACEPOINT_STR() VMLINUX_SYMBOL(__start___tracepoint_str) = .;	\
 			 KEEP(*(__tracepoint_str)) /* Trace_printk fmt' pointer */ \
-			 __stop___tracepoint_str = .;
+			 VMLINUX_SYMBOL(__stop___tracepoint_str) = .;
 #else
 #define TRACE_PRINTKS()
 #define TRACEPOINT_STR()
@@ -185,27 +187,27 @@
 
 #ifdef CONFIG_FTRACE_SYSCALLS
 #define TRACE_SYSCALLS() . = ALIGN(8);					\
-			 __start_syscalls_metadata = .;			\
+			 VMLINUX_SYMBOL(__start_syscalls_metadata) = .;	\
 			 KEEP(*(__syscalls_metadata))			\
-			 __stop_syscalls_metadata = .;
+			 VMLINUX_SYMBOL(__stop_syscalls_metadata) = .;
 #else
 #define TRACE_SYSCALLS()
 #endif
 
 #ifdef CONFIG_BPF_EVENTS
 #define BPF_RAW_TP() STRUCT_ALIGN();					\
-			 __start__bpf_raw_tp = .;			\
+			 VMLINUX_SYMBOL(__start__bpf_raw_tp) = .;	\
 			 KEEP(*(__bpf_raw_tp_map))			\
-			 __stop__bpf_raw_tp = .;
+			 VMLINUX_SYMBOL(__stop__bpf_raw_tp) = .;
 #else
 #define BPF_RAW_TP()
 #endif
 
 #ifdef CONFIG_SERIAL_EARLYCON
 #define EARLYCON_TABLE() . = ALIGN(8);				\
-			 __earlycon_table = .;			\
+			 VMLINUX_SYMBOL(__earlycon_table) = .;	\
 			 KEEP(*(__earlycon_table))		\
-			 __earlycon_table_end = .;
+			 VMLINUX_SYMBOL(__earlycon_table_end) = .;
 #else
 #define EARLYCON_TABLE()
 #endif
@@ -225,7 +227,7 @@
 #define _OF_TABLE_0(name)
 #define _OF_TABLE_1(name)						\
 	. = ALIGN(8);							\
-	__##name##_of_table = .;					\
+	VMLINUX_SYMBOL(__##name##_of_table) = .;			\
 	KEEP(*(__##name##_of_table))					\
 	KEEP(*(__##name##_of_table_end))
 
@@ -239,9 +241,9 @@
 #ifdef CONFIG_ACPI
 #define ACPI_PROBE_TABLE(name)						\
 	. = ALIGN(8);							\
-	__##name##_acpi_probe_table = .;				\
+	VMLINUX_SYMBOL(__##name##_acpi_probe_table) = .;		\
 	KEEP(*(__##name##_acpi_probe_table))				\
-	__##name##_acpi_probe_table_end = .;
+	VMLINUX_SYMBOL(__##name##_acpi_probe_table_end) = .;
 #else
 #define ACPI_PROBE_TABLE(name)
 #endif
@@ -258,9 +260,9 @@
 
 #define KERNEL_DTB()							\
 	STRUCT_ALIGN();							\
-	__dtb_start = .;						\
+	VMLINUX_SYMBOL(__dtb_start) = .;				\
 	KEEP(*(.dtb.init.rodata))					\
-	__dtb_end = .;
+	VMLINUX_SYMBOL(__dtb_end) = .;
 
 /*
  * .data section
@@ -273,16 +275,16 @@
 	MEM_KEEP(init.data*)						\
 	MEM_KEEP(exit.data*)						\
 	*(.data.unlikely)						\
-	__start_once = .;						\
+	VMLINUX_SYMBOL(__start_once) = .;				\
 	*(.data.once)							\
-	__end_once = .;							\
+	VMLINUX_SYMBOL(__end_once) = .;					\
 	STRUCT_ALIGN();							\
 	*(__tracepoints)						\
 	/* implement dynamic printk debug */				\
 	. = ALIGN(8);							\
-	__start___verbose = .;						\
+	VMLINUX_SYMBOL(__start___verbose) = .;                          \
 	KEEP(*(__verbose))                                              \
-	__stop___verbose = .;						\
+	VMLINUX_SYMBOL(__stop___verbose) = .;				\
 	LIKELY_PROFILE()		       				\
 	BRANCH_PROFILE()						\
 	TRACE_PRINTKS()							\
@@ -294,10 +296,10 @@
  */
 #define NOSAVE_DATA							\
 	. = ALIGN(PAGE_SIZE);						\
-	__nosave_begin = .;						\
+	VMLINUX_SYMBOL(__nosave_begin) = .;				\
 	*(.data..nosave)						\
 	. = ALIGN(PAGE_SIZE);						\
-	__nosave_end = .;
+	VMLINUX_SYMBOL(__nosave_end) = .;
 
 #define PAGE_ALIGNED_DATA(page_align)					\
 	. = ALIGN(page_align);						\
@@ -314,13 +316,13 @@
 
 #define INIT_TASK_DATA(align)						\
 	. = ALIGN(align);						\
-	__start_init_task = .;						\
-	init_thread_union = .;						\
-	init_stack = .;							\
+	VMLINUX_SYMBOL(__start_init_task) = .;				\
+	VMLINUX_SYMBOL(init_thread_union) = .;				\
+	VMLINUX_SYMBOL(init_stack) = .;					\
 	KEEP(*(.data..init_task))					\
 	KEEP(*(.data..init_thread_info))				\
-	. = __start_init_task + THREAD_SIZE;				\
-	__end_init_task = .;
+	. = VMLINUX_SYMBOL(__start_init_task) + THREAD_SIZE;		\
+	VMLINUX_SYMBOL(__end_init_task) = .;
 
 #define JUMP_TABLE_DATA							\
 	. = ALIGN(8);							\
@@ -334,10 +336,10 @@
  */
 #ifndef RO_AFTER_INIT_DATA
 #define RO_AFTER_INIT_DATA						\
-	__start_ro_after_init = .;					\
+	VMLINUX_SYMBOL(__start_ro_after_init) = .;			\
 	*(.data..ro_after_init)						\
 	JUMP_TABLE_DATA							\
-	__end_ro_after_init = .;
+	VMLINUX_SYMBOL(__end_ro_after_init) = .;
 #endif
 
 /*
@@ -346,13 +348,13 @@
 #define RO_DATA_SECTION(align)						\
 	. = ALIGN((align));						\
 	.rodata           : AT(ADDR(.rodata) - LOAD_OFFSET) {		\
-		__start_rodata = .;					\
+		VMLINUX_SYMBOL(__start_rodata) = .;			\
 		*(.rodata) *(.rodata.*)					\
 		RO_AFTER_INIT_DATA	/* Read only after init */	\
 		. = ALIGN(8);						\
-		__start___tracepoints_ptrs = .;				\
+		VMLINUX_SYMBOL(__start___tracepoints_ptrs) = .;		\
 		KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
-		__stop___tracepoints_ptrs = .;				\
+		VMLINUX_SYMBOL(__stop___tracepoints_ptrs) = .;		\
 		*(__tracepoints_strings)/* Tracepoints: strings */	\
 	}								\
 									\
@@ -362,109 +364,109 @@
 									\
 	/* PCI quirks */						\
 	.pci_fixup        : AT(ADDR(.pci_fixup) - LOAD_OFFSET) {	\
-		__start_pci_fixups_early = .;				\
+		VMLINUX_SYMBOL(__start_pci_fixups_early) = .;		\
 		KEEP(*(.pci_fixup_early))				\
-		__end_pci_fixups_early = .;				\
-		__start_pci_fixups_header = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_early) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_header) = .;		\
 		KEEP(*(.pci_fixup_header))				\
-		__end_pci_fixups_header = .;				\
-		__start_pci_fixups_final = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_header) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_final) = .;		\
 		KEEP(*(.pci_fixup_final))				\
-		__end_pci_fixups_final = .;				\
-		__start_pci_fixups_enable = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_final) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_enable) = .;		\
 		KEEP(*(.pci_fixup_enable))				\
-		__end_pci_fixups_enable = .;				\
-		__start_pci_fixups_resume = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_enable) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_resume) = .;		\
 		KEEP(*(.pci_fixup_resume))				\
-		__end_pci_fixups_resume = .;				\
-		__start_pci_fixups_resume_early = .;			\
+		VMLINUX_SYMBOL(__end_pci_fixups_resume) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_resume_early) = .;	\
 		KEEP(*(.pci_fixup_resume_early))			\
-		__end_pci_fixups_resume_early = .;			\
-		__start_pci_fixups_suspend = .;				\
+		VMLINUX_SYMBOL(__end_pci_fixups_resume_early) = .;	\
+		VMLINUX_SYMBOL(__start_pci_fixups_suspend) = .;		\
 		KEEP(*(.pci_fixup_suspend))				\
-		__end_pci_fixups_suspend = .;				\
-		__start_pci_fixups_suspend_late = .;			\
+		VMLINUX_SYMBOL(__end_pci_fixups_suspend) = .;		\
+		VMLINUX_SYMBOL(__start_pci_fixups_suspend_late) = .;	\
 		KEEP(*(.pci_fixup_suspend_late))			\
-		__end_pci_fixups_suspend_late = .;			\
+		VMLINUX_SYMBOL(__end_pci_fixups_suspend_late) = .;	\
 	}								\
 									\
 	/* Built-in firmware blobs */					\
 	.builtin_fw        : AT(ADDR(.builtin_fw) - LOAD_OFFSET) {	\
-		__start_builtin_fw = .;					\
+		VMLINUX_SYMBOL(__start_builtin_fw) = .;			\
 		KEEP(*(.builtin_fw))					\
-		__end_builtin_fw = .;					\
+		VMLINUX_SYMBOL(__end_builtin_fw) = .;			\
 	}								\
 									\
 	TRACEDATA							\
 									\
 	/* Kernel symbol table: Normal symbols */			\
 	__ksymtab         : AT(ADDR(__ksymtab) - LOAD_OFFSET) {		\
-		__start___ksymtab = .;					\
+		VMLINUX_SYMBOL(__start___ksymtab) = .;			\
 		KEEP(*(SORT(___ksymtab+*)))				\
-		__stop___ksymtab = .;					\
+		VMLINUX_SYMBOL(__stop___ksymtab) = .;			\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__ksymtab_gpl     : AT(ADDR(__ksymtab_gpl) - LOAD_OFFSET) {	\
-		__start___ksymtab_gpl = .;				\
+		VMLINUX_SYMBOL(__start___ksymtab_gpl) = .;		\
 		KEEP(*(SORT(___ksymtab_gpl+*)))				\
-		__stop___ksymtab_gpl = .;				\
+		VMLINUX_SYMBOL(__stop___ksymtab_gpl) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: Normal unused symbols */		\
 	__ksymtab_unused  : AT(ADDR(__ksymtab_unused) - LOAD_OFFSET) {	\
-		__start___ksymtab_unused = .;				\
+		VMLINUX_SYMBOL(__start___ksymtab_unused) = .;		\
 		KEEP(*(SORT(___ksymtab_unused+*)))			\
-		__stop___ksymtab_unused = .;				\
+		VMLINUX_SYMBOL(__stop___ksymtab_unused) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only unused symbols */		\
 	__ksymtab_unused_gpl : AT(ADDR(__ksymtab_unused_gpl) - LOAD_OFFSET) { \
-		__start___ksymtab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__start___ksymtab_unused_gpl) = .;	\
 		KEEP(*(SORT(___ksymtab_unused_gpl+*)))			\
-		__stop___ksymtab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__stop___ksymtab_unused_gpl) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: GPL-future-only symbols */		\
 	__ksymtab_gpl_future : AT(ADDR(__ksymtab_gpl_future) - LOAD_OFFSET) { \
-		__start___ksymtab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__start___ksymtab_gpl_future) = .;	\
 		KEEP(*(SORT(___ksymtab_gpl_future+*)))			\
-		__stop___ksymtab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__stop___ksymtab_gpl_future) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: Normal symbols */			\
 	__kcrctab         : AT(ADDR(__kcrctab) - LOAD_OFFSET) {		\
-		__start___kcrctab = .;					\
+		VMLINUX_SYMBOL(__start___kcrctab) = .;			\
 		KEEP(*(SORT(___kcrctab+*)))				\
-		__stop___kcrctab = .;					\
+		VMLINUX_SYMBOL(__stop___kcrctab) = .;			\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only symbols */			\
 	__kcrctab_gpl     : AT(ADDR(__kcrctab_gpl) - LOAD_OFFSET) {	\
-		__start___kcrctab_gpl = .;				\
+		VMLINUX_SYMBOL(__start___kcrctab_gpl) = .;		\
 		KEEP(*(SORT(___kcrctab_gpl+*)))				\
-		__stop___kcrctab_gpl = .;				\
+		VMLINUX_SYMBOL(__stop___kcrctab_gpl) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: Normal unused symbols */		\
 	__kcrctab_unused  : AT(ADDR(__kcrctab_unused) - LOAD_OFFSET) {	\
-		__start___kcrctab_unused = .;				\
+		VMLINUX_SYMBOL(__start___kcrctab_unused) = .;		\
 		KEEP(*(SORT(___kcrctab_unused+*)))			\
-		__stop___kcrctab_unused = .;				\
+		VMLINUX_SYMBOL(__stop___kcrctab_unused) = .;		\
 	}								\
 									\
 	/* Kernel symbol table: GPL-only unused symbols */		\
 	__kcrctab_unused_gpl : AT(ADDR(__kcrctab_unused_gpl) - LOAD_OFFSET) { \
-		__start___kcrctab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__start___kcrctab_unused_gpl) = .;	\
 		KEEP(*(SORT(___kcrctab_unused_gpl+*)))			\
-		__stop___kcrctab_unused_gpl = .;			\
+		VMLINUX_SYMBOL(__stop___kcrctab_unused_gpl) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: GPL-future-only symbols */		\
 	__kcrctab_gpl_future : AT(ADDR(__kcrctab_gpl_future) - LOAD_OFFSET) { \
-		__start___kcrctab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__start___kcrctab_gpl_future) = .;	\
 		KEEP(*(SORT(___kcrctab_gpl_future+*)))			\
-		__stop___kcrctab_gpl_future = .;			\
+		VMLINUX_SYMBOL(__stop___kcrctab_gpl_future) = .;	\
 	}								\
 									\
 	/* Kernel symbol table: strings */				\
@@ -481,18 +483,18 @@
 									\
 	/* Built-in module parameters. */				\
 	__param : AT(ADDR(__param) - LOAD_OFFSET) {			\
-		__start___param = .;					\
+		VMLINUX_SYMBOL(__start___param) = .;			\
 		KEEP(*(__param))					\
-		__stop___param = .;					\
+		VMLINUX_SYMBOL(__stop___param) = .;			\
 	}								\
 									\
 	/* Built-in module versions. */					\
 	__modver : AT(ADDR(__modver) - LOAD_OFFSET) {			\
-		__start___modver = .;					\
+		VMLINUX_SYMBOL(__start___modver) = .;			\
 		KEEP(*(__modver))					\
-		__stop___modver = .;					\
+		VMLINUX_SYMBOL(__stop___modver) = .;			\
 		. = ALIGN((align));					\
-		__end_rodata = .;					\
+		VMLINUX_SYMBOL(__end_rodata) = .;			\
 	}								\
 	. = ALIGN((align));
 
@@ -522,47 +524,47 @@
  * address even at second ld pass when generating System.map */
 #define SCHED_TEXT							\
 		ALIGN_FUNCTION();					\
-		__sched_text_start = .;					\
+		VMLINUX_SYMBOL(__sched_text_start) = .;			\
 		*(.sched.text)						\
-		__sched_text_end = .;
+		VMLINUX_SYMBOL(__sched_text_end) = .;
 
 /* spinlock.text is aling to function alignment to secure we have same
  * address even at second ld pass when generating System.map */
 #define LOCK_TEXT							\
 		ALIGN_FUNCTION();					\
-		__lock_text_start = .;					\
+		VMLINUX_SYMBOL(__lock_text_start) = .;			\
 		*(.spinlock.text)					\
-		__lock_text_end = .;
+		VMLINUX_SYMBOL(__lock_text_end) = .;
 
 #define CPUIDLE_TEXT							\
 		ALIGN_FUNCTION();					\
-		__cpuidle_text_start = .;				\
+		VMLINUX_SYMBOL(__cpuidle_text_start) = .;		\
 		*(.cpuidle.text)					\
-		__cpuidle_text_end = .;
+		VMLINUX_SYMBOL(__cpuidle_text_end) = .;
 
 #define KPROBES_TEXT							\
 		ALIGN_FUNCTION();					\
-		__kprobes_text_start = .;				\
+		VMLINUX_SYMBOL(__kprobes_text_start) = .;		\
 		*(.kprobes.text)					\
-		__kprobes_text_end = .;
+		VMLINUX_SYMBOL(__kprobes_text_end) = .;
 
 #define ENTRY_TEXT							\
 		ALIGN_FUNCTION();					\
-		__entry_text_start = .;					\
+		VMLINUX_SYMBOL(__entry_text_start) = .;			\
 		*(.entry.text)						\
-		__entry_text_end = .;
+		VMLINUX_SYMBOL(__entry_text_end) = .;
 
 #define IRQENTRY_TEXT							\
 		ALIGN_FUNCTION();					\
-		__irqentry_text_start = .;				\
+		VMLINUX_SYMBOL(__irqentry_text_start) = .;		\
 		*(.irqentry.text)					\
-		__irqentry_text_end = .;
+		VMLINUX_SYMBOL(__irqentry_text_end) = .;
 
 #define SOFTIRQENTRY_TEXT						\
 		ALIGN_FUNCTION();					\
-		__softirqentry_text_start = .;				\
+		VMLINUX_SYMBOL(__softirqentry_text_start) = .;		\
 		*(.softirqentry.text)					\
-		__softirqentry_text_end = .;
+		VMLINUX_SYMBOL(__softirqentry_text_end) = .;
 
 /* Section used for early init (in .S files) */
 #define HEAD_TEXT  KEEP(*(.head.text))
@@ -578,9 +580,9 @@
 #define EXCEPTION_TABLE(align)						\
 	. = ALIGN(align);						\
 	__ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {		\
-		__start___ex_table = .;					\
+		VMLINUX_SYMBOL(__start___ex_table) = .;			\
 		KEEP(*(__ex_table))					\
-		__stop___ex_table = .;					\
+		VMLINUX_SYMBOL(__stop___ex_table) = .;			\
 	}
 
 /*
@@ -594,11 +596,11 @@
 
 #ifdef CONFIG_CONSTRUCTORS
 #define KERNEL_CTORS()	. = ALIGN(8);			   \
-			__ctors_start = .;		   \
+			VMLINUX_SYMBOL(__ctors_start) = .; \
 			KEEP(*(.ctors))			   \
 			KEEP(*(SORT(.init_array.*)))	   \
 			KEEP(*(.init_array))		   \
-			__ctors_end = .;
+			VMLINUX_SYMBOL(__ctors_end) = .;
 #else
 #define KERNEL_CTORS()
 #endif
@@ -734,9 +736,9 @@
 #define BUG_TABLE							\
 	. = ALIGN(8);							\
 	__bug_table : AT(ADDR(__bug_table) - LOAD_OFFSET) {		\
-		__start___bug_table = .;				\
+		VMLINUX_SYMBOL(__start___bug_table) = .;		\
 		KEEP(*(__bug_table))					\
-		__stop___bug_table = .;					\
+		VMLINUX_SYMBOL(__stop___bug_table) = .;			\
 	}
 #else
 #define BUG_TABLE
@@ -746,22 +748,22 @@
 #define ORC_UNWIND_TABLE						\
 	. = ALIGN(4);							\
 	.orc_unwind_ip : AT(ADDR(.orc_unwind_ip) - LOAD_OFFSET) {	\
-		__start_orc_unwind_ip = .;				\
+		VMLINUX_SYMBOL(__start_orc_unwind_ip) = .;		\
 		KEEP(*(.orc_unwind_ip))					\
-		__stop_orc_unwind_ip = .;				\
+		VMLINUX_SYMBOL(__stop_orc_unwind_ip) = .;		\
 	}								\
 	. = ALIGN(2);							\
 	.orc_unwind : AT(ADDR(.orc_unwind) - LOAD_OFFSET) {		\
-		__start_orc_unwind = .;					\
+		VMLINUX_SYMBOL(__start_orc_unwind) = .;			\
 		KEEP(*(.orc_unwind))					\
-		__stop_orc_unwind = .;					\
+		VMLINUX_SYMBOL(__stop_orc_unwind) = .;			\
 	}								\
 	. = ALIGN(4);							\
 	.orc_lookup : AT(ADDR(.orc_lookup) - LOAD_OFFSET) {		\
-		orc_lookup = .;						\
+		VMLINUX_SYMBOL(orc_lookup) = .;				\
 		. += (((SIZEOF(.text) + LOOKUP_BLOCK_SIZE - 1) /	\
 			LOOKUP_BLOCK_SIZE) + 1) * 4;			\
-		orc_lookup_end = .;					\
+		VMLINUX_SYMBOL(orc_lookup_end) = .;			\
 	}
 #else
 #define ORC_UNWIND_TABLE
@@ -771,9 +773,9 @@
 #define TRACEDATA							\
 	. = ALIGN(4);							\
 	.tracedata : AT(ADDR(.tracedata) - LOAD_OFFSET) {		\
-		__tracedata_start = .;					\
+		VMLINUX_SYMBOL(__tracedata_start) = .;			\
 		KEEP(*(.tracedata))					\
-		__tracedata_end = .;					\
+		VMLINUX_SYMBOL(__tracedata_end) = .;			\
 	}
 #else
 #define TRACEDATA
@@ -781,24 +783,24 @@
 
 #define NOTES								\
 	.notes : AT(ADDR(.notes) - LOAD_OFFSET) {			\
-		__start_notes = .;					\
+		VMLINUX_SYMBOL(__start_notes) = .;			\
 		KEEP(*(.note.*))					\
-		__stop_notes = .;					\
+		VMLINUX_SYMBOL(__stop_notes) = .;			\
 	}
 
 #define INIT_SETUP(initsetup_align)					\
 		. = ALIGN(initsetup_align);				\
-		__setup_start = .;					\
+		VMLINUX_SYMBOL(__setup_start) = .;			\
 		KEEP(*(.init.setup))					\
-		__setup_end = .;
+		VMLINUX_SYMBOL(__setup_end) = .;
 
 #define INIT_CALLS_LEVEL(level)						\
-		__initcall##level##_start = .;				\
+		VMLINUX_SYMBOL(__initcall##level##_start) = .;		\
 		KEEP(*(.initcall##level##.init))			\
 		KEEP(*(.initcall##level##s.init))			\
 
 #define INIT_CALLS							\
-		__initcall_start = .;					\
+		VMLINUX_SYMBOL(__initcall_start) = .;			\
 		KEEP(*(.initcallearly.init))				\
 		INIT_CALLS_LEVEL(0)					\
 		INIT_CALLS_LEVEL(1)					\
@@ -809,17 +811,17 @@
 		INIT_CALLS_LEVEL(rootfs)				\
 		INIT_CALLS_LEVEL(6)					\
 		INIT_CALLS_LEVEL(7)					\
-		__initcall_end = .;
+		VMLINUX_SYMBOL(__initcall_end) = .;
 
 #define CON_INITCALL							\
-		__con_initcall_start = .;				\
+		VMLINUX_SYMBOL(__con_initcall_start) = .;		\
 		KEEP(*(.con_initcall.init))				\
-		__con_initcall_end = .;
+		VMLINUX_SYMBOL(__con_initcall_end) = .;
 
 #ifdef CONFIG_BLK_DEV_INITRD
 #define INIT_RAM_FS							\
 	. = ALIGN(4);							\
-	__initramfs_start = .;						\
+	VMLINUX_SYMBOL(__initramfs_start) = .;				\
 	KEEP(*(.init.ramfs))						\
 	. = ALIGN(8);							\
 	KEEP(*(.init.ramfs.info))
@@ -875,7 +877,7 @@
  * sharing between subsections for different purposes.
  */
 #define PERCPU_INPUT(cacheline)						\
-	__per_cpu_start = .;						\
+	VMLINUX_SYMBOL(__per_cpu_start) = .;				\
 	*(.data..percpu..first)						\
 	. = ALIGN(PAGE_SIZE);						\
 	*(.data..percpu..page_aligned)					\
@@ -885,7 +887,7 @@
 	*(.data..percpu)						\
 	*(.data..percpu..shared_aligned)				\
 	PERCPU_DECRYPTED_SECTION					\
-	__per_cpu_end = .;
+	VMLINUX_SYMBOL(__per_cpu_end) = .;
 
 /**
  * PERCPU_VADDR - define output section for percpu area
@@ -912,11 +914,12 @@
  * address, use PERCPU_SECTION.
  */
 #define PERCPU_VADDR(cacheline, vaddr, phdr)				\
-	__per_cpu_load = .;						\
-	.data..percpu vaddr : AT(__per_cpu_load - LOAD_OFFSET) {	\
+	VMLINUX_SYMBOL(__per_cpu_load) = .;				\
+	.data..percpu vaddr : AT(VMLINUX_SYMBOL(__per_cpu_load)		\
+				- LOAD_OFFSET) {			\
 		PERCPU_INPUT(cacheline)					\
 	} phdr								\
-	. = __per_cpu_load + SIZEOF(.data..percpu);
+	. = VMLINUX_SYMBOL(__per_cpu_load) + SIZEOF(.data..percpu);
 
 /**
  * PERCPU_SECTION - define output section for percpu area, simple version
@@ -933,7 +936,7 @@
 #define PERCPU_SECTION(cacheline)					\
 	. = ALIGN(PAGE_SIZE);						\
 	.data..percpu	: AT(ADDR(.data..percpu) - LOAD_OFFSET) {	\
-		__per_cpu_load = .;					\
+		VMLINUX_SYMBOL(__per_cpu_load) = .;			\
 		PERCPU_INPUT(cacheline)					\
 	}
 
@@ -972,9 +975,9 @@
 #define INIT_TEXT_SECTION(inittext_align)				\
 	. = ALIGN(inittext_align);					\
 	.init.text : AT(ADDR(.init.text) - LOAD_OFFSET) {		\
-		_sinittext = .;						\
+		VMLINUX_SYMBOL(_sinittext) = .;				\
 		INIT_TEXT						\
-		_einittext = .;						\
+		VMLINUX_SYMBOL(_einittext) = .;				\
 	}
 
 #define INIT_DATA_SECTION(initsetup_align)				\
@@ -988,8 +991,8 @@
 
 #define BSS_SECTION(sbss_align, bss_align, stop_align)			\
 	. = ALIGN(sbss_align);						\
-	__bss_start = .;						\
+	VMLINUX_SYMBOL(__bss_start) = .;				\
 	SBSS(sbss_align)						\
 	BSS(bss_align)							\
 	. = ALIGN(stop_align);						\
-	__bss_stop = .;
+	VMLINUX_SYMBOL(__bss_stop) = .;
diff --git a/include/linux/export.h b/include/linux/export.h
index fd8711ed9ac4..34c34d09103c 100644
--- a/include/linux/export.h
+++ b/include/linux/export.h
@@ -10,6 +10,19 @@
  * hackers place grumpy comments in header files.
  */
 
+/* Some toolchains use a `_' prefix for all user symbols. */
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+#define __VMLINUX_SYMBOL(x) _##x
+#define __VMLINUX_SYMBOL_STR(x) "_" #x
+#else
+#define __VMLINUX_SYMBOL(x) x
+#define __VMLINUX_SYMBOL_STR(x) #x
+#endif
+
+/* Indirect, so macros are expanded before pasting. */
+#define VMLINUX_SYMBOL(x) __VMLINUX_SYMBOL(x)
+#define VMLINUX_SYMBOL_STR(x) __VMLINUX_SYMBOL_STR(x)
+
 #ifndef __ASSEMBLY__
 #ifdef MODULE
 extern struct module __this_module;
@@ -27,14 +40,14 @@ extern struct module __this_module;
 #if defined(CONFIG_MODULE_REL_CRCS)
 #define __CRC_SYMBOL(sym, sec)						\
 	asm("	.section \"___kcrctab" sec "+" #sym "\", \"a\"	\n"	\
-	    "	.weak	__crc_" #sym "				\n"	\
-	    "	.long	__crc_" #sym " - .			\n"	\
+	    "	.weak	" VMLINUX_SYMBOL_STR(__crc_##sym) "	\n"	\
+	    "	.long	" VMLINUX_SYMBOL_STR(__crc_##sym) " - .	\n"	\
 	    "	.previous					\n");
 #else
 #define __CRC_SYMBOL(sym, sec)						\
 	asm("	.section \"___kcrctab" sec "+" #sym "\", \"a\"	\n"	\
-	    "	.weak	__crc_" #sym "				\n"	\
-	    "	.long	__crc_" #sym "				\n"	\
+	    "	.weak	" VMLINUX_SYMBOL_STR(__crc_##sym) "	\n"	\
+	    "	.long	" VMLINUX_SYMBOL_STR(__crc_##sym) "	\n"	\
 	    "	.previous					\n");
 #endif
 #else
@@ -80,7 +93,7 @@ struct kernel_symbol {
 	__CRC_SYMBOL(sym, sec)						\
 	static const char __kstrtab_##sym[]				\
 	__attribute__((section("__ksymtab_strings"), used, aligned(1)))	\
-	= #sym;								\
+	= VMLINUX_SYMBOL_STR(#sym);					\
 	__KSYMTAB_ENTRY(sym, sec)
 
 #if defined(__DISABLE_EXPORTS)
diff --git a/include/linux/linkage.h b/include/linux/linkage.h
index 7e020782ade2..d287823ee947 100644
--- a/include/linux/linkage.h
+++ b/include/linux/linkage.h
@@ -24,16 +24,16 @@
 
 #ifndef cond_syscall
 #define cond_syscall(x)	asm(				\
-	".weak " __stringify(x) "\n\t"			\
-	".set  " __stringify(x) ","			\
-		 __stringify(sys_ni_syscall))
+	".weak " VMLINUX_SYMBOL_STR(x) "\n\t"		\
+	".set  " VMLINUX_SYMBOL_STR(x) ","		\
+		 VMLINUX_SYMBOL_STR(sys_ni_syscall))
 #endif
 
 #ifndef SYSCALL_ALIAS
 #define SYSCALL_ALIAS(alias, name) asm(			\
-	".globl " __stringify(alias) "\n\t"		\
-	".set   " __stringify(alias) ","		\
-		  __stringify(name))
+	".globl " VMLINUX_SYMBOL_STR(alias) "\n\t"	\
+	".set   " VMLINUX_SYMBOL_STR(alias) ","		\
+		  VMLINUX_SYMBOL_STR(name))
 #endif
 
 #define __page_aligned_data	__section(.data..page_aligned) __aligned(PAGE_SIZE)
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 2f66ed388d1c..aa4dac94e563 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -123,6 +123,7 @@ $(obj)/%.i: $(src)/%.c FORCE
 cmd_gensymtypes_c =                                                         \
     $(CPP) -D__GENKSYMS__ $(c_flags) $< |                                   \
     scripts/genksyms/genksyms $(if $(1), -T $(2))                           \
+     $(patsubst y,-s _,$(CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX))             \
      $(patsubst y,-R,$(CONFIG_MODULE_REL_CRCS))                             \
      $(if $(KBUILD_PRESERVE),-p)                                            \
      -r $(firstword $(wildcard $(2:.symtypes=.symref) /dev/null))
@@ -334,6 +335,7 @@ cmd_gensymtypes_S =                                                         \
      sed 's/.*___EXPORT_SYMBOL[[:space:]]*\([a-zA-Z0-9_]*\)[[:space:]]*,.*/EXPORT_SYMBOL(\1);/' ; } | \
     $(CPP) -D__GENKSYMS__ $(c_flags) -xc - |                                \
     scripts/genksyms/genksyms $(if $(1), -T $(2))                           \
+     $(patsubst y,-s _,$(CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX))             \
      $(patsubst y,-R,$(CONFIG_MODULE_REL_CRCS))                             \
      $(if $(KBUILD_PRESERVE),-p)                                            \
      -r $(firstword $(wildcard $(2:.symtypes=.symref) /dev/null))
@@ -444,10 +446,15 @@ targets += $(lib-target)
 
 dummy-object = $(obj)/.lib_exports.o
 ksyms-lds = $(dot-target).lds
+ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+ref_prefix = EXTERN(_
+else
+ref_prefix = EXTERN(
+endif
 
 quiet_cmd_export_list = EXPORTS $@
 cmd_export_list = $(OBJDUMP) -h $< | \
-	sed -ne '/___ksymtab/s/.*+\([^ ]*\).*/EXTERN(\1)/p' >$(ksyms-lds);\
+	sed -ne '/___ksymtab/s/.*+\([^ ]*\).*/$(ref_prefix)\1)/p' >$(ksyms-lds);\
 	rm -f $(dummy-object);\
 	echo | $(CC) $(a_flags) -c -o $(dummy-object) -x assembler -;\
 	$(LD) $(ld_flags) -r -o $@ -T $(ksyms-lds) $(dummy-object);\
diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh
index a904bf1f5e67..e6023399ceb0 100755
--- a/scripts/adjust_autoksyms.sh
+++ b/scripts/adjust_autoksyms.sh
@@ -49,6 +49,12 @@ EOT
 sed 's/ko$/mod/' modules.order |
 xargs -n1 sed -n -e '2{s/ /\n/g;/^$/!p;}' -- |
 sort -u |
+while read sym; do
+	if [ -n "$CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX" ]; then
+		sym="${sym#_}"
+	fi
+	echo ${sym}
+done |
 sed -e 's/\(.*\)/#define __KSYM_\1 1/' >> "$new_ksyms_file"
 
 # Special case for modversions (see modpost.c)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index e739f565497e..c65f5647d2af 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5274,6 +5274,16 @@ sub process {
 			}
 		}
 
+# make sure symbols are always wrapped with VMLINUX_SYMBOL() ...
+# all assignments may have only one of the following with an assignment:
+#	.
+#	ALIGN(...)
+#	VMLINUX_SYMBOL(...)
+		if ($realfile eq 'vmlinux.lds.h' && $line =~ /(?:(?:^|\s)$Ident\s*=|=\s*$Ident(?:\s|$))/) {
+			WARN("MISSING_VMLINUX_SYMBOL",
+			     "vmlinux.lds.h needs VMLINUX_SYMBOL() around C-visible symbols\n" . $herecurr);
+		}
+
 # check for redundant bracing round if etc
 		if ($line =~ /(^.*)\bif\b/ && $1 !~ /else\s*$/) {
 			my ($level, $endln, @chunks) =
diff --git a/scripts/depmod.sh b/scripts/depmod.sh
index e083bcae343f..805b6d5b36a2 100755
--- a/scripts/depmod.sh
+++ b/scripts/depmod.sh
@@ -3,12 +3,13 @@
 #
 # A depmod wrapper used by the toplevel Makefile
 
-if test $# -ne 2; then
-	echo "Usage: $0 /sbin/depmod <kernelrelease>" >&2
+if test $# -ne 3; then
+	echo "Usage: $0 /sbin/depmod <kernelrelease> <symbolprefix>" >&2
 	exit 1
 fi
 DEPMOD=$1
 KERNELRELEASE=$2
+SYMBOL_PREFIX=$3
 
 if ! test -r System.map ; then
 	echo "Warning: modules_install: missing 'System.map' file. Skipping depmod." >&2
@@ -21,6 +22,24 @@ if [ -z $(command -v $DEPMOD) ]; then
 	exit 0
 fi
 
+# older versions of depmod don't support -P <symbol-prefix>
+# support was added in module-init-tools 3.13
+if test -n "$SYMBOL_PREFIX"; then
+	release=$("$DEPMOD" --version)
+	package=$(echo "$release" | cut -d' ' -f 1)
+	if test "$package" = "module-init-tools"; then
+		version=$(echo "$release" | cut -d' ' -f 2)
+		later=$(printf '%s\n' "$version" "3.13" | sort -V | tail -n 1)
+		if test "$later" != "$version"; then
+			# module-init-tools < 3.13, drop the symbol prefix
+			SYMBOL_PREFIX=""
+		fi
+	fi
+	if test -n "$SYMBOL_PREFIX"; then
+		SYMBOL_PREFIX="-P $SYMBOL_PREFIX"
+	fi
+fi
+
 # older versions of depmod require the version string to start with three
 # numbers, so we cheat with a symlink here
 depmod_hack_needed=true
@@ -43,7 +62,7 @@ set -- -ae -F System.map
 if test -n "$INSTALL_MOD_PATH"; then
 	set -- "$@" -b "$INSTALL_MOD_PATH"
 fi
-"$DEPMOD" "$@" "$KERNELRELEASE"
+"$DEPMOD" "$@" "$KERNELRELEASE" $SYMBOL_PREFIX
 ret=$?
 
 if $depmod_hack_needed; then
diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
index 23eff234184f..139bf698a7a8 100644
--- a/scripts/genksyms/genksyms.c
+++ b/scripts/genksyms/genksyms.c
@@ -34,6 +34,7 @@ int in_source_file;
 
 static int flag_debug, flag_dump_defs, flag_reference, flag_dump_types,
 	   flag_preserve, flag_warnings, flag_rel_crcs;
+static const char *mod_prefix = "";
 
 static int errors;
 static int nsyms;
@@ -681,10 +682,10 @@ void export_symbol(const char *name)
 			fputs(">\n", debugfile);
 
 		/* Used as a linker script. */
-		printf(!flag_rel_crcs ? "__crc_%s = 0x%08lx;\n" :
+		printf(!flag_rel_crcs ? "%s__crc_%s = 0x%08lx;\n" :
 		       "SECTIONS { .rodata : ALIGN(4) { "
-		       "__crc_%s = .; LONG(0x%08lx); } }\n",
-		       name, crc);
+		       "%s__crc_%s = .; LONG(0x%08lx); } }\n",
+		       mod_prefix, name, crc);
 	}
 }
 
@@ -757,6 +758,7 @@ int main(int argc, char **argv)
 
 #ifdef __GNU_LIBRARY__
 	struct option long_opts[] = {
+		{"symbol-prefix", 1, 0, 's'},
 		{"debug", 0, 0, 'd'},
 		{"warnings", 0, 0, 'w'},
 		{"quiet", 0, 0, 'q'},
@@ -776,6 +778,9 @@ int main(int argc, char **argv)
 	while ((o = getopt(argc, argv, "s:dwqVDr:T:phR")) != EOF)
 #endif				/* __GNU_LIBRARY__ */
 		switch (o) {
+		case 's':
+			mod_prefix = optarg;
+			break;
 		case 'd':
 			flag_debug++;
 			break;
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index ae6504d07fd6..75ec25554111 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -60,6 +60,7 @@ static struct sym_entry *table;
 static unsigned int table_size, table_cnt;
 static int all_symbols = 0;
 static int absolute_percpu = 0;
+static char symbol_prefix_char = '\0';
 static int base_relative = 0;
 
 static int token_profit[0x10000];
@@ -72,6 +73,7 @@ static unsigned char best_table_len[256];
 static void usage(void)
 {
 	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
+			"[--symbol-prefix=<prefix char>] "
 			"[--base-relative] < in.map > out.S\n");
 	exit(1);
 }
@@ -109,22 +111,28 @@ static int check_symbol_range(const char *sym, unsigned long long addr,
 
 static int read_symbol(FILE *in, struct sym_entry *s)
 {
-	char sym[500], stype;
+	char str[500];
+	char *sym, stype;
 	int rc;
 
-	rc = fscanf(in, "%llx %c %499s\n", &s->addr, &stype, sym);
+	rc = fscanf(in, "%llx %c %499s\n", &s->addr, &stype, str);
 	if (rc != 3) {
-		if (rc != EOF && fgets(sym, 500, in) == NULL)
+		if (rc != EOF && fgets(str, 500, in) == NULL)
 			fprintf(stderr, "Read error or end of file.\n");
 		return -1;
 	}
-	if (strlen(sym) >= KSYM_NAME_LEN) {
+	if (strlen(str) >= KSYM_NAME_LEN) {
 		fprintf(stderr, "Symbol %s too long for kallsyms (%zu >= %d).\n"
 				"Please increase KSYM_NAME_LEN both in kernel and kallsyms.c\n",
-			sym, strlen(sym), KSYM_NAME_LEN);
+			str, strlen(str), KSYM_NAME_LEN);
 		return -1;
 	}
 
+	sym = str;
+	/* skip prefix char */
+	if (symbol_prefix_char && str[0] == symbol_prefix_char)
+		sym++;
+
 	/* Ignore most absolute/undefined (?) symbols. */
 	if (strcmp(sym, "_text") == 0)
 		_text = s->addr;
@@ -145,7 +153,7 @@ static int read_symbol(FILE *in, struct sym_entry *s)
 		 is_arm_mapping_symbol(sym))
 		return -1;
 	/* exclude also MIPS ELF local symbols ($L123 instead of .L123) */
-	else if (sym[0] == '$')
+	else if (str[0] == '$')
 		return -1;
 	/* exclude debugging symbols */
 	else if (stype == 'N' || stype == 'n')
@@ -156,14 +164,14 @@ static int read_symbol(FILE *in, struct sym_entry *s)
 
 	/* include the type field in the symbol name, so that it gets
 	 * compressed together */
-	s->len = strlen(sym) + 1;
+	s->len = strlen(str) + 1;
 	s->sym = malloc(s->len + 1);
 	if (!s->sym) {
 		fprintf(stderr, "kallsyms failure: "
 			"unable to allocate required amount of memory\n");
 		exit(EXIT_FAILURE);
 	}
-	strcpy((char *)s->sym + 1, sym);
+	strcpy((char *)s->sym + 1, str);
 	s->sym[0] = stype;
 
 	s->percpu_absolute = 0;
@@ -226,6 +234,11 @@ static int symbol_valid(struct sym_entry *s)
 	int i;
 	char *sym_name = (char *)s->sym + 1;
 
+	/* skip prefix char */
+	if (symbol_prefix_char && *sym_name == symbol_prefix_char)
+		sym_name++;
+
+
 	/* if --all-symbols is not specified, then symbols outside the text
 	 * and inittext sections are discarded */
 	if (!all_symbols) {
@@ -290,9 +303,15 @@ static void read_map(FILE *in)
 
 static void output_label(char *label)
 {
-	printf(".globl %s\n", label);
+	if (symbol_prefix_char)
+		printf(".globl %c%s\n", symbol_prefix_char, label);
+	else
+		printf(".globl %s\n", label);
 	printf("\tALGN\n");
-	printf("%s:\n", label);
+	if (symbol_prefix_char)
+		printf("%c%s:\n", symbol_prefix_char, label);
+	else
+		printf("%s:\n", label);
 }
 
 /* uncompress a compressed symbol. When this function is called, the best table
@@ -749,7 +768,13 @@ int main(int argc, char **argv)
 				absolute_percpu = 1;
 			else if (strcmp(argv[i], "--base-relative") == 0)
 				base_relative = 1;
-			else
+			else if (strncmp(argv[i], "--symbol-prefix=", 16) == 0) {
+				char *p = &argv[i][16];
+				/* skip quote */
+				if ((*p == '"' && *(p+2) == '"') || (*p == '\'' && *(p+2) == '\''))
+					p++;
+				symbol_prefix_char = *p;
+			} else
 				usage();
 		}
 	} else if (argc != 1)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 915775eb2921..c3c5758ed7d6 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -117,6 +117,10 @@ kallsyms()
 	info KSYM ${2}
 	local kallsymopt;
 
+	if [ -n "${CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX}" ]; then
+		kallsymopt="${kallsymopt} --symbol-prefix=_"
+	fi
+
 	if [ -n "${CONFIG_KALLSYMS_ALL}" ]; then
 		kallsymopt="${kallsymopt} --all-symbols"
 	fi
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index f277e116e0eb..555f152bbebe 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -19,7 +19,9 @@
 #include <stdbool.h>
 #include <errno.h>
 #include "modpost.h"
+#include "../../include/generated/autoconf.h"
 #include "../../include/linux/license.h"
+#include "../../include/linux/export.h"
 
 /* Are we using CONFIG_MODVERSIONS? */
 static int modversions = 0;
@@ -588,7 +590,7 @@ static void parse_elf_finish(struct elf_info *info)
 static int ignore_undef_symbol(struct elf_info *info, const char *symname)
 {
 	/* ignore __this_module, it will be resolved shortly */
-	if (strcmp(symname, "__this_module") == 0)
+	if (strcmp(symname, VMLINUX_SYMBOL_STR(__this_module)) == 0)
 		return 1;
 	/* ignore global offset table */
 	if (strcmp(symname, "_GLOBAL_OFFSET_TABLE_") == 0)
@@ -614,6 +616,9 @@ static int ignore_undef_symbol(struct elf_info *info, const char *symname)
 	return 0;
 }
 
+#define CRC_PFX     VMLINUX_SYMBOL_STR(__crc_)
+#define KSYMTAB_PFX VMLINUX_SYMBOL_STR(__ksymtab_)
+
 static void handle_modversions(struct module *mod, struct elf_info *info,
 			       Elf_Sym *sym, const char *symname)
 {
@@ -628,7 +633,7 @@ static void handle_modversions(struct module *mod, struct elf_info *info,
 		export = export_from_sec(info, get_secindex(info, sym));
 
 	/* CRC'd symbol */
-	if (strstarts(symname, "__crc_")) {
+	if (strstarts(symname, CRC_PFX)) {
 		is_crc = true;
 		crc = (unsigned int) sym->st_value;
 		if (sym->st_shndx != SHN_UNDEF && sym->st_shndx != SHN_ABS) {
@@ -641,7 +646,7 @@ static void handle_modversions(struct module *mod, struct elf_info *info,
 				info->sechdrs[sym->st_shndx].sh_addr : 0);
 			crc = TO_NATIVE(*crcp);
 		}
-		sym_update_crc(symname + strlen("__crc_"), mod, crc,
+		sym_update_crc(symname + strlen(CRC_PFX), mod, crc,
 				export);
 	}
 
@@ -679,10 +684,15 @@ static void handle_modversions(struct module *mod, struct elf_info *info,
 		}
 #endif
 
+#ifdef CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
+		if (symname[0] != '_')
+			break;
+		else
+			symname++;
+#endif
 		if (is_crc) {
 			const char *e = is_vmlinux(mod->name) ?"":".ko";
-			warn("EXPORT symbol \"%s\" [%s%s] version generation failed, symbol will not be versioned.\n",
-			     symname + strlen("__crc_"), mod->name, e);
+			warn("EXPORT symbol \"%s\" [%s%s] version generation failed, symbol will not be versioned.\n", symname + strlen(CRC_PFX), mod->name, e);
 		}
 		mod->unres = alloc_symbol(symname,
 					  ELF_ST_BIND(sym->st_info) == STB_WEAK,
@@ -690,13 +700,13 @@ static void handle_modversions(struct module *mod, struct elf_info *info,
 		break;
 	default:
 		/* All exported symbols */
-		if (strstarts(symname, "__ksymtab_")) {
-			sym_add_exported(symname + strlen("__ksymtab_"), mod,
+		if (strstarts(symname, KSYMTAB_PFX)) {
+			sym_add_exported(symname + strlen(KSYMTAB_PFX), mod,
 					export);
 		}
-		if (strcmp(symname, "init_module") == 0)
+		if (strcmp(symname, VMLINUX_SYMBOL_STR(init_module)) == 0)
 			mod->has_init = 1;
-		if (strcmp(symname, "cleanup_module") == 0)
+		if (strcmp(symname, VMLINUX_SYMBOL_STR(cleanup_module)) == 0)
 			mod->has_cleanup = 1;
 		break;
 	}
@@ -2230,7 +2240,7 @@ static int add_versions(struct buffer *b, struct module *mod)
 			err = 1;
 			break;
 		}
-		buf_printf(b, "\t{ %#8x, \"%s\" },\n",
+		buf_printf(b, "\t{ %#8x, __VMLINUX_SYMBOL_STR(%s) },\n",
 			   s->crc, s->name);
 	}
 
diff --git a/usr/initramfs_data.S b/usr/initramfs_data.S
index d07648f05bbf..b28da799f6a6 100644
--- a/usr/initramfs_data.S
+++ b/usr/initramfs_data.S
@@ -30,8 +30,8 @@ __irf_start:
 .incbin __stringify(INITRAMFS_IMAGE)
 __irf_end:
 .section .init.ramfs.info,"a"
-.globl __initramfs_size
-__initramfs_size:
+.globl VMLINUX_SYMBOL(__initramfs_size)
+VMLINUX_SYMBOL(__initramfs_size):
 #ifdef CONFIG_64BIT
 	.quad __irf_end - __irf_start
 #else
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 31/37] lkl: add support for Windows hosts
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki, Jens Staal

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch allows LKL to be compiled for windows hosts with the mingw
toolchain. Note that patches [1] that fix weak symbols linking are
required to successfully compile LKL with mingw.

The patch disables the modpost pass over vmlinux since modpost only
works with ELF objects.

It also adds and workaround to an #include_next <stdard.h> error which
is apparently caused by using -nosdtinc.

[1] https://sourceware.org/ml/binutils/2015-10/msg00234.html

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Jens Staal <staal1978@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/Kconfig                 | 2 ++
 arch/um/lkl/include/system/stdarg.h | 2 ++
 include/linux/compiler_attributes.h | 4 ++++
 lib/.gitignore                      | 2 ++
 lib/raid6/.gitignore                | 1 +
 scripts/.gitignore                  | 2 ++
 scripts/basic/.gitignore            | 1 +
 scripts/kconfig/.gitignore          | 1 +
 scripts/link-vmlinux.sh             | 2 ++
 scripts/mod/.gitignore              | 1 +
 10 files changed, 18 insertions(+)
 create mode 100644 arch/um/lkl/include/system/stdarg.h

diff --git a/arch/um/lkl/Kconfig b/arch/um/lkl/Kconfig
index 07b3699095ae..1629e2679b75 100644
--- a/arch/um/lkl/Kconfig
+++ b/arch/um/lkl/Kconfig
@@ -20,6 +20,8 @@ config LKL
        select DMA_DIRECT_OPS
        select PHYS_ADDR_T_64BIT if 64BIT
        select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64"
+       select 64BIT if "$(OUTPUT_FORMAT)" = "pe-x86-64"
+       select HAVE_UNDERSCORE_SYMBOL_PREFIX if "$(OUTPUT_FORMAT)" = "pe-i386"
        select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64-freebsd"
        select NET
        select MULTIUSER
diff --git a/arch/um/lkl/include/system/stdarg.h b/arch/um/lkl/include/system/stdarg.h
new file mode 100644
index 000000000000..12077a36828c
--- /dev/null
+++ b/arch/um/lkl/include/system/stdarg.h
@@ -0,0 +1,2 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* empty file to avoid #include_next<stdarg.h> error */
diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 6b318efd8a74..1981b1c323c1 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -154,7 +154,11 @@
  *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-format-function-attribute
  * clang: https://clang.llvm.org/docs/AttributeReference.html#format
  */
+#ifdef __MINGW32__
+#define __printf(a, b)             __attribute__((__format__(gnu_printf, a, b)))
+#else
 #define __printf(a, b)                  __attribute__((__format__(printf, a, b)))
+#endif
 #define __scanf(a, b)                   __attribute__((__format__(scanf, a, b)))
 
 /*
diff --git a/lib/.gitignore b/lib/.gitignore
index f2a39c9e5485..eb9f11b81fe1 100644
--- a/lib/.gitignore
+++ b/lib/.gitignore
@@ -2,7 +2,9 @@
 # Generated files
 #
 gen_crc32table
+gen_crc32table.exe
 gen_crc64table
+gen_crc64table.exe
 crc32table.h
 crc64table.h
 oid_registry_data.c
diff --git a/lib/raid6/.gitignore b/lib/raid6/.gitignore
index 3de0d8921286..80e3566535aa 100644
--- a/lib/raid6/.gitignore
+++ b/lib/raid6/.gitignore
@@ -1,4 +1,5 @@
 mktables
+mktables.exe
 altivec*.c
 int*.c
 tables.c
diff --git a/scripts/.gitignore b/scripts/.gitignore
index 17f8cef88fa8..ec9138a39b25 100644
--- a/scripts/.gitignore
+++ b/scripts/.gitignore
@@ -4,8 +4,10 @@
 bin2c
 conmakehash
 kallsyms
+kallsyms.exe
 pnmtologo
 unifdef
+unifdef.exe
 recordmcount
 sortextable
 asn1_compiler
diff --git a/scripts/basic/.gitignore b/scripts/basic/.gitignore
index a776371a3502..77ce153243fa 100644
--- a/scripts/basic/.gitignore
+++ b/scripts/basic/.gitignore
@@ -1 +1,2 @@
 fixdep
+fixdep.exe
diff --git a/scripts/kconfig/.gitignore b/scripts/kconfig/.gitignore
index b5bf92f66d11..aa27000d896f 100644
--- a/scripts/kconfig/.gitignore
+++ b/scripts/kconfig/.gitignore
@@ -8,6 +8,7 @@
 # configuration programs
 #
 conf
+conf.exe
 mconf
 nconf
 qconf
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index c3c5758ed7d6..553d966a1986 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -209,6 +209,7 @@ fi;
 # final build of init/
 ${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init
 
+if [ -e scripts/mod/modpost ]; then
 #link vmlinux.o
 info LD vmlinux.o
 modpost_link vmlinux.o
@@ -218,6 +219,7 @@ ${MAKE} -f "${srctree}/scripts/Makefile.modpost" MODPOST_VMLINUX=1
 
 info MODINFO modules.builtin.modinfo
 ${OBJCOPY} -j .modinfo -O binary vmlinux.o modules.builtin.modinfo
+fi
 
 kallsymso=""
 kallsyms_vmlinux=""
diff --git a/scripts/mod/.gitignore b/scripts/mod/.gitignore
index 3bd11b603173..cd67845e326d 100644
--- a/scripts/mod/.gitignore
+++ b/scripts/mod/.gitignore
@@ -1,4 +1,5 @@
 elfconfig.h
 mk_elfconfig
 modpost
+modpost.exe
 devicetable-offsets.h
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 31/37] lkl: add support for Windows hosts
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Octavian Purdila, Jens Staal, Akira Moroo,
	linux-kernel-library, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

This patch allows LKL to be compiled for windows hosts with the mingw
toolchain. Note that patches [1] that fix weak symbols linking are
required to successfully compile LKL with mingw.

The patch disables the modpost pass over vmlinux since modpost only
works with ELF objects.

It also adds and workaround to an #include_next <stdard.h> error which
is apparently caused by using -nosdtinc.

[1] https://sourceware.org/ml/binutils/2015-10/msg00234.html

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Jens Staal <staal1978@gmail.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 arch/um/lkl/Kconfig                 | 2 ++
 arch/um/lkl/include/system/stdarg.h | 2 ++
 include/linux/compiler_attributes.h | 4 ++++
 lib/.gitignore                      | 2 ++
 lib/raid6/.gitignore                | 1 +
 scripts/.gitignore                  | 2 ++
 scripts/basic/.gitignore            | 1 +
 scripts/kconfig/.gitignore          | 1 +
 scripts/link-vmlinux.sh             | 2 ++
 scripts/mod/.gitignore              | 1 +
 10 files changed, 18 insertions(+)
 create mode 100644 arch/um/lkl/include/system/stdarg.h

diff --git a/arch/um/lkl/Kconfig b/arch/um/lkl/Kconfig
index 07b3699095ae..1629e2679b75 100644
--- a/arch/um/lkl/Kconfig
+++ b/arch/um/lkl/Kconfig
@@ -20,6 +20,8 @@ config LKL
        select DMA_DIRECT_OPS
        select PHYS_ADDR_T_64BIT if 64BIT
        select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64"
+       select 64BIT if "$(OUTPUT_FORMAT)" = "pe-x86-64"
+       select HAVE_UNDERSCORE_SYMBOL_PREFIX if "$(OUTPUT_FORMAT)" = "pe-i386"
        select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64-freebsd"
        select NET
        select MULTIUSER
diff --git a/arch/um/lkl/include/system/stdarg.h b/arch/um/lkl/include/system/stdarg.h
new file mode 100644
index 000000000000..12077a36828c
--- /dev/null
+++ b/arch/um/lkl/include/system/stdarg.h
@@ -0,0 +1,2 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* empty file to avoid #include_next<stdarg.h> error */
diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 6b318efd8a74..1981b1c323c1 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -154,7 +154,11 @@
  *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-format-function-attribute
  * clang: https://clang.llvm.org/docs/AttributeReference.html#format
  */
+#ifdef __MINGW32__
+#define __printf(a, b)             __attribute__((__format__(gnu_printf, a, b)))
+#else
 #define __printf(a, b)                  __attribute__((__format__(printf, a, b)))
+#endif
 #define __scanf(a, b)                   __attribute__((__format__(scanf, a, b)))
 
 /*
diff --git a/lib/.gitignore b/lib/.gitignore
index f2a39c9e5485..eb9f11b81fe1 100644
--- a/lib/.gitignore
+++ b/lib/.gitignore
@@ -2,7 +2,9 @@
 # Generated files
 #
 gen_crc32table
+gen_crc32table.exe
 gen_crc64table
+gen_crc64table.exe
 crc32table.h
 crc64table.h
 oid_registry_data.c
diff --git a/lib/raid6/.gitignore b/lib/raid6/.gitignore
index 3de0d8921286..80e3566535aa 100644
--- a/lib/raid6/.gitignore
+++ b/lib/raid6/.gitignore
@@ -1,4 +1,5 @@
 mktables
+mktables.exe
 altivec*.c
 int*.c
 tables.c
diff --git a/scripts/.gitignore b/scripts/.gitignore
index 17f8cef88fa8..ec9138a39b25 100644
--- a/scripts/.gitignore
+++ b/scripts/.gitignore
@@ -4,8 +4,10 @@
 bin2c
 conmakehash
 kallsyms
+kallsyms.exe
 pnmtologo
 unifdef
+unifdef.exe
 recordmcount
 sortextable
 asn1_compiler
diff --git a/scripts/basic/.gitignore b/scripts/basic/.gitignore
index a776371a3502..77ce153243fa 100644
--- a/scripts/basic/.gitignore
+++ b/scripts/basic/.gitignore
@@ -1 +1,2 @@
 fixdep
+fixdep.exe
diff --git a/scripts/kconfig/.gitignore b/scripts/kconfig/.gitignore
index b5bf92f66d11..aa27000d896f 100644
--- a/scripts/kconfig/.gitignore
+++ b/scripts/kconfig/.gitignore
@@ -8,6 +8,7 @@
 # configuration programs
 #
 conf
+conf.exe
 mconf
 nconf
 qconf
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index c3c5758ed7d6..553d966a1986 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -209,6 +209,7 @@ fi;
 # final build of init/
 ${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init
 
+if [ -e scripts/mod/modpost ]; then
 #link vmlinux.o
 info LD vmlinux.o
 modpost_link vmlinux.o
@@ -218,6 +219,7 @@ ${MAKE} -f "${srctree}/scripts/Makefile.modpost" MODPOST_VMLINUX=1
 
 info MODINFO modules.builtin.modinfo
 ${OBJCOPY} -j .modinfo -O binary vmlinux.o modules.builtin.modinfo
+fi
 
 kallsymso=""
 kallsyms_vmlinux=""
diff --git a/scripts/mod/.gitignore b/scripts/mod/.gitignore
index 3bd11b603173..cd67845e326d 100644
--- a/scripts/mod/.gitignore
+++ b/scripts/mod/.gitignore
@@ -1,4 +1,5 @@
 elfconfig.h
 mk_elfconfig
 modpost
+modpost.exe
 devicetable-offsets.h
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 32/37] lkl tools: add support for Windows host
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki, Patrick Collins, Yuan Liu

From: Octavian Purdila <tavi.purdila@gmail.com>

Add host operations for Windows host and virtio disk support.

Signed-off-by: Andreas Gnau <andreas.gnau@intel.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore                   |   2 +
 tools/lkl/Makefile.autoconf            |  22 ++
 tools/lkl/include/mingw32/sys/socket.h |   4 +
 tools/lkl/lib/Build                    |   2 +
 tools/lkl/lib/nt-host.c                | 375 +++++++++++++++++++++++++
 tools/lkl/tests/test.sh                |   4 +-
 6 files changed, 408 insertions(+), 1 deletion(-)
 create mode 100644 tools/lkl/include/mingw32/sys/socket.h
 create mode 100644 tools/lkl/lib/nt-host.c

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
index c78ec268e4b0..48232402224d 100644
--- a/tools/lkl/.gitignore
+++ b/tools/lkl/.gitignore
@@ -1,3 +1,5 @@
+*.exe
+*.dll
 Makefile.conf
 include/lkl_autoconf.h
 tests/autoconf.sh
diff --git a/tools/lkl/Makefile.autoconf b/tools/lkl/Makefile.autoconf
index fd63b8aa5c77..1631f5cc25ac 100644
--- a/tools/lkl/Makefile.autoconf
+++ b/tools/lkl/Makefile.autoconf
@@ -1,4 +1,5 @@
 POSIX_HOSTS=elf64-x86-64 elf32-i386 elf64-x86-64-freebsd
+NT_HOSTS=pe-i386 pe-x86-64
 
 define set_autoconf_var
   $(shell echo "#define LKL_HOST_CONFIG_$(1) $(2)" \
@@ -65,6 +66,26 @@ define posix_host
   $(if $(filter $(1),elf32-i386),$(call set_autoconf_var,I386,y))
 endef
 
+define nt64_host
+  $(call set_autoconf_var,NEEDS_LARGP,y)
+  CFLAGS += -Wl,--enable-auto-image-base -Wl,--image-base -Wl,0x10000000 \
+	 -Wl,--out-implib=$(OUTPUT)liblkl.dll.a -Wl,--export-all-symbols \
+	 -Wl,--enable-auto-import
+  LDFLAGS +=-Wl,--image-base -Wl,0x10000000 -Wl,--enable-auto-image-base \
+	   -Wl,--out-implib=$(OUTPUT)liblkl.dll.a -Wl,--export-all-symbols \
+	   -Wl,--enable-auto-import
+endef
+
+define nt_host
+  $(call set_autoconf_var,NT,y)
+  KOPT = "KALLSYMS_EXTRA_PASS=1"
+  LDLIBS += -lws2_32
+  EXESUF := .exe
+  SOSUF := .dll
+  CFLAGS += -Iinclude/mingw32
+  $(if $(filter $(1),pe-x86-64),$(call nt64_host))
+endef
+
 define do_autoconf
   export CROSS_COMPILE := $(CROSS_COMPILE)
   export CC := $(CROSS_COMPILE)gcc
@@ -74,6 +95,7 @@ define do_autoconf
   $(eval CC := $(CROSS_COMPILE)gcc)
   $(eval LD_FMT := $(shell $(LD) -r -print-output-format))
   $(if $(filter $(LD_FMT),$(POSIX_HOSTS)),$(call posix_host,$(LD_FMT)))
+  $(if $(filter $(LD_FMT),$(NT_HOSTS)),$(call nt_host,$(LD_FMT)))
 endef
 
 export do_autoconf
diff --git a/tools/lkl/include/mingw32/sys/socket.h b/tools/lkl/include/mingw32/sys/socket.h
new file mode 100644
index 000000000000..f9ede3170d03
--- /dev/null
+++ b/tools/lkl/include/mingw32/sys/socket.h
@@ -0,0 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* fake file to avoid #include <sys/socket.h> error on non-posix
+ * host (e.g., mingw32)
+ */
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index 1f1d55f259a3..76a1e62dfca8 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1,12 +1,14 @@
 CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 CFLAGS_posix-host.o += -D_FILE_OFFSET_BITS=64
 CFLAGS_virtio_net_vde.o += $(pkg-config --cflags vdeplug 2>/dev/null)
+CFLAGS_nt-host.o += -D_WIN32_WINNT=0x0600
 
 liblkl-y += fs.o
 liblkl-y += iomem.o
 liblkl-y += net.o
 liblkl-y += jmp_buf.o
 liblkl-$(LKL_HOST_CONFIG_POSIX) += posix-host.o
+liblkl-$(LKL_HOST_CONFIG_NT) += nt-host.o
 liblkl-y += utils.o
 liblkl-y += virtio_blk.o
 liblkl-y += virtio.o
diff --git a/tools/lkl/lib/nt-host.c b/tools/lkl/lib/nt-host.c
new file mode 100644
index 000000000000..c7613272be3b
--- /dev/null
+++ b/tools/lkl/lib/nt-host.c
@@ -0,0 +1,375 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <windows.h>
+#include <assert.h>
+#include <unistd.h>
+#undef s_addr
+#include <lkl_host.h>
+#include "iomem.h"
+#include "jmp_buf.h"
+
+#define DIFF_1601_TO_1970_IN_100NS (11644473600L * 10000000L)
+
+struct lkl_mutex {
+	int recursive;
+	HANDLE handle;
+};
+
+struct lkl_sem {
+	HANDLE sem;
+};
+
+struct lkl_tls_key {
+	DWORD key;
+};
+
+static struct lkl_sem *sem_alloc(int count)
+{
+	struct lkl_sem *sem = malloc(sizeof(struct lkl_sem));
+
+	sem->sem = CreateSemaphore(NULL, count, 100, NULL);
+	return sem;
+}
+
+static void sem_up(struct lkl_sem *sem)
+{
+	ReleaseSemaphore(sem->sem, 1, NULL);
+}
+
+static void sem_down(struct lkl_sem *sem)
+{
+	WaitForSingleObject(sem->sem, INFINITE);
+}
+
+static void sem_free(struct lkl_sem *sem)
+{
+	CloseHandle(sem->sem);
+	free(sem);
+}
+
+static struct lkl_mutex *mutex_alloc(int recursive)
+{
+	struct lkl_mutex *_mutex = malloc(sizeof(struct lkl_mutex));
+
+	if (!_mutex)
+		return NULL;
+
+	if (recursive)
+		_mutex->handle = CreateMutex(0, FALSE, 0);
+	else
+		_mutex->handle = CreateSemaphore(NULL, 1, 100, NULL);
+	_mutex->recursive = recursive;
+	return _mutex;
+}
+
+static void mutex_lock(struct lkl_mutex *mutex)
+{
+	WaitForSingleObject(mutex->handle, INFINITE);
+}
+
+static void mutex_unlock(struct lkl_mutex *_mutex)
+{
+	if (_mutex->recursive)
+		ReleaseMutex(_mutex->handle);
+	else
+		ReleaseSemaphore(_mutex->handle, 1, NULL);
+}
+
+static void mutex_free(struct lkl_mutex *_mutex)
+{
+	CloseHandle(_mutex->handle);
+	free(_mutex);
+}
+
+static lkl_thread_t thread_create(void (*fn)(void *), void *arg)
+{
+	DWORD WINAPI(*win_fn)(LPVOID arg) = (DWORD WINAPI(*)(LPVOID))fn;
+	HANDLE h = CreateThread(NULL, 0, win_fn, arg, 0, NULL);
+
+	if (!h)
+		return 0;
+
+	return GetThreadId(h);
+}
+
+static void thread_detach(void)
+{
+}
+
+static void thread_exit(void)
+{
+	ExitThread(0);
+}
+
+static int thread_join(lkl_thread_t tid)
+{
+	int ret;
+	HANDLE *h;
+
+	h = OpenThread(SYNCHRONIZE, FALSE, tid);
+	if (!h)
+		lkl_printf("%s: can't get thread handle\n", __func__);
+
+	ret = WaitForSingleObject(h, INFINITE);
+	if (ret)
+		lkl_printf("%s: %d\n", __func__, ret);
+
+	CloseHandle(h);
+
+	return ret ? -1 : 0;
+}
+
+static lkl_thread_t thread_self(void)
+{
+	return GetThreadId(GetCurrentThread());
+}
+
+static int thread_equal(lkl_thread_t a, lkl_thread_t b)
+{
+	return a == b;
+}
+
+static struct lkl_tls_key *tls_alloc(void (*destructor)(void *))
+{
+	struct lkl_tls_key *ret = malloc(sizeof(struct lkl_tls_key));
+
+	ret->key = FlsAlloc((PFLS_CALLBACK_FUNCTION)destructor);
+	if (ret->key == TLS_OUT_OF_INDEXES) {
+		free(ret);
+		return NULL;
+	}
+	return ret;
+}
+
+static void tls_free(struct lkl_tls_key *key)
+{
+	/* setting to NULL first to prevent the callback from being called */
+	FlsSetValue(key->key, NULL);
+	FlsFree(key->key);
+	free(key);
+}
+
+static int tls_set(struct lkl_tls_key *key, void *data)
+{
+	return FlsSetValue(key->key, data) ? 0 : -1;
+}
+
+static void *tls_get(struct lkl_tls_key *key)
+{
+	return FlsGetValue(key->key);
+}
+
+
+/*
+ * With 64 bits, we can cover about 583 years at a nanosecond resolution.
+ * Windows counts time from 1601 so we do have about 100 years before we
+ * overflow.
+ */
+static unsigned long long time_ns(void)
+{
+	SYSTEMTIME st;
+	FILETIME ft;
+	ULARGE_INTEGER uli;
+
+	GetSystemTime(&st);
+	SystemTimeToFileTime(&st, &ft);
+	uli.LowPart = ft.dwLowDateTime;
+	uli.HighPart = ft.dwHighDateTime;
+
+	return (uli.QuadPart - DIFF_1601_TO_1970_IN_100NS) * 100;
+}
+
+struct timer {
+	HANDLE queue;
+	void (*callback)(void *arg);
+	void *arg;
+};
+
+static void *timer_alloc(void (*fn)(void *), void *arg)
+{
+	struct timer *t;
+
+	t = malloc(sizeof(*t));
+	if (!t)
+		return NULL;
+
+	t->queue = CreateTimerQueue();
+	if (!t->queue) {
+		free(t);
+		return NULL;
+	}
+
+	t->callback = fn;
+	t->arg = arg;
+
+	return t;
+}
+
+static void CALLBACK timer_callback(void *arg, BOOLEAN TimerOrWaitFired)
+{
+	struct timer *t = (struct timer *)arg;
+
+	if (TimerOrWaitFired)
+		t->callback(t->arg);
+}
+
+static int timer_set_oneshot(void *timer, unsigned long ns)
+{
+	struct timer *t = (struct timer *)timer;
+	HANDLE tmp;
+
+	return !CreateTimerQueueTimer(&tmp, t->queue, timer_callback, t,
+				      ns / 1000000, 0, 0);
+}
+
+static void timer_free(void *timer)
+{
+	struct timer *t = (struct timer *)timer;
+	HANDLE completion;
+
+	completion = CreateEvent(NULL, FALSE, FALSE, NULL);
+	DeleteTimerQueueEx(t->queue, completion);
+	WaitForSingleObject(completion, INFINITE);
+	free(t);
+}
+
+static void panic(void)
+{
+	int *x = NULL;
+
+	*x = 1;
+	assert(0);
+}
+
+static void print(const char *str, int len)
+{
+	write(1, str, len);
+}
+
+static long gettid(void)
+{
+	return GetCurrentThreadId();
+}
+
+static void *mem_alloc(unsigned long size)
+{
+	return malloc(size);
+}
+
+struct lkl_host_operations lkl_host_ops = {
+	.panic = panic,
+	.thread_create = thread_create,
+	.thread_detach = thread_detach,
+	.thread_exit = thread_exit,
+	.thread_join = thread_join,
+	.thread_self = thread_self,
+	.thread_equal = thread_equal,
+	.sem_alloc = sem_alloc,
+	.sem_free = sem_free,
+	.sem_up = sem_up,
+	.sem_down = sem_down,
+	.mutex_alloc = mutex_alloc,
+	.mutex_free = mutex_free,
+	.mutex_lock = mutex_lock,
+	.mutex_unlock = mutex_unlock,
+	.tls_alloc = tls_alloc,
+	.tls_free = tls_free,
+	.tls_set = tls_set,
+	.tls_get = tls_get,
+	.time = time_ns,
+	.timer_alloc = timer_alloc,
+	.timer_set_oneshot = timer_set_oneshot,
+	.timer_free = timer_free,
+	.print = print,
+	.mem_alloc = mem_alloc,
+	.mem_free = free,
+	.ioremap = lkl_ioremap,
+	.iomem_access = lkl_iomem_access,
+	.virtio_devices = lkl_virtio_devs,
+	.gettid = gettid,
+	.jmp_buf_set = jmp_buf_set,
+	.jmp_buf_longjmp = jmp_buf_longjmp,
+};
+
+int handle_get_capacity(struct lkl_disk disk, unsigned long long *res)
+{
+	LARGE_INTEGER tmp;
+
+	if (!GetFileSizeEx(disk.handle, &tmp))
+		return -1;
+
+	*res = tmp.QuadPart;
+	return 0;
+}
+
+static int blk_request(struct lkl_disk disk, struct lkl_blk_req *req)
+{
+	unsigned long long offset = req->sector * 512;
+	OVERLAPPED ov = { 0, };
+	int err = 0, ret;
+
+	switch (req->type) {
+	case LKL_DEV_BLK_TYPE_READ:
+	case LKL_DEV_BLK_TYPE_WRITE:
+	{
+		int i;
+
+		for (i = 0; i < req->count; i++) {
+			DWORD res;
+			struct iovec *buf = &req->buf[i];
+
+			ov.Offset = offset & 0xffffffff;
+			ov.OffsetHigh = offset >> 32;
+
+			if (req->type == LKL_DEV_BLK_TYPE_READ)
+				ret = ReadFile(disk.handle, buf->iov_base,
+					       buf->iov_len, &res, &ov);
+			else
+				ret = WriteFile(disk.handle, buf->iov_base,
+						buf->iov_len, &res, &ov);
+			if (!ret) {
+				lkl_printf("%s: I/O error: %d\n", __func__,
+					   GetLastError());
+				err = -1;
+				goto out;
+			}
+
+			if (res != buf->iov_len) {
+				lkl_printf("%s: I/O error: short: %d %d\n",
+					   res, buf->iov_len);
+				err = -1;
+				goto out;
+			}
+
+			offset += buf->iov_len;
+		}
+		break;
+	}
+	case LKL_DEV_BLK_TYPE_FLUSH:
+	case LKL_DEV_BLK_TYPE_FLUSH_OUT:
+		ret = FlushFileBuffers(disk.handle);
+		if (!ret)
+			err = 1;
+		break;
+	default:
+		return LKL_DEV_BLK_STATUS_UNSUP;
+	}
+
+out:
+	if (err < 0)
+		return LKL_DEV_BLK_STATUS_IOERR;
+
+	return LKL_DEV_BLK_STATUS_OK;
+}
+
+struct lkl_dev_blk_ops lkl_dev_blk_ops = {
+	.get_capacity = handle_get_capacity,
+	.request = blk_request,
+};
+
+/* Needed to resolve linker error on Win32. We don't really support
+ * any network IO on Windows, anyway, so there's no loss here.
+ */
+int lkl_netdevs_remove(void)
+{
+	return 0;
+}
diff --git a/tools/lkl/tests/test.sh b/tools/lkl/tests/test.sh
index 1a5619aed735..cda932b98058 100644
--- a/tools/lkl/tests/test.sh
+++ b/tools/lkl/tests/test.sh
@@ -98,7 +98,9 @@ lkl_test_exec()
         file=$file.exe
     fi
 
-    if file $file | grep ARM; then
+    if file $file | grep PE32; then
+        WRAPPER="wine"
+    elif file $file | grep ARM; then
         WRAPPER="qemu-arm-static"
     elif file $file | grep "FreeBSD" ; then
         ssh_copy "$file" $BSD_WDIR
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 32/37] lkl tools: add support for Windows host
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: linux-arch, Octavian Purdila, Akira Moroo, Yuan Liu,
	Patrick Collins, linux-kernel-library, Hajime Tazaki

From: Octavian Purdila <tavi.purdila@gmail.com>

Add host operations for Windows host and virtio disk support.

Signed-off-by: Andreas Gnau <andreas.gnau@intel.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Signed-off-by: Patrick Collins <pscollins@google.com>
Signed-off-by: Yuan Liu <liuyuan@google.com>
Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
---
 tools/lkl/.gitignore                   |   2 +
 tools/lkl/Makefile.autoconf            |  22 ++
 tools/lkl/include/mingw32/sys/socket.h |   4 +
 tools/lkl/lib/Build                    |   2 +
 tools/lkl/lib/nt-host.c                | 375 +++++++++++++++++++++++++
 tools/lkl/tests/test.sh                |   4 +-
 6 files changed, 408 insertions(+), 1 deletion(-)
 create mode 100644 tools/lkl/include/mingw32/sys/socket.h
 create mode 100644 tools/lkl/lib/nt-host.c

diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
index c78ec268e4b0..48232402224d 100644
--- a/tools/lkl/.gitignore
+++ b/tools/lkl/.gitignore
@@ -1,3 +1,5 @@
+*.exe
+*.dll
 Makefile.conf
 include/lkl_autoconf.h
 tests/autoconf.sh
diff --git a/tools/lkl/Makefile.autoconf b/tools/lkl/Makefile.autoconf
index fd63b8aa5c77..1631f5cc25ac 100644
--- a/tools/lkl/Makefile.autoconf
+++ b/tools/lkl/Makefile.autoconf
@@ -1,4 +1,5 @@
 POSIX_HOSTS=elf64-x86-64 elf32-i386 elf64-x86-64-freebsd
+NT_HOSTS=pe-i386 pe-x86-64
 
 define set_autoconf_var
   $(shell echo "#define LKL_HOST_CONFIG_$(1) $(2)" \
@@ -65,6 +66,26 @@ define posix_host
   $(if $(filter $(1),elf32-i386),$(call set_autoconf_var,I386,y))
 endef
 
+define nt64_host
+  $(call set_autoconf_var,NEEDS_LARGP,y)
+  CFLAGS += -Wl,--enable-auto-image-base -Wl,--image-base -Wl,0x10000000 \
+	 -Wl,--out-implib=$(OUTPUT)liblkl.dll.a -Wl,--export-all-symbols \
+	 -Wl,--enable-auto-import
+  LDFLAGS +=-Wl,--image-base -Wl,0x10000000 -Wl,--enable-auto-image-base \
+	   -Wl,--out-implib=$(OUTPUT)liblkl.dll.a -Wl,--export-all-symbols \
+	   -Wl,--enable-auto-import
+endef
+
+define nt_host
+  $(call set_autoconf_var,NT,y)
+  KOPT = "KALLSYMS_EXTRA_PASS=1"
+  LDLIBS += -lws2_32
+  EXESUF := .exe
+  SOSUF := .dll
+  CFLAGS += -Iinclude/mingw32
+  $(if $(filter $(1),pe-x86-64),$(call nt64_host))
+endef
+
 define do_autoconf
   export CROSS_COMPILE := $(CROSS_COMPILE)
   export CC := $(CROSS_COMPILE)gcc
@@ -74,6 +95,7 @@ define do_autoconf
   $(eval CC := $(CROSS_COMPILE)gcc)
   $(eval LD_FMT := $(shell $(LD) -r -print-output-format))
   $(if $(filter $(LD_FMT),$(POSIX_HOSTS)),$(call posix_host,$(LD_FMT)))
+  $(if $(filter $(LD_FMT),$(NT_HOSTS)),$(call nt_host,$(LD_FMT)))
 endef
 
 export do_autoconf
diff --git a/tools/lkl/include/mingw32/sys/socket.h b/tools/lkl/include/mingw32/sys/socket.h
new file mode 100644
index 000000000000..f9ede3170d03
--- /dev/null
+++ b/tools/lkl/include/mingw32/sys/socket.h
@@ -0,0 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* fake file to avoid #include <sys/socket.h> error on non-posix
+ * host (e.g., mingw32)
+ */
diff --git a/tools/lkl/lib/Build b/tools/lkl/lib/Build
index 1f1d55f259a3..76a1e62dfca8 100644
--- a/tools/lkl/lib/Build
+++ b/tools/lkl/lib/Build
@@ -1,12 +1,14 @@
 CFLAGS_config.o += -I$(srctree)/tools/perf/pmu-events
 CFLAGS_posix-host.o += -D_FILE_OFFSET_BITS=64
 CFLAGS_virtio_net_vde.o += $(pkg-config --cflags vdeplug 2>/dev/null)
+CFLAGS_nt-host.o += -D_WIN32_WINNT=0x0600
 
 liblkl-y += fs.o
 liblkl-y += iomem.o
 liblkl-y += net.o
 liblkl-y += jmp_buf.o
 liblkl-$(LKL_HOST_CONFIG_POSIX) += posix-host.o
+liblkl-$(LKL_HOST_CONFIG_NT) += nt-host.o
 liblkl-y += utils.o
 liblkl-y += virtio_blk.o
 liblkl-y += virtio.o
diff --git a/tools/lkl/lib/nt-host.c b/tools/lkl/lib/nt-host.c
new file mode 100644
index 000000000000..c7613272be3b
--- /dev/null
+++ b/tools/lkl/lib/nt-host.c
@@ -0,0 +1,375 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <windows.h>
+#include <assert.h>
+#include <unistd.h>
+#undef s_addr
+#include <lkl_host.h>
+#include "iomem.h"
+#include "jmp_buf.h"
+
+#define DIFF_1601_TO_1970_IN_100NS (11644473600L * 10000000L)
+
+struct lkl_mutex {
+	int recursive;
+	HANDLE handle;
+};
+
+struct lkl_sem {
+	HANDLE sem;
+};
+
+struct lkl_tls_key {
+	DWORD key;
+};
+
+static struct lkl_sem *sem_alloc(int count)
+{
+	struct lkl_sem *sem = malloc(sizeof(struct lkl_sem));
+
+	sem->sem = CreateSemaphore(NULL, count, 100, NULL);
+	return sem;
+}
+
+static void sem_up(struct lkl_sem *sem)
+{
+	ReleaseSemaphore(sem->sem, 1, NULL);
+}
+
+static void sem_down(struct lkl_sem *sem)
+{
+	WaitForSingleObject(sem->sem, INFINITE);
+}
+
+static void sem_free(struct lkl_sem *sem)
+{
+	CloseHandle(sem->sem);
+	free(sem);
+}
+
+static struct lkl_mutex *mutex_alloc(int recursive)
+{
+	struct lkl_mutex *_mutex = malloc(sizeof(struct lkl_mutex));
+
+	if (!_mutex)
+		return NULL;
+
+	if (recursive)
+		_mutex->handle = CreateMutex(0, FALSE, 0);
+	else
+		_mutex->handle = CreateSemaphore(NULL, 1, 100, NULL);
+	_mutex->recursive = recursive;
+	return _mutex;
+}
+
+static void mutex_lock(struct lkl_mutex *mutex)
+{
+	WaitForSingleObject(mutex->handle, INFINITE);
+}
+
+static void mutex_unlock(struct lkl_mutex *_mutex)
+{
+	if (_mutex->recursive)
+		ReleaseMutex(_mutex->handle);
+	else
+		ReleaseSemaphore(_mutex->handle, 1, NULL);
+}
+
+static void mutex_free(struct lkl_mutex *_mutex)
+{
+	CloseHandle(_mutex->handle);
+	free(_mutex);
+}
+
+static lkl_thread_t thread_create(void (*fn)(void *), void *arg)
+{
+	DWORD WINAPI(*win_fn)(LPVOID arg) = (DWORD WINAPI(*)(LPVOID))fn;
+	HANDLE h = CreateThread(NULL, 0, win_fn, arg, 0, NULL);
+
+	if (!h)
+		return 0;
+
+	return GetThreadId(h);
+}
+
+static void thread_detach(void)
+{
+}
+
+static void thread_exit(void)
+{
+	ExitThread(0);
+}
+
+static int thread_join(lkl_thread_t tid)
+{
+	int ret;
+	HANDLE *h;
+
+	h = OpenThread(SYNCHRONIZE, FALSE, tid);
+	if (!h)
+		lkl_printf("%s: can't get thread handle\n", __func__);
+
+	ret = WaitForSingleObject(h, INFINITE);
+	if (ret)
+		lkl_printf("%s: %d\n", __func__, ret);
+
+	CloseHandle(h);
+
+	return ret ? -1 : 0;
+}
+
+static lkl_thread_t thread_self(void)
+{
+	return GetThreadId(GetCurrentThread());
+}
+
+static int thread_equal(lkl_thread_t a, lkl_thread_t b)
+{
+	return a == b;
+}
+
+static struct lkl_tls_key *tls_alloc(void (*destructor)(void *))
+{
+	struct lkl_tls_key *ret = malloc(sizeof(struct lkl_tls_key));
+
+	ret->key = FlsAlloc((PFLS_CALLBACK_FUNCTION)destructor);
+	if (ret->key == TLS_OUT_OF_INDEXES) {
+		free(ret);
+		return NULL;
+	}
+	return ret;
+}
+
+static void tls_free(struct lkl_tls_key *key)
+{
+	/* setting to NULL first to prevent the callback from being called */
+	FlsSetValue(key->key, NULL);
+	FlsFree(key->key);
+	free(key);
+}
+
+static int tls_set(struct lkl_tls_key *key, void *data)
+{
+	return FlsSetValue(key->key, data) ? 0 : -1;
+}
+
+static void *tls_get(struct lkl_tls_key *key)
+{
+	return FlsGetValue(key->key);
+}
+
+
+/*
+ * With 64 bits, we can cover about 583 years at a nanosecond resolution.
+ * Windows counts time from 1601 so we do have about 100 years before we
+ * overflow.
+ */
+static unsigned long long time_ns(void)
+{
+	SYSTEMTIME st;
+	FILETIME ft;
+	ULARGE_INTEGER uli;
+
+	GetSystemTime(&st);
+	SystemTimeToFileTime(&st, &ft);
+	uli.LowPart = ft.dwLowDateTime;
+	uli.HighPart = ft.dwHighDateTime;
+
+	return (uli.QuadPart - DIFF_1601_TO_1970_IN_100NS) * 100;
+}
+
+struct timer {
+	HANDLE queue;
+	void (*callback)(void *arg);
+	void *arg;
+};
+
+static void *timer_alloc(void (*fn)(void *), void *arg)
+{
+	struct timer *t;
+
+	t = malloc(sizeof(*t));
+	if (!t)
+		return NULL;
+
+	t->queue = CreateTimerQueue();
+	if (!t->queue) {
+		free(t);
+		return NULL;
+	}
+
+	t->callback = fn;
+	t->arg = arg;
+
+	return t;
+}
+
+static void CALLBACK timer_callback(void *arg, BOOLEAN TimerOrWaitFired)
+{
+	struct timer *t = (struct timer *)arg;
+
+	if (TimerOrWaitFired)
+		t->callback(t->arg);
+}
+
+static int timer_set_oneshot(void *timer, unsigned long ns)
+{
+	struct timer *t = (struct timer *)timer;
+	HANDLE tmp;
+
+	return !CreateTimerQueueTimer(&tmp, t->queue, timer_callback, t,
+				      ns / 1000000, 0, 0);
+}
+
+static void timer_free(void *timer)
+{
+	struct timer *t = (struct timer *)timer;
+	HANDLE completion;
+
+	completion = CreateEvent(NULL, FALSE, FALSE, NULL);
+	DeleteTimerQueueEx(t->queue, completion);
+	WaitForSingleObject(completion, INFINITE);
+	free(t);
+}
+
+static void panic(void)
+{
+	int *x = NULL;
+
+	*x = 1;
+	assert(0);
+}
+
+static void print(const char *str, int len)
+{
+	write(1, str, len);
+}
+
+static long gettid(void)
+{
+	return GetCurrentThreadId();
+}
+
+static void *mem_alloc(unsigned long size)
+{
+	return malloc(size);
+}
+
+struct lkl_host_operations lkl_host_ops = {
+	.panic = panic,
+	.thread_create = thread_create,
+	.thread_detach = thread_detach,
+	.thread_exit = thread_exit,
+	.thread_join = thread_join,
+	.thread_self = thread_self,
+	.thread_equal = thread_equal,
+	.sem_alloc = sem_alloc,
+	.sem_free = sem_free,
+	.sem_up = sem_up,
+	.sem_down = sem_down,
+	.mutex_alloc = mutex_alloc,
+	.mutex_free = mutex_free,
+	.mutex_lock = mutex_lock,
+	.mutex_unlock = mutex_unlock,
+	.tls_alloc = tls_alloc,
+	.tls_free = tls_free,
+	.tls_set = tls_set,
+	.tls_get = tls_get,
+	.time = time_ns,
+	.timer_alloc = timer_alloc,
+	.timer_set_oneshot = timer_set_oneshot,
+	.timer_free = timer_free,
+	.print = print,
+	.mem_alloc = mem_alloc,
+	.mem_free = free,
+	.ioremap = lkl_ioremap,
+	.iomem_access = lkl_iomem_access,
+	.virtio_devices = lkl_virtio_devs,
+	.gettid = gettid,
+	.jmp_buf_set = jmp_buf_set,
+	.jmp_buf_longjmp = jmp_buf_longjmp,
+};
+
+int handle_get_capacity(struct lkl_disk disk, unsigned long long *res)
+{
+	LARGE_INTEGER tmp;
+
+	if (!GetFileSizeEx(disk.handle, &tmp))
+		return -1;
+
+	*res = tmp.QuadPart;
+	return 0;
+}
+
+static int blk_request(struct lkl_disk disk, struct lkl_blk_req *req)
+{
+	unsigned long long offset = req->sector * 512;
+	OVERLAPPED ov = { 0, };
+	int err = 0, ret;
+
+	switch (req->type) {
+	case LKL_DEV_BLK_TYPE_READ:
+	case LKL_DEV_BLK_TYPE_WRITE:
+	{
+		int i;
+
+		for (i = 0; i < req->count; i++) {
+			DWORD res;
+			struct iovec *buf = &req->buf[i];
+
+			ov.Offset = offset & 0xffffffff;
+			ov.OffsetHigh = offset >> 32;
+
+			if (req->type == LKL_DEV_BLK_TYPE_READ)
+				ret = ReadFile(disk.handle, buf->iov_base,
+					       buf->iov_len, &res, &ov);
+			else
+				ret = WriteFile(disk.handle, buf->iov_base,
+						buf->iov_len, &res, &ov);
+			if (!ret) {
+				lkl_printf("%s: I/O error: %d\n", __func__,
+					   GetLastError());
+				err = -1;
+				goto out;
+			}
+
+			if (res != buf->iov_len) {
+				lkl_printf("%s: I/O error: short: %d %d\n",
+					   res, buf->iov_len);
+				err = -1;
+				goto out;
+			}
+
+			offset += buf->iov_len;
+		}
+		break;
+	}
+	case LKL_DEV_BLK_TYPE_FLUSH:
+	case LKL_DEV_BLK_TYPE_FLUSH_OUT:
+		ret = FlushFileBuffers(disk.handle);
+		if (!ret)
+			err = 1;
+		break;
+	default:
+		return LKL_DEV_BLK_STATUS_UNSUP;
+	}
+
+out:
+	if (err < 0)
+		return LKL_DEV_BLK_STATUS_IOERR;
+
+	return LKL_DEV_BLK_STATUS_OK;
+}
+
+struct lkl_dev_blk_ops lkl_dev_blk_ops = {
+	.get_capacity = handle_get_capacity,
+	.request = blk_request,
+};
+
+/* Needed to resolve linker error on Win32. We don't really support
+ * any network IO on Windows, anyway, so there's no loss here.
+ */
+int lkl_netdevs_remove(void)
+{
+	return 0;
+}
diff --git a/tools/lkl/tests/test.sh b/tools/lkl/tests/test.sh
index 1a5619aed735..cda932b98058 100644
--- a/tools/lkl/tests/test.sh
+++ b/tools/lkl/tests/test.sh
@@ -98,7 +98,9 @@ lkl_test_exec()
         file=$file.exe
     fi
 
-    if file $file | grep ARM; then
+    if file $file | grep PE32; then
+        WRAPPER="wine"
+    elif file $file | grep ARM; then
         WRAPPER="qemu-arm-static"
     elif file $file | grep "FreeBSD" ; then
         ssh_copy "$file" $BSD_WDIR
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 33/37] kallsyms: Add a config option to select section for kallsyms
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch

From: Andreas Abel <aabel@google.com>

This commit adds a kernel config option to select whether the
kallsyms data should be in the .rodata section (the default for
non-LKL builds), or in the .data section (the default for LKL).

This is to avoid relocations in the text segment (TEXTRELs) that
would otherwise occur with LKL when the .rodata and the .text
section end up in the same segment.

Having TEXTRELs can lead to a number of issues:

1. If a shared library contains a TEXTREL, the corresponding memory
pages cannot be shared.

2. Android >=6 and SELinux do not support binaries with TEXTRELs
(http://android-developers.blogspot.com/2016/06/android-changes-for-ndk-developers.html).

3. If a program has a TEXTREL, uses an ifunc, and is compiled with
early binding, this can lead to a segmentation fault when processing
the relocation for the ifunc during dynamic linking because the text
segment is made temporarily non-executable to process the TEXTREL
(line 248 in dl_reloc.c).

Signed-off-by: Andreas Abel <aabel@google.com>
---
 init/Kconfig            | 12 ++++++++++++
 scripts/kallsyms.c      | 11 +++++++++--
 scripts/link-vmlinux.sh |  4 ++++
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 81293d78a6ad..bd1a846e0ee0 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1385,6 +1385,18 @@ config POSIX_TIMERS
 
 	  If unsure say y.
 
+config KALLSYMS_USE_DATA_SECTION
+	bool "Use .data instead of .rodata section for kallsyms"
+	depends on KALLSYMS
+	default n
+	help
+	  Enabling this option will put the kallsyms data in the .data section
+	  instead of the .rodata section.
+
+	  This is useful when building the kernel as a library, as it avoids
+	  relocations in the text segment that could otherwise occur if the
+	  .rodata section is in the same segment as the .text section.
+
 config PRINTK
 	default y
 	bool "Enable support for printk" if EXPERT
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 75ec25554111..5e4f270c3904 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -59,6 +59,7 @@ static struct addr_range percpu_range = {
 static struct sym_entry *table;
 static unsigned int table_size, table_cnt;
 static int all_symbols = 0;
+static int use_data_section;
 static int absolute_percpu = 0;
 static char symbol_prefix_char = '\0';
 static int base_relative = 0;
@@ -74,6 +75,7 @@ static void usage(void)
 {
 	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
 			"[--symbol-prefix=<prefix char>] "
+			"[--use-data-section] "
 			"[--base-relative] < in.map > out.S\n");
 	exit(1);
 }
@@ -362,7 +364,10 @@ static void write_src(void)
 	printf("#define ALGN .balign 4\n");
 	printf("#endif\n");
 
-	printf("\t.section .rodata, \"a\"\n");
+	if (use_data_section)
+		printf("\t.section .data\n");
+	else
+		printf("\t.section .rodata, \"a\"\n");
 
 	/* Provide proper symbols relocatability by their relativeness
 	 * to a fixed anchor point in the runtime image, either '_text'
@@ -774,7 +779,9 @@ int main(int argc, char **argv)
 				if ((*p == '"' && *(p+2) == '"') || (*p == '\'' && *(p+2) == '\''))
 					p++;
 				symbol_prefix_char = *p;
-			} else
+			} else if (strcmp(argv[i], "--use-data-section") == 0)
+				use_data_section = 1;
+			else
 				usage();
 		}
 	} else if (argc != 1)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 553d966a1986..3fc1fc406b38 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -133,6 +133,10 @@ kallsyms()
 		kallsymopt="${kallsymopt} --base-relative"
 	fi
 
+	if [ -n "${CONFIG_KALLSYMS_USE_DATA_SECTION}" ]; then
+		kallsymopt="${kallsymopt} --use-data-section"
+	fi
+
 	local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL}               \
 		      ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
 
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 33/37] kallsyms: Add a config option to select section for kallsyms
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um; +Cc: Octavian Purdila, linux-kernel-library, linux-arch, Akira Moroo

From: Andreas Abel <aabel@google.com>

This commit adds a kernel config option to select whether the
kallsyms data should be in the .rodata section (the default for
non-LKL builds), or in the .data section (the default for LKL).

This is to avoid relocations in the text segment (TEXTRELs) that
would otherwise occur with LKL when the .rodata and the .text
section end up in the same segment.

Having TEXTRELs can lead to a number of issues:

1. If a shared library contains a TEXTREL, the corresponding memory
pages cannot be shared.

2. Android >=6 and SELinux do not support binaries with TEXTRELs
(http://android-developers.blogspot.com/2016/06/android-changes-for-ndk-developers.html).

3. If a program has a TEXTREL, uses an ifunc, and is compiled with
early binding, this can lead to a segmentation fault when processing
the relocation for the ifunc during dynamic linking because the text
segment is made temporarily non-executable to process the TEXTREL
(line 248 in dl_reloc.c).

Signed-off-by: Andreas Abel <aabel@google.com>
---
 init/Kconfig            | 12 ++++++++++++
 scripts/kallsyms.c      | 11 +++++++++--
 scripts/link-vmlinux.sh |  4 ++++
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 81293d78a6ad..bd1a846e0ee0 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1385,6 +1385,18 @@ config POSIX_TIMERS
 
 	  If unsure say y.
 
+config KALLSYMS_USE_DATA_SECTION
+	bool "Use .data instead of .rodata section for kallsyms"
+	depends on KALLSYMS
+	default n
+	help
+	  Enabling this option will put the kallsyms data in the .data section
+	  instead of the .rodata section.
+
+	  This is useful when building the kernel as a library, as it avoids
+	  relocations in the text segment that could otherwise occur if the
+	  .rodata section is in the same segment as the .text section.
+
 config PRINTK
 	default y
 	bool "Enable support for printk" if EXPERT
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 75ec25554111..5e4f270c3904 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -59,6 +59,7 @@ static struct addr_range percpu_range = {
 static struct sym_entry *table;
 static unsigned int table_size, table_cnt;
 static int all_symbols = 0;
+static int use_data_section;
 static int absolute_percpu = 0;
 static char symbol_prefix_char = '\0';
 static int base_relative = 0;
@@ -74,6 +75,7 @@ static void usage(void)
 {
 	fprintf(stderr, "Usage: kallsyms [--all-symbols] "
 			"[--symbol-prefix=<prefix char>] "
+			"[--use-data-section] "
 			"[--base-relative] < in.map > out.S\n");
 	exit(1);
 }
@@ -362,7 +364,10 @@ static void write_src(void)
 	printf("#define ALGN .balign 4\n");
 	printf("#endif\n");
 
-	printf("\t.section .rodata, \"a\"\n");
+	if (use_data_section)
+		printf("\t.section .data\n");
+	else
+		printf("\t.section .rodata, \"a\"\n");
 
 	/* Provide proper symbols relocatability by their relativeness
 	 * to a fixed anchor point in the runtime image, either '_text'
@@ -774,7 +779,9 @@ int main(int argc, char **argv)
 				if ((*p == '"' && *(p+2) == '"') || (*p == '\'' && *(p+2) == '\''))
 					p++;
 				symbol_prefix_char = *p;
-			} else
+			} else if (strcmp(argv[i], "--use-data-section") == 0)
+				use_data_section = 1;
+			else
 				usage();
 		}
 	} else if (argc != 1)
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 553d966a1986..3fc1fc406b38 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -133,6 +133,10 @@ kallsyms()
 		kallsymopt="${kallsymopt} --base-relative"
 	fi
 
+	if [ -n "${CONFIG_KALLSYMS_USE_DATA_SECTION}" ]; then
+		kallsymopt="${kallsymopt} --use-data-section"
+	fi
+
 	local aflags="${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL}               \
 		      ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS}"
 
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 34/37] lkl: Android ARM (arm/arm64) support
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki

Initial attempt to run an application with hijack library on Android
platform.  Tested mostly on Android 6.x and 7.x.

The build process assumes that the android ndk toolchain is installed in
a host system. arm32 build is required to use alternate linker in order
to avoid a link issue during the build (described in *1).

*1
https://github.com/lkl/linux/issues/59#issuecomment-308961122

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 arch/um/lkl/Kconfig            |  1 +
 tools/lkl/Makefile.autoconf    |  7 +++-
 tools/lkl/lib/hijack/hijack.c  | 13 ++++++++
 tools/lkl/lib/hijack/init.c    | 11 ++++++
 tools/lkl/tests/disk.sh        | 11 +++++-
 tools/lkl/tests/hijack-test.sh | 43 ++++++++++++++++++------
 tools/lkl/tests/test.sh        | 61 +++++++++++++++++++++++++++++++++-
 7 files changed, 134 insertions(+), 13 deletions(-)

diff --git a/arch/um/lkl/Kconfig b/arch/um/lkl/Kconfig
index 1629e2679b75..fc501b64a2af 100644
--- a/arch/um/lkl/Kconfig
+++ b/arch/um/lkl/Kconfig
@@ -23,6 +23,7 @@ config LKL
        select 64BIT if "$(OUTPUT_FORMAT)" = "pe-x86-64"
        select HAVE_UNDERSCORE_SYMBOL_PREFIX if "$(OUTPUT_FORMAT)" = "pe-i386"
        select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64-freebsd"
+       select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-littleaarch64"
        select NET
        select MULTIUSER
        select INET
diff --git a/tools/lkl/Makefile.autoconf b/tools/lkl/Makefile.autoconf
index 1631f5cc25ac..7222a95c314f 100644
--- a/tools/lkl/Makefile.autoconf
+++ b/tools/lkl/Makefile.autoconf
@@ -1,4 +1,4 @@
-POSIX_HOSTS=elf64-x86-64 elf32-i386 elf64-x86-64-freebsd
+POSIX_HOSTS=elf64-x86-64 elf32-i386 elf64-x86-64-freebsd elf32-littlearm elf64-littleaarch64
 NT_HOSTS=pe-i386 pe-x86-64
 
 define set_autoconf_var
@@ -17,6 +17,10 @@ define is_defined
 $(shell $(CC) -dM -E - </dev/null | grep $(1))
 endef
 
+define android_host
+  $(call set_autoconf_var,ANDROID,y)
+endef
+
 define bsd_host
   $(call set_autoconf_var,BSD,y)
 endef
@@ -54,6 +58,7 @@ define posix_host
   LDFLAGS += -pie
   CFLAGS += -fPIC -pthread
   SOSUF := .so
+  $(if $(call is_defined,__ANDROID__),$(call android_host),LDLIBS += -lrt -lpthread)
   $(if $(filter $(1),elf64-x86-64-freebsd),$(call bsd_host))
   $(if $(filter $(1),elf32-littlearm),$(call arm_host))
   $(if $(filter $(1),elf64-littleaarch64),$(call aarch64_host))
diff --git a/tools/lkl/lib/hijack/hijack.c b/tools/lkl/lib/hijack/hijack.c
index 485c15d7c279..3a95b9dbe88b 100644
--- a/tools/lkl/lib/hijack/hijack.c
+++ b/tools/lkl/lib/hijack/hijack.c
@@ -204,7 +204,11 @@ int socket(int domain, int type, int protocol)
 }
 
 HOST_CALL(ioctl);
+#ifdef __ANDROID__
+int ioctl(int fd, int req, ...)
+#else
 int ioctl(int fd, unsigned long req, ...)
+#endif
 {
 	va_list vl;
 	long arg;
@@ -586,6 +590,15 @@ void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset)
 	return lkl_sys_mmap(addr, length, prot, flags, fd, offset);
 }
 
+#ifndef __ANDROID__
+HOST_CALL(__xstat64)
+int stat(const char *pathname, struct stat *buf)
+{
+	CHECK_HOST_CALL(__xstat64);
+	return host___xstat64(0, pathname, buf);
+}
+#endif
+
 ssize_t send(int fd, const void *buf, size_t len, int flags)
 {
 	return sendto(fd, buf, len, flags, 0, 0);
diff --git a/tools/lkl/lib/hijack/init.c b/tools/lkl/lib/hijack/init.c
index 2145fb7ec2cb..de00f2018e59 100644
--- a/tools/lkl/lib/hijack/init.c
+++ b/tools/lkl/lib/hijack/init.c
@@ -170,6 +170,17 @@ hijack_init(void)
 	if (single_cpu_mode == 1)
 		PinToFirstCpu(&ori_cpu);
 
+#ifdef __ANDROID__
+	struct sigaction sa;
+
+	sa.sa_handler = SIG_IGN;
+	sa.sa_flags = 0;
+	if (sigaction(32, &sa, 0) == -1) {
+		perror("sigaction");
+		exit(1);
+	}
+#endif
+
 	ret = lkl_start_kernel(&lkl_host_ops, cfg->boot_cmdline);
 	if (ret) {
 		fprintf(stderr, "can't start kernel: %s\n", lkl_strerror(ret));
diff --git a/tools/lkl/tests/disk.sh b/tools/lkl/tests/disk.sh
index 9bdcb16f2d5c..e2ec6cf69d4b 100755
--- a/tools/lkl/tests/disk.sh
+++ b/tools/lkl/tests/disk.sh
@@ -15,6 +15,12 @@ function prepfs()
 
     yes | mkfs.$1 $file
 
+    if ! [ -z $ANDROID_WDIR ]; then
+        adb shell mkdir -p $ANDROID_WDIR
+        adb push $file $ANDROID_WDIR
+        rm $file
+        file=$ANDROID_WDIR/$(basename $file)
+    fi
     if ! [ -z $BSD_WDIR ]; then
         $MYSSH mkdir -p $BSD_WDIR
         ssh_copy $file $BSD_WDIR
@@ -29,7 +35,10 @@ function cleanfs()
 {
     set -e
 
-    if ! [ -z $BSD_WDIR ]; then
+    if ! [ -z $ANDROID_WDIR ]; then
+        adb shell rm $1
+        adb shell rm $ANDROID_WDIR/disk
+    elif ! [ -z $BSD_WDIR ]; then
         $MYSSH rm $1
         $MYSSH rm $BSD_WDIR/disk
     else
diff --git a/tools/lkl/tests/hijack-test.sh b/tools/lkl/tests/hijack-test.sh
index 097af6cff3ba..a62aa5b251e0 100755
--- a/tools/lkl/tests/hijack-test.sh
+++ b/tools/lkl/tests/hijack-test.sh
@@ -15,7 +15,11 @@ set_cfgjson()
 {
     cfgjson=${wdir}/hijack-test$1.conf
 
-    cat > ${cfgjson}
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        adb shell cat \> ${cfgjson}
+    else
+        cat > ${cfgjson}
+    fi
 
     export_vars cfgjson
 }
@@ -54,6 +58,11 @@ test_mount_and_dump()
 {
     set -e
 
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        echo "TODO: android-23 doesn't call destructor..."
+        return $TEST_SKIP
+    fi
+
     set_cfgjson << EOF
     {
         "mount":"proc,sysfs",
@@ -377,6 +386,10 @@ test_tap_qdisc()
 {
     set -e
 
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        return $TEST_SKIP
+    fi
+
     set_cfgjson << EOF
     {
         "gateway":"$(ip_host)",
@@ -655,15 +668,25 @@ if [[ ! -e ${basedir}/lib/hijack/liblkl-hijack.so ]]; then
     exit 0
 fi
 
-# Make a temporary directory to run tests in, since we'll be copying
-# things there.
-wdir=$(mktemp -d)
-cp `which ping` ${wdir}
-cp `which ping6` ${wdir}
-ping=${wdir}/ping
-ping6=${wdir}/ping6
-hijack=$basedir/bin/lkl-hijack.sh
-netperf=$basedir/tests/run_netperf.sh
+if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+    wdir=$ANDROID_WDIR
+    adb_push lib/hijack/liblkl-hijack.so bin/lkl-hijack.sh tests/net-setup.sh \
+             tests/run_netperf.sh tests/hijack-test.sh
+    ping="ping"
+    ping6="ping6"
+    hijack="$wdir/bin/lkl-hijack.sh"
+    netperf="$wdir/tests/run_netperf.sh"
+else
+    # Make a temporary directory to run tests in, since we'll be copying
+    # things there.
+    wdir=$(mktemp -d)
+    cp `which ping` ${wdir}
+    cp `which ping6` ${wdir}
+    ping=${wdir}/ping
+    ping6=${wdir}/ping6
+    hijack=$basedir/bin/lkl-hijack.sh
+    netperf=$basedir/tests/run_netperf.sh
+fi
 
 fifo1=${wdir}/fifo1
 fifo2=${wdir}/fifo2
diff --git a/tools/lkl/tests/test.sh b/tools/lkl/tests/test.sh
index cda932b98058..a40d08fd6185 100644
--- a/tools/lkl/tests/test.sh
+++ b/tools/lkl/tests/test.sh
@@ -100,6 +100,19 @@ lkl_test_exec()
 
     if file $file | grep PE32; then
         WRAPPER="wine"
+    elif file $file | grep "interpreter /system/bin/linker" ; then
+        adb push "$file" $ANDROID_WDIR
+        if [ -n "$SUDO" ]; then
+            ANDROID_USER=root
+            SUDO=""
+        fi
+        if [ -n "$ANDROID_USER" ]; then
+            SU="su $ANDROID_USER"
+        else
+            SU=""
+        fi
+        WRAPPER="adb shell $SU"
+        file=$ANDROID_WDIR/$(basename $file)
     elif file $file | grep ARM; then
         WRAPPER="qemu-arm-static"
     elif file $file | grep "FreeBSD" ; then
@@ -134,13 +147,48 @@ lkl_test_cmd()
         SHOPTS="-x"
     fi
 
-    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        if [ "$1" = "sudo" ]; then
+            ANDROID_USER=root
+            shift
+        fi
+        if [ -n "$ANDROID_USER" ]; then
+            SU="su $ANDROID_USER"
+        else
+            SU=""
+        fi
+        WRAPPER="adb shell $SU"
+    elif [ -n "$LKL_HOST_CONFIG_BSD" ]; then
         WRAPPER="$MYSSH $SU"
     fi
 
     echo "$@" | $WRAPPER sh $SHOPTS
 }
 
+adb_push()
+{
+    while [ -n "$1" ]; do
+        if [[ "$1" = *.sh ]]; then
+            type="script"
+        else
+            type="file"
+        fi
+
+        dir=$(dirname $1)
+        adb shell mkdir -p $ANDROID_WDIR/$dir
+
+        if [ "$type" = "script" ]; then
+            sed "s/\/usr\/bin\/env bash/\/system\/bin\/sh/" $basedir/$1 | \
+                adb shell cat \> $ANDROID_WDIR/$1
+            adb shell chmod a+x $ANDROID_WDIR/$1
+        else
+            adb push $basedir/$1 $ANDROID_WDIR/$dir
+        fi
+
+        shift
+    done
+}
+
 # XXX: $MYSSH and $MYSCP are defined in a circleci docker image.
 # see the definitions in lkl/lkl-docker:circleci/freebsd11/Dockerfile
 ssh_push()
@@ -169,11 +217,22 @@ ssh_copy()
     $MYSCP -P 7722 -r $1 root@localhost:$2
 }
 
+lkl_test_android_cleanup()
+{
+    adb shell rm -rf $ANDROID_WDIR
+}
+
 lkl_test_bsd_cleanup()
 {
     $MYSSH rm -rf $BSD_WDIR
 }
 
+if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+    trap lkl_test_android_cleanup EXIT
+    export ANDROID_WDIR=/data/local/tmp/lkl
+    adb shell mkdir -p $ANDROID_WDIR
+fi
+
 if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
     trap lkl_test_bsd_cleanup EXIT
     export BSD_WDIR=/root/lkl
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 34/37] lkl: Android ARM (arm/arm64) support
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, linux-kernel-library, linux-arch,
	Hajime Tazaki, Akira Moroo

Initial attempt to run an application with hijack library on Android
platform.  Tested mostly on Android 6.x and 7.x.

The build process assumes that the android ndk toolchain is installed in
a host system. arm32 build is required to use alternate linker in order
to avoid a link issue during the build (described in *1).

*1
https://github.com/lkl/linux/issues/59#issuecomment-308961122

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 arch/um/lkl/Kconfig            |  1 +
 tools/lkl/Makefile.autoconf    |  7 +++-
 tools/lkl/lib/hijack/hijack.c  | 13 ++++++++
 tools/lkl/lib/hijack/init.c    | 11 ++++++
 tools/lkl/tests/disk.sh        | 11 +++++-
 tools/lkl/tests/hijack-test.sh | 43 ++++++++++++++++++------
 tools/lkl/tests/test.sh        | 61 +++++++++++++++++++++++++++++++++-
 7 files changed, 134 insertions(+), 13 deletions(-)

diff --git a/arch/um/lkl/Kconfig b/arch/um/lkl/Kconfig
index 1629e2679b75..fc501b64a2af 100644
--- a/arch/um/lkl/Kconfig
+++ b/arch/um/lkl/Kconfig
@@ -23,6 +23,7 @@ config LKL
        select 64BIT if "$(OUTPUT_FORMAT)" = "pe-x86-64"
        select HAVE_UNDERSCORE_SYMBOL_PREFIX if "$(OUTPUT_FORMAT)" = "pe-i386"
        select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-x86-64-freebsd"
+       select 64BIT if "$(OUTPUT_FORMAT)" = "elf64-littleaarch64"
        select NET
        select MULTIUSER
        select INET
diff --git a/tools/lkl/Makefile.autoconf b/tools/lkl/Makefile.autoconf
index 1631f5cc25ac..7222a95c314f 100644
--- a/tools/lkl/Makefile.autoconf
+++ b/tools/lkl/Makefile.autoconf
@@ -1,4 +1,4 @@
-POSIX_HOSTS=elf64-x86-64 elf32-i386 elf64-x86-64-freebsd
+POSIX_HOSTS=elf64-x86-64 elf32-i386 elf64-x86-64-freebsd elf32-littlearm elf64-littleaarch64
 NT_HOSTS=pe-i386 pe-x86-64
 
 define set_autoconf_var
@@ -17,6 +17,10 @@ define is_defined
 $(shell $(CC) -dM -E - </dev/null | grep $(1))
 endef
 
+define android_host
+  $(call set_autoconf_var,ANDROID,y)
+endef
+
 define bsd_host
   $(call set_autoconf_var,BSD,y)
 endef
@@ -54,6 +58,7 @@ define posix_host
   LDFLAGS += -pie
   CFLAGS += -fPIC -pthread
   SOSUF := .so
+  $(if $(call is_defined,__ANDROID__),$(call android_host),LDLIBS += -lrt -lpthread)
   $(if $(filter $(1),elf64-x86-64-freebsd),$(call bsd_host))
   $(if $(filter $(1),elf32-littlearm),$(call arm_host))
   $(if $(filter $(1),elf64-littleaarch64),$(call aarch64_host))
diff --git a/tools/lkl/lib/hijack/hijack.c b/tools/lkl/lib/hijack/hijack.c
index 485c15d7c279..3a95b9dbe88b 100644
--- a/tools/lkl/lib/hijack/hijack.c
+++ b/tools/lkl/lib/hijack/hijack.c
@@ -204,7 +204,11 @@ int socket(int domain, int type, int protocol)
 }
 
 HOST_CALL(ioctl);
+#ifdef __ANDROID__
+int ioctl(int fd, int req, ...)
+#else
 int ioctl(int fd, unsigned long req, ...)
+#endif
 {
 	va_list vl;
 	long arg;
@@ -586,6 +590,15 @@ void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset)
 	return lkl_sys_mmap(addr, length, prot, flags, fd, offset);
 }
 
+#ifndef __ANDROID__
+HOST_CALL(__xstat64)
+int stat(const char *pathname, struct stat *buf)
+{
+	CHECK_HOST_CALL(__xstat64);
+	return host___xstat64(0, pathname, buf);
+}
+#endif
+
 ssize_t send(int fd, const void *buf, size_t len, int flags)
 {
 	return sendto(fd, buf, len, flags, 0, 0);
diff --git a/tools/lkl/lib/hijack/init.c b/tools/lkl/lib/hijack/init.c
index 2145fb7ec2cb..de00f2018e59 100644
--- a/tools/lkl/lib/hijack/init.c
+++ b/tools/lkl/lib/hijack/init.c
@@ -170,6 +170,17 @@ hijack_init(void)
 	if (single_cpu_mode == 1)
 		PinToFirstCpu(&ori_cpu);
 
+#ifdef __ANDROID__
+	struct sigaction sa;
+
+	sa.sa_handler = SIG_IGN;
+	sa.sa_flags = 0;
+	if (sigaction(32, &sa, 0) == -1) {
+		perror("sigaction");
+		exit(1);
+	}
+#endif
+
 	ret = lkl_start_kernel(&lkl_host_ops, cfg->boot_cmdline);
 	if (ret) {
 		fprintf(stderr, "can't start kernel: %s\n", lkl_strerror(ret));
diff --git a/tools/lkl/tests/disk.sh b/tools/lkl/tests/disk.sh
index 9bdcb16f2d5c..e2ec6cf69d4b 100755
--- a/tools/lkl/tests/disk.sh
+++ b/tools/lkl/tests/disk.sh
@@ -15,6 +15,12 @@ function prepfs()
 
     yes | mkfs.$1 $file
 
+    if ! [ -z $ANDROID_WDIR ]; then
+        adb shell mkdir -p $ANDROID_WDIR
+        adb push $file $ANDROID_WDIR
+        rm $file
+        file=$ANDROID_WDIR/$(basename $file)
+    fi
     if ! [ -z $BSD_WDIR ]; then
         $MYSSH mkdir -p $BSD_WDIR
         ssh_copy $file $BSD_WDIR
@@ -29,7 +35,10 @@ function cleanfs()
 {
     set -e
 
-    if ! [ -z $BSD_WDIR ]; then
+    if ! [ -z $ANDROID_WDIR ]; then
+        adb shell rm $1
+        adb shell rm $ANDROID_WDIR/disk
+    elif ! [ -z $BSD_WDIR ]; then
         $MYSSH rm $1
         $MYSSH rm $BSD_WDIR/disk
     else
diff --git a/tools/lkl/tests/hijack-test.sh b/tools/lkl/tests/hijack-test.sh
index 097af6cff3ba..a62aa5b251e0 100755
--- a/tools/lkl/tests/hijack-test.sh
+++ b/tools/lkl/tests/hijack-test.sh
@@ -15,7 +15,11 @@ set_cfgjson()
 {
     cfgjson=${wdir}/hijack-test$1.conf
 
-    cat > ${cfgjson}
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        adb shell cat \> ${cfgjson}
+    else
+        cat > ${cfgjson}
+    fi
 
     export_vars cfgjson
 }
@@ -54,6 +58,11 @@ test_mount_and_dump()
 {
     set -e
 
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        echo "TODO: android-23 doesn't call destructor..."
+        return $TEST_SKIP
+    fi
+
     set_cfgjson << EOF
     {
         "mount":"proc,sysfs",
@@ -377,6 +386,10 @@ test_tap_qdisc()
 {
     set -e
 
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        return $TEST_SKIP
+    fi
+
     set_cfgjson << EOF
     {
         "gateway":"$(ip_host)",
@@ -655,15 +668,25 @@ if [[ ! -e ${basedir}/lib/hijack/liblkl-hijack.so ]]; then
     exit 0
 fi
 
-# Make a temporary directory to run tests in, since we'll be copying
-# things there.
-wdir=$(mktemp -d)
-cp `which ping` ${wdir}
-cp `which ping6` ${wdir}
-ping=${wdir}/ping
-ping6=${wdir}/ping6
-hijack=$basedir/bin/lkl-hijack.sh
-netperf=$basedir/tests/run_netperf.sh
+if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+    wdir=$ANDROID_WDIR
+    adb_push lib/hijack/liblkl-hijack.so bin/lkl-hijack.sh tests/net-setup.sh \
+             tests/run_netperf.sh tests/hijack-test.sh
+    ping="ping"
+    ping6="ping6"
+    hijack="$wdir/bin/lkl-hijack.sh"
+    netperf="$wdir/tests/run_netperf.sh"
+else
+    # Make a temporary directory to run tests in, since we'll be copying
+    # things there.
+    wdir=$(mktemp -d)
+    cp `which ping` ${wdir}
+    cp `which ping6` ${wdir}
+    ping=${wdir}/ping
+    ping6=${wdir}/ping6
+    hijack=$basedir/bin/lkl-hijack.sh
+    netperf=$basedir/tests/run_netperf.sh
+fi
 
 fifo1=${wdir}/fifo1
 fifo2=${wdir}/fifo2
diff --git a/tools/lkl/tests/test.sh b/tools/lkl/tests/test.sh
index cda932b98058..a40d08fd6185 100644
--- a/tools/lkl/tests/test.sh
+++ b/tools/lkl/tests/test.sh
@@ -100,6 +100,19 @@ lkl_test_exec()
 
     if file $file | grep PE32; then
         WRAPPER="wine"
+    elif file $file | grep "interpreter /system/bin/linker" ; then
+        adb push "$file" $ANDROID_WDIR
+        if [ -n "$SUDO" ]; then
+            ANDROID_USER=root
+            SUDO=""
+        fi
+        if [ -n "$ANDROID_USER" ]; then
+            SU="su $ANDROID_USER"
+        else
+            SU=""
+        fi
+        WRAPPER="adb shell $SU"
+        file=$ANDROID_WDIR/$(basename $file)
     elif file $file | grep ARM; then
         WRAPPER="qemu-arm-static"
     elif file $file | grep "FreeBSD" ; then
@@ -134,13 +147,48 @@ lkl_test_cmd()
         SHOPTS="-x"
     fi
 
-    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
+    if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+        if [ "$1" = "sudo" ]; then
+            ANDROID_USER=root
+            shift
+        fi
+        if [ -n "$ANDROID_USER" ]; then
+            SU="su $ANDROID_USER"
+        else
+            SU=""
+        fi
+        WRAPPER="adb shell $SU"
+    elif [ -n "$LKL_HOST_CONFIG_BSD" ]; then
         WRAPPER="$MYSSH $SU"
     fi
 
     echo "$@" | $WRAPPER sh $SHOPTS
 }
 
+adb_push()
+{
+    while [ -n "$1" ]; do
+        if [[ "$1" = *.sh ]]; then
+            type="script"
+        else
+            type="file"
+        fi
+
+        dir=$(dirname $1)
+        adb shell mkdir -p $ANDROID_WDIR/$dir
+
+        if [ "$type" = "script" ]; then
+            sed "s/\/usr\/bin\/env bash/\/system\/bin\/sh/" $basedir/$1 | \
+                adb shell cat \> $ANDROID_WDIR/$1
+            adb shell chmod a+x $ANDROID_WDIR/$1
+        else
+            adb push $basedir/$1 $ANDROID_WDIR/$dir
+        fi
+
+        shift
+    done
+}
+
 # XXX: $MYSSH and $MYSCP are defined in a circleci docker image.
 # see the definitions in lkl/lkl-docker:circleci/freebsd11/Dockerfile
 ssh_push()
@@ -169,11 +217,22 @@ ssh_copy()
     $MYSCP -P 7722 -r $1 root@localhost:$2
 }
 
+lkl_test_android_cleanup()
+{
+    adb shell rm -rf $ANDROID_WDIR
+}
+
 lkl_test_bsd_cleanup()
 {
     $MYSSH rm -rf $BSD_WDIR
 }
 
+if [ -n "$LKL_HOST_CONFIG_ANDROID" ]; then
+    trap lkl_test_android_cleanup EXIT
+    export ANDROID_WDIR=/data/local/tmp/lkl
+    adb shell mkdir -p $ANDROID_WDIR
+fi
+
 if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
     trap lkl_test_bsd_cleanup EXIT
     export BSD_WDIR=/root/lkl
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 35/37] um lkl: add CI tests
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki

We use CircleCI for the tests, which should check regressions before
merging.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 .circleci/config.yml             | 274 +++++++++++++++++++++++++++++++
 tools/lkl/scripts/checkpatch.sh  |  60 +++++++
 tools/lkl/scripts/lkl-jenkins.sh |  21 +++
 3 files changed, 355 insertions(+)
 create mode 100644 .circleci/config.yml
 create mode 100755 tools/lkl/scripts/checkpatch.sh
 create mode 100755 tools/lkl/scripts/lkl-jenkins.sh

diff --git a/.circleci/config.yml b/.circleci/config.yml
new file mode 100644
index 000000000000..5c7b2fbad703
--- /dev/null
+++ b/.circleci/config.yml
@@ -0,0 +1,274 @@
+version: 2
+general:
+  artifacts:
+
+do_steps: &do_steps
+ steps:
+  - run: echo "$CROSS_COMPILE" > ~/_cross_compile
+  - restore_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: checkout build tree
+      command: |
+        mkdir -p ~/.ssh/
+        ssh-keyscan -H github.com >> ~/.ssh/known_hosts
+        if ! [ -d .git ]; then
+          git clone --depth=1 $CIRCLE_REPOSITORY_URL .;
+        fi
+        if [[ $CIRCLE_BRANCH == pull/* ]]; then
+           git fetch --depth=1 origin $CIRCLE_BRANCH/head;
+        else
+           git fetch --depth=1 origin $CIRCLE_BRANCH;
+        fi
+        git reset --hard $CIRCLE_SHA1
+  - save_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+      paths:
+        - /home/ubuntu/project/.git
+  - run:
+      name: clean
+      command: |
+        make mrproper
+        cd tools/lkl && make clean-conf
+        rm -rf ~/junit
+  - run: mkdir -p /home/ubuntu/.ccache
+  - restore_cache:
+      key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: build DPDK
+      command: |
+        if [ "$MKARG" = "dpdk=yes" ]; then
+          sudo apt-get update
+          if ! sudo apt-get install -y linux-headers-$(uname -r) ; then
+             cd /lib/modules && sudo ln -sf 4.4.0-97-generic `uname -r` && \
+               cd /home/ubuntu/project
+          fi
+          cd tools/lkl && ./scripts/dpdk-sdk-build.sh;
+        fi
+  - run:
+      name: copy mingw binutils
+      command: |
+        if [ "$CROSS_COMPILE" = "i686-w64-mingw32-" ]; then
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/i686-w64-mingw32-as
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/i686-w64-mingw32-ld
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/i686-w64-mingw32-objcopy
+          sudo cp i686-w64-mingw32-* /usr/bin;
+        elif [ "$CROSS_COMPILE" = "arm-linux-androideabi-" ]; then
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/arm-linux-androideabi-ld.gold
+          sudo cp arm-linux-androideabi-ld.gold /usr/bin/arm-linux-androideabi-ld;
+        fi
+  - run:
+      name: start emulator
+      command: |
+        if [[ $CROSS_COMPILE == *android* ]]; then
+          emulator -avd Nexus5_API24 -no-window -no-audio -no-boot-anim;
+        elif [[ $CROSS_COMPILE == *freebsd* ]]; then
+          cd /home/ubuntu && $QEMU
+        fi
+      background: true
+  - run: cd tools/lkl && make -j8 ${MKARG}
+  - run: mkdir -p ~/destdir && cd tools/lkl && make DESTDIR=~/destdir
+  - save_cache:
+     paths:
+       - /home/ubuntu/.ccache
+     key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+  - run:
+      name: wait emulator to boot
+      command: |
+        if [[ $CROSS_COMPILE == *android* ]]; then
+          /home/ubuntu/circle-android.sh wait-for-boot;
+        elif [[ $CROSS_COMPILE == *freebsd* ]]; then
+          while ! $MYSSH -o ConnectTimeout=1 exit 2> /dev/null
+          do
+             sleep 5
+          done
+        fi
+  - run:
+      name: run tests
+      command: |
+        mkdir -p ~/junit
+        make -C tools/lkl run-tests tests="--junit-dir ~/junit"
+        find ./tools/lkl/ -type f -name "*.xml" -exec mv {} ~/junit/ \;
+      no_output_timeout: "90m"
+  - store_test_results:
+      path: ~/junit
+  - store_artifacts:
+      path: ~/junit
+
+
+do_uml_steps: &do_uml_steps
+ steps:
+  - run: echo "$CROSS_COMPILE" > ~/_cross_compile
+  - restore_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: checkout build tree
+      command: |
+        mkdir -p ~/.ssh/
+        ssh-keyscan -H github.com >> ~/.ssh/known_hosts
+        if ! [ -d .git ]; then
+          git clone --depth=1 $CIRCLE_REPOSITORY_URL .;
+        fi
+        if [[ $CIRCLE_BRANCH == pull/* ]]; then
+           git fetch --depth=1 origin $CIRCLE_BRANCH/head;
+        else
+           git fetch --depth=1 origin $CIRCLE_BRANCH;
+        fi
+        git reset --hard $CIRCLE_SHA1
+  - save_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+      paths:
+        - /home/ubuntu/project/.git
+  - run: mkdir -p /home/ubuntu/.ccache
+  - restore_cache:
+      key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: build
+      command: |
+        sudo apt-get update
+        sudo apt-get install -y gcc-multilib g++-multilib
+        make -C tools/lkl/
+        make defconfig ARCH=um SUBARCH=$SUBARCH
+        make ARCH=um SUBARCH=$SUBARCH
+  - save_cache:
+     paths:
+       - /home/ubuntu/.ccache
+     key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+  - run:
+      name: test
+      command: |
+        # XXX: i386 build doesn't work with the test
+        if [ $CIRCLE_STAGE = "i386_uml" ] || [ $CIRCLE_STAGE = "i386_uml_on_x86_64" ]; then
+          exit 0
+        fi
+        ./linux rootfstype=hostfs ro mem=1g loglevel=10 init="/bin/bash -c exit" || export RETVAL=$?
+        # SIGABRT=6 => 128+6
+        if [ $RETVAL != "134" ]; then
+          exit 1
+        fi
+
+## Customize the test machine
+jobs:
+  x86_64:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     MKARG: "dpdk=no"
+   <<: *do_steps
+
+  i386:
+   docker:
+     - image: lkldocker/circleci-i386:0.1
+   environment:
+     CROSS_COMPILE: ""
+   <<: *do_steps
+
+  mingw32:
+   docker:
+     - image: lkldocker/circleci-mingw:0.6
+   environment:
+     CROSS_COMPILE: "i686-w64-mingw32-"
+   <<: *do_steps
+
+  android-arm32:
+   docker:
+     - image: lkldocker/circleci-android-arm32:0.6
+   environment:
+     CROSS_COMPILE: "arm-linux-androideabi-"
+     LKL_ANDROID_TEST: 1
+     ANDROID_SDK_ROOT: /home/ubuntu/android-sdk
+   <<: *do_steps
+
+  android-aarch64:
+   docker:
+     - image: lkldocker/circleci-android-arm64:0.6
+   environment:
+     CROSS_COMPILE: "aarch64-linux-android-"
+     LKL_ANDROID_TEST: 1
+     ANDROID_SDK_ROOT: /home/ubuntu/android-sdk
+   <<: *do_steps
+
+  freebsd11_x86_64:
+   docker:
+     - image: lkldocker/circleci-freebsd11-x86_64:0.4
+   environment:
+     CROSS_COMPILE: "x86_64-pc-freebsd11-"
+   <<: *do_steps
+
+  x86_64_valgrind:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     MKARG: "dpdk=no"
+     VALGRIND: 1
+   <<: *do_steps
+
+  x86_64_uml:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     TMPDIR: "/tmp" # required for not using /dev/shm
+     SUBARCH: "x86_64"
+   <<: *do_uml_steps
+
+  i386_uml:
+   docker:
+     - image: lkldocker/circleci-i386:0.1
+   environment:
+     CROSS_COMPILE: ""
+     SUBARCH: "i386"
+     TMPDIR: "/tmp" # required for not using /dev/shm
+   <<: *do_uml_steps
+
+  i386_uml_on_x86_64:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     TMPDIR: "/tmp" # required for not using /dev/shm
+     SUBARCH: "i386"
+   <<: *do_uml_steps
+
+  checkpatch:
+   docker:
+     - image: lkldocker/circleci:0.5
+   environment:
+   steps:
+     - restore_cache:
+        key: code-tree-full-history-{{ .Environment.CACHE_VERSION }}
+     - checkout
+     - run: sudo pip install ply
+     - run: tools/lkl/scripts/checkpatch.sh
+     - save_cache:
+        key: code-tree-full-history-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+        paths:
+          - /home/ubuntu/project/.git
+        when: always
+
+workflows:
+  version: 2
+  build:
+    jobs:
+     - x86_64
+     - mingw32
+     - android-arm32
+     - android-aarch64
+     - freebsd11_x86_64
+     - checkpatch
+     - i386
+     - x86_64_uml
+     - i386_uml
+     - i386_uml_on_x86_64
+  nightly:
+    triggers:
+      - schedule:
+          cron: "0 0 * * *"
+          filters:
+            branches:
+              only:
+                - master
+    jobs:
+      - x86_64_valgrind
diff --git a/tools/lkl/scripts/checkpatch.sh b/tools/lkl/scripts/checkpatch.sh
new file mode 100755
index 000000000000..0c02ca6b21a2
--- /dev/null
+++ b/tools/lkl/scripts/checkpatch.sh
@@ -0,0 +1,60 @@
+#!/bin/sh -ex
+# SPDX-License-Identifier: GPL-2.0
+
+if [ -z "$origin_master" ]; then
+    origin_master="origin/master"
+fi
+
+UPSTREAM=git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
+LKL=github.com:lkl/linux.git
+
+upstream=`git remote -v | grep $UPSTREAM | cut -f1 | head -n1`
+lkl=`git remote -v | grep $LKL | cut -f1 | head -n1`
+
+if [ -z "$upstream" ]; then
+    git fetch --tags --progress git://$UPSTREAM
+else
+    git fetch --tags $upstream
+fi
+
+if [ -z "$lkl" ]; then
+    git remote add lkl-upstream git@$LKL || true
+    lkl=`git remote -v | grep $LKL | cut -f1 | head -n1`
+fi
+
+if [ -z "$lkl" ]; then
+    echo "can't find lkl remote, quiting"
+    exit 1
+fi
+
+git fetch $lkl
+git fetch --tags $upstream
+
+# find the last upstream tag to avoid checking upstream commits during
+# upstream merges
+tag=`git tag --sort='-*authordate' | grep ^v | head -n1`
+tmp=`mktemp -d`
+
+commits=$(git log --no-merges --pretty=format:%h HEAD ^$lkl/master ^$tag)
+for c in $commits; do
+    git format-patch -1 -o $tmp $c
+done
+
+if [ -z "$c" ]; then
+    echo "there are not commits/patches to check, quiting."
+    rmdir $tmp
+    exit 0
+fi
+
+./scripts/checkpatch.pl --ignore FILE_PATH_CHANGES $tmp/*.patch
+rm $tmp/*.patch
+
+# checkpatch.pl does not know how to deal with 3 way diffs which would
+# be useful to check the conflict resolutions during merges...
+#for c in `git log --merges --pretty=format:%h HEAD ^$origin_master ^$tag`; do
+#    git log --pretty=email $c -1 > $tmp/$c.patch
+#    git diff $c $c^1 $c^2 >> $tmp/$c.patch
+#done
+
+rmdir $tmp
+
diff --git a/tools/lkl/scripts/lkl-jenkins.sh b/tools/lkl/scripts/lkl-jenkins.sh
new file mode 100755
index 000000000000..eaadc6e90143
--- /dev/null
+++ b/tools/lkl/scripts/lkl-jenkins.sh
@@ -0,0 +1,21 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+set -e
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+basedir=$(cd $script_dir/../../..; pwd)
+
+export PATH=$PATH:/sbin
+
+build_and_test()
+{
+    cd $basedir
+    make mrproper
+    cd tools/lkl
+    make clean-conf
+    make -j4
+    make run-tests
+}
+
+build_and_test
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 35/37] um lkl: add CI tests
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, linux-kernel-library, linux-arch,
	Hajime Tazaki, Akira Moroo

We use CircleCI for the tests, which should check regressions before
merging.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 .circleci/config.yml             | 274 +++++++++++++++++++++++++++++++
 tools/lkl/scripts/checkpatch.sh  |  60 +++++++
 tools/lkl/scripts/lkl-jenkins.sh |  21 +++
 3 files changed, 355 insertions(+)
 create mode 100644 .circleci/config.yml
 create mode 100755 tools/lkl/scripts/checkpatch.sh
 create mode 100755 tools/lkl/scripts/lkl-jenkins.sh

diff --git a/.circleci/config.yml b/.circleci/config.yml
new file mode 100644
index 000000000000..5c7b2fbad703
--- /dev/null
+++ b/.circleci/config.yml
@@ -0,0 +1,274 @@
+version: 2
+general:
+  artifacts:
+
+do_steps: &do_steps
+ steps:
+  - run: echo "$CROSS_COMPILE" > ~/_cross_compile
+  - restore_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: checkout build tree
+      command: |
+        mkdir -p ~/.ssh/
+        ssh-keyscan -H github.com >> ~/.ssh/known_hosts
+        if ! [ -d .git ]; then
+          git clone --depth=1 $CIRCLE_REPOSITORY_URL .;
+        fi
+        if [[ $CIRCLE_BRANCH == pull/* ]]; then
+           git fetch --depth=1 origin $CIRCLE_BRANCH/head;
+        else
+           git fetch --depth=1 origin $CIRCLE_BRANCH;
+        fi
+        git reset --hard $CIRCLE_SHA1
+  - save_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+      paths:
+        - /home/ubuntu/project/.git
+  - run:
+      name: clean
+      command: |
+        make mrproper
+        cd tools/lkl && make clean-conf
+        rm -rf ~/junit
+  - run: mkdir -p /home/ubuntu/.ccache
+  - restore_cache:
+      key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: build DPDK
+      command: |
+        if [ "$MKARG" = "dpdk=yes" ]; then
+          sudo apt-get update
+          if ! sudo apt-get install -y linux-headers-$(uname -r) ; then
+             cd /lib/modules && sudo ln -sf 4.4.0-97-generic `uname -r` && \
+               cd /home/ubuntu/project
+          fi
+          cd tools/lkl && ./scripts/dpdk-sdk-build.sh;
+        fi
+  - run:
+      name: copy mingw binutils
+      command: |
+        if [ "$CROSS_COMPILE" = "i686-w64-mingw32-" ]; then
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/i686-w64-mingw32-as
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/i686-w64-mingw32-ld
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/i686-w64-mingw32-objcopy
+          sudo cp i686-w64-mingw32-* /usr/bin;
+        elif [ "$CROSS_COMPILE" = "arm-linux-androideabi-" ]; then
+          wget https://github.com/lkl/linux/raw/master/tools/lkl/bin/arm-linux-androideabi-ld.gold
+          sudo cp arm-linux-androideabi-ld.gold /usr/bin/arm-linux-androideabi-ld;
+        fi
+  - run:
+      name: start emulator
+      command: |
+        if [[ $CROSS_COMPILE == *android* ]]; then
+          emulator -avd Nexus5_API24 -no-window -no-audio -no-boot-anim;
+        elif [[ $CROSS_COMPILE == *freebsd* ]]; then
+          cd /home/ubuntu && $QEMU
+        fi
+      background: true
+  - run: cd tools/lkl && make -j8 ${MKARG}
+  - run: mkdir -p ~/destdir && cd tools/lkl && make DESTDIR=~/destdir
+  - save_cache:
+     paths:
+       - /home/ubuntu/.ccache
+     key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+  - run:
+      name: wait emulator to boot
+      command: |
+        if [[ $CROSS_COMPILE == *android* ]]; then
+          /home/ubuntu/circle-android.sh wait-for-boot;
+        elif [[ $CROSS_COMPILE == *freebsd* ]]; then
+          while ! $MYSSH -o ConnectTimeout=1 exit 2> /dev/null
+          do
+             sleep 5
+          done
+        fi
+  - run:
+      name: run tests
+      command: |
+        mkdir -p ~/junit
+        make -C tools/lkl run-tests tests="--junit-dir ~/junit"
+        find ./tools/lkl/ -type f -name "*.xml" -exec mv {} ~/junit/ \;
+      no_output_timeout: "90m"
+  - store_test_results:
+      path: ~/junit
+  - store_artifacts:
+      path: ~/junit
+
+
+do_uml_steps: &do_uml_steps
+ steps:
+  - run: echo "$CROSS_COMPILE" > ~/_cross_compile
+  - restore_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: checkout build tree
+      command: |
+        mkdir -p ~/.ssh/
+        ssh-keyscan -H github.com >> ~/.ssh/known_hosts
+        if ! [ -d .git ]; then
+          git clone --depth=1 $CIRCLE_REPOSITORY_URL .;
+        fi
+        if [[ $CIRCLE_BRANCH == pull/* ]]; then
+           git fetch --depth=1 origin $CIRCLE_BRANCH/head;
+        else
+           git fetch --depth=1 origin $CIRCLE_BRANCH;
+        fi
+        git reset --hard $CIRCLE_SHA1
+  - save_cache:
+      key: code-tree-shallow-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+      paths:
+        - /home/ubuntu/project/.git
+  - run: mkdir -p /home/ubuntu/.ccache
+  - restore_cache:
+      key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}
+  - run:
+      name: build
+      command: |
+        sudo apt-get update
+        sudo apt-get install -y gcc-multilib g++-multilib
+        make -C tools/lkl/
+        make defconfig ARCH=um SUBARCH=$SUBARCH
+        make ARCH=um SUBARCH=$SUBARCH
+  - save_cache:
+     paths:
+       - /home/ubuntu/.ccache
+     key: compiler-cache-{{ checksum "~/_cross_compile" }}-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+  - run:
+      name: test
+      command: |
+        # XXX: i386 build doesn't work with the test
+        if [ $CIRCLE_STAGE = "i386_uml" ] || [ $CIRCLE_STAGE = "i386_uml_on_x86_64" ]; then
+          exit 0
+        fi
+        ./linux rootfstype=hostfs ro mem=1g loglevel=10 init="/bin/bash -c exit" || export RETVAL=$?
+        # SIGABRT=6 => 128+6
+        if [ $RETVAL != "134" ]; then
+          exit 1
+        fi
+
+## Customize the test machine
+jobs:
+  x86_64:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     MKARG: "dpdk=no"
+   <<: *do_steps
+
+  i386:
+   docker:
+     - image: lkldocker/circleci-i386:0.1
+   environment:
+     CROSS_COMPILE: ""
+   <<: *do_steps
+
+  mingw32:
+   docker:
+     - image: lkldocker/circleci-mingw:0.6
+   environment:
+     CROSS_COMPILE: "i686-w64-mingw32-"
+   <<: *do_steps
+
+  android-arm32:
+   docker:
+     - image: lkldocker/circleci-android-arm32:0.6
+   environment:
+     CROSS_COMPILE: "arm-linux-androideabi-"
+     LKL_ANDROID_TEST: 1
+     ANDROID_SDK_ROOT: /home/ubuntu/android-sdk
+   <<: *do_steps
+
+  android-aarch64:
+   docker:
+     - image: lkldocker/circleci-android-arm64:0.6
+   environment:
+     CROSS_COMPILE: "aarch64-linux-android-"
+     LKL_ANDROID_TEST: 1
+     ANDROID_SDK_ROOT: /home/ubuntu/android-sdk
+   <<: *do_steps
+
+  freebsd11_x86_64:
+   docker:
+     - image: lkldocker/circleci-freebsd11-x86_64:0.4
+   environment:
+     CROSS_COMPILE: "x86_64-pc-freebsd11-"
+   <<: *do_steps
+
+  x86_64_valgrind:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     MKARG: "dpdk=no"
+     VALGRIND: 1
+   <<: *do_steps
+
+  x86_64_uml:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     TMPDIR: "/tmp" # required for not using /dev/shm
+     SUBARCH: "x86_64"
+   <<: *do_uml_steps
+
+  i386_uml:
+   docker:
+     - image: lkldocker/circleci-i386:0.1
+   environment:
+     CROSS_COMPILE: ""
+     SUBARCH: "i386"
+     TMPDIR: "/tmp" # required for not using /dev/shm
+   <<: *do_uml_steps
+
+  i386_uml_on_x86_64:
+   docker:
+     - image: lkldocker/circleci-x86_64:0.7
+   environment:
+     CROSS_COMPILE: ""
+     TMPDIR: "/tmp" # required for not using /dev/shm
+     SUBARCH: "i386"
+   <<: *do_uml_steps
+
+  checkpatch:
+   docker:
+     - image: lkldocker/circleci:0.5
+   environment:
+   steps:
+     - restore_cache:
+        key: code-tree-full-history-{{ .Environment.CACHE_VERSION }}
+     - checkout
+     - run: sudo pip install ply
+     - run: tools/lkl/scripts/checkpatch.sh
+     - save_cache:
+        key: code-tree-full-history-{{ .Environment.CACHE_VERSION }}-{{ epoch }}
+        paths:
+          - /home/ubuntu/project/.git
+        when: always
+
+workflows:
+  version: 2
+  build:
+    jobs:
+     - x86_64
+     - mingw32
+     - android-arm32
+     - android-aarch64
+     - freebsd11_x86_64
+     - checkpatch
+     - i386
+     - x86_64_uml
+     - i386_uml
+     - i386_uml_on_x86_64
+  nightly:
+    triggers:
+      - schedule:
+          cron: "0 0 * * *"
+          filters:
+            branches:
+              only:
+                - master
+    jobs:
+      - x86_64_valgrind
diff --git a/tools/lkl/scripts/checkpatch.sh b/tools/lkl/scripts/checkpatch.sh
new file mode 100755
index 000000000000..0c02ca6b21a2
--- /dev/null
+++ b/tools/lkl/scripts/checkpatch.sh
@@ -0,0 +1,60 @@
+#!/bin/sh -ex
+# SPDX-License-Identifier: GPL-2.0
+
+if [ -z "$origin_master" ]; then
+    origin_master="origin/master"
+fi
+
+UPSTREAM=git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
+LKL=github.com:lkl/linux.git
+
+upstream=`git remote -v | grep $UPSTREAM | cut -f1 | head -n1`
+lkl=`git remote -v | grep $LKL | cut -f1 | head -n1`
+
+if [ -z "$upstream" ]; then
+    git fetch --tags --progress git://$UPSTREAM
+else
+    git fetch --tags $upstream
+fi
+
+if [ -z "$lkl" ]; then
+    git remote add lkl-upstream git@$LKL || true
+    lkl=`git remote -v | grep $LKL | cut -f1 | head -n1`
+fi
+
+if [ -z "$lkl" ]; then
+    echo "can't find lkl remote, quiting"
+    exit 1
+fi
+
+git fetch $lkl
+git fetch --tags $upstream
+
+# find the last upstream tag to avoid checking upstream commits during
+# upstream merges
+tag=`git tag --sort='-*authordate' | grep ^v | head -n1`
+tmp=`mktemp -d`
+
+commits=$(git log --no-merges --pretty=format:%h HEAD ^$lkl/master ^$tag)
+for c in $commits; do
+    git format-patch -1 -o $tmp $c
+done
+
+if [ -z "$c" ]; then
+    echo "there are not commits/patches to check, quiting."
+    rmdir $tmp
+    exit 0
+fi
+
+./scripts/checkpatch.pl --ignore FILE_PATH_CHANGES $tmp/*.patch
+rm $tmp/*.patch
+
+# checkpatch.pl does not know how to deal with 3 way diffs which would
+# be useful to check the conflict resolutions during merges...
+#for c in `git log --merges --pretty=format:%h HEAD ^$origin_master ^$tag`; do
+#    git log --pretty=email $c -1 > $tmp/$c.patch
+#    git diff $c $c^1 $c^2 >> $tmp/$c.patch
+#done
+
+rmdir $tmp
+
diff --git a/tools/lkl/scripts/lkl-jenkins.sh b/tools/lkl/scripts/lkl-jenkins.sh
new file mode 100755
index 000000000000..eaadc6e90143
--- /dev/null
+++ b/tools/lkl/scripts/lkl-jenkins.sh
@@ -0,0 +1,21 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+set -e
+
+script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
+basedir=$(cd $script_dir/../../..; pwd)
+
+export PATH=$PATH:/sbin
+
+build_and_test()
+{
+    cd $basedir
+    make mrproper
+    cd tools/lkl
+    make clean-conf
+    make -j4
+    make run-tests
+}
+
+build_and_test
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 36/37] um: use lkl virtio_net_tap device as UML device
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki

This also expands supporting virtio-mmio driver, which involves multiple
addition to Kbuild file as well.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 .circleci/config.yml             |   2 +-
 arch/um/Kconfig                  |   6 --
 arch/um/Makefile.um              |   3 +
 arch/um/configs/x86_64_defconfig |   5 ++
 arch/um/include/asm/Kbuild       |   1 +
 arch/um/include/asm/io.h         |   4 +
 arch/um/kernel/syscall.c         |  53 ++++++++++++
 arch/um/lkl/include/asm/irq.h    |   2 +
 arch/um/os-Linux/Makefile        |   5 ++
 arch/um/os-Linux/lkl_dev.c       | 134 +++++++++++++++++++++++++++++++
 tools/lkl/lib/Makefile           |  33 ++++++++
 tools/lkl/lib/posix-host.c       |   4 +
 tools/lkl/lib/virtio.c           |  17 +++-
 tools/lkl/lib/virtio.h           |  22 +++++
 tools/lkl/lib/virtio_net.c       |  25 +++++-
 tools/lkl/lib/virtio_net_fd.c    |  22 -----
 tools/lkl/lib/virtio_net_fd.h    |  22 +++++
 17 files changed, 328 insertions(+), 32 deletions(-)
 create mode 100644 arch/um/os-Linux/lkl_dev.c
 create mode 100644 tools/lkl/lib/Makefile

diff --git a/.circleci/config.yml b/.circleci/config.yml
index 5c7b2fbad703..9753543e8198 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -141,7 +141,7 @@ do_uml_steps: &do_uml_steps
         if [ $CIRCLE_STAGE = "i386_uml" ] || [ $CIRCLE_STAGE = "i386_uml_on_x86_64" ]; then
           exit 0
         fi
-        ./linux rootfstype=hostfs ro mem=1g loglevel=10 init="/bin/bash -c exit" || export RETVAL=$?
+        ./linux rootfstype=hostfs ro mem=1g loglevel=10 veth0=tap,tap0,0xc803 init="/bin/bash -c exit" || export RETVAL=$?
         # SIGABRT=6 => 128+6
         if [ $RETVAL != "134" ]; then
           exit 1
diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index c46bdb2987ce..325a784da776 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -45,9 +45,6 @@ config MMU
 	bool
 	default y
 
-config NO_IOMEM
-	def_bool y
-
 config ISA
 	bool
 
@@ -182,9 +179,6 @@ config MMAPPER
 	  This driver allows a host file to be used as emulated IO memory inside
 	  UML.
 
-config NO_DMA
-	def_bool y
-
 config PGTABLE_LEVELS
 	int
 	default 3 if 3_LEVEL_PGTABLES
diff --git a/arch/um/Makefile.um b/arch/um/Makefile.um
index d54fd387a16f..fc28305c866a 100644
--- a/arch/um/Makefile.um
+++ b/arch/um/Makefile.um
@@ -147,3 +147,6 @@ archclean:
 		-o -name '*.gcov' \) -type f -print | xargs rm -f
 
 export HEADER_ARCH SUBARCH USER_CFLAGS CFLAGS_NO_HARDENING OS DEV_NULL_PATH
+
+core-y		       += $(srctree)/tools/lkl/lib/
+KBUILD_CPPFLAGS += -I$(srctree)/$(ARCH_DIR)/lkl/include -I$(srctree)/$(ARCH_DIR)/
diff --git a/arch/um/configs/x86_64_defconfig b/arch/um/configs/x86_64_defconfig
index 3281d7600225..917982b6cd60 100644
--- a/arch/um/configs/x86_64_defconfig
+++ b/arch/um/configs/x86_64_defconfig
@@ -70,3 +70,8 @@ CONFIG_NLS=y
 CONFIG_DEBUG_INFO=y
 CONFIG_FRAME_WARN=1024
 CONFIG_DEBUG_KERNEL=y
+CONFIG_VIRTIO=y
+CONFIG_VIRTIO_MENU=y
+CONFIG_VIRTIO_MMIO=y
+CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y
+CONFIG_VIRTIO_NET=y
diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild
index 398006d27e40..f39430ba94d3 100644
--- a/arch/um/include/asm/Kbuild
+++ b/arch/um/include/asm/Kbuild
@@ -5,6 +5,7 @@ generic-y += compat.h
 generic-y += current.h
 generic-y += delay.h
 generic-y += device.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += extable.h
diff --git a/arch/um/include/asm/io.h b/arch/um/include/asm/io.h
index 96f77b5232aa..f23700d3c071 100644
--- a/arch/um/include/asm/io.h
+++ b/arch/um/include/asm/io.h
@@ -2,11 +2,15 @@
 #ifndef _ASM_UM_IO_H
 #define _ASM_UM_IO_H
 
+#ifndef CONFIG_HAS_IOMEM
 #define ioremap ioremap
 static inline void __iomem *ioremap(phys_addr_t offset, size_t size)
 {
 	return (void __iomem *)(unsigned long)offset;
 }
+#else
+#include <lkl/include/asm/io.h>
+#endif
 
 #define iounmap iounmap
 static inline void iounmap(void __iomem *addr)
diff --git a/arch/um/kernel/syscall.c b/arch/um/kernel/syscall.c
index eed54c53fbbb..3ebbeb7bab9c 100644
--- a/arch/um/kernel/syscall.c
+++ b/arch/um/kernel/syscall.c
@@ -13,6 +13,7 @@
 #include <asm/mman.h>
 #include <linux/uaccess.h>
 #include <asm/unistd.h>
+#include <linux/platform_device.h>
 
 long old_mmap(unsigned long addr, unsigned long len,
 	      unsigned long prot, unsigned long flags,
@@ -26,3 +27,55 @@ long old_mmap(unsigned long addr, unsigned long len,
  out:
 	return err;
 }
+
+SYSCALL_DEFINE3(virtio_mmio_device_add, long, base, long, size, unsigned int,
+		irq)
+{
+	struct platform_device *pdev;
+	int ret;
+
+	struct resource res[] = {
+		[0] = {
+		       .start = base,
+		       .end = base + size - 1,
+		       .flags = IORESOURCE_MEM,
+		       },
+		[1] = {
+		       .start = irq,
+		       .end = irq,
+		       .flags = IORESOURCE_IRQ,
+		       },
+	};
+
+	pdev = platform_device_alloc("virtio-mmio", PLATFORM_DEVID_AUTO);
+	if (!pdev) {
+		dev_err(&pdev->dev,
+			"%s: Unable to device alloc for virtio-mmio\n",
+			__func__);
+		return -ENOMEM;
+	}
+
+	ret = platform_device_add_resources(pdev, res, ARRAY_SIZE(res));
+	if (ret) {
+		dev_err(&pdev->dev,
+			"%s: Unable to add resources for %s%d\n", __func__,
+			pdev->name, pdev->id);
+		goto exit_device_put;
+	}
+
+	ret = platform_device_add(pdev);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "%s: Unable to add %s%d\n", __func__,
+			pdev->name, pdev->id);
+		goto exit_release_pdev;
+	}
+
+	return pdev->id;
+
+exit_release_pdev:
+	platform_device_del(pdev);
+exit_device_put:
+	platform_device_put(pdev);
+
+	return ret;
+}
diff --git a/arch/um/lkl/include/asm/irq.h b/arch/um/lkl/include/asm/irq.h
index 948fc54cb76c..7057bcd73727 100644
--- a/arch/um/lkl/include/asm/irq.h
+++ b/arch/um/lkl/include/asm/irq.h
@@ -2,8 +2,10 @@
 #ifndef _ASM_LKL_IRQ_H
 #define _ASM_LKL_IRQ_H
 
+#ifndef CONFIG_UML
 #define IRQ_STATUS_BITS		(sizeof(long) * 8)
 #define NR_IRQS			((int)(IRQ_STATUS_BITS * IRQ_STATUS_BITS))
+#endif
 
 void run_irqs(void);
 void set_irq_pending(int irq);
diff --git a/arch/um/os-Linux/Makefile b/arch/um/os-Linux/Makefile
index 839915b8c31c..d90d88a2f34e 100644
--- a/arch/um/os-Linux/Makefile
+++ b/arch/um/os-Linux/Makefile
@@ -11,9 +11,14 @@ obj-y = execvp.o file.o helper.o irq.o main.o mem.o process.o \
 	umid.o user_syms.o util.o drivers/ skas/
 
 obj-$(CONFIG_ARCH_REUSE_HOST_VSYSCALL_AREA) += elf_aux.o
+obj-y += lkl_dev.o
+
+CFLAGS_lkl_dev.o:=-I$(srctree)/tools/lkl/include -Wno-undef
 
 USER_OBJS := $(user-objs-y) elf_aux.o execvp.o file.o helper.o irq.o \
 	main.o mem.o process.o registers.o sigio.o signal.o start_up.o time.o \
 	tty.o umid.o util.o
 
+USER_OBJS += lkl_dev.o
+
 include arch/um/scripts/Makefile.rules
diff --git a/arch/um/os-Linux/lkl_dev.c b/arch/um/os-Linux/lkl_dev.c
new file mode 100644
index 000000000000..698062917ed5
--- /dev/null
+++ b/arch/um/os-Linux/lkl_dev.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <stdlib.h>
+#include <string.h>
+#include <init.h>
+#include <os.h>
+#include <kern_util.h>
+#include <errno.h>
+
+#include <lkl.h>
+#include <lkl_host.h>
+
+extern struct lkl_host_operations lkl_host_ops;
+struct lkl_host_operations *lkl_ops = &lkl_host_ops;
+
+static struct lkl_netdev *nd;
+
+int __init uml_netdev_prepare(char *iftype, char *ifparams, char *ifoffload)
+{
+	int offload = 0;
+
+	if (ifoffload)
+		offload = strtol(ifoffload, NULL, 0);
+
+	if ((strcmp(iftype, "tap") == 0)) {
+		nd = lkl_netdev_tap_create(ifparams, offload);
+#ifdef notyet
+	} else if ((strcmp(iftype, "macvtap") == 0)) {
+		nd = lkl_netdev_macvtap_create(ifparams, offload);
+#endif
+	} else {
+		if (offload) {
+			lkl_printf("WARN: %s isn't supported on %s\n",
+				   "LKL_HIJACK_OFFLOAD",
+				   iftype);
+			lkl_printf(
+				"WARN: Disabling offload features.\n");
+		}
+		offload = 0;
+	}
+#ifdef notyet
+	if (strcmp(iftype, "raw") == 0)
+		nd = lkl_netdev_raw_create(ifparams);
+#endif
+
+	return 0;
+}
+
+
+int __init uml_netdev_add(void)
+{
+	if (nd)
+		lkl_netdev_add(nd, NULL);
+
+	return 0;
+}
+__initcall(uml_netdev_add);
+
+static int __init lkl_eth_setup(char *str, int *niu)
+{
+	char *end, *iftype, *ifparams, *ifoffload;
+	int devid, err = -EINVAL;
+
+	/* veth */
+	devid = strtoul(str, &end, 0);
+	if (end == str) {
+		os_warn("Bad device number\n");
+		return err;
+	}
+
+	/* = */
+	str = end;
+	if (*str != '=') {
+		os_warn("Expected '=' after device number\n");
+		return err;
+	}
+	str++;
+
+	/* <iftype> */
+	iftype = str;
+
+	/* <ifparams> */
+	ifparams = strchr(str, ',');
+	if (ifparams == NULL) {
+		os_warn("failed to parse ifparams\n");
+		return -1;
+	}
+	*ifparams = '\0';
+	ifparams++;
+
+	str = ifparams;
+	/* <offload> */
+	ifoffload = strchr(str, ',');
+	*ifoffload = '\0';
+	ifoffload++;
+
+	os_info("str=%s, iftype=%s, ifparams=%s, offload=%s\n",
+		str, iftype, ifparams, ifoffload);
+
+	/* preparation */
+	uml_netdev_prepare(iftype, ifparams, ifoffload);
+
+	return 1;
+}
+
+__uml_setup("veth", lkl_eth_setup,
+"veth[0-9]+=<iftype>,<ifparams>,<offload>\n"
+"    Configure a network device.\n\n"
+);
+
+/* stub functions */
+int lkl_is_running(void)
+{
+	return 1;
+}
+
+
+void lkl_put_irq(int i, const char *user)
+{
+}
+
+/* XXX */
+static int free_irqs[2] = {5, 13};
+int lkl_get_free_irq(const char *user)
+{
+	static int irq_idx;
+	return free_irqs[irq_idx++];
+}
+
+int lkl_trigger_irq(int irq)
+{
+	do_IRQ(irq, NULL);
+	return 0;
+}
diff --git a/tools/lkl/lib/Makefile b/tools/lkl/lib/Makefile
new file mode 100644
index 000000000000..3c35d49843cd
--- /dev/null
+++ b/tools/lkl/lib/Makefile
@@ -0,0 +1,33 @@
+CFLAGS_posix-host.o += -D_FILE_OFFSET_BITS=64 -Wno-error=incompatible-pointer-types
+
+USER_CFLAGS += -I$(srctree)/tools/lkl/include \
+		-Wno-strict-prototypes -Wno-undef \
+		-Wframe-larger-than=20480 -O0 -g
+
+USER_OBJS += fs.o iomem.o net.o jmp_buf.o virtio.o virtio_net.o \
+	 virtio_net_fd.o virtio_net_tap.o utils.o posix-host.o \
+	../../perf/pmu-events/jsmn.o
+
+#obj-y += fs.o
+obj-y += iomem.o
+#obj-y += net.o
+obj-y += jmp_buf.o
+obj-y += posix-host.o
+#obj-$(LKL_HOST_CONFIG_NT) += nt-host.o
+obj-y += utils.o
+#obj-y += virtio_blk.o
+obj-y += virtio.o
+#obj-y += dbg.o
+#obj-y += dbg_handler.o
+obj-y += virtio_net.o
+obj-y += virtio_net_fd.o
+obj-y += virtio_net_tap.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_raw.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET_MACVTAP) += virtio_net_macvtap.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET_DPDK) += virtio_net_dpdk.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET_VDE) += virtio_net_vde.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_pipe.o
+obj-y += ../../perf/pmu-events/jsmn.o
+#obj-y += config.o
+
+include arch/um/scripts/Makefile.rules
diff --git a/tools/lkl/lib/posix-host.c b/tools/lkl/lib/posix-host.c
index c2b579433b12..4d52b06c9944 100644
--- a/tools/lkl/lib/posix-host.c
+++ b/tools/lkl/lib/posix-host.c
@@ -306,10 +306,12 @@ static void timer_free(void *_timer)
 	timer_delete(timer);
 }
 
+#ifndef __arch_um__
 static void panic(void)
 {
 	assert(0);
 }
+#endif
 
 static long _gettid(void)
 {
@@ -321,7 +323,9 @@ static long _gettid(void)
 }
 
 struct lkl_host_operations lkl_host_ops = {
+#ifndef __arch_um__
 	.panic = panic,
+#endif
 	.thread_create = thread_create,
 	.thread_detach = thread_detach,
 	.thread_exit = thread_exit,
diff --git a/tools/lkl/lib/virtio.c b/tools/lkl/lib/virtio.c
index c5247665482d..a19943c87d95 100644
--- a/tools/lkl/lib/virtio.c
+++ b/tools/lkl/lib/virtio.c
@@ -46,6 +46,12 @@
 		lkl_host_ops.panic();					\
 	} while (0)
 
+#ifdef __arch_um__
+extern unsigned long uml_physmem;
+#else
+static unsigned long uml_physmem;
+#endif
+
 struct virtio_queue {
 	uint32_t num_max;
 	uint32_t num;
@@ -216,7 +222,8 @@ static void add_dev_buf_from_vring_desc(struct virtio_req *req,
 {
 	struct iovec *buf = &req->buf[req->buf_count++];
 
-	buf->iov_base = (void *)(uintptr_t)le64toh(vring_desc->addr);
+	buf->iov_base = (void *)(uintptr_t)le64toh(vring_desc->addr)
+		+ uml_physmem;
 	buf->iov_len = le32toh(vring_desc->len);
 
 	if (!(buf->iov_base && buf->iov_len))
@@ -304,8 +311,10 @@ void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx)
 	if (!q->ready)
 		return;
 
+#ifndef __arch_um__
 	if (dev->ops->acquire_queue)
 		dev->ops->acquire_queue(dev, qidx);
+#endif
 
 	while (q->last_avail_idx != le16toh(q->avail->idx)) {
 		/*
@@ -319,8 +328,10 @@ void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx)
 			virtio_set_avail_event(q, q->avail->idx);
 	}
 
+#ifndef __arch_um__
 	if (dev->ops->release_queue)
 		dev->ops->release_queue(dev, qidx);
+#endif
 }
 
 static inline uint32_t virtio_read_device_features(struct virtio_dev *dev)
@@ -406,7 +417,7 @@ static inline void set_ptr_low(void **ptr, uint32_t val)
 	uint64_t tmp = (uintptr_t)*ptr;
 
 	tmp = (tmp & 0xFFFFFFFF00000000) | val;
-	*ptr = (void *)(long)tmp;
+	*ptr = (void *)(long)tmp + uml_physmem;
 }
 
 static inline void set_ptr_high(void **ptr, uint32_t val)
@@ -579,6 +590,7 @@ int virtio_dev_setup(struct virtio_dev *dev, int queues, int num_max)
 
 int virtio_dev_cleanup(struct virtio_dev *dev)
 {
+#ifndef __arch_um__
 	char devname[100];
 	long fd, ret;
 	long mount_ret;
@@ -622,6 +634,7 @@ int virtio_dev_cleanup(struct virtio_dev *dev)
 	lkl_put_irq(dev->irq, "virtio");
 	unregister_iomem(dev->base);
 	lkl_host_ops.mem_free(dev->queue);
+#endif
 	return 0;
 }
 
diff --git a/tools/lkl/lib/virtio.h b/tools/lkl/lib/virtio.h
index 7427aa8fad79..be06ef09f8b0 100644
--- a/tools/lkl/lib/virtio.h
+++ b/tools/lkl/lib/virtio.h
@@ -87,6 +87,28 @@ void virtio_req_complete(struct virtio_req *req, uint32_t len);
 void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx);
 void virtio_set_queue_max_merge_len(struct virtio_dev *dev, int q, int len);
 
+#ifdef __arch_um__
+//#include <irq_kern.h>
+#include <irq_user.h>
+enum irqreturn {
+	IRQ_HANDLED		= (1 << 0),
+	IRQ_WAKE_THREAD		= (1 << 1),
+};
+
+typedef enum irqreturn irqreturn_t;
+typedef irqreturn_t (*irq_handler_t)(int, void *);
+
+#define IRQF_SHARED		0x00000080
+
+extern int um_request_irq(unsigned int irq, int fd, int type,
+			  irq_handler_t handler,
+			  unsigned long irqflags,  const char *devname,
+			  void *dev_id);
+
+long sys_virtio_mmio_device_add(long base, long size, unsigned int irq);
+#define lkl_sys_virtio_mmio_device_add sys_virtio_mmio_device_add
+#endif /* __arch_um__ */
+
 #define container_of(ptr, type, member) \
 	(type *)((char *)(ptr) - __builtin_offsetof(type, member))
 
diff --git a/tools/lkl/lib/virtio_net.c b/tools/lkl/lib/virtio_net.c
index cd720b363f18..18b69f98087f 100644
--- a/tools/lkl/lib/virtio_net.c
+++ b/tools/lkl/lib/virtio_net.c
@@ -2,6 +2,7 @@
 #include <string.h>
 #include <lkl_host.h>
 #include "virtio.h"
+#include "virtio_net_fd.h"
 #include "endian.h"
 
 #include <lkl/linux/virtio_net.h>
@@ -212,9 +213,23 @@ static struct lkl_mutex **init_queue_locks(int num_queues)
 	return ret;
 }
 
+#ifdef __arch_um__
+static irqreturn_t um_virtio_intr(int irq, void *dev_id)
+{
+	struct virtio_dev *dev = dev_id;
+
+	virtio_process_queue(dev, 0);
+	return 0;
+}
+#endif
+
 int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
 {
 	struct virtio_net_dev *dev;
+#ifdef __arch_um__
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+#endif
 	int ret = -LKL_ENOMEM;
 
 	dev = lkl_host_ops.mem_alloc(sizeof(*dev));
@@ -252,16 +267,22 @@ int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
 	if (ret)
 		goto out_free;
 
+#ifdef __arch_um__
+	um_request_irq(dev->dev.irq, nd_fd->fd_rx, IRQ_READ, um_virtio_intr,
+		       IRQF_SHARED, "virtio", dev);
+#endif
+
 	/*
 	 * We may receive upto 64KB TSO packet so collect as many descriptors as
 	 * there are available up to 64KB in total len.
 	 */
 	if (dev->dev.device_features & BIT(LKL_VIRTIO_NET_F_MRG_RXBUF))
 		virtio_set_queue_max_merge_len(&dev->dev, RX_QUEUE_IDX, 65536);
-
+#ifndef __arch_um__
 	dev->poll_tid = lkl_host_ops.thread_create(poll_thread, dev);
 	if (dev->poll_tid == 0)
 		goto out_cleanup_dev;
+#endif
 
 	ret = dev_register(dev);
 	if (ret < 0)
@@ -280,6 +301,7 @@ int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
 	return ret;
 }
 
+#ifndef __arch_um__
 /* Return 0 for success, -1 for failure. */
 void lkl_netdev_remove(int id)
 {
@@ -315,6 +337,7 @@ void lkl_netdev_remove(int id)
 	free_queue_locks(dev->queue_locks, NUM_QUEUES);
 	lkl_host_ops.mem_free(dev);
 }
+#endif
 
 void lkl_netdev_free(struct lkl_netdev *nd)
 {
diff --git a/tools/lkl/lib/virtio_net_fd.c b/tools/lkl/lib/virtio_net_fd.c
index f8664455e696..a19193cfeca9 100644
--- a/tools/lkl/lib/virtio_net_fd.c
+++ b/tools/lkl/lib/virtio_net_fd.c
@@ -25,28 +25,6 @@
 #include "virtio.h"
 #include "virtio_net_fd.h"
 
-struct lkl_netdev_fd {
-	struct lkl_netdev dev;
-	/* file-descriptor based device */
-	int fd_rx;
-	int fd_tx;
-	/*
-	 * Controlls the poll mask for fd. Can be acccessed concurrently from
-	 * poll, tx, or rx routines but there is no need for syncronization
-	 * because:
-	 *
-	 * (a) TX and RX routines set different variables so even if they update
-	 * at the same time there is no race condition
-	 *
-	 * (b) Even if poll and TX / RX update at the same time poll cannot
-	 * stall: when poll resets the poll variable we know that TX / RX will
-	 * run which means that eventually the poll variable will be set.
-	 */
-	int poll_tx, poll_rx;
-	/* controle pipe */
-	int pipe[2];
-};
-
 static int fd_net_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
 {
 	int ret;
diff --git a/tools/lkl/lib/virtio_net_fd.h b/tools/lkl/lib/virtio_net_fd.h
index 713ba13cca7c..fe6d6d8e3ab4 100644
--- a/tools/lkl/lib/virtio_net_fd.h
+++ b/tools/lkl/lib/virtio_net_fd.h
@@ -4,6 +4,28 @@
 
 struct ifreq;
 
+struct lkl_netdev_fd {
+	struct lkl_netdev dev;
+	/* file-descriptor based device */
+	int fd_rx;
+	int fd_tx;
+	/*
+	 * Controlls the poll mask for fd. Can be acccessed concurrently from
+	 * poll, tx, or rx routines but there is no need for syncronization
+	 * because:
+	 *
+	 * (a) TX and RX routines set different variables so even if they update
+	 * at the same time there is no race condition
+	 *
+	 * (b) Even if poll and TX / RX update at the same time poll cannot
+	 * stall: when poll resets the poll variable we know that TX / RX will
+	 * run which means that eventually the poll variable will be set.
+	 */
+	int poll_tx, poll_rx;
+	/* controle pipe */
+	int pipe[2];
+};
+
 /**
  * lkl_register_netdev_linux_fdnet - register a file descriptor-based network
  * device as a NIC
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 36/37] um: use lkl virtio_net_tap device as UML device
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, linux-kernel-library, linux-arch,
	Hajime Tazaki, Akira Moroo

This also expands supporting virtio-mmio driver, which involves multiple
addition to Kbuild file as well.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 .circleci/config.yml             |   2 +-
 arch/um/Kconfig                  |   6 --
 arch/um/Makefile.um              |   3 +
 arch/um/configs/x86_64_defconfig |   5 ++
 arch/um/include/asm/Kbuild       |   1 +
 arch/um/include/asm/io.h         |   4 +
 arch/um/kernel/syscall.c         |  53 ++++++++++++
 arch/um/lkl/include/asm/irq.h    |   2 +
 arch/um/os-Linux/Makefile        |   5 ++
 arch/um/os-Linux/lkl_dev.c       | 134 +++++++++++++++++++++++++++++++
 tools/lkl/lib/Makefile           |  33 ++++++++
 tools/lkl/lib/posix-host.c       |   4 +
 tools/lkl/lib/virtio.c           |  17 +++-
 tools/lkl/lib/virtio.h           |  22 +++++
 tools/lkl/lib/virtio_net.c       |  25 +++++-
 tools/lkl/lib/virtio_net_fd.c    |  22 -----
 tools/lkl/lib/virtio_net_fd.h    |  22 +++++
 17 files changed, 328 insertions(+), 32 deletions(-)
 create mode 100644 arch/um/os-Linux/lkl_dev.c
 create mode 100644 tools/lkl/lib/Makefile

diff --git a/.circleci/config.yml b/.circleci/config.yml
index 5c7b2fbad703..9753543e8198 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -141,7 +141,7 @@ do_uml_steps: &do_uml_steps
         if [ $CIRCLE_STAGE = "i386_uml" ] || [ $CIRCLE_STAGE = "i386_uml_on_x86_64" ]; then
           exit 0
         fi
-        ./linux rootfstype=hostfs ro mem=1g loglevel=10 init="/bin/bash -c exit" || export RETVAL=$?
+        ./linux rootfstype=hostfs ro mem=1g loglevel=10 veth0=tap,tap0,0xc803 init="/bin/bash -c exit" || export RETVAL=$?
         # SIGABRT=6 => 128+6
         if [ $RETVAL != "134" ]; then
           exit 1
diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index c46bdb2987ce..325a784da776 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -45,9 +45,6 @@ config MMU
 	bool
 	default y
 
-config NO_IOMEM
-	def_bool y
-
 config ISA
 	bool
 
@@ -182,9 +179,6 @@ config MMAPPER
 	  This driver allows a host file to be used as emulated IO memory inside
 	  UML.
 
-config NO_DMA
-	def_bool y
-
 config PGTABLE_LEVELS
 	int
 	default 3 if 3_LEVEL_PGTABLES
diff --git a/arch/um/Makefile.um b/arch/um/Makefile.um
index d54fd387a16f..fc28305c866a 100644
--- a/arch/um/Makefile.um
+++ b/arch/um/Makefile.um
@@ -147,3 +147,6 @@ archclean:
 		-o -name '*.gcov' \) -type f -print | xargs rm -f
 
 export HEADER_ARCH SUBARCH USER_CFLAGS CFLAGS_NO_HARDENING OS DEV_NULL_PATH
+
+core-y		       += $(srctree)/tools/lkl/lib/
+KBUILD_CPPFLAGS += -I$(srctree)/$(ARCH_DIR)/lkl/include -I$(srctree)/$(ARCH_DIR)/
diff --git a/arch/um/configs/x86_64_defconfig b/arch/um/configs/x86_64_defconfig
index 3281d7600225..917982b6cd60 100644
--- a/arch/um/configs/x86_64_defconfig
+++ b/arch/um/configs/x86_64_defconfig
@@ -70,3 +70,8 @@ CONFIG_NLS=y
 CONFIG_DEBUG_INFO=y
 CONFIG_FRAME_WARN=1024
 CONFIG_DEBUG_KERNEL=y
+CONFIG_VIRTIO=y
+CONFIG_VIRTIO_MENU=y
+CONFIG_VIRTIO_MMIO=y
+CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y
+CONFIG_VIRTIO_NET=y
diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild
index 398006d27e40..f39430ba94d3 100644
--- a/arch/um/include/asm/Kbuild
+++ b/arch/um/include/asm/Kbuild
@@ -5,6 +5,7 @@ generic-y += compat.h
 generic-y += current.h
 generic-y += delay.h
 generic-y += device.h
+generic-y += dma-mapping.h
 generic-y += emergency-restart.h
 generic-y += exec.h
 generic-y += extable.h
diff --git a/arch/um/include/asm/io.h b/arch/um/include/asm/io.h
index 96f77b5232aa..f23700d3c071 100644
--- a/arch/um/include/asm/io.h
+++ b/arch/um/include/asm/io.h
@@ -2,11 +2,15 @@
 #ifndef _ASM_UM_IO_H
 #define _ASM_UM_IO_H
 
+#ifndef CONFIG_HAS_IOMEM
 #define ioremap ioremap
 static inline void __iomem *ioremap(phys_addr_t offset, size_t size)
 {
 	return (void __iomem *)(unsigned long)offset;
 }
+#else
+#include <lkl/include/asm/io.h>
+#endif
 
 #define iounmap iounmap
 static inline void iounmap(void __iomem *addr)
diff --git a/arch/um/kernel/syscall.c b/arch/um/kernel/syscall.c
index eed54c53fbbb..3ebbeb7bab9c 100644
--- a/arch/um/kernel/syscall.c
+++ b/arch/um/kernel/syscall.c
@@ -13,6 +13,7 @@
 #include <asm/mman.h>
 #include <linux/uaccess.h>
 #include <asm/unistd.h>
+#include <linux/platform_device.h>
 
 long old_mmap(unsigned long addr, unsigned long len,
 	      unsigned long prot, unsigned long flags,
@@ -26,3 +27,55 @@ long old_mmap(unsigned long addr, unsigned long len,
  out:
 	return err;
 }
+
+SYSCALL_DEFINE3(virtio_mmio_device_add, long, base, long, size, unsigned int,
+		irq)
+{
+	struct platform_device *pdev;
+	int ret;
+
+	struct resource res[] = {
+		[0] = {
+		       .start = base,
+		       .end = base + size - 1,
+		       .flags = IORESOURCE_MEM,
+		       },
+		[1] = {
+		       .start = irq,
+		       .end = irq,
+		       .flags = IORESOURCE_IRQ,
+		       },
+	};
+
+	pdev = platform_device_alloc("virtio-mmio", PLATFORM_DEVID_AUTO);
+	if (!pdev) {
+		dev_err(&pdev->dev,
+			"%s: Unable to device alloc for virtio-mmio\n",
+			__func__);
+		return -ENOMEM;
+	}
+
+	ret = platform_device_add_resources(pdev, res, ARRAY_SIZE(res));
+	if (ret) {
+		dev_err(&pdev->dev,
+			"%s: Unable to add resources for %s%d\n", __func__,
+			pdev->name, pdev->id);
+		goto exit_device_put;
+	}
+
+	ret = platform_device_add(pdev);
+	if (ret < 0) {
+		dev_err(&pdev->dev, "%s: Unable to add %s%d\n", __func__,
+			pdev->name, pdev->id);
+		goto exit_release_pdev;
+	}
+
+	return pdev->id;
+
+exit_release_pdev:
+	platform_device_del(pdev);
+exit_device_put:
+	platform_device_put(pdev);
+
+	return ret;
+}
diff --git a/arch/um/lkl/include/asm/irq.h b/arch/um/lkl/include/asm/irq.h
index 948fc54cb76c..7057bcd73727 100644
--- a/arch/um/lkl/include/asm/irq.h
+++ b/arch/um/lkl/include/asm/irq.h
@@ -2,8 +2,10 @@
 #ifndef _ASM_LKL_IRQ_H
 #define _ASM_LKL_IRQ_H
 
+#ifndef CONFIG_UML
 #define IRQ_STATUS_BITS		(sizeof(long) * 8)
 #define NR_IRQS			((int)(IRQ_STATUS_BITS * IRQ_STATUS_BITS))
+#endif
 
 void run_irqs(void);
 void set_irq_pending(int irq);
diff --git a/arch/um/os-Linux/Makefile b/arch/um/os-Linux/Makefile
index 839915b8c31c..d90d88a2f34e 100644
--- a/arch/um/os-Linux/Makefile
+++ b/arch/um/os-Linux/Makefile
@@ -11,9 +11,14 @@ obj-y = execvp.o file.o helper.o irq.o main.o mem.o process.o \
 	umid.o user_syms.o util.o drivers/ skas/
 
 obj-$(CONFIG_ARCH_REUSE_HOST_VSYSCALL_AREA) += elf_aux.o
+obj-y += lkl_dev.o
+
+CFLAGS_lkl_dev.o:=-I$(srctree)/tools/lkl/include -Wno-undef
 
 USER_OBJS := $(user-objs-y) elf_aux.o execvp.o file.o helper.o irq.o \
 	main.o mem.o process.o registers.o sigio.o signal.o start_up.o time.o \
 	tty.o umid.o util.o
 
+USER_OBJS += lkl_dev.o
+
 include arch/um/scripts/Makefile.rules
diff --git a/arch/um/os-Linux/lkl_dev.c b/arch/um/os-Linux/lkl_dev.c
new file mode 100644
index 000000000000..698062917ed5
--- /dev/null
+++ b/arch/um/os-Linux/lkl_dev.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <stdlib.h>
+#include <string.h>
+#include <init.h>
+#include <os.h>
+#include <kern_util.h>
+#include <errno.h>
+
+#include <lkl.h>
+#include <lkl_host.h>
+
+extern struct lkl_host_operations lkl_host_ops;
+struct lkl_host_operations *lkl_ops = &lkl_host_ops;
+
+static struct lkl_netdev *nd;
+
+int __init uml_netdev_prepare(char *iftype, char *ifparams, char *ifoffload)
+{
+	int offload = 0;
+
+	if (ifoffload)
+		offload = strtol(ifoffload, NULL, 0);
+
+	if ((strcmp(iftype, "tap") == 0)) {
+		nd = lkl_netdev_tap_create(ifparams, offload);
+#ifdef notyet
+	} else if ((strcmp(iftype, "macvtap") == 0)) {
+		nd = lkl_netdev_macvtap_create(ifparams, offload);
+#endif
+	} else {
+		if (offload) {
+			lkl_printf("WARN: %s isn't supported on %s\n",
+				   "LKL_HIJACK_OFFLOAD",
+				   iftype);
+			lkl_printf(
+				"WARN: Disabling offload features.\n");
+		}
+		offload = 0;
+	}
+#ifdef notyet
+	if (strcmp(iftype, "raw") == 0)
+		nd = lkl_netdev_raw_create(ifparams);
+#endif
+
+	return 0;
+}
+
+
+int __init uml_netdev_add(void)
+{
+	if (nd)
+		lkl_netdev_add(nd, NULL);
+
+	return 0;
+}
+__initcall(uml_netdev_add);
+
+static int __init lkl_eth_setup(char *str, int *niu)
+{
+	char *end, *iftype, *ifparams, *ifoffload;
+	int devid, err = -EINVAL;
+
+	/* veth */
+	devid = strtoul(str, &end, 0);
+	if (end == str) {
+		os_warn("Bad device number\n");
+		return err;
+	}
+
+	/* = */
+	str = end;
+	if (*str != '=') {
+		os_warn("Expected '=' after device number\n");
+		return err;
+	}
+	str++;
+
+	/* <iftype> */
+	iftype = str;
+
+	/* <ifparams> */
+	ifparams = strchr(str, ',');
+	if (ifparams == NULL) {
+		os_warn("failed to parse ifparams\n");
+		return -1;
+	}
+	*ifparams = '\0';
+	ifparams++;
+
+	str = ifparams;
+	/* <offload> */
+	ifoffload = strchr(str, ',');
+	*ifoffload = '\0';
+	ifoffload++;
+
+	os_info("str=%s, iftype=%s, ifparams=%s, offload=%s\n",
+		str, iftype, ifparams, ifoffload);
+
+	/* preparation */
+	uml_netdev_prepare(iftype, ifparams, ifoffload);
+
+	return 1;
+}
+
+__uml_setup("veth", lkl_eth_setup,
+"veth[0-9]+=<iftype>,<ifparams>,<offload>\n"
+"    Configure a network device.\n\n"
+);
+
+/* stub functions */
+int lkl_is_running(void)
+{
+	return 1;
+}
+
+
+void lkl_put_irq(int i, const char *user)
+{
+}
+
+/* XXX */
+static int free_irqs[2] = {5, 13};
+int lkl_get_free_irq(const char *user)
+{
+	static int irq_idx;
+	return free_irqs[irq_idx++];
+}
+
+int lkl_trigger_irq(int irq)
+{
+	do_IRQ(irq, NULL);
+	return 0;
+}
diff --git a/tools/lkl/lib/Makefile b/tools/lkl/lib/Makefile
new file mode 100644
index 000000000000..3c35d49843cd
--- /dev/null
+++ b/tools/lkl/lib/Makefile
@@ -0,0 +1,33 @@
+CFLAGS_posix-host.o += -D_FILE_OFFSET_BITS=64 -Wno-error=incompatible-pointer-types
+
+USER_CFLAGS += -I$(srctree)/tools/lkl/include \
+		-Wno-strict-prototypes -Wno-undef \
+		-Wframe-larger-than=20480 -O0 -g
+
+USER_OBJS += fs.o iomem.o net.o jmp_buf.o virtio.o virtio_net.o \
+	 virtio_net_fd.o virtio_net_tap.o utils.o posix-host.o \
+	../../perf/pmu-events/jsmn.o
+
+#obj-y += fs.o
+obj-y += iomem.o
+#obj-y += net.o
+obj-y += jmp_buf.o
+obj-y += posix-host.o
+#obj-$(LKL_HOST_CONFIG_NT) += nt-host.o
+obj-y += utils.o
+#obj-y += virtio_blk.o
+obj-y += virtio.o
+#obj-y += dbg.o
+#obj-y += dbg_handler.o
+obj-y += virtio_net.o
+obj-y += virtio_net_fd.o
+obj-y += virtio_net_tap.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_raw.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET_MACVTAP) += virtio_net_macvtap.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET_DPDK) += virtio_net_dpdk.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET_VDE) += virtio_net_vde.o
+#obj-$(LKL_HOST_CONFIG_VIRTIO_NET) += virtio_net_pipe.o
+obj-y += ../../perf/pmu-events/jsmn.o
+#obj-y += config.o
+
+include arch/um/scripts/Makefile.rules
diff --git a/tools/lkl/lib/posix-host.c b/tools/lkl/lib/posix-host.c
index c2b579433b12..4d52b06c9944 100644
--- a/tools/lkl/lib/posix-host.c
+++ b/tools/lkl/lib/posix-host.c
@@ -306,10 +306,12 @@ static void timer_free(void *_timer)
 	timer_delete(timer);
 }
 
+#ifndef __arch_um__
 static void panic(void)
 {
 	assert(0);
 }
+#endif
 
 static long _gettid(void)
 {
@@ -321,7 +323,9 @@ static long _gettid(void)
 }
 
 struct lkl_host_operations lkl_host_ops = {
+#ifndef __arch_um__
 	.panic = panic,
+#endif
 	.thread_create = thread_create,
 	.thread_detach = thread_detach,
 	.thread_exit = thread_exit,
diff --git a/tools/lkl/lib/virtio.c b/tools/lkl/lib/virtio.c
index c5247665482d..a19943c87d95 100644
--- a/tools/lkl/lib/virtio.c
+++ b/tools/lkl/lib/virtio.c
@@ -46,6 +46,12 @@
 		lkl_host_ops.panic();					\
 	} while (0)
 
+#ifdef __arch_um__
+extern unsigned long uml_physmem;
+#else
+static unsigned long uml_physmem;
+#endif
+
 struct virtio_queue {
 	uint32_t num_max;
 	uint32_t num;
@@ -216,7 +222,8 @@ static void add_dev_buf_from_vring_desc(struct virtio_req *req,
 {
 	struct iovec *buf = &req->buf[req->buf_count++];
 
-	buf->iov_base = (void *)(uintptr_t)le64toh(vring_desc->addr);
+	buf->iov_base = (void *)(uintptr_t)le64toh(vring_desc->addr)
+		+ uml_physmem;
 	buf->iov_len = le32toh(vring_desc->len);
 
 	if (!(buf->iov_base && buf->iov_len))
@@ -304,8 +311,10 @@ void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx)
 	if (!q->ready)
 		return;
 
+#ifndef __arch_um__
 	if (dev->ops->acquire_queue)
 		dev->ops->acquire_queue(dev, qidx);
+#endif
 
 	while (q->last_avail_idx != le16toh(q->avail->idx)) {
 		/*
@@ -319,8 +328,10 @@ void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx)
 			virtio_set_avail_event(q, q->avail->idx);
 	}
 
+#ifndef __arch_um__
 	if (dev->ops->release_queue)
 		dev->ops->release_queue(dev, qidx);
+#endif
 }
 
 static inline uint32_t virtio_read_device_features(struct virtio_dev *dev)
@@ -406,7 +417,7 @@ static inline void set_ptr_low(void **ptr, uint32_t val)
 	uint64_t tmp = (uintptr_t)*ptr;
 
 	tmp = (tmp & 0xFFFFFFFF00000000) | val;
-	*ptr = (void *)(long)tmp;
+	*ptr = (void *)(long)tmp + uml_physmem;
 }
 
 static inline void set_ptr_high(void **ptr, uint32_t val)
@@ -579,6 +590,7 @@ int virtio_dev_setup(struct virtio_dev *dev, int queues, int num_max)
 
 int virtio_dev_cleanup(struct virtio_dev *dev)
 {
+#ifndef __arch_um__
 	char devname[100];
 	long fd, ret;
 	long mount_ret;
@@ -622,6 +634,7 @@ int virtio_dev_cleanup(struct virtio_dev *dev)
 	lkl_put_irq(dev->irq, "virtio");
 	unregister_iomem(dev->base);
 	lkl_host_ops.mem_free(dev->queue);
+#endif
 	return 0;
 }
 
diff --git a/tools/lkl/lib/virtio.h b/tools/lkl/lib/virtio.h
index 7427aa8fad79..be06ef09f8b0 100644
--- a/tools/lkl/lib/virtio.h
+++ b/tools/lkl/lib/virtio.h
@@ -87,6 +87,28 @@ void virtio_req_complete(struct virtio_req *req, uint32_t len);
 void virtio_process_queue(struct virtio_dev *dev, uint32_t qidx);
 void virtio_set_queue_max_merge_len(struct virtio_dev *dev, int q, int len);
 
+#ifdef __arch_um__
+//#include <irq_kern.h>
+#include <irq_user.h>
+enum irqreturn {
+	IRQ_HANDLED		= (1 << 0),
+	IRQ_WAKE_THREAD		= (1 << 1),
+};
+
+typedef enum irqreturn irqreturn_t;
+typedef irqreturn_t (*irq_handler_t)(int, void *);
+
+#define IRQF_SHARED		0x00000080
+
+extern int um_request_irq(unsigned int irq, int fd, int type,
+			  irq_handler_t handler,
+			  unsigned long irqflags,  const char *devname,
+			  void *dev_id);
+
+long sys_virtio_mmio_device_add(long base, long size, unsigned int irq);
+#define lkl_sys_virtio_mmio_device_add sys_virtio_mmio_device_add
+#endif /* __arch_um__ */
+
 #define container_of(ptr, type, member) \
 	(type *)((char *)(ptr) - __builtin_offsetof(type, member))
 
diff --git a/tools/lkl/lib/virtio_net.c b/tools/lkl/lib/virtio_net.c
index cd720b363f18..18b69f98087f 100644
--- a/tools/lkl/lib/virtio_net.c
+++ b/tools/lkl/lib/virtio_net.c
@@ -2,6 +2,7 @@
 #include <string.h>
 #include <lkl_host.h>
 #include "virtio.h"
+#include "virtio_net_fd.h"
 #include "endian.h"
 
 #include <lkl/linux/virtio_net.h>
@@ -212,9 +213,23 @@ static struct lkl_mutex **init_queue_locks(int num_queues)
 	return ret;
 }
 
+#ifdef __arch_um__
+static irqreturn_t um_virtio_intr(int irq, void *dev_id)
+{
+	struct virtio_dev *dev = dev_id;
+
+	virtio_process_queue(dev, 0);
+	return 0;
+}
+#endif
+
 int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
 {
 	struct virtio_net_dev *dev;
+#ifdef __arch_um__
+	struct lkl_netdev_fd *nd_fd =
+		container_of(nd, struct lkl_netdev_fd, dev);
+#endif
 	int ret = -LKL_ENOMEM;
 
 	dev = lkl_host_ops.mem_alloc(sizeof(*dev));
@@ -252,16 +267,22 @@ int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
 	if (ret)
 		goto out_free;
 
+#ifdef __arch_um__
+	um_request_irq(dev->dev.irq, nd_fd->fd_rx, IRQ_READ, um_virtio_intr,
+		       IRQF_SHARED, "virtio", dev);
+#endif
+
 	/*
 	 * We may receive upto 64KB TSO packet so collect as many descriptors as
 	 * there are available up to 64KB in total len.
 	 */
 	if (dev->dev.device_features & BIT(LKL_VIRTIO_NET_F_MRG_RXBUF))
 		virtio_set_queue_max_merge_len(&dev->dev, RX_QUEUE_IDX, 65536);
-
+#ifndef __arch_um__
 	dev->poll_tid = lkl_host_ops.thread_create(poll_thread, dev);
 	if (dev->poll_tid == 0)
 		goto out_cleanup_dev;
+#endif
 
 	ret = dev_register(dev);
 	if (ret < 0)
@@ -280,6 +301,7 @@ int lkl_netdev_add(struct lkl_netdev *nd, struct lkl_netdev_args *args)
 	return ret;
 }
 
+#ifndef __arch_um__
 /* Return 0 for success, -1 for failure. */
 void lkl_netdev_remove(int id)
 {
@@ -315,6 +337,7 @@ void lkl_netdev_remove(int id)
 	free_queue_locks(dev->queue_locks, NUM_QUEUES);
 	lkl_host_ops.mem_free(dev);
 }
+#endif
 
 void lkl_netdev_free(struct lkl_netdev *nd)
 {
diff --git a/tools/lkl/lib/virtio_net_fd.c b/tools/lkl/lib/virtio_net_fd.c
index f8664455e696..a19193cfeca9 100644
--- a/tools/lkl/lib/virtio_net_fd.c
+++ b/tools/lkl/lib/virtio_net_fd.c
@@ -25,28 +25,6 @@
 #include "virtio.h"
 #include "virtio_net_fd.h"
 
-struct lkl_netdev_fd {
-	struct lkl_netdev dev;
-	/* file-descriptor based device */
-	int fd_rx;
-	int fd_tx;
-	/*
-	 * Controlls the poll mask for fd. Can be acccessed concurrently from
-	 * poll, tx, or rx routines but there is no need for syncronization
-	 * because:
-	 *
-	 * (a) TX and RX routines set different variables so even if they update
-	 * at the same time there is no race condition
-	 *
-	 * (b) Even if poll and TX / RX update at the same time poll cannot
-	 * stall: when poll resets the poll variable we know that TX / RX will
-	 * run which means that eventually the poll variable will be set.
-	 */
-	int poll_tx, poll_rx;
-	/* controle pipe */
-	int pipe[2];
-};
-
 static int fd_net_tx(struct lkl_netdev *nd, struct iovec *iov, int cnt)
 {
 	int ret;
diff --git a/tools/lkl/lib/virtio_net_fd.h b/tools/lkl/lib/virtio_net_fd.h
index 713ba13cca7c..fe6d6d8e3ab4 100644
--- a/tools/lkl/lib/virtio_net_fd.h
+++ b/tools/lkl/lib/virtio_net_fd.h
@@ -4,6 +4,28 @@
 
 struct ifreq;
 
+struct lkl_netdev_fd {
+	struct lkl_netdev dev;
+	/* file-descriptor based device */
+	int fd_rx;
+	int fd_tx;
+	/*
+	 * Controlls the poll mask for fd. Can be acccessed concurrently from
+	 * poll, tx, or rx routines but there is no need for syncronization
+	 * because:
+	 *
+	 * (a) TX and RX routines set different variables so even if they update
+	 * at the same time there is no race condition
+	 *
+	 * (b) Even if poll and TX / RX update at the same time poll cannot
+	 * stall: when poll resets the poll variable we know that TX / RX will
+	 * run which means that eventually the poll variable will be set.
+	 */
+	int poll_tx, poll_rx;
+	/* controle pipe */
+	int pipe[2];
+};
+
 /**
  * lkl_register_netdev_linux_fdnet - register a file descriptor-based network
  * device as a NIC
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 37/37] um: add lkl virtio-blk device
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  5:02     ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, Akira Moroo, linux-kernel-library, linux-arch,
	Hajime Tazaki

Now uml can use a virtio-blk device via 'vubd0=<filename>' over
virtio-mmio driver.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 .circleci/config.yml             |  4 ++-
 arch/um/configs/x86_64_defconfig |  1 +
 arch/um/os-Linux/lkl_dev.c       | 56 +++++++++++++++++++++++++++++++-
 tools/lkl/lib/Makefile           |  6 ++--
 4 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/.circleci/config.yml b/.circleci/config.yml
index 9753543e8198..f2fe39fc2bee 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -141,7 +141,9 @@ do_uml_steps: &do_uml_steps
         if [ $CIRCLE_STAGE = "i386_uml" ] || [ $CIRCLE_STAGE = "i386_uml_on_x86_64" ]; then
           exit 0
         fi
-        ./linux rootfstype=hostfs ro mem=1g loglevel=10 veth0=tap,tap0,0xc803 init="/bin/bash -c exit" || export RETVAL=$?
+        dd if=/dev/zero of=disk.img bs=1024 count=20480
+        mkfs.ext4 disk.img
+        ./linux rootfstype=hostfs ro mem=1g loglevel=10 veth0=tap,tap0,0xc803 vubd0=disk.img init='/bin/bash -x -c "mount -t ext4 /dev/vda /mnt ; ls -l /mnt/; ip addr ; exit"' || export RETVAL=$?
         # SIGABRT=6 => 128+6
         if [ $RETVAL != "134" ]; then
           exit 1
diff --git a/arch/um/configs/x86_64_defconfig b/arch/um/configs/x86_64_defconfig
index 917982b6cd60..e5b7c048a701 100644
--- a/arch/um/configs/x86_64_defconfig
+++ b/arch/um/configs/x86_64_defconfig
@@ -75,3 +75,4 @@ CONFIG_VIRTIO_MENU=y
 CONFIG_VIRTIO_MMIO=y
 CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y
 CONFIG_VIRTIO_NET=y
+CONFIG_VIRTIO_BLK=y
\ No newline at end of file
diff --git a/arch/um/os-Linux/lkl_dev.c b/arch/um/os-Linux/lkl_dev.c
index 698062917ed5..e08f113dfc0b 100644
--- a/arch/um/os-Linux/lkl_dev.c
+++ b/arch/um/os-Linux/lkl_dev.c
@@ -6,6 +6,7 @@
 #include <os.h>
 #include <kern_util.h>
 #include <errno.h>
+#include <fcntl.h>
 
 #include <lkl.h>
 #include <lkl_host.h>
@@ -14,6 +15,7 @@ extern struct lkl_host_operations lkl_host_ops;
 struct lkl_host_operations *lkl_ops = &lkl_host_ops;
 
 static struct lkl_netdev *nd;
+static struct lkl_disk disk;
 
 int __init uml_netdev_prepare(char *iftype, char *ifparams, char *ifoffload)
 {
@@ -108,13 +110,65 @@ __uml_setup("veth", lkl_eth_setup,
 "    Configure a network device.\n\n"
 );
 
+int __init uml_blkdev_add(void)
+{
+	int disk_id = 0;
+
+	if (disk.fd)
+		disk_id = lkl_disk_add(&disk);
+
+	if (disk_id < 0)
+		return -1;
+
+	return 0;
+}
+__initcall(uml_blkdev_add);
+
+static int __init lkl_ubd_setup(char *str, int *niu)
+{
+	char *end, *fname;
+	int devid, err = -EINVAL;
+
+	/* veth */
+	devid = strtoul(str, &end, 0);
+	if (end == str) {
+		os_warn("Bad device number\n");
+		return err;
+	}
+
+	/* = */
+	str = end;
+	if (*str != '=') {
+		os_warn("Expected '=' after device number\n");
+		return err;
+	}
+	str++;
+
+	/* <filename> */
+	fname = str;
+
+	os_info("fname=%s\n", fname);
+	/* create */
+	disk.fd = open(fname, O_RDWR);
+	if (disk.fd < 0)
+		return -1;
+
+	disk.ops = NULL;
+
+	return 1;
+}
+__uml_setup("vubd", lkl_ubd_setup,
+"vubd<n>=<filename>\n"
+"    Configure a block device.\n\n"
+);
+
+
 /* stub functions */
 int lkl_is_running(void)
 {
 	return 1;
 }
 
-
 void lkl_put_irq(int i, const char *user)
 {
 }
diff --git a/tools/lkl/lib/Makefile b/tools/lkl/lib/Makefile
index 3c35d49843cd..be6cb4b8f4ec 100644
--- a/tools/lkl/lib/Makefile
+++ b/tools/lkl/lib/Makefile
@@ -4,9 +4,9 @@ USER_CFLAGS += -I$(srctree)/tools/lkl/include \
 		-Wno-strict-prototypes -Wno-undef \
 		-Wframe-larger-than=20480 -O0 -g
 
-USER_OBJS += fs.o iomem.o net.o jmp_buf.o virtio.o virtio_net.o \
+USER_OBJS += iomem.o jmp_buf.o virtio.o virtio_net.o \
 	 virtio_net_fd.o virtio_net_tap.o utils.o posix-host.o \
-	../../perf/pmu-events/jsmn.o
+	 virtio_blk.o ../../perf/pmu-events/jsmn.o
 
 #obj-y += fs.o
 obj-y += iomem.o
@@ -15,7 +15,7 @@ obj-y += jmp_buf.o
 obj-y += posix-host.o
 #obj-$(LKL_HOST_CONFIG_NT) += nt-host.o
 obj-y += utils.o
-#obj-y += virtio_blk.o
+obj-y += virtio_blk.o
 obj-y += virtio.o
 #obj-y += dbg.o
 #obj-y += dbg_handler.o
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [RFC v2 37/37] um: add lkl virtio-blk device
@ 2019-11-08  5:02     ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-08  5:02 UTC (permalink / raw)
  To: linux-um
  Cc: Octavian Purdila, linux-kernel-library, linux-arch,
	Hajime Tazaki, Akira Moroo

Now uml can use a virtio-blk device via 'vubd0=<filename>' over
virtio-mmio driver.

Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
---
 .circleci/config.yml             |  4 ++-
 arch/um/configs/x86_64_defconfig |  1 +
 arch/um/os-Linux/lkl_dev.c       | 56 +++++++++++++++++++++++++++++++-
 tools/lkl/lib/Makefile           |  6 ++--
 4 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/.circleci/config.yml b/.circleci/config.yml
index 9753543e8198..f2fe39fc2bee 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -141,7 +141,9 @@ do_uml_steps: &do_uml_steps
         if [ $CIRCLE_STAGE = "i386_uml" ] || [ $CIRCLE_STAGE = "i386_uml_on_x86_64" ]; then
           exit 0
         fi
-        ./linux rootfstype=hostfs ro mem=1g loglevel=10 veth0=tap,tap0,0xc803 init="/bin/bash -c exit" || export RETVAL=$?
+        dd if=/dev/zero of=disk.img bs=1024 count=20480
+        mkfs.ext4 disk.img
+        ./linux rootfstype=hostfs ro mem=1g loglevel=10 veth0=tap,tap0,0xc803 vubd0=disk.img init='/bin/bash -x -c "mount -t ext4 /dev/vda /mnt ; ls -l /mnt/; ip addr ; exit"' || export RETVAL=$?
         # SIGABRT=6 => 128+6
         if [ $RETVAL != "134" ]; then
           exit 1
diff --git a/arch/um/configs/x86_64_defconfig b/arch/um/configs/x86_64_defconfig
index 917982b6cd60..e5b7c048a701 100644
--- a/arch/um/configs/x86_64_defconfig
+++ b/arch/um/configs/x86_64_defconfig
@@ -75,3 +75,4 @@ CONFIG_VIRTIO_MENU=y
 CONFIG_VIRTIO_MMIO=y
 CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y
 CONFIG_VIRTIO_NET=y
+CONFIG_VIRTIO_BLK=y
\ No newline at end of file
diff --git a/arch/um/os-Linux/lkl_dev.c b/arch/um/os-Linux/lkl_dev.c
index 698062917ed5..e08f113dfc0b 100644
--- a/arch/um/os-Linux/lkl_dev.c
+++ b/arch/um/os-Linux/lkl_dev.c
@@ -6,6 +6,7 @@
 #include <os.h>
 #include <kern_util.h>
 #include <errno.h>
+#include <fcntl.h>
 
 #include <lkl.h>
 #include <lkl_host.h>
@@ -14,6 +15,7 @@ extern struct lkl_host_operations lkl_host_ops;
 struct lkl_host_operations *lkl_ops = &lkl_host_ops;
 
 static struct lkl_netdev *nd;
+static struct lkl_disk disk;
 
 int __init uml_netdev_prepare(char *iftype, char *ifparams, char *ifoffload)
 {
@@ -108,13 +110,65 @@ __uml_setup("veth", lkl_eth_setup,
 "    Configure a network device.\n\n"
 );
 
+int __init uml_blkdev_add(void)
+{
+	int disk_id = 0;
+
+	if (disk.fd)
+		disk_id = lkl_disk_add(&disk);
+
+	if (disk_id < 0)
+		return -1;
+
+	return 0;
+}
+__initcall(uml_blkdev_add);
+
+static int __init lkl_ubd_setup(char *str, int *niu)
+{
+	char *end, *fname;
+	int devid, err = -EINVAL;
+
+	/* veth */
+	devid = strtoul(str, &end, 0);
+	if (end == str) {
+		os_warn("Bad device number\n");
+		return err;
+	}
+
+	/* = */
+	str = end;
+	if (*str != '=') {
+		os_warn("Expected '=' after device number\n");
+		return err;
+	}
+	str++;
+
+	/* <filename> */
+	fname = str;
+
+	os_info("fname=%s\n", fname);
+	/* create */
+	disk.fd = open(fname, O_RDWR);
+	if (disk.fd < 0)
+		return -1;
+
+	disk.ops = NULL;
+
+	return 1;
+}
+__uml_setup("vubd", lkl_ubd_setup,
+"vubd<n>=<filename>\n"
+"    Configure a block device.\n\n"
+);
+
+
 /* stub functions */
 int lkl_is_running(void)
 {
 	return 1;
 }
 
-
 void lkl_put_irq(int i, const char *user)
 {
 }
diff --git a/tools/lkl/lib/Makefile b/tools/lkl/lib/Makefile
index 3c35d49843cd..be6cb4b8f4ec 100644
--- a/tools/lkl/lib/Makefile
+++ b/tools/lkl/lib/Makefile
@@ -4,9 +4,9 @@ USER_CFLAGS += -I$(srctree)/tools/lkl/include \
 		-Wno-strict-prototypes -Wno-undef \
 		-Wframe-larger-than=20480 -O0 -g
 
-USER_OBJS += fs.o iomem.o net.o jmp_buf.o virtio.o virtio_net.o \
+USER_OBJS += iomem.o jmp_buf.o virtio.o virtio_net.o \
 	 virtio_net_fd.o virtio_net_tap.o utils.o posix-host.o \
-	../../perf/pmu-events/jsmn.o
+	 virtio_blk.o ../../perf/pmu-events/jsmn.o
 
 #obj-y += fs.o
 obj-y += iomem.o
@@ -15,7 +15,7 @@ obj-y += jmp_buf.o
 obj-y += posix-host.o
 #obj-$(LKL_HOST_CONFIG_NT) += nt-host.o
 obj-y += utils.o
-#obj-y += virtio_blk.o
+obj-y += virtio_blk.o
 obj-y += virtio.o
 #obj-y += dbg.o
 #obj-y += dbg_handler.o
-- 
2.20.1 (Apple Git-117)


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [RFC v2 00/37] Unifying LKL into UML
  2019-11-08  5:02   ` Hajime Tazaki
@ 2019-11-08  9:13     ` Anton Ivanov
  -1 siblings, 0 replies; 206+ messages in thread
From: Anton Ivanov @ 2019-11-08  9:13 UTC (permalink / raw)
  To: Hajime Tazaki, linux-um
  Cc: Octavian Purdila, linux-kernel-library, linux-arch, Akira Moroo



On 08/11/2019 05:02, Hajime Tazaki wrote:
> This RFC patchset is to ask opinions from folks, whether LKL codes is good
> to integrate into UML code base.  We wish to have any kind of feedback from
> your kind reviews.  There are numbers of commits which should be asked for
> reviews to other mailing lists; we will do it later once we got discussed
> in this mailing list.
> 
> # sorry for the long list of patches: we can make it smaller by only
>    including basic set of LKL (e.g., removing foreign OS support, etc) if
>    you wish.
> 
> 
> rfc v2:
> - use UMMODE instead of SUBARCH to switch UML or LKL
> - tools/lkl directory is still there. I confirmed we can move under arch/um
>    (e.g., arch/um/lkl/hosts).  I will move it IF this is preferable.
> - drop several patches involved non-uml directory
> - drop several patches which are not required
> - refine commit logs
> - document updated
> 
> 
> 
> 
> LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
> as extensively as possible with minimal effort and reduced maintenance
> overhead.
> 
> Examples of how LKL can be used are: creating userspace applications
> (running on Linux and other operating systems) that can read or write Linux
> filesystems or can use the Linux networking stack, creating kernel drivers
> for other operating systems that can read Linux filesystems, bootloaders
> support for reading/writing Linux filesystems, etc.
> 
> With LKL, the kernel code is compiled into an object file that can be
> directly linked by applications. The API offered by LKL is based on the
> Linux system call interface.
> 
> LKL is originally implemented as an architecture port in arch/lkl, but this
> series of commits tries to integrate this into arch/um as one of the mode
> of UML.  This was discussed during RFC email of LKL (*1).
> 
> The latest LKL version can be found at https://github.com/lkl/linux
> 
> Milestone
> =========
> This patches is just a first step toward upstreaming *library mode* of
> Linux kernel, but we think we need to have several steps toward our goal,
> describing in the below.
> 
> 1. Put LKL code under arch/um (arch/um/lkl), and build it in a
> separate way from UML.
> 2. Share common parts of implementation between UML and LKL.
> 3. Reimplement UML features with LKL API (if we wish)
> 
> For the step 1, we put LKL as one of UMMODE in order to make less effort to
> integrate (make ARCH=um UMMODE=library).  The modification to existing UML
> code is trying to be minimized.
> 
> The RFC patches also includes and a bit of step 2 as a proof of possibility
> to share the code.  For this, we used the virtio device code of LKL and use
> it from UML by enabling virtio-mmio driver with UML code.
> 
> 
> 
> Building LKL the host library and LKL applications
> ==================================================
> 
> % cd tools/lkl
> % make
> 
> will build LKL as a object file, it will install it in tools/lkl/lib together
> with the headers files in tools/lkl/include then will build the host library,
> tests and a few of application examples:
> 
> * tests/boot - a simple applications that uses LKL and exercises the basic
> LKL APIs
> 
> * tests/net-test - a simple applications that uses network feature of
> LKL and exercises the basic network-related APIs
> 
> * fs2tar - a tool that converts a filesystem image to a tar archive
> 
> * cptofs/cpfromfs - a tool that copies files to/from a filesystem image
> 
> % make run-tests
> 
> should run the above `tests/boot` and `tests/net-test` and report errors if
> there are any.
> 
> Supported hosts
> ===============
> 
> Currently LKL supports POSIX and Windows userspace applications. New hosts
> can be added relatively easy if the host supports gcc and GNU ld. Previous
> versions of LKL supported Windows kernel and Haiku kernel hosts, and we
> also have WIP patches (not included in this RFC) with rump-hypercall
> interface, used in UEFI, as well as macOS userspace (part of POSIX?).
> 
> There is also musl-libc port for LKL, which might be interested in for some
> folks.
> 
> 
> Further readings about LKL
> =========================
> 
> - Discussion in github LKL issue
> https://github.com/lkl/linux/issues/304
> 
> - LKL (an article)
> https://www.researchgate.net/profile/Nicolae_Tapus2/publication/224164682_LKL_The_Linux_kernel_library/links/02bfe50fd921ab4f7c000000.pdf
> 
> *1 RFC email to LKML (back in 2015)
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1012277.html
> 
> 
> 
> Please review the following changes for suitability for inclusion. If you have
> any objections or suggestions for improvement, please respond to the patches. If
> you agree with the changes, please provide your Acked-by.
> 
> The following changes since commit 73625ed66389d4c620520058d828f43a93ab4d0c:
> 
>    um: irq: Fix LAST_IRQ usage in init_IRQ() (2019-09-16 08:38:58 +0200)
> 
> are available in the Git repository at:
> 
>    git://github.com/thehajime/linux 61b15bfb52c7f1f066685c90a1cfe8346b3faec9
>    https://github.com/thehajime/linux/tree/upstream-to-uml-5.5-rc1-v2
> 
> Andreas Abel (1):
>    kallsyms: Add a config option to select section for kallsyms
> 
> Hajime Tazaki (6):
>    lkl: add system call hijack support
>    scripts: revert CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX patches
>    lkl: Android ARM (arm/arm64) support
>    um lkl: add CI tests
>    um: use lkl virtio_net_tap device as UML device
>    um: add lkl virtio-blk device
> 
> Octavian Purdila (29):
>    asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
>    arch: add __SYSCALL_DEFINE_ARCH
>    lkl: architecture skeleton for Linux kernel library
>    lkl: host interface
>    lkl: memory handling
>    lkl: kernel threads support
>    lkl: interrupt support
>    lkl: system call interface and application API
>    lkl: timers, time and delay support
>    lkl: memory mapped I/O support
>    lkl: basic kernel console support
>    lkl: initialization and cleanup
>    lkl: plug in the build system
>    lkl tools: skeleton for host side library, tests and tools
>    lkl tools: host lib: add utilities functions
>    lkl tools: host lib: memory mapped I/O helpers
>    lkl tools: host lib: virtio devices
>    lkl tools: host lib: virtio block device
>    lkl tools: host lib: filesystem helpers
>    lkl tools: host lib: posix host operations
>    lkl tools: "boot" test
>    lkl tools: tool that reads/writes to/from a filesystem image
>    lkl tools: tool that converts a filesystem image to tar
>    lkl tools: virtio: add network device support
>    checkpatch: avoid showing BIT_ULL warnings for tools/ files
>    lkl tools: add lklfuse
>    lkl: add documentation
>    lkl: add support for Windows hosts
>    lkl tools: add support for Windows host
> 
> Thomas Liebetraut (1):
>    tools: Add the lkl host library to the common tools Makefile
> 
>   .circleci/config.yml                       | 276 ++++++
>   Documentation/virt/uml/lkl.txt             | 453 ++++++++++
>   MAINTAINERS                                |   8 +
>   Makefile                                   |   4 +-
>   arch/Kconfig                               |   6 +
>   arch/um/Kconfig                            |  32 +-
>   arch/um/Makefile                           | 151 +---
>   arch/um/Makefile.um                        | 152 ++++
>   arch/um/configs/x86_64_defconfig           |   6 +
>   arch/um/include/asm/Kbuild                 |   1 +
>   arch/um/include/asm/io.h                   |   4 +
>   arch/um/kernel/syscall.c                   |  53 ++
>   arch/um/lkl/.gitignore                     |   2 +
>   arch/um/lkl/Kconfig                        |  91 ++
>   arch/um/lkl/Kconfig.debug                  |   0
>   arch/um/lkl/Makefile                       |  69 ++
>   arch/um/lkl/auto.conf                      |   1 +
>   arch/um/lkl/configs/lkl_defconfig          |  91 ++
>   arch/um/lkl/include/asm/Kbuild             |  80 ++
>   arch/um/lkl/include/asm/bitsperlong.h      |  11 +
>   arch/um/lkl/include/asm/byteorder.h        |   7 +
>   arch/um/lkl/include/asm/cpu.h              |  14 +
>   arch/um/lkl/include/asm/elf.h              |  15 +
>   arch/um/lkl/include/asm/host_ops.h         |  10 +
>   arch/um/lkl/include/asm/io.h               | 104 +++
>   arch/um/lkl/include/asm/irq.h              |  15 +
>   arch/um/lkl/include/asm/mutex.h            |   7 +
>   arch/um/lkl/include/asm/page.h             |  14 +
>   arch/um/lkl/include/asm/pgtable.h          |  62 ++
>   arch/um/lkl/include/asm/processor.h        |  60 ++
>   arch/um/lkl/include/asm/ptrace.h           |  25 +
>   arch/um/lkl/include/asm/sched.h            |  23 +
>   arch/um/lkl/include/asm/setup.h            |   7 +
>   arch/um/lkl/include/asm/syscalls.h         |  18 +
>   arch/um/lkl/include/asm/syscalls_32.h      |  43 +
>   arch/um/lkl/include/asm/thread_info.h      |  70 ++
>   arch/um/lkl/include/asm/tlb.h              |  12 +
>   arch/um/lkl/include/asm/uaccess.h          |  64 ++
>   arch/um/lkl/include/asm/unistd.h           |  29 +
>   arch/um/lkl/include/asm/unistd_32.h        |  31 +
>   arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
>   arch/um/lkl/include/asm/xor.h              |   9 +
>   arch/um/lkl/include/system/stdarg.h        |   2 +
>   arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
>   arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
>   arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
>   arch/um/lkl/include/uapi/asm/host_ops.h    | 153 ++++
>   arch/um/lkl/include/uapi/asm/irq.h         |  36 +
>   arch/um/lkl/include/uapi/asm/sigcontext.h  |  16 +
>   arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
>   arch/um/lkl/include/uapi/asm/swab.h        |  11 +
>   arch/um/lkl/include/uapi/asm/syscalls.h    | 348 ++++++++
>   arch/um/lkl/include/uapi/asm/unistd.h      |  18 +
>   arch/um/lkl/kernel/Makefile                |   4 +
>   arch/um/lkl/kernel/asm-offsets.c           |   2 +
>   arch/um/lkl/kernel/console.c               |  42 +
>   arch/um/lkl/kernel/cpu.c                   | 223 +++++
>   arch/um/lkl/kernel/irq.c                   | 193 +++++
>   arch/um/lkl/kernel/misc.c                  |  60 ++
>   arch/um/lkl/kernel/setup.c                 | 193 +++++
>   arch/um/lkl/kernel/syscalls.c              | 246 ++++++
>   arch/um/lkl/kernel/syscalls_32.c           | 159 ++++
>   arch/um/lkl/kernel/threads.c               | 227 +++++
>   arch/um/lkl/kernel/time.c                  | 145 ++++
>   arch/um/lkl/kernel/vmlinux.lds.S           |  51 ++
>   arch/um/lkl/mm/Makefile                    |   1 +
>   arch/um/lkl/mm/bootmem.c                   |  66 ++
>   arch/um/lkl/scripts/headers_install.py     | 195 +++++
>   arch/um/os-Linux/Makefile                  |   5 +
>   arch/um/os-Linux/lkl_dev.c                 | 188 +++++
>   certs/system_certificates.S                |  16 +-
>   include/asm-generic/atomic64.h             |   2 +
>   include/asm-generic/export.h               |  34 +-
>   include/asm-generic/vmlinux.lds.h          | 279 ++++---
>   include/linux/compiler_attributes.h        |   4 +
>   include/linux/export.h                     |  23 +-
>   include/linux/linkage.h                    |  12 +-
>   include/linux/syscalls.h                   |   6 +
>   init/Kconfig                               |  12 +
>   lib/.gitignore                             |   2 +
>   lib/raid6/.gitignore                       |   1 +
>   scripts/.gitignore                         |   2 +
>   scripts/Makefile.build                     |   9 +-
>   scripts/adjust_autoksyms.sh                |   6 +
>   scripts/basic/.gitignore                   |   1 +
>   scripts/checkpatch.pl                      |  13 +-
>   scripts/depmod.sh                          |  25 +-
>   scripts/genksyms/genksyms.c                |  11 +-
>   scripts/kallsyms.c                         |  54 +-
>   scripts/kconfig/.gitignore                 |   1 +
>   scripts/link-vmlinux.sh                    |  10 +
>   scripts/mod/.gitignore                     |   1 +
>   scripts/mod/modpost.c                      |  30 +-
>   tools/Makefile                             |  11 +-
>   tools/lkl/.gitignore                       |  15 +
>   tools/lkl/Build                            |   6 +
>   tools/lkl/Makefile                         | 130 +++
>   tools/lkl/Makefile.autoconf                | 114 +++
>   tools/lkl/Targets                          |  25 +
>   tools/lkl/bin/lkl-hijack.sh                |  23 +
>   tools/lkl/cptofs.c                         | 635 ++++++++++++++
>   tools/lkl/fs2tar.c                         | 410 +++++++++
>   tools/lkl/include/.gitignore               |   1 +
>   tools/lkl/include/lkl.h                    | 928 +++++++++++++++++++++
>   tools/lkl/include/lkl_config.h             |  61 ++
>   tools/lkl/include/lkl_host.h               | 160 ++++
>   tools/lkl/include/mingw32/sys/socket.h     |   4 +
>   tools/lkl/lib/.gitignore                   |   3 +
>   tools/lkl/lib/Build                        |  26 +
>   tools/lkl/lib/Makefile                     |  33 +
>   tools/lkl/lib/config.c                     | 793 ++++++++++++++++++
>   tools/lkl/lib/dbg.c                        | 300 +++++++
>   tools/lkl/lib/dbg_handler.c                |  44 +
>   tools/lkl/lib/endian.h                     |  31 +
>   tools/lkl/lib/fs.c                         | 433 ++++++++++
>   tools/lkl/lib/hijack/Build                 |   4 +
>   tools/lkl/lib/hijack/hijack.c              | 620 ++++++++++++++
>   tools/lkl/lib/hijack/init.c                | 252 ++++++
>   tools/lkl/lib/hijack/init.h                |   8 +
>   tools/lkl/lib/hijack/xlate.c               | 613 ++++++++++++++
>   tools/lkl/lib/hijack/xlate.h               |  13 +
>   tools/lkl/lib/iomem.c                      |  88 ++
>   tools/lkl/lib/iomem.h                      |  15 +
>   tools/lkl/lib/jmp_buf.c                    |  14 +
>   tools/lkl/lib/jmp_buf.h                    |   8 +
>   tools/lkl/lib/net.c                        | 818 ++++++++++++++++++
>   tools/lkl/lib/nt-host.c                    | 375 +++++++++
>   tools/lkl/lib/posix-host.c                 | 439 ++++++++++
>   tools/lkl/lib/utils.c                      | 266 ++++++
>   tools/lkl/lib/virtio.c                     | 644 ++++++++++++++
>   tools/lkl/lib/virtio.h                     | 115 +++
>   tools/lkl/lib/virtio_blk.c                 | 132 +++
>   tools/lkl/lib/virtio_net.c                 | 345 ++++++++
>   tools/lkl/lib/virtio_net_dpdk.c            | 480 +++++++++++
>   tools/lkl/lib/virtio_net_fd.c              | 195 +++++
>   tools/lkl/lib/virtio_net_fd.h              |  50 ++
>   tools/lkl/lib/virtio_net_macvtap.c         |  32 +
>   tools/lkl/lib/virtio_net_pipe.c            |  76 ++
>   tools/lkl/lib/virtio_net_raw.c             |  94 +++
>   tools/lkl/lib/virtio_net_tap.c             | 111 +++
>   tools/lkl/lib/virtio_net_vde.c             | 168 ++++
>   tools/lkl/lklfuse.c                        | 658 +++++++++++++++
>   tools/lkl/scripts/checkpatch.sh            |  60 ++
>   tools/lkl/scripts/lkl-jenkins.sh           |  21 +
>   tools/lkl/tests/Build                      |   3 +
>   tools/lkl/tests/boot.c                     | 562 +++++++++++++
>   tools/lkl/tests/boot.sh                    |   9 +
>   tools/lkl/tests/cla.c                      | 159 ++++
>   tools/lkl/tests/cla.h                      |  33 +
>   tools/lkl/tests/disk.c                     | 189 +++++
>   tools/lkl/tests/disk.sh                    |  70 ++
>   tools/lkl/tests/hijack-test.sh             | 760 +++++++++++++++++
>   tools/lkl/tests/lklfuse.sh                 | 110 +++
>   tools/lkl/tests/net-setup.sh               | 134 +++
>   tools/lkl/tests/net-test.c                 | 317 +++++++
>   tools/lkl/tests/net.sh                     | 186 +++++
>   tools/lkl/tests/run.py                     | 182 ++++
>   tools/lkl/tests/run_netperf.sh             |  98 +++
>   tools/lkl/tests/tap13.py                   | 209 +++++
>   tools/lkl/tests/test.c                     | 126 +++
>   tools/lkl/tests/test.h                     |  72 ++
>   tools/lkl/tests/test.sh                    | 240 ++++++
>   tools/lkl/tests/valgrind.supp              |  85 ++
>   tools/lkl/tests/valgrind2xunit.py          |  69 ++
>   usr/initramfs_data.S                       |   4 +-
>   165 files changed, 19489 insertions(+), 354 deletions(-)
>   create mode 100644 .circleci/config.yml
>   create mode 100644 Documentation/virt/uml/lkl.txt
>   create mode 100644 arch/um/Makefile.um
>   create mode 100644 arch/um/lkl/.gitignore
>   create mode 100644 arch/um/lkl/Kconfig
>   create mode 100644 arch/um/lkl/Kconfig.debug
>   create mode 100644 arch/um/lkl/Makefile
>   create mode 100644 arch/um/lkl/auto.conf
>   create mode 100644 arch/um/lkl/configs/lkl_defconfig
>   create mode 100644 arch/um/lkl/include/asm/Kbuild
>   create mode 100644 arch/um/lkl/include/asm/bitsperlong.h
>   create mode 100644 arch/um/lkl/include/asm/byteorder.h
>   create mode 100644 arch/um/lkl/include/asm/cpu.h
>   create mode 100644 arch/um/lkl/include/asm/elf.h
>   create mode 100644 arch/um/lkl/include/asm/host_ops.h
>   create mode 100644 arch/um/lkl/include/asm/io.h
>   create mode 100644 arch/um/lkl/include/asm/irq.h
>   create mode 100644 arch/um/lkl/include/asm/mutex.h
>   create mode 100644 arch/um/lkl/include/asm/page.h
>   create mode 100644 arch/um/lkl/include/asm/pgtable.h
>   create mode 100644 arch/um/lkl/include/asm/processor.h
>   create mode 100644 arch/um/lkl/include/asm/ptrace.h
>   create mode 100644 arch/um/lkl/include/asm/sched.h
>   create mode 100644 arch/um/lkl/include/asm/setup.h
>   create mode 100644 arch/um/lkl/include/asm/syscalls.h
>   create mode 100644 arch/um/lkl/include/asm/syscalls_32.h
>   create mode 100644 arch/um/lkl/include/asm/thread_info.h
>   create mode 100644 arch/um/lkl/include/asm/tlb.h
>   create mode 100644 arch/um/lkl/include/asm/uaccess.h
>   create mode 100644 arch/um/lkl/include/asm/unistd.h
>   create mode 100644 arch/um/lkl/include/asm/unistd_32.h
>   create mode 100644 arch/um/lkl/include/asm/vmlinux.lds.h
>   create mode 100644 arch/um/lkl/include/asm/xor.h
>   create mode 100644 arch/um/lkl/include/system/stdarg.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/Kbuild
>   create mode 100644 arch/um/lkl/include/uapi/asm/bitsperlong.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/byteorder.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/host_ops.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/irq.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/sigcontext.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/siginfo.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/swab.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/syscalls.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/unistd.h
>   create mode 100644 arch/um/lkl/kernel/Makefile
>   create mode 100644 arch/um/lkl/kernel/asm-offsets.c
>   create mode 100644 arch/um/lkl/kernel/console.c
>   create mode 100644 arch/um/lkl/kernel/cpu.c
>   create mode 100644 arch/um/lkl/kernel/irq.c
>   create mode 100644 arch/um/lkl/kernel/misc.c
>   create mode 100644 arch/um/lkl/kernel/setup.c
>   create mode 100644 arch/um/lkl/kernel/syscalls.c
>   create mode 100644 arch/um/lkl/kernel/syscalls_32.c
>   create mode 100644 arch/um/lkl/kernel/threads.c
>   create mode 100644 arch/um/lkl/kernel/time.c
>   create mode 100644 arch/um/lkl/kernel/vmlinux.lds.S
>   create mode 100644 arch/um/lkl/mm/Makefile
>   create mode 100644 arch/um/lkl/mm/bootmem.c
>   create mode 100755 arch/um/lkl/scripts/headers_install.py
>   create mode 100644 arch/um/os-Linux/lkl_dev.c
>   create mode 100644 tools/lkl/.gitignore
>   create mode 100644 tools/lkl/Build
>   create mode 100644 tools/lkl/Makefile
>   create mode 100644 tools/lkl/Makefile.autoconf
>   create mode 100644 tools/lkl/Targets
>   create mode 100755 tools/lkl/bin/lkl-hijack.sh
>   create mode 100644 tools/lkl/cptofs.c
>   create mode 100644 tools/lkl/fs2tar.c
>   create mode 100644 tools/lkl/include/.gitignore
>   create mode 100644 tools/lkl/include/lkl.h
>   create mode 100644 tools/lkl/include/lkl_config.h
>   create mode 100644 tools/lkl/include/lkl_host.h
>   create mode 100644 tools/lkl/include/mingw32/sys/socket.h
>   create mode 100644 tools/lkl/lib/.gitignore
>   create mode 100644 tools/lkl/lib/Build
>   create mode 100644 tools/lkl/lib/Makefile
>   create mode 100644 tools/lkl/lib/config.c
>   create mode 100644 tools/lkl/lib/dbg.c
>   create mode 100644 tools/lkl/lib/dbg_handler.c
>   create mode 100644 tools/lkl/lib/endian.h
>   create mode 100644 tools/lkl/lib/fs.c
>   create mode 100644 tools/lkl/lib/hijack/Build
>   create mode 100644 tools/lkl/lib/hijack/hijack.c
>   create mode 100644 tools/lkl/lib/hijack/init.c
>   create mode 100644 tools/lkl/lib/hijack/init.h
>   create mode 100644 tools/lkl/lib/hijack/xlate.c
>   create mode 100644 tools/lkl/lib/hijack/xlate.h
>   create mode 100644 tools/lkl/lib/iomem.c
>   create mode 100644 tools/lkl/lib/iomem.h
>   create mode 100644 tools/lkl/lib/jmp_buf.c
>   create mode 100644 tools/lkl/lib/jmp_buf.h
>   create mode 100644 tools/lkl/lib/net.c
>   create mode 100644 tools/lkl/lib/nt-host.c
>   create mode 100644 tools/lkl/lib/posix-host.c
>   create mode 100644 tools/lkl/lib/utils.c
>   create mode 100644 tools/lkl/lib/virtio.c
>   create mode 100644 tools/lkl/lib/virtio.h
>   create mode 100644 tools/lkl/lib/virtio_blk.c
>   create mode 100644 tools/lkl/lib/virtio_net.c
>   create mode 100644 tools/lkl/lib/virtio_net_dpdk.c
>   create mode 100644 tools/lkl/lib/virtio_net_fd.c
>   create mode 100644 tools/lkl/lib/virtio_net_fd.h
>   create mode 100644 tools/lkl/lib/virtio_net_macvtap.c
>   create mode 100644 tools/lkl/lib/virtio_net_pipe.c
>   create mode 100644 tools/lkl/lib/virtio_net_raw.c
>   create mode 100644 tools/lkl/lib/virtio_net_tap.c
>   create mode 100644 tools/lkl/lib/virtio_net_vde.c
>   create mode 100644 tools/lkl/lklfuse.c
>   create mode 100755 tools/lkl/scripts/checkpatch.sh
>   create mode 100755 tools/lkl/scripts/lkl-jenkins.sh
>   create mode 100644 tools/lkl/tests/Build
>   create mode 100644 tools/lkl/tests/boot.c
>   create mode 100755 tools/lkl/tests/boot.sh
>   create mode 100644 tools/lkl/tests/cla.c
>   create mode 100644 tools/lkl/tests/cla.h
>   create mode 100644 tools/lkl/tests/disk.c
>   create mode 100755 tools/lkl/tests/disk.sh
>   create mode 100755 tools/lkl/tests/hijack-test.sh
>   create mode 100755 tools/lkl/tests/lklfuse.sh
>   create mode 100644 tools/lkl/tests/net-setup.sh
>   create mode 100644 tools/lkl/tests/net-test.c
>   create mode 100755 tools/lkl/tests/net.sh
>   create mode 100755 tools/lkl/tests/run.py
>   create mode 100755 tools/lkl/tests/run_netperf.sh
>   create mode 100644 tools/lkl/tests/tap13.py
>   create mode 100644 tools/lkl/tests/test.c
>   create mode 100644 tools/lkl/tests/test.h
>   create mode 100644 tools/lkl/tests/test.sh
>   create mode 100644 tools/lkl/tests/valgrind.supp
>   create mode 100755 tools/lkl/tests/valgrind2xunit.py
> 

I am reading the patch-set and I have a recurring question as I read it. 
It applies to IRQ, mmap IO, timers, devices, etc

The question is: "What is the unerlying req to replace the existing UML 
code for the library".

F.e timers in UML have been moved to an underlying POSIX timers call now 
and that can probably work on any system that offers it. If there is 
some presentation/documentation/etc material which I can read which goes 
through the actual choices it will be very helpful.

The same question applies the other way around too. I like the hostops 
approach, we can probably adopt some of that in UML proper to make it 
more portable and easier to have alternative implementations for the 
underlying host side operations.

-- 
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 00/37] Unifying LKL into UML
@ 2019-11-08  9:13     ` Anton Ivanov
  0 siblings, 0 replies; 206+ messages in thread
From: Anton Ivanov @ 2019-11-08  9:13 UTC (permalink / raw)
  To: Hajime Tazaki, linux-um
  Cc: Octavian Purdila, linux-kernel-library, Akira Moroo, linux-arch



On 08/11/2019 05:02, Hajime Tazaki wrote:
> This RFC patchset is to ask opinions from folks, whether LKL codes is good
> to integrate into UML code base.  We wish to have any kind of feedback from
> your kind reviews.  There are numbers of commits which should be asked for
> reviews to other mailing lists; we will do it later once we got discussed
> in this mailing list.
> 
> # sorry for the long list of patches: we can make it smaller by only
>    including basic set of LKL (e.g., removing foreign OS support, etc) if
>    you wish.
> 
> 
> rfc v2:
> - use UMMODE instead of SUBARCH to switch UML or LKL
> - tools/lkl directory is still there. I confirmed we can move under arch/um
>    (e.g., arch/um/lkl/hosts).  I will move it IF this is preferable.
> - drop several patches involved non-uml directory
> - drop several patches which are not required
> - refine commit logs
> - document updated
> 
> 
> 
> 
> LKL (Linux Kernel Library) is aiming to allow reusing the Linux kernel code
> as extensively as possible with minimal effort and reduced maintenance
> overhead.
> 
> Examples of how LKL can be used are: creating userspace applications
> (running on Linux and other operating systems) that can read or write Linux
> filesystems or can use the Linux networking stack, creating kernel drivers
> for other operating systems that can read Linux filesystems, bootloaders
> support for reading/writing Linux filesystems, etc.
> 
> With LKL, the kernel code is compiled into an object file that can be
> directly linked by applications. The API offered by LKL is based on the
> Linux system call interface.
> 
> LKL is originally implemented as an architecture port in arch/lkl, but this
> series of commits tries to integrate this into arch/um as one of the mode
> of UML.  This was discussed during RFC email of LKL (*1).
> 
> The latest LKL version can be found at https://github.com/lkl/linux
> 
> Milestone
> =========
> This patches is just a first step toward upstreaming *library mode* of
> Linux kernel, but we think we need to have several steps toward our goal,
> describing in the below.
> 
> 1. Put LKL code under arch/um (arch/um/lkl), and build it in a
> separate way from UML.
> 2. Share common parts of implementation between UML and LKL.
> 3. Reimplement UML features with LKL API (if we wish)
> 
> For the step 1, we put LKL as one of UMMODE in order to make less effort to
> integrate (make ARCH=um UMMODE=library).  The modification to existing UML
> code is trying to be minimized.
> 
> The RFC patches also includes and a bit of step 2 as a proof of possibility
> to share the code.  For this, we used the virtio device code of LKL and use
> it from UML by enabling virtio-mmio driver with UML code.
> 
> 
> 
> Building LKL the host library and LKL applications
> ==================================================
> 
> % cd tools/lkl
> % make
> 
> will build LKL as a object file, it will install it in tools/lkl/lib together
> with the headers files in tools/lkl/include then will build the host library,
> tests and a few of application examples:
> 
> * tests/boot - a simple applications that uses LKL and exercises the basic
> LKL APIs
> 
> * tests/net-test - a simple applications that uses network feature of
> LKL and exercises the basic network-related APIs
> 
> * fs2tar - a tool that converts a filesystem image to a tar archive
> 
> * cptofs/cpfromfs - a tool that copies files to/from a filesystem image
> 
> % make run-tests
> 
> should run the above `tests/boot` and `tests/net-test` and report errors if
> there are any.
> 
> Supported hosts
> ===============
> 
> Currently LKL supports POSIX and Windows userspace applications. New hosts
> can be added relatively easy if the host supports gcc and GNU ld. Previous
> versions of LKL supported Windows kernel and Haiku kernel hosts, and we
> also have WIP patches (not included in this RFC) with rump-hypercall
> interface, used in UEFI, as well as macOS userspace (part of POSIX?).
> 
> There is also musl-libc port for LKL, which might be interested in for some
> folks.
> 
> 
> Further readings about LKL
> =========================
> 
> - Discussion in github LKL issue
> https://github.com/lkl/linux/issues/304
> 
> - LKL (an article)
> https://www.researchgate.net/profile/Nicolae_Tapus2/publication/224164682_LKL_The_Linux_kernel_library/links/02bfe50fd921ab4f7c000000.pdf
> 
> *1 RFC email to LKML (back in 2015)
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1012277.html
> 
> 
> 
> Please review the following changes for suitability for inclusion. If you have
> any objections or suggestions for improvement, please respond to the patches. If
> you agree with the changes, please provide your Acked-by.
> 
> The following changes since commit 73625ed66389d4c620520058d828f43a93ab4d0c:
> 
>    um: irq: Fix LAST_IRQ usage in init_IRQ() (2019-09-16 08:38:58 +0200)
> 
> are available in the Git repository at:
> 
>    git://github.com/thehajime/linux 61b15bfb52c7f1f066685c90a1cfe8346b3faec9
>    https://github.com/thehajime/linux/tree/upstream-to-uml-5.5-rc1-v2
> 
> Andreas Abel (1):
>    kallsyms: Add a config option to select section for kallsyms
> 
> Hajime Tazaki (6):
>    lkl: add system call hijack support
>    scripts: revert CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX patches
>    lkl: Android ARM (arm/arm64) support
>    um lkl: add CI tests
>    um: use lkl virtio_net_tap device as UML device
>    um: add lkl virtio-blk device
> 
> Octavian Purdila (29):
>    asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
>    arch: add __SYSCALL_DEFINE_ARCH
>    lkl: architecture skeleton for Linux kernel library
>    lkl: host interface
>    lkl: memory handling
>    lkl: kernel threads support
>    lkl: interrupt support
>    lkl: system call interface and application API
>    lkl: timers, time and delay support
>    lkl: memory mapped I/O support
>    lkl: basic kernel console support
>    lkl: initialization and cleanup
>    lkl: plug in the build system
>    lkl tools: skeleton for host side library, tests and tools
>    lkl tools: host lib: add utilities functions
>    lkl tools: host lib: memory mapped I/O helpers
>    lkl tools: host lib: virtio devices
>    lkl tools: host lib: virtio block device
>    lkl tools: host lib: filesystem helpers
>    lkl tools: host lib: posix host operations
>    lkl tools: "boot" test
>    lkl tools: tool that reads/writes to/from a filesystem image
>    lkl tools: tool that converts a filesystem image to tar
>    lkl tools: virtio: add network device support
>    checkpatch: avoid showing BIT_ULL warnings for tools/ files
>    lkl tools: add lklfuse
>    lkl: add documentation
>    lkl: add support for Windows hosts
>    lkl tools: add support for Windows host
> 
> Thomas Liebetraut (1):
>    tools: Add the lkl host library to the common tools Makefile
> 
>   .circleci/config.yml                       | 276 ++++++
>   Documentation/virt/uml/lkl.txt             | 453 ++++++++++
>   MAINTAINERS                                |   8 +
>   Makefile                                   |   4 +-
>   arch/Kconfig                               |   6 +
>   arch/um/Kconfig                            |  32 +-
>   arch/um/Makefile                           | 151 +---
>   arch/um/Makefile.um                        | 152 ++++
>   arch/um/configs/x86_64_defconfig           |   6 +
>   arch/um/include/asm/Kbuild                 |   1 +
>   arch/um/include/asm/io.h                   |   4 +
>   arch/um/kernel/syscall.c                   |  53 ++
>   arch/um/lkl/.gitignore                     |   2 +
>   arch/um/lkl/Kconfig                        |  91 ++
>   arch/um/lkl/Kconfig.debug                  |   0
>   arch/um/lkl/Makefile                       |  69 ++
>   arch/um/lkl/auto.conf                      |   1 +
>   arch/um/lkl/configs/lkl_defconfig          |  91 ++
>   arch/um/lkl/include/asm/Kbuild             |  80 ++
>   arch/um/lkl/include/asm/bitsperlong.h      |  11 +
>   arch/um/lkl/include/asm/byteorder.h        |   7 +
>   arch/um/lkl/include/asm/cpu.h              |  14 +
>   arch/um/lkl/include/asm/elf.h              |  15 +
>   arch/um/lkl/include/asm/host_ops.h         |  10 +
>   arch/um/lkl/include/asm/io.h               | 104 +++
>   arch/um/lkl/include/asm/irq.h              |  15 +
>   arch/um/lkl/include/asm/mutex.h            |   7 +
>   arch/um/lkl/include/asm/page.h             |  14 +
>   arch/um/lkl/include/asm/pgtable.h          |  62 ++
>   arch/um/lkl/include/asm/processor.h        |  60 ++
>   arch/um/lkl/include/asm/ptrace.h           |  25 +
>   arch/um/lkl/include/asm/sched.h            |  23 +
>   arch/um/lkl/include/asm/setup.h            |   7 +
>   arch/um/lkl/include/asm/syscalls.h         |  18 +
>   arch/um/lkl/include/asm/syscalls_32.h      |  43 +
>   arch/um/lkl/include/asm/thread_info.h      |  70 ++
>   arch/um/lkl/include/asm/tlb.h              |  12 +
>   arch/um/lkl/include/asm/uaccess.h          |  64 ++
>   arch/um/lkl/include/asm/unistd.h           |  29 +
>   arch/um/lkl/include/asm/unistd_32.h        |  31 +
>   arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
>   arch/um/lkl/include/asm/xor.h              |   9 +
>   arch/um/lkl/include/system/stdarg.h        |   2 +
>   arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
>   arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
>   arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
>   arch/um/lkl/include/uapi/asm/host_ops.h    | 153 ++++
>   arch/um/lkl/include/uapi/asm/irq.h         |  36 +
>   arch/um/lkl/include/uapi/asm/sigcontext.h  |  16 +
>   arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
>   arch/um/lkl/include/uapi/asm/swab.h        |  11 +
>   arch/um/lkl/include/uapi/asm/syscalls.h    | 348 ++++++++
>   arch/um/lkl/include/uapi/asm/unistd.h      |  18 +
>   arch/um/lkl/kernel/Makefile                |   4 +
>   arch/um/lkl/kernel/asm-offsets.c           |   2 +
>   arch/um/lkl/kernel/console.c               |  42 +
>   arch/um/lkl/kernel/cpu.c                   | 223 +++++
>   arch/um/lkl/kernel/irq.c                   | 193 +++++
>   arch/um/lkl/kernel/misc.c                  |  60 ++
>   arch/um/lkl/kernel/setup.c                 | 193 +++++
>   arch/um/lkl/kernel/syscalls.c              | 246 ++++++
>   arch/um/lkl/kernel/syscalls_32.c           | 159 ++++
>   arch/um/lkl/kernel/threads.c               | 227 +++++
>   arch/um/lkl/kernel/time.c                  | 145 ++++
>   arch/um/lkl/kernel/vmlinux.lds.S           |  51 ++
>   arch/um/lkl/mm/Makefile                    |   1 +
>   arch/um/lkl/mm/bootmem.c                   |  66 ++
>   arch/um/lkl/scripts/headers_install.py     | 195 +++++
>   arch/um/os-Linux/Makefile                  |   5 +
>   arch/um/os-Linux/lkl_dev.c                 | 188 +++++
>   certs/system_certificates.S                |  16 +-
>   include/asm-generic/atomic64.h             |   2 +
>   include/asm-generic/export.h               |  34 +-
>   include/asm-generic/vmlinux.lds.h          | 279 ++++---
>   include/linux/compiler_attributes.h        |   4 +
>   include/linux/export.h                     |  23 +-
>   include/linux/linkage.h                    |  12 +-
>   include/linux/syscalls.h                   |   6 +
>   init/Kconfig                               |  12 +
>   lib/.gitignore                             |   2 +
>   lib/raid6/.gitignore                       |   1 +
>   scripts/.gitignore                         |   2 +
>   scripts/Makefile.build                     |   9 +-
>   scripts/adjust_autoksyms.sh                |   6 +
>   scripts/basic/.gitignore                   |   1 +
>   scripts/checkpatch.pl                      |  13 +-
>   scripts/depmod.sh                          |  25 +-
>   scripts/genksyms/genksyms.c                |  11 +-
>   scripts/kallsyms.c                         |  54 +-
>   scripts/kconfig/.gitignore                 |   1 +
>   scripts/link-vmlinux.sh                    |  10 +
>   scripts/mod/.gitignore                     |   1 +
>   scripts/mod/modpost.c                      |  30 +-
>   tools/Makefile                             |  11 +-
>   tools/lkl/.gitignore                       |  15 +
>   tools/lkl/Build                            |   6 +
>   tools/lkl/Makefile                         | 130 +++
>   tools/lkl/Makefile.autoconf                | 114 +++
>   tools/lkl/Targets                          |  25 +
>   tools/lkl/bin/lkl-hijack.sh                |  23 +
>   tools/lkl/cptofs.c                         | 635 ++++++++++++++
>   tools/lkl/fs2tar.c                         | 410 +++++++++
>   tools/lkl/include/.gitignore               |   1 +
>   tools/lkl/include/lkl.h                    | 928 +++++++++++++++++++++
>   tools/lkl/include/lkl_config.h             |  61 ++
>   tools/lkl/include/lkl_host.h               | 160 ++++
>   tools/lkl/include/mingw32/sys/socket.h     |   4 +
>   tools/lkl/lib/.gitignore                   |   3 +
>   tools/lkl/lib/Build                        |  26 +
>   tools/lkl/lib/Makefile                     |  33 +
>   tools/lkl/lib/config.c                     | 793 ++++++++++++++++++
>   tools/lkl/lib/dbg.c                        | 300 +++++++
>   tools/lkl/lib/dbg_handler.c                |  44 +
>   tools/lkl/lib/endian.h                     |  31 +
>   tools/lkl/lib/fs.c                         | 433 ++++++++++
>   tools/lkl/lib/hijack/Build                 |   4 +
>   tools/lkl/lib/hijack/hijack.c              | 620 ++++++++++++++
>   tools/lkl/lib/hijack/init.c                | 252 ++++++
>   tools/lkl/lib/hijack/init.h                |   8 +
>   tools/lkl/lib/hijack/xlate.c               | 613 ++++++++++++++
>   tools/lkl/lib/hijack/xlate.h               |  13 +
>   tools/lkl/lib/iomem.c                      |  88 ++
>   tools/lkl/lib/iomem.h                      |  15 +
>   tools/lkl/lib/jmp_buf.c                    |  14 +
>   tools/lkl/lib/jmp_buf.h                    |   8 +
>   tools/lkl/lib/net.c                        | 818 ++++++++++++++++++
>   tools/lkl/lib/nt-host.c                    | 375 +++++++++
>   tools/lkl/lib/posix-host.c                 | 439 ++++++++++
>   tools/lkl/lib/utils.c                      | 266 ++++++
>   tools/lkl/lib/virtio.c                     | 644 ++++++++++++++
>   tools/lkl/lib/virtio.h                     | 115 +++
>   tools/lkl/lib/virtio_blk.c                 | 132 +++
>   tools/lkl/lib/virtio_net.c                 | 345 ++++++++
>   tools/lkl/lib/virtio_net_dpdk.c            | 480 +++++++++++
>   tools/lkl/lib/virtio_net_fd.c              | 195 +++++
>   tools/lkl/lib/virtio_net_fd.h              |  50 ++
>   tools/lkl/lib/virtio_net_macvtap.c         |  32 +
>   tools/lkl/lib/virtio_net_pipe.c            |  76 ++
>   tools/lkl/lib/virtio_net_raw.c             |  94 +++
>   tools/lkl/lib/virtio_net_tap.c             | 111 +++
>   tools/lkl/lib/virtio_net_vde.c             | 168 ++++
>   tools/lkl/lklfuse.c                        | 658 +++++++++++++++
>   tools/lkl/scripts/checkpatch.sh            |  60 ++
>   tools/lkl/scripts/lkl-jenkins.sh           |  21 +
>   tools/lkl/tests/Build                      |   3 +
>   tools/lkl/tests/boot.c                     | 562 +++++++++++++
>   tools/lkl/tests/boot.sh                    |   9 +
>   tools/lkl/tests/cla.c                      | 159 ++++
>   tools/lkl/tests/cla.h                      |  33 +
>   tools/lkl/tests/disk.c                     | 189 +++++
>   tools/lkl/tests/disk.sh                    |  70 ++
>   tools/lkl/tests/hijack-test.sh             | 760 +++++++++++++++++
>   tools/lkl/tests/lklfuse.sh                 | 110 +++
>   tools/lkl/tests/net-setup.sh               | 134 +++
>   tools/lkl/tests/net-test.c                 | 317 +++++++
>   tools/lkl/tests/net.sh                     | 186 +++++
>   tools/lkl/tests/run.py                     | 182 ++++
>   tools/lkl/tests/run_netperf.sh             |  98 +++
>   tools/lkl/tests/tap13.py                   | 209 +++++
>   tools/lkl/tests/test.c                     | 126 +++
>   tools/lkl/tests/test.h                     |  72 ++
>   tools/lkl/tests/test.sh                    | 240 ++++++
>   tools/lkl/tests/valgrind.supp              |  85 ++
>   tools/lkl/tests/valgrind2xunit.py          |  69 ++
>   usr/initramfs_data.S                       |   4 +-
>   165 files changed, 19489 insertions(+), 354 deletions(-)
>   create mode 100644 .circleci/config.yml
>   create mode 100644 Documentation/virt/uml/lkl.txt
>   create mode 100644 arch/um/Makefile.um
>   create mode 100644 arch/um/lkl/.gitignore
>   create mode 100644 arch/um/lkl/Kconfig
>   create mode 100644 arch/um/lkl/Kconfig.debug
>   create mode 100644 arch/um/lkl/Makefile
>   create mode 100644 arch/um/lkl/auto.conf
>   create mode 100644 arch/um/lkl/configs/lkl_defconfig
>   create mode 100644 arch/um/lkl/include/asm/Kbuild
>   create mode 100644 arch/um/lkl/include/asm/bitsperlong.h
>   create mode 100644 arch/um/lkl/include/asm/byteorder.h
>   create mode 100644 arch/um/lkl/include/asm/cpu.h
>   create mode 100644 arch/um/lkl/include/asm/elf.h
>   create mode 100644 arch/um/lkl/include/asm/host_ops.h
>   create mode 100644 arch/um/lkl/include/asm/io.h
>   create mode 100644 arch/um/lkl/include/asm/irq.h
>   create mode 100644 arch/um/lkl/include/asm/mutex.h
>   create mode 100644 arch/um/lkl/include/asm/page.h
>   create mode 100644 arch/um/lkl/include/asm/pgtable.h
>   create mode 100644 arch/um/lkl/include/asm/processor.h
>   create mode 100644 arch/um/lkl/include/asm/ptrace.h
>   create mode 100644 arch/um/lkl/include/asm/sched.h
>   create mode 100644 arch/um/lkl/include/asm/setup.h
>   create mode 100644 arch/um/lkl/include/asm/syscalls.h
>   create mode 100644 arch/um/lkl/include/asm/syscalls_32.h
>   create mode 100644 arch/um/lkl/include/asm/thread_info.h
>   create mode 100644 arch/um/lkl/include/asm/tlb.h
>   create mode 100644 arch/um/lkl/include/asm/uaccess.h
>   create mode 100644 arch/um/lkl/include/asm/unistd.h
>   create mode 100644 arch/um/lkl/include/asm/unistd_32.h
>   create mode 100644 arch/um/lkl/include/asm/vmlinux.lds.h
>   create mode 100644 arch/um/lkl/include/asm/xor.h
>   create mode 100644 arch/um/lkl/include/system/stdarg.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/Kbuild
>   create mode 100644 arch/um/lkl/include/uapi/asm/bitsperlong.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/byteorder.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/host_ops.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/irq.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/sigcontext.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/siginfo.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/swab.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/syscalls.h
>   create mode 100644 arch/um/lkl/include/uapi/asm/unistd.h
>   create mode 100644 arch/um/lkl/kernel/Makefile
>   create mode 100644 arch/um/lkl/kernel/asm-offsets.c
>   create mode 100644 arch/um/lkl/kernel/console.c
>   create mode 100644 arch/um/lkl/kernel/cpu.c
>   create mode 100644 arch/um/lkl/kernel/irq.c
>   create mode 100644 arch/um/lkl/kernel/misc.c
>   create mode 100644 arch/um/lkl/kernel/setup.c
>   create mode 100644 arch/um/lkl/kernel/syscalls.c
>   create mode 100644 arch/um/lkl/kernel/syscalls_32.c
>   create mode 100644 arch/um/lkl/kernel/threads.c
>   create mode 100644 arch/um/lkl/kernel/time.c
>   create mode 100644 arch/um/lkl/kernel/vmlinux.lds.S
>   create mode 100644 arch/um/lkl/mm/Makefile
>   create mode 100644 arch/um/lkl/mm/bootmem.c
>   create mode 100755 arch/um/lkl/scripts/headers_install.py
>   create mode 100644 arch/um/os-Linux/lkl_dev.c
>   create mode 100644 tools/lkl/.gitignore
>   create mode 100644 tools/lkl/Build
>   create mode 100644 tools/lkl/Makefile
>   create mode 100644 tools/lkl/Makefile.autoconf
>   create mode 100644 tools/lkl/Targets
>   create mode 100755 tools/lkl/bin/lkl-hijack.sh
>   create mode 100644 tools/lkl/cptofs.c
>   create mode 100644 tools/lkl/fs2tar.c
>   create mode 100644 tools/lkl/include/.gitignore
>   create mode 100644 tools/lkl/include/lkl.h
>   create mode 100644 tools/lkl/include/lkl_config.h
>   create mode 100644 tools/lkl/include/lkl_host.h
>   create mode 100644 tools/lkl/include/mingw32/sys/socket.h
>   create mode 100644 tools/lkl/lib/.gitignore
>   create mode 100644 tools/lkl/lib/Build
>   create mode 100644 tools/lkl/lib/Makefile
>   create mode 100644 tools/lkl/lib/config.c
>   create mode 100644 tools/lkl/lib/dbg.c
>   create mode 100644 tools/lkl/lib/dbg_handler.c
>   create mode 100644 tools/lkl/lib/endian.h
>   create mode 100644 tools/lkl/lib/fs.c
>   create mode 100644 tools/lkl/lib/hijack/Build
>   create mode 100644 tools/lkl/lib/hijack/hijack.c
>   create mode 100644 tools/lkl/lib/hijack/init.c
>   create mode 100644 tools/lkl/lib/hijack/init.h
>   create mode 100644 tools/lkl/lib/hijack/xlate.c
>   create mode 100644 tools/lkl/lib/hijack/xlate.h
>   create mode 100644 tools/lkl/lib/iomem.c
>   create mode 100644 tools/lkl/lib/iomem.h
>   create mode 100644 tools/lkl/lib/jmp_buf.c
>   create mode 100644 tools/lkl/lib/jmp_buf.h
>   create mode 100644 tools/lkl/lib/net.c
>   create mode 100644 tools/lkl/lib/nt-host.c
>   create mode 100644 tools/lkl/lib/posix-host.c
>   create mode 100644 tools/lkl/lib/utils.c
>   create mode 100644 tools/lkl/lib/virtio.c
>   create mode 100644 tools/lkl/lib/virtio.h
>   create mode 100644 tools/lkl/lib/virtio_blk.c
>   create mode 100644 tools/lkl/lib/virtio_net.c
>   create mode 100644 tools/lkl/lib/virtio_net_dpdk.c
>   create mode 100644 tools/lkl/lib/virtio_net_fd.c
>   create mode 100644 tools/lkl/lib/virtio_net_fd.h
>   create mode 100644 tools/lkl/lib/virtio_net_macvtap.c
>   create mode 100644 tools/lkl/lib/virtio_net_pipe.c
>   create mode 100644 tools/lkl/lib/virtio_net_raw.c
>   create mode 100644 tools/lkl/lib/virtio_net_tap.c
>   create mode 100644 tools/lkl/lib/virtio_net_vde.c
>   create mode 100644 tools/lkl/lklfuse.c
>   create mode 100755 tools/lkl/scripts/checkpatch.sh
>   create mode 100755 tools/lkl/scripts/lkl-jenkins.sh
>   create mode 100644 tools/lkl/tests/Build
>   create mode 100644 tools/lkl/tests/boot.c
>   create mode 100755 tools/lkl/tests/boot.sh
>   create mode 100644 tools/lkl/tests/cla.c
>   create mode 100644 tools/lkl/tests/cla.h
>   create mode 100644 tools/lkl/tests/disk.c
>   create mode 100755 tools/lkl/tests/disk.sh
>   create mode 100755 tools/lkl/tests/hijack-test.sh
>   create mode 100755 tools/lkl/tests/lklfuse.sh
>   create mode 100644 tools/lkl/tests/net-setup.sh
>   create mode 100644 tools/lkl/tests/net-test.c
>   create mode 100755 tools/lkl/tests/net.sh
>   create mode 100755 tools/lkl/tests/run.py
>   create mode 100755 tools/lkl/tests/run_netperf.sh
>   create mode 100644 tools/lkl/tests/tap13.py
>   create mode 100644 tools/lkl/tests/test.c
>   create mode 100644 tools/lkl/tests/test.h
>   create mode 100644 tools/lkl/tests/test.sh
>   create mode 100644 tools/lkl/tests/valgrind.supp
>   create mode 100755 tools/lkl/tests/valgrind2xunit.py
> 

I am reading the patch-set and I have a recurring question as I read it. 
It applies to IRQ, mmap IO, timers, devices, etc

The question is: "What is the unerlying req to replace the existing UML 
code for the library".

F.e timers in UML have been moved to an underlying POSIX timers call now 
and that can probably work on any system that offers it. If there is 
some presentation/documentation/etc material which I can read which goes 
through the actual choices it will be very helpful.

The same question applies the other way around too. I like the hostops 
approach, we can probably adopt some of that in UML proper to make it 
more portable and easier to have alternative implementations for the 
underlying host side operations.

-- 
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 00/37] Unifying LKL into UML
  2019-11-08  9:13     ` Anton Ivanov
@ 2019-11-08 11:17       ` Octavian Purdila
  -1 siblings, 0 replies; 206+ messages in thread
From: Octavian Purdila @ 2019-11-08 11:17 UTC (permalink / raw)
  To: Anton Ivanov
  Cc: Hajime Tazaki, linux-um, linux-kernel-library, linux-arch, Akira Moroo

On Fri, Nov 8, 2019 at 11:13 AM Anton Ivanov
<anton.ivanov@cambridgegreys.com> wrote:
>
<snip>

Hi Anton,

> I am reading the patch-set and I have a recurring question as I read it.
> It applies to IRQ, mmap IO, timers, devices, etc
>
> The question is: "What is the unerlying req to replace the existing UML
> code for the library".
>
> F.e timers in UML have been moved to an underlying POSIX timers call now
> and that can probably work on any system that offers it. If there is
> some presentation/documentation/etc material which I can read which goes
> through the actual choices it will be very helpful.
>

This (old) paper should help with some of the rationale and design
decision behind LKL:

https://www.researchgate.net/publication/224164682_LKL_The_Linux_kernel_library

> The same question applies the other way around too. I like the hostops
> approach, we can probably adopt some of that in UML proper to make it
> more portable and easier to have alternative implementations for the
> underlying host side operations.
>

The host ops part is not properly explained in the paper as it evolved
over time (they are called native operations there), so I will try to
give a high level overview here.

In order to make it easier to compile LKL applications for different
targets (OSes, architectures, user/kernel) we decided to use a two
step build process: a kernel build and a host target (+apps) build.
This helped us reduce intrusions in the kernel build system while
allowing to support the various requirements a host target has. As
part of the first step a lkl.o object and a set of processed kernel
uabi headers (to avoid conflicts with host Linux kernel headers) are
generated. These are then use to build a library (liblkl.so) for a
specific target. Here is where host ops are compiled in together with
the lkl.o object. This is the reason for the split between arch/lkl
(kernel) and tools/lkl (host target).

I think this build split is the biggest challenge in integrating LKL
with UML and I think once this is resolved it will be much easier to
merge the rest.

I am curios to learn what you think about such a split build for UML
and if there are any low hanging fruits we could start from.

Thanks,
Tavi

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 00/37] Unifying LKL into UML
@ 2019-11-08 11:17       ` Octavian Purdila
  0 siblings, 0 replies; 206+ messages in thread
From: Octavian Purdila @ 2019-11-08 11:17 UTC (permalink / raw)
  To: Anton Ivanov
  Cc: linux-arch, linux-kernel-library, linux-um, Hajime Tazaki, Akira Moroo

On Fri, Nov 8, 2019 at 11:13 AM Anton Ivanov
<anton.ivanov@cambridgegreys.com> wrote:
>
<snip>

Hi Anton,

> I am reading the patch-set and I have a recurring question as I read it.
> It applies to IRQ, mmap IO, timers, devices, etc
>
> The question is: "What is the unerlying req to replace the existing UML
> code for the library".
>
> F.e timers in UML have been moved to an underlying POSIX timers call now
> and that can probably work on any system that offers it. If there is
> some presentation/documentation/etc material which I can read which goes
> through the actual choices it will be very helpful.
>

This (old) paper should help with some of the rationale and design
decision behind LKL:

https://www.researchgate.net/publication/224164682_LKL_The_Linux_kernel_library

> The same question applies the other way around too. I like the hostops
> approach, we can probably adopt some of that in UML proper to make it
> more portable and easier to have alternative implementations for the
> underlying host side operations.
>

The host ops part is not properly explained in the paper as it evolved
over time (they are called native operations there), so I will try to
give a high level overview here.

In order to make it easier to compile LKL applications for different
targets (OSes, architectures, user/kernel) we decided to use a two
step build process: a kernel build and a host target (+apps) build.
This helped us reduce intrusions in the kernel build system while
allowing to support the various requirements a host target has. As
part of the first step a lkl.o object and a set of processed kernel
uabi headers (to avoid conflicts with host Linux kernel headers) are
generated. These are then use to build a library (liblkl.so) for a
specific target. Here is where host ops are compiled in together with
the lkl.o object. This is the reason for the split between arch/lkl
(kernel) and tools/lkl (host target).

I think this build split is the biggest challenge in integrating LKL
with UML and I think once this is resolved it will be much easier to
merge the rest.

I am curios to learn what you think about such a split build for UML
and if there are any low hanging fruits we could start from.

Thanks,
Tavi

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
  2019-11-08  5:02     ` Hajime Tazaki
@ 2019-11-25 22:00       ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:00 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: linux-um, Linux-Arch, Patrick Collins, Levente Kurusa,
	Matthieu Coudron, Conrad Meyer, Octavian Purdila, Jens Staal,
	Motomu Utsumi, Lai Jiangshan, Akira Moroo, Petros Angelatos,
	Yuan Liu, Xiao Jia, Mark Stillwell, linux-kernel-library,
	Pierre-Hugues Husson, Michael Zimmermann, Luca Dariz,
	Edison M . Castro

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> Adds the LKL Kconfig, vmlinux linker script, basic architecture
> headers and miscellaneous basic functions or stubs such as
> dump_stack(), show_regs() and cpuinfo proc ops.
>
> The headers we introduce in this patch are simple wrappers to the
> asm-generic headers or stubs for things we don't support, such as
> ptrace, DMA, signals, ELF handling and low level processor operations.
>
> The kernel configuration is automatically updated to reflect the
> endianness of the host, 64bit support or the output format for
> vmlinux's linker script. We do this by looking at the ld's default
> output format.
>
> Signed-off-by: Andreas Abel <aabel@google.com>
> Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
> Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
> Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> Signed-off-by: Jens Staal <staal1978@gmail.com>
> Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
> Signed-off-by: Levente Kurusa <levex@linux.com>
> Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
> Signed-off-by: Mark Stillwell <mark@stillwell.me>
> Signed-off-by: Matthieu Coudron <mattator@gmail.com>
> Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
> Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
> Signed-off-by: Patrick Collins <pscollins@google.com>
> Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
> Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
> Signed-off-by: Xiao Jia <xiaoj@google.com>
> Signed-off-by: Yuan Liu <liuyuan@google.com>
> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>

Can we please have this chain cleaned up?
Please see process/submitting-patches.rst.

> ---
>  MAINTAINERS                                |   8 +
>  arch/um/lkl/.gitignore                     |   2 +
>  arch/um/lkl/Kconfig                        |  95 ++++++
>  arch/um/lkl/Kconfig.debug                  |   0
>  arch/um/lkl/configs/lkl_defconfig          |  91 ++++++
>  arch/um/lkl/include/asm/Kbuild             |  80 +++++
>  arch/um/lkl/include/asm/bitsperlong.h      |  11 +
>  arch/um/lkl/include/asm/byteorder.h        |   7 +
>  arch/um/lkl/include/asm/cpu.h              |  14 +
>  arch/um/lkl/include/asm/elf.h              |  15 +
>  arch/um/lkl/include/asm/mutex.h            |   7 +
>  arch/um/lkl/include/asm/processor.h        |  60 ++++
>  arch/um/lkl/include/asm/ptrace.h           |  25 ++
>  arch/um/lkl/include/asm/sched.h            |  23 ++
>  arch/um/lkl/include/asm/syscalls.h         |  18 ++
>  arch/um/lkl/include/asm/syscalls_32.h      |  43 +++
>  arch/um/lkl/include/asm/tlb.h              |  12 +
>  arch/um/lkl/include/asm/uaccess.h          |  64 ++++
>  arch/um/lkl/include/asm/unistd_32.h        |  31 ++
>  arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
>  arch/um/lkl/include/asm/xor.h              |   9 +
>  arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
>  arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
>  arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
>  arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
>  arch/um/lkl/include/uapi/asm/swab.h        |  11 +
>  arch/um/lkl/include/uapi/asm/syscalls.h    | 348 +++++++++++++++++++++

I think this is the first big thing which needs a unification.

In UML we try hard to re-use headers from x86.
We also have some headers in arch/x86/um/.

LKL should do the same. At least try hard to avoid duplication.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
@ 2019-11-25 22:00       ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:00 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: Linux-Arch, Levente Kurusa, Matthieu Coudron, Conrad Meyer,
	Octavian Purdila, Lai Jiangshan, Jens Staal, Motomu Utsumi,
	linux-um, Akira Moroo, Petros Angelatos, Edison M . Castro,
	Xiao Jia, Mark Stillwell, linux-kernel-library, Patrick Collins,
	Pierre-Hugues Husson, Michael Zimmermann, Luca Dariz, Yuan Liu

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> Adds the LKL Kconfig, vmlinux linker script, basic architecture
> headers and miscellaneous basic functions or stubs such as
> dump_stack(), show_regs() and cpuinfo proc ops.
>
> The headers we introduce in this patch are simple wrappers to the
> asm-generic headers or stubs for things we don't support, such as
> ptrace, DMA, signals, ELF handling and low level processor operations.
>
> The kernel configuration is automatically updated to reflect the
> endianness of the host, 64bit support or the output format for
> vmlinux's linker script. We do this by looking at the ld's default
> output format.
>
> Signed-off-by: Andreas Abel <aabel@google.com>
> Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
> Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
> Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> Signed-off-by: Jens Staal <staal1978@gmail.com>
> Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
> Signed-off-by: Levente Kurusa <levex@linux.com>
> Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
> Signed-off-by: Mark Stillwell <mark@stillwell.me>
> Signed-off-by: Matthieu Coudron <mattator@gmail.com>
> Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
> Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
> Signed-off-by: Patrick Collins <pscollins@google.com>
> Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
> Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
> Signed-off-by: Xiao Jia <xiaoj@google.com>
> Signed-off-by: Yuan Liu <liuyuan@google.com>
> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>

Can we please have this chain cleaned up?
Please see process/submitting-patches.rst.

> ---
>  MAINTAINERS                                |   8 +
>  arch/um/lkl/.gitignore                     |   2 +
>  arch/um/lkl/Kconfig                        |  95 ++++++
>  arch/um/lkl/Kconfig.debug                  |   0
>  arch/um/lkl/configs/lkl_defconfig          |  91 ++++++
>  arch/um/lkl/include/asm/Kbuild             |  80 +++++
>  arch/um/lkl/include/asm/bitsperlong.h      |  11 +
>  arch/um/lkl/include/asm/byteorder.h        |   7 +
>  arch/um/lkl/include/asm/cpu.h              |  14 +
>  arch/um/lkl/include/asm/elf.h              |  15 +
>  arch/um/lkl/include/asm/mutex.h            |   7 +
>  arch/um/lkl/include/asm/processor.h        |  60 ++++
>  arch/um/lkl/include/asm/ptrace.h           |  25 ++
>  arch/um/lkl/include/asm/sched.h            |  23 ++
>  arch/um/lkl/include/asm/syscalls.h         |  18 ++
>  arch/um/lkl/include/asm/syscalls_32.h      |  43 +++
>  arch/um/lkl/include/asm/tlb.h              |  12 +
>  arch/um/lkl/include/asm/uaccess.h          |  64 ++++
>  arch/um/lkl/include/asm/unistd_32.h        |  31 ++
>  arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
>  arch/um/lkl/include/asm/xor.h              |   9 +
>  arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
>  arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
>  arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
>  arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
>  arch/um/lkl/include/uapi/asm/swab.h        |  11 +
>  arch/um/lkl/include/uapi/asm/syscalls.h    | 348 +++++++++++++++++++++

I think this is the first big thing which needs a unification.

In UML we try hard to re-use headers from x86.
We also have some headers in arch/x86/um/.

LKL should do the same. At least try hard to avoid duplication.

-- 
Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 01/37] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
  2019-11-08  5:02     ` Hajime Tazaki
@ 2019-11-25 22:02       ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:02 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: linux-um, Octavian Purdila, linux-kernel-library, Linux-Arch,
	Akira Moroo

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> With CONFIG_64BIT enabled, atomic64 via CONFIG_GENERIC_ATOMIC64 options
> are not compiled due to type conflict of atomic64_t defined in
> linux/type.h.
>
> This commit fixes the issue and allow using generic atomic64 ops.

Hmm, why is this specific to LKL?
This need a review from core developers.

> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> ---
>  include/asm-generic/atomic64.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/include/asm-generic/atomic64.h b/include/asm-generic/atomic64.h
> index 370f01d4450f..9b15847baae5 100644
> --- a/include/asm-generic/atomic64.h
> +++ b/include/asm-generic/atomic64.h
> @@ -9,9 +9,11 @@
>  #define _ASM_GENERIC_ATOMIC64_H
>  #include <linux/types.h>
>
> +#ifndef CONFIG_64BIT
>  typedef struct {
>         s64 counter;
>  } atomic64_t;
> +#endif
>
>  #define ATOMIC64_INIT(i)       { (i) }
>
> --
> 2.20.1 (Apple Git-117)
>
>
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um



-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 01/37] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
@ 2019-11-25 22:02       ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:02 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: Octavian Purdila, linux-kernel-library, Akira Moroo, linux-um,
	Linux-Arch

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> With CONFIG_64BIT enabled, atomic64 via CONFIG_GENERIC_ATOMIC64 options
> are not compiled due to type conflict of atomic64_t defined in
> linux/type.h.
>
> This commit fixes the issue and allow using generic atomic64 ops.

Hmm, why is this specific to LKL?
This need a review from core developers.

> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> ---
>  include/asm-generic/atomic64.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/include/asm-generic/atomic64.h b/include/asm-generic/atomic64.h
> index 370f01d4450f..9b15847baae5 100644
> --- a/include/asm-generic/atomic64.h
> +++ b/include/asm-generic/atomic64.h
> @@ -9,9 +9,11 @@
>  #define _ASM_GENERIC_ATOMIC64_H
>  #include <linux/types.h>
>
> +#ifndef CONFIG_64BIT
>  typedef struct {
>         s64 counter;
>  } atomic64_t;
> +#endif
>
>  #define ATOMIC64_INIT(i)       { (i) }
>
> --
> 2.20.1 (Apple Git-117)
>
>
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um



-- 
Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 02/37] arch: add __SYSCALL_DEFINE_ARCH
  2019-11-08  5:02     ` Hajime Tazaki
@ 2019-11-25 22:02       ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:02 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: linux-um, Octavian Purdila, linux-kernel-library, Linux-Arch,
	Akira Moroo

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> This allows the architecture code to process the system call
> definitions. It is used by LKL to create strong typed function
> definitions for system calls.
>
> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> ---
>  include/linux/syscalls.h | 6 ++++++

Same here, core developers need to agree on this.

>  1 file changed, 6 insertions(+)
>
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 88145da7d140..77e52fe19923 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -203,9 +203,14 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>  }
>  #endif
>
> +#ifndef __SYSCALL_DEFINE_ARCH
> +#define __SYSCALL_DEFINE_ARCH(x, sname, ...)
> +#endif
> +
>  #ifndef SYSCALL_DEFINE0
>  #define SYSCALL_DEFINE0(sname)                                 \
>         SYSCALL_METADATA(_##sname, 0);                          \
> +       __SYSCALL_DEFINE_ARCH(0, _##sname);                     \
>         asmlinkage long sys_##sname(void);                      \
>         ALLOW_ERROR_INJECTION(sys_##sname, ERRNO);              \
>         asmlinkage long sys_##sname(void)
> @@ -222,6 +227,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>
>  #define SYSCALL_DEFINEx(x, sname, ...)                         \
>         SYSCALL_METADATA(sname, x, __VA_ARGS__)                 \
> +       __SYSCALL_DEFINE_ARCH(x, sname, __VA_ARGS__)            \
>         __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>
>  #define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
> --
> 2.20.1 (Apple Git-117)
>
>
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um



-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 02/37] arch: add __SYSCALL_DEFINE_ARCH
@ 2019-11-25 22:02       ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:02 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: Octavian Purdila, linux-kernel-library, Akira Moroo, linux-um,
	Linux-Arch

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> This allows the architecture code to process the system call
> definitions. It is used by LKL to create strong typed function
> definitions for system calls.
>
> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> ---
>  include/linux/syscalls.h | 6 ++++++

Same here, core developers need to agree on this.

>  1 file changed, 6 insertions(+)
>
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 88145da7d140..77e52fe19923 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -203,9 +203,14 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>  }
>  #endif
>
> +#ifndef __SYSCALL_DEFINE_ARCH
> +#define __SYSCALL_DEFINE_ARCH(x, sname, ...)
> +#endif
> +
>  #ifndef SYSCALL_DEFINE0
>  #define SYSCALL_DEFINE0(sname)                                 \
>         SYSCALL_METADATA(_##sname, 0);                          \
> +       __SYSCALL_DEFINE_ARCH(0, _##sname);                     \
>         asmlinkage long sys_##sname(void);                      \
>         ALLOW_ERROR_INJECTION(sys_##sname, ERRNO);              \
>         asmlinkage long sys_##sname(void)
> @@ -222,6 +227,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>
>  #define SYSCALL_DEFINEx(x, sname, ...)                         \
>         SYSCALL_METADATA(sname, x, __VA_ARGS__)                 \
> +       __SYSCALL_DEFINE_ARCH(x, sname, __VA_ARGS__)            \
>         __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>
>  #define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
> --
> 2.20.1 (Apple Git-117)
>
>
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um



-- 
Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-08  5:02     ` Hajime Tazaki
@ 2019-11-25 22:07       ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:07 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: linux-um, Linux-Arch, Conrad Meyer, Octavian Purdila,
	Akira Moroo, Yuan Liu, Patrick Collins, linux-kernel-library,
	Michael Zimmermann

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> Add helpers for implementing host virtio devices. It uses the memory
> mapped I/O helpers to interact with the Linux MMIO virtio transport
> driver and offers support to setup and add a new virtio device,
> dispatch requests from the incoming queues as well as support for
> completing requests.
>
> All added virtio devices are stored in lkl_virtio_devs as strings, per
> the Linux MMIO virtio transport driver command line specification.

Did you checkout arch/um/drivers/virtio_uml.c?
Why is this driver needed?

Virtio support is rather new in UML, we definitely need a common
code base for LKL and UML regarding to virtio.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-25 22:07       ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:07 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: Linux-Arch, Conrad Meyer, Octavian Purdila, linux-um,
	Akira Moroo, linux-kernel-library, Patrick Collins,
	Michael Zimmermann, Yuan Liu

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> Add helpers for implementing host virtio devices. It uses the memory
> mapped I/O helpers to interact with the Linux MMIO virtio transport
> driver and offers support to setup and add a new virtio device,
> dispatch requests from the incoming queues as well as support for
> completing requests.
>
> All added virtio devices are stored in lkl_virtio_devs as strings, per
> the Linux MMIO virtio transport driver command line specification.

Did you checkout arch/um/drivers/virtio_uml.c?
Why is this driver needed?

Virtio support is rather new in UML, we definitely need a common
code base for LKL and UML regarding to virtio.

-- 
Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 05/37] lkl: memory handling
  2019-11-08  5:02     ` Hajime Tazaki
@ 2019-11-25 22:10       ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:10 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: linux-um, Linux-Arch, Levente Kurusa, Octavian Purdila,
	Akira Moroo, Yuan Liu, linux-kernel-library

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> LKL is a non MMU architecture and hence there is not much work left to
> do other than initializing the boot allocator and providing the page
> and page table definitions.
>
> The backstore memory is allocated via a host operation and the memory
> size to be used is specified when the kernel is started, in the
> lkl_start_kernel call.
>
> Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> Signed-off-by: Levente Kurusa <levex@linux.com>
> Signed-off-by: Yuan Liu <liuyuan@google.com>
> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> ---
>  arch/um/lkl/include/asm/page.h          | 14 ++++++
>  arch/um/lkl/include/asm/pgtable.h       | 62 +++++++++++++++++++++++
>  arch/um/lkl/include/uapi/asm/host_ops.h |  5 ++
>  arch/um/lkl/mm/bootmem.c                | 66 +++++++++++++++++++++++++

This is also something which needs unification with UML.
UML in NOMMU mode would be LKL then...

--
Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 05/37] lkl: memory handling
@ 2019-11-25 22:10       ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:10 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: Linux-Arch, Levente Kurusa, Octavian Purdila, linux-um,
	Akira Moroo, linux-kernel-library, Yuan Liu

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> LKL is a non MMU architecture and hence there is not much work left to
> do other than initializing the boot allocator and providing the page
> and page table definitions.
>
> The backstore memory is allocated via a host operation and the memory
> size to be used is specified when the kernel is started, in the
> lkl_start_kernel call.
>
> Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> Signed-off-by: Levente Kurusa <levex@linux.com>
> Signed-off-by: Yuan Liu <liuyuan@google.com>
> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> ---
>  arch/um/lkl/include/asm/page.h          | 14 ++++++
>  arch/um/lkl/include/asm/pgtable.h       | 62 +++++++++++++++++++++++
>  arch/um/lkl/include/uapi/asm/host_ops.h |  5 ++
>  arch/um/lkl/mm/bootmem.c                | 66 +++++++++++++++++++++++++

This is also something which needs unification with UML.
UML in NOMMU mode would be LKL then...

--
Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 07/37] lkl: interrupt support
  2019-11-08  5:02     ` Hajime Tazaki
@ 2019-11-25 22:13       ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:13 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: linux-um, Linux-Arch, Octavian Purdila, Akira Moroo,
	linux-kernel-library, Michael Zimmermann

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> Add APIs that allows the host to reserve and free and interrupt number
> and also to trigger an interrupt.
>
> The trigger operation will simply store the interrupt data in
> queue. The interrupt handler is run later, at the first opportunity it
> has to avoid races with any kernel threads.
>
> Currently, interrupts are run on the first interrupt enable operation
> if interrupts are disabled and if we are not already in interrupt
> context.
>
> When triggering an interrupt, it uses GCC's built-in functions for
> atomic memory access to synchronize and simple boolean flags.
>
> Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> ---
>  arch/um/lkl/include/asm/irq.h             |  13 ++
>  arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
>  arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
>  arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++

Like I said before, this also something to unify with UML.
I'm aware that this is easily said but we cannot have too much duplication.

Feel free to ask if UML internals give you headache. :-)

--
Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 07/37] lkl: interrupt support
@ 2019-11-25 22:13       ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-25 22:13 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: Linux-Arch, Octavian Purdila, linux-um, Akira Moroo,
	linux-kernel-library, Michael Zimmermann

On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>
> From: Octavian Purdila <tavi.purdila@gmail.com>
>
> Add APIs that allows the host to reserve and free and interrupt number
> and also to trigger an interrupt.
>
> The trigger operation will simply store the interrupt data in
> queue. The interrupt handler is run later, at the first opportunity it
> has to avoid races with any kernel threads.
>
> Currently, interrupts are run on the first interrupt enable operation
> if interrupts are disabled and if we are not already in interrupt
> context.
>
> When triggering an interrupt, it uses GCC's built-in functions for
> atomic memory access to synchronize and simple boolean flags.
>
> Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> ---
>  arch/um/lkl/include/asm/irq.h             |  13 ++
>  arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
>  arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
>  arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++

Like I said before, this also something to unify with UML.
I'm aware that this is easily said but we cannot have too much duplication.

Feel free to ask if UML internals give you headache. :-)

--
Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-25 22:07       ` Richard Weinberger
@ 2019-11-26  8:43         ` Johannes Berg
  -1 siblings, 0 replies; 206+ messages in thread
From: Johannes Berg @ 2019-11-26  8:43 UTC (permalink / raw)
  To: Richard Weinberger, Hajime Tazaki
  Cc: Linux-Arch, Conrad Meyer, Octavian Purdila, linux-um,
	Akira Moroo, linux-kernel-library, Patrick Collins,
	Michael Zimmermann, Yuan Liu

On Mon, 2019-11-25 at 23:07 +0100, Richard Weinberger wrote:
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> > 
> > Add helpers for implementing host virtio devices. It uses the memory
> > mapped I/O helpers to interact with the Linux MMIO virtio transport
> > driver and offers support to setup and add a new virtio device,
> > dispatch requests from the incoming queues as well as support for
> > completing requests.
> > 
> > All added virtio devices are stored in lkl_virtio_devs as strings, per
> > the Linux MMIO virtio transport driver command line specification.
> 
> Did you checkout arch/um/drivers/virtio_uml.c?
> Why is this driver needed?

This isn't really a driver, this is virtio *device-side* code. Our
virtio_uml is *guest-side* code, and only speaks vhost-user.

I'm not sure how MMIO devices could possibly work though, does LKL
intercept MMIO somehow?

johannes

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-26  8:43         ` Johannes Berg
  0 siblings, 0 replies; 206+ messages in thread
From: Johannes Berg @ 2019-11-26  8:43 UTC (permalink / raw)
  To: Richard Weinberger, Hajime Tazaki
  Cc: Linux-Arch, Conrad Meyer, Octavian Purdila, linux-um,
	Akira Moroo, Patrick Collins, linux-kernel-library,
	Michael Zimmermann, Yuan Liu

On Mon, 2019-11-25 at 23:07 +0100, Richard Weinberger wrote:
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> > 
> > Add helpers for implementing host virtio devices. It uses the memory
> > mapped I/O helpers to interact with the Linux MMIO virtio transport
> > driver and offers support to setup and add a new virtio device,
> > dispatch requests from the incoming queues as well as support for
> > completing requests.
> > 
> > All added virtio devices are stored in lkl_virtio_devs as strings, per
> > the Linux MMIO virtio transport driver command line specification.
> 
> Did you checkout arch/um/drivers/virtio_uml.c?
> Why is this driver needed?

This isn't really a driver, this is virtio *device-side* code. Our
virtio_uml is *guest-side* code, and only speaks vhost-user.

I'm not sure how MMIO devices could possibly work though, does LKL
intercept MMIO somehow?

johannes



_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-26  8:43         ` Johannes Berg
@ 2019-11-26  8:50           ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-26  8:50 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Hajime Tazaki, linux-arch, cem, tavi purdila, linux-um,
	retrage01, linux-kernel-library, pscollins, sigmaepsilon92,
	liuyuan

----- Ursprüngliche Mail -----
> Von: "Johannes Berg" <johannes@sipsolutions.net>
> An: "Richard Weinberger" <richard.weinberger@gmail.com>, "Hajime Tazaki" <thehajime@gmail.com>
> CC: "linux-arch" <linux-arch@vger.kernel.org>, "cem" <cem@freebsd.org>, "tavi purdila" <tavi.purdila@gmail.com>,
> "linux-um" <linux-um@lists.infradead.org>, "retrage01" <retrage01@gmail.com>, linux-kernel-library@freelists.org,
> "pscollins" <pscollins@google.com>, "sigmaepsilon92" <sigmaepsilon92@gmail.com>, "liuyuan" <liuyuan@google.com>
> Gesendet: Dienstag, 26. November 2019 09:43:36
> Betreff: Re: [RFC v2 17/37] lkl tools: host lib: virtio devices

> On Mon, 2019-11-25 at 23:07 +0100, Richard Weinberger wrote:
>> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>> > From: Octavian Purdila <tavi.purdila@gmail.com>
>> > 
>> > Add helpers for implementing host virtio devices. It uses the memory
>> > mapped I/O helpers to interact with the Linux MMIO virtio transport
>> > driver and offers support to setup and add a new virtio device,
>> > dispatch requests from the incoming queues as well as support for
>> > completing requests.
>> > 
>> > All added virtio devices are stored in lkl_virtio_devs as strings, per
>> > the Linux MMIO virtio transport driver command line specification.
>> 
>> Did you checkout arch/um/drivers/virtio_uml.c?
>> Why is this driver needed?
> 
> This isn't really a driver, this is virtio *device-side* code. Our
> virtio_uml is *guest-side* code, and only speaks vhost-user.

Sorry, bad wording from my side. I meant with "driver" a kernel component.
 
> I'm not sure how MMIO devices could possibly work though, does LKL
> intercept MMIO somehow?

My point is that UML and LKL should try to do use the same concept/code
regarding virtio. At the end of day both use virtual devices which use
facilities from the host.
If this is really not possible it needs a good explanation.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-26  8:50           ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-26  8:50 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-arch, cem, tavi purdila, linux-um, retrage01, liuyuan,
	pscollins, linux-kernel-library, sigmaepsilon92, Hajime Tazaki

----- Ursprüngliche Mail -----
> Von: "Johannes Berg" <johannes@sipsolutions.net>
> An: "Richard Weinberger" <richard.weinberger@gmail.com>, "Hajime Tazaki" <thehajime@gmail.com>
> CC: "linux-arch" <linux-arch@vger.kernel.org>, "cem" <cem@freebsd.org>, "tavi purdila" <tavi.purdila@gmail.com>,
> "linux-um" <linux-um@lists.infradead.org>, "retrage01" <retrage01@gmail.com>, linux-kernel-library@freelists.org,
> "pscollins" <pscollins@google.com>, "sigmaepsilon92" <sigmaepsilon92@gmail.com>, "liuyuan" <liuyuan@google.com>
> Gesendet: Dienstag, 26. November 2019 09:43:36
> Betreff: Re: [RFC v2 17/37] lkl tools: host lib: virtio devices

> On Mon, 2019-11-25 at 23:07 +0100, Richard Weinberger wrote:
>> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>> > From: Octavian Purdila <tavi.purdila@gmail.com>
>> > 
>> > Add helpers for implementing host virtio devices. It uses the memory
>> > mapped I/O helpers to interact with the Linux MMIO virtio transport
>> > driver and offers support to setup and add a new virtio device,
>> > dispatch requests from the incoming queues as well as support for
>> > completing requests.
>> > 
>> > All added virtio devices are stored in lkl_virtio_devs as strings, per
>> > the Linux MMIO virtio transport driver command line specification.
>> 
>> Did you checkout arch/um/drivers/virtio_uml.c?
>> Why is this driver needed?
> 
> This isn't really a driver, this is virtio *device-side* code. Our
> virtio_uml is *guest-side* code, and only speaks vhost-user.

Sorry, bad wording from my side. I meant with "driver" a kernel component.
 
> I'm not sure how MMIO devices could possibly work though, does LKL
> intercept MMIO somehow?

My point is that UML and LKL should try to do use the same concept/code
regarding virtio. At the end of day both use virtual devices which use
facilities from the host.
If this is really not possible it needs a good explanation.

Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-26  8:50           ` Richard Weinberger
@ 2019-11-26  8:52             ` Johannes Berg
  -1 siblings, 0 replies; 206+ messages in thread
From: Johannes Berg @ 2019-11-26  8:52 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Hajime Tazaki, linux-arch, cem, tavi purdila, linux-um,
	retrage01, linux-kernel-library, pscollins, sigmaepsilon92,
	liuyuan

On Tue, 2019-11-26 at 09:50 +0100, Richard Weinberger wrote:

> > This isn't really a driver, this is virtio *device-side* code. Our
> > virtio_uml is *guest-side* code, and only speaks vhost-user.
> 
> Sorry, bad wording from my side. I meant with "driver" a kernel component.

Yeah, though arguably this isn't a kernel component, this is a
hypervisor component... We just link them both into the same binary due
to the way UML/LKL work.
 
> > I'm not sure how MMIO devices could possibly work though, does LKL
> > intercept MMIO somehow?
> 
> My point is that UML and LKL should try to do use the same concept/code
> regarding virtio. At the end of day both use virtual devices which use
> facilities from the host.
> If this is really not possible it needs a good explanation.

I think it isn't possible, unless you use vhost-user over a unix domain
socket internally to talk between the kernel (virtio_uml) and hypervisor
(device) components.

In virtio_uml, the device implementation is assumed to be a separate
process with a vhost-user connection. Here in LKL, the virtio device is
part of the "hypervisor", i.e. in the same process.

johannes

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-26  8:52             ` Johannes Berg
  0 siblings, 0 replies; 206+ messages in thread
From: Johannes Berg @ 2019-11-26  8:52 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: linux-arch, cem, tavi purdila, linux-um, retrage01, liuyuan,
	pscollins, linux-kernel-library, sigmaepsilon92, Hajime Tazaki

On Tue, 2019-11-26 at 09:50 +0100, Richard Weinberger wrote:

> > This isn't really a driver, this is virtio *device-side* code. Our
> > virtio_uml is *guest-side* code, and only speaks vhost-user.
> 
> Sorry, bad wording from my side. I meant with "driver" a kernel component.

Yeah, though arguably this isn't a kernel component, this is a
hypervisor component... We just link them both into the same binary due
to the way UML/LKL work.
 
> > I'm not sure how MMIO devices could possibly work though, does LKL
> > intercept MMIO somehow?
> 
> My point is that UML and LKL should try to do use the same concept/code
> regarding virtio. At the end of day both use virtual devices which use
> facilities from the host.
> If this is really not possible it needs a good explanation.

I think it isn't possible, unless you use vhost-user over a unix domain
socket internally to talk between the kernel (virtio_uml) and hypervisor
(device) components.

In virtio_uml, the device implementation is assumed to be a separate
process with a vhost-user connection. Here in LKL, the virtio device is
part of the "hypervisor", i.e. in the same process.

johannes


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-26  8:52             ` Johannes Berg
@ 2019-11-26 10:09               ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-26 10:09 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Hajime Tazaki, linux-arch, cem, tavi purdila, linux-um,
	retrage01, linux-kernel-library, pscollins, sigmaepsilon92,
	liuyuan

----- Ursprüngliche Mail -----
>> My point is that UML and LKL should try to do use the same concept/code
>> regarding virtio. At the end of day both use virtual devices which use
>> facilities from the host.
>> If this is really not possible it needs a good explanation.
> 
> I think it isn't possible, unless you use vhost-user over a unix domain
> socket internally to talk between the kernel (virtio_uml) and hypervisor
> (device) components.
> 
> In virtio_uml, the device implementation is assumed to be a separate
> process with a vhost-user connection. Here in LKL, the virtio device is
> part of the "hypervisor", i.e. in the same process.

Exactly, currently UML and LKL solve same things differently, but do we need to?

If we fail to agree on such a high level I might make sense to reevaluate to option
of not merging UML and LKL at all.
But this is beyond my decisional power and something I'd like to avoid.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-26 10:09               ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-26 10:09 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-arch, cem, tavi purdila, linux-um, retrage01, liuyuan,
	pscollins, linux-kernel-library, sigmaepsilon92, Hajime Tazaki

----- Ursprüngliche Mail -----
>> My point is that UML and LKL should try to do use the same concept/code
>> regarding virtio. At the end of day both use virtual devices which use
>> facilities from the host.
>> If this is really not possible it needs a good explanation.
> 
> I think it isn't possible, unless you use vhost-user over a unix domain
> socket internally to talk between the kernel (virtio_uml) and hypervisor
> (device) components.
> 
> In virtio_uml, the device implementation is assumed to be a separate
> process with a vhost-user connection. Here in LKL, the virtio device is
> part of the "hypervisor", i.e. in the same process.

Exactly, currently UML and LKL solve same things differently, but do we need to?

If we fail to agree on such a high level I might make sense to reevaluate to option
of not merging UML and LKL at all.
But this is beyond my decisional power and something I'd like to avoid.

Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-26 10:09               ` Richard Weinberger
@ 2019-11-26 10:16                 ` Johannes Berg
  -1 siblings, 0 replies; 206+ messages in thread
From: Johannes Berg @ 2019-11-26 10:16 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Hajime Tazaki, linux-arch, cem, tavi purdila, linux-um,
	retrage01, linux-kernel-library, pscollins, sigmaepsilon92,
	liuyuan

On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
> > > My point is that UML and LKL should try to do use the same concept/code
> > > regarding virtio. At the end of day both use virtual devices which use
> > > facilities from the host.
> > > If this is really not possible it needs a good explanation.
> > 
> > I think it isn't possible, unless you use vhost-user over a unix domain
> > socket internally to talk between the kernel (virtio_uml) and hypervisor
> > (device) components.
> > 
> > In virtio_uml, the device implementation is assumed to be a separate
> > process with a vhost-user connection. Here in LKL, the virtio device is
> > part of the "hypervisor", i.e. in the same process.
> 
> Exactly, currently UML and LKL solve same things differently, but do we need to?

It's not the same thing though :-)

UML right now doesn't have or support virtio devices in the built-in
hypervisor, what we wanted to use virtio for was explicitly for the
vhost-user devices.

LKL clearly wants to have device implementations in the hypervisor,
perhaps for networking or console etc.? That _might_ be useful since it
makes the device implementation more general, unlike the UML approach
where all devices come with a kernel- and user-side and are special
drivers in the kernel, vs. general virtio drivers.

Now, arguably, since UML has all these already a combined UML/LKL
doesn't actually *need* any virtio devices, since all (or at least most)
of the things that could be covered by virtio today are already covered
by UML devices (block, net, console, random).

I'd probably say then that this can be removed from an initial "minimum
viable product" of LKL, since once merged with UML you get the devices
from that. Later, we could decide that UML devices actually are better
done as virtio, and support something like this.

johannes

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-26 10:16                 ` Johannes Berg
  0 siblings, 0 replies; 206+ messages in thread
From: Johannes Berg @ 2019-11-26 10:16 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: linux-arch, cem, tavi purdila, linux-um, retrage01, liuyuan,
	pscollins, linux-kernel-library, sigmaepsilon92, Hajime Tazaki

On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
> > > My point is that UML and LKL should try to do use the same concept/code
> > > regarding virtio. At the end of day both use virtual devices which use
> > > facilities from the host.
> > > If this is really not possible it needs a good explanation.
> > 
> > I think it isn't possible, unless you use vhost-user over a unix domain
> > socket internally to talk between the kernel (virtio_uml) and hypervisor
> > (device) components.
> > 
> > In virtio_uml, the device implementation is assumed to be a separate
> > process with a vhost-user connection. Here in LKL, the virtio device is
> > part of the "hypervisor", i.e. in the same process.
> 
> Exactly, currently UML and LKL solve same things differently, but do we need to?

It's not the same thing though :-)

UML right now doesn't have or support virtio devices in the built-in
hypervisor, what we wanted to use virtio for was explicitly for the
vhost-user devices.

LKL clearly wants to have device implementations in the hypervisor,
perhaps for networking or console etc.? That _might_ be useful since it
makes the device implementation more general, unlike the UML approach
where all devices come with a kernel- and user-side and are special
drivers in the kernel, vs. general virtio drivers.

Now, arguably, since UML has all these already a combined UML/LKL
doesn't actually *need* any virtio devices, since all (or at least most)
of the things that could be covered by virtio today are already covered
by UML devices (block, net, console, random).

I'd probably say then that this can be removed from an initial "minimum
viable product" of LKL, since once merged with UML you get the devices
from that. Later, we could decide that UML devices actually are better
done as virtio, and support something like this.

johannes


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-26 10:16                 ` Johannes Berg
@ 2019-11-26 10:42                   ` Octavian Purdila
  -1 siblings, 0 replies; 206+ messages in thread
From: Octavian Purdila @ 2019-11-26 10:42 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Richard Weinberger, Hajime Tazaki, linux-arch, cem, linux-um,
	retrage01, linux-kernel-library, pscollins, sigmaepsilon92,
	liuyuan

On Tue, Nov 26, 2019 at 12:16 PM Johannes Berg
<johannes@sipsolutions.net> wrote:
>
> On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
> > ----- Ursprüngliche Mail -----
> > > > My point is that UML and LKL should try to do use the same concept/code
> > > > regarding virtio. At the end of day both use virtual devices which use
> > > > facilities from the host.
> > > > If this is really not possible it needs a good explanation.
> > >
> > > I think it isn't possible, unless you use vhost-user over a unix domain
> > > socket internally to talk between the kernel (virtio_uml) and hypervisor
> > > (device) components.
> > >
> > > In virtio_uml, the device implementation is assumed to be a separate
> > > process with a vhost-user connection. Here in LKL, the virtio device is
> > > part of the "hypervisor", i.e. in the same process.
> >
> > Exactly, currently UML and LKL solve same things differently, but do we need to?
>
> It's not the same thing though :-)
>
> UML right now doesn't have or support virtio devices in the built-in
> hypervisor, what we wanted to use virtio for was explicitly for the
> vhost-user devices.
>
> LKL clearly wants to have device implementations in the hypervisor,
> perhaps for networking or console etc.? That _might_ be useful since it
> makes the device implementation more general, unlike the UML approach
> where all devices come with a kernel- and user-side and are special
> drivers in the kernel, vs. general virtio drivers.
>

That is correct. Initially we used the same UML model, with dedicated
drivers for LKL, and later switched to using the built-in virtio
drivers (so far for network and block devices).

> Now, arguably, since UML has all these already a combined UML/LKL
> doesn't actually *need* any virtio devices, since all (or at least most)
> of the things that could be covered by virtio today are already covered
> by UML devices (block, net, console, random).
>
> I'd probably say then that this can be removed from an initial "minimum
> viable product" of LKL, since once merged with UML you get the devices
> from that. Later, we could decide that UML devices actually are better
> done as virtio, and support something like this.
>

I agree, I think it make sense to drop these since the problem of
dedicated vs generic / virtio drivers are orthogonal with regard to
UML and LKL unification and can later be worked on.

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-26 10:42                   ` Octavian Purdila
  0 siblings, 0 replies; 206+ messages in thread
From: Octavian Purdila @ 2019-11-26 10:42 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-arch, cem, Richard Weinberger, linux-um, retrage01,
	liuyuan, pscollins, linux-kernel-library, sigmaepsilon92,
	Hajime Tazaki

On Tue, Nov 26, 2019 at 12:16 PM Johannes Berg
<johannes@sipsolutions.net> wrote:
>
> On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
> > ----- Ursprüngliche Mail -----
> > > > My point is that UML and LKL should try to do use the same concept/code
> > > > regarding virtio. At the end of day both use virtual devices which use
> > > > facilities from the host.
> > > > If this is really not possible it needs a good explanation.
> > >
> > > I think it isn't possible, unless you use vhost-user over a unix domain
> > > socket internally to talk between the kernel (virtio_uml) and hypervisor
> > > (device) components.
> > >
> > > In virtio_uml, the device implementation is assumed to be a separate
> > > process with a vhost-user connection. Here in LKL, the virtio device is
> > > part of the "hypervisor", i.e. in the same process.
> >
> > Exactly, currently UML and LKL solve same things differently, but do we need to?
>
> It's not the same thing though :-)
>
> UML right now doesn't have or support virtio devices in the built-in
> hypervisor, what we wanted to use virtio for was explicitly for the
> vhost-user devices.
>
> LKL clearly wants to have device implementations in the hypervisor,
> perhaps for networking or console etc.? That _might_ be useful since it
> makes the device implementation more general, unlike the UML approach
> where all devices come with a kernel- and user-side and are special
> drivers in the kernel, vs. general virtio drivers.
>

That is correct. Initially we used the same UML model, with dedicated
drivers for LKL, and later switched to using the built-in virtio
drivers (so far for network and block devices).

> Now, arguably, since UML has all these already a combined UML/LKL
> doesn't actually *need* any virtio devices, since all (or at least most)
> of the things that could be covered by virtio today are already covered
> by UML devices (block, net, console, random).
>
> I'd probably say then that this can be removed from an initial "minimum
> viable product" of LKL, since once merged with UML you get the devices
> from that. Later, we could decide that UML devices actually are better
> done as virtio, and support something like this.
>

I agree, I think it make sense to drop these since the problem of
dedicated vs generic / virtio drivers are orthogonal with regard to
UML and LKL unification and can later be worked on.

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-26 10:42                   ` Octavian Purdila
@ 2019-11-26 10:49                     ` Anton Ivanov
  -1 siblings, 0 replies; 206+ messages in thread
From: Anton Ivanov @ 2019-11-26 10:49 UTC (permalink / raw)
  To: Octavian Purdila, Johannes Berg
  Cc: linux-arch, cem, Richard Weinberger, linux-um, retrage01,
	liuyuan, pscollins, linux-kernel-library, sigmaepsilon92,
	Hajime Tazaki



On 26/11/2019 10:42, Octavian Purdila wrote:
> On Tue, Nov 26, 2019 at 12:16 PM Johannes Berg
> <johannes@sipsolutions.net> wrote:
>>
>> On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
>>> ----- Ursprüngliche Mail -----
>>>>> My point is that UML and LKL should try to do use the same concept/code
>>>>> regarding virtio. At the end of day both use virtual devices which use
>>>>> facilities from the host.
>>>>> If this is really not possible it needs a good explanation.
>>>>
>>>> I think it isn't possible, unless you use vhost-user over a unix domain
>>>> socket internally to talk between the kernel (virtio_uml) and hypervisor
>>>> (device) components.
>>>>
>>>> In virtio_uml, the device implementation is assumed to be a separate
>>>> process with a vhost-user connection. Here in LKL, the virtio device is
>>>> part of the "hypervisor", i.e. in the same process.
>>>
>>> Exactly, currently UML and LKL solve same things differently, but do we need to?
>>
>> It's not the same thing though :-)
>>
>> UML right now doesn't have or support virtio devices in the built-in
>> hypervisor, what we wanted to use virtio for was explicitly for the
>> vhost-user devices.
>>
>> LKL clearly wants to have device implementations in the hypervisor,
>> perhaps for networking or console etc.? That _might_ be useful since it
>> makes the device implementation more general, unlike the UML approach
>> where all devices come with a kernel- and user-side and are special
>> drivers in the kernel, vs. general virtio drivers.
>>
> 
> That is correct. Initially we used the same UML model, with dedicated
> drivers for LKL, and later switched to using the built-in virtio
> drivers (so far for network and block devices).
> 
>> Now, arguably, since UML has all these already a combined UML/LKL
>> doesn't actually *need* any virtio devices, since all (or at least most)
>> of the things that could be covered by virtio today are already covered
>> by UML devices (block, net, console, random).
>>
>> I'd probably say then that this can be removed from an initial "minimum
>> viable product" of LKL, since once merged with UML you get the devices
>> from that. Later, we could decide that UML devices actually are better
>> done as virtio, and support something like this.
>>
> 
> I agree, I think it make sense to drop these since the problem of
> dedicated vs generic / virtio drivers are orthogonal with regard to
> UML and LKL unification and can later be worked on.

This brings us back to the interrupt controller as noted by Richard earlier.

UML devices are heavily dependent on the file io as an IRQ trigger 
paradigm and they need an interrupt controller which has an IO event 
feed into it. I did not see that in LKL on first read.

So as a first step we should get it to work with existing UML IRQ 
controller and whatever incremental patches are needed on top of that.

> 
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um
> 

-- 
Anton R. Ivanov
https://www.kot-begemot.co.uk/

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-26 10:49                     ` Anton Ivanov
  0 siblings, 0 replies; 206+ messages in thread
From: Anton Ivanov @ 2019-11-26 10:49 UTC (permalink / raw)
  To: Octavian Purdila, Johannes Berg
  Cc: linux-arch, Hajime Tazaki, cem, Richard Weinberger, linux-um,
	retrage01, linux-kernel-library, pscollins, sigmaepsilon92,
	liuyuan



On 26/11/2019 10:42, Octavian Purdila wrote:
> On Tue, Nov 26, 2019 at 12:16 PM Johannes Berg
> <johannes@sipsolutions.net> wrote:
>>
>> On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
>>> ----- Ursprüngliche Mail -----
>>>>> My point is that UML and LKL should try to do use the same concept/code
>>>>> regarding virtio. At the end of day both use virtual devices which use
>>>>> facilities from the host.
>>>>> If this is really not possible it needs a good explanation.
>>>>
>>>> I think it isn't possible, unless you use vhost-user over a unix domain
>>>> socket internally to talk between the kernel (virtio_uml) and hypervisor
>>>> (device) components.
>>>>
>>>> In virtio_uml, the device implementation is assumed to be a separate
>>>> process with a vhost-user connection. Here in LKL, the virtio device is
>>>> part of the "hypervisor", i.e. in the same process.
>>>
>>> Exactly, currently UML and LKL solve same things differently, but do we need to?
>>
>> It's not the same thing though :-)
>>
>> UML right now doesn't have or support virtio devices in the built-in
>> hypervisor, what we wanted to use virtio for was explicitly for the
>> vhost-user devices.
>>
>> LKL clearly wants to have device implementations in the hypervisor,
>> perhaps for networking or console etc.? That _might_ be useful since it
>> makes the device implementation more general, unlike the UML approach
>> where all devices come with a kernel- and user-side and are special
>> drivers in the kernel, vs. general virtio drivers.
>>
> 
> That is correct. Initially we used the same UML model, with dedicated
> drivers for LKL, and later switched to using the built-in virtio
> drivers (so far for network and block devices).
> 
>> Now, arguably, since UML has all these already a combined UML/LKL
>> doesn't actually *need* any virtio devices, since all (or at least most)
>> of the things that could be covered by virtio today are already covered
>> by UML devices (block, net, console, random).
>>
>> I'd probably say then that this can be removed from an initial "minimum
>> viable product" of LKL, since once merged with UML you get the devices
>> from that. Later, we could decide that UML devices actually are better
>> done as virtio, and support something like this.
>>
> 
> I agree, I think it make sense to drop these since the problem of
> dedicated vs generic / virtio drivers are orthogonal with regard to
> UML and LKL unification and can later be worked on.

This brings us back to the interrupt controller as noted by Richard earlier.

UML devices are heavily dependent on the file io as an IRQ trigger 
paradigm and they need an interrupt controller which has an IO event 
feed into it. I did not see that in LKL on first read.

So as a first step we should get it to work with existing UML IRQ 
controller and whatever incremental patches are needed on top of that.

> 
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um
> 

-- 
Anton R. Ivanov
https://www.kot-begemot.co.uk/

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
  2019-11-25 22:00       ` Richard Weinberger
@ 2019-11-26 11:42         ` Octavian Purdila
  -1 siblings, 0 replies; 206+ messages in thread
From: Octavian Purdila @ 2019-11-26 11:42 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Hajime Tazaki, linux-um, Linux-Arch, Patrick Collins,
	Levente Kurusa, Matthieu Coudron, Conrad Meyer, Jens Staal,
	Motomu Utsumi, Lai Jiangshan, Akira Moroo, Petros Angelatos,
	Yuan Liu, Xiao Jia, Mark Stillwell, linux-kernel-library,
	Pierre-Hugues Husson, Michael Zimmermann, Luca Dariz,
	Edison M . Castro

On Tue, Nov 26, 2019 at 12:00 AM Richard Weinberger
<richard.weinberger@gmail.com> wrote:
>
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> >
> > Adds the LKL Kconfig, vmlinux linker script, basic architecture
> > headers and miscellaneous basic functions or stubs such as
> > dump_stack(), show_regs() and cpuinfo proc ops.
> >
> > The headers we introduce in this patch are simple wrappers to the
> > asm-generic headers or stubs for things we don't support, such as
> > ptrace, DMA, signals, ELF handling and low level processor operations.
> >
> > The kernel configuration is automatically updated to reflect the
> > endianness of the host, 64bit support or the output format for
> > vmlinux's linker script. We do this by looking at the ld's default
> > output format.
> >
> > Signed-off-by: Andreas Abel <aabel@google.com>
> > Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
> > Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
> > Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> > Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> > Signed-off-by: Jens Staal <staal1978@gmail.com>
> > Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
> > Signed-off-by: Levente Kurusa <levex@linux.com>
> > Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
> > Signed-off-by: Mark Stillwell <mark@stillwell.me>
> > Signed-off-by: Matthieu Coudron <mattator@gmail.com>
> > Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
> > Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
> > Signed-off-by: Patrick Collins <pscollins@google.com>
> > Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
> > Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
> > Signed-off-by: Xiao Jia <xiaoj@google.com>
> > Signed-off-by: Yuan Liu <liuyuan@google.com>
> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
>
> Can we please have this chain cleaned up?
> Please see process/submitting-patches.rst.
>
> > ---
> >  MAINTAINERS                                |   8 +
> >  arch/um/lkl/.gitignore                     |   2 +
> >  arch/um/lkl/Kconfig                        |  95 ++++++
> >  arch/um/lkl/Kconfig.debug                  |   0
> >  arch/um/lkl/configs/lkl_defconfig          |  91 ++++++
> >  arch/um/lkl/include/asm/Kbuild             |  80 +++++
> >  arch/um/lkl/include/asm/bitsperlong.h      |  11 +
> >  arch/um/lkl/include/asm/byteorder.h        |   7 +
> >  arch/um/lkl/include/asm/cpu.h              |  14 +
> >  arch/um/lkl/include/asm/elf.h              |  15 +
> >  arch/um/lkl/include/asm/mutex.h            |   7 +
> >  arch/um/lkl/include/asm/processor.h        |  60 ++++
> >  arch/um/lkl/include/asm/ptrace.h           |  25 ++
> >  arch/um/lkl/include/asm/sched.h            |  23 ++
> >  arch/um/lkl/include/asm/syscalls.h         |  18 ++
> >  arch/um/lkl/include/asm/syscalls_32.h      |  43 +++
> >  arch/um/lkl/include/asm/tlb.h              |  12 +
> >  arch/um/lkl/include/asm/uaccess.h          |  64 ++++
> >  arch/um/lkl/include/asm/unistd_32.h        |  31 ++
> >  arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
> >  arch/um/lkl/include/asm/xor.h              |   9 +
> >  arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
> >  arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
> >  arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
> >  arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
> >  arch/um/lkl/include/uapi/asm/swab.h        |  11 +
> >  arch/um/lkl/include/uapi/asm/syscalls.h    | 348 +++++++++++++++++++++
>
> I think this is the first big thing which needs a unification.
>
> In UML we try hard to re-use headers from x86.
> We also have some headers in arch/x86/um/.
>
> LKL should do the same. At least try hard to avoid duplication.
>

In LKL we tried to avoid coupling the kernel build part to a
particular architecture, to make it easier to port it (to different
arches, but as well to other OSes or special environments [1][2]).
That is the main reason for having two build steps, one for kernel
proper, and one for the host. That is why the host part was placed
into tools/lkl to make it clear that is not part of the kernel proper.

I think this is one of the biggest differences between UML and LKL and
it would be helpful to get feedback of what people think of a
potential similar split for UML.

[1] https://www.haiku-os.org/blog/lucian/2010-07-08_booting_lkl_inside_haiku/
[2] https://github.com/lkl/lkl-ntk-driver-poc

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
@ 2019-11-26 11:42         ` Octavian Purdila
  0 siblings, 0 replies; 206+ messages in thread
From: Octavian Purdila @ 2019-11-26 11:42 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Linux-Arch, Levente Kurusa, Matthieu Coudron, Conrad Meyer,
	Lai Jiangshan, Jens Staal, Motomu Utsumi, linux-um, Akira Moroo,
	Petros Angelatos, Yuan Liu, Xiao Jia, Mark Stillwell,
	linux-kernel-library, Edison M . Castro, Patrick Collins,
	Pierre-Hugues Husson, Michael Zimmermann, Luca Dariz,
	Hajime Tazaki

On Tue, Nov 26, 2019 at 12:00 AM Richard Weinberger
<richard.weinberger@gmail.com> wrote:
>
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> >
> > Adds the LKL Kconfig, vmlinux linker script, basic architecture
> > headers and miscellaneous basic functions or stubs such as
> > dump_stack(), show_regs() and cpuinfo proc ops.
> >
> > The headers we introduce in this patch are simple wrappers to the
> > asm-generic headers or stubs for things we don't support, such as
> > ptrace, DMA, signals, ELF handling and low level processor operations.
> >
> > The kernel configuration is automatically updated to reflect the
> > endianness of the host, 64bit support or the output format for
> > vmlinux's linker script. We do this by looking at the ld's default
> > output format.
> >
> > Signed-off-by: Andreas Abel <aabel@google.com>
> > Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
> > Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
> > Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> > Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> > Signed-off-by: Jens Staal <staal1978@gmail.com>
> > Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
> > Signed-off-by: Levente Kurusa <levex@linux.com>
> > Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
> > Signed-off-by: Mark Stillwell <mark@stillwell.me>
> > Signed-off-by: Matthieu Coudron <mattator@gmail.com>
> > Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
> > Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
> > Signed-off-by: Patrick Collins <pscollins@google.com>
> > Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
> > Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
> > Signed-off-by: Xiao Jia <xiaoj@google.com>
> > Signed-off-by: Yuan Liu <liuyuan@google.com>
> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
>
> Can we please have this chain cleaned up?
> Please see process/submitting-patches.rst.
>
> > ---
> >  MAINTAINERS                                |   8 +
> >  arch/um/lkl/.gitignore                     |   2 +
> >  arch/um/lkl/Kconfig                        |  95 ++++++
> >  arch/um/lkl/Kconfig.debug                  |   0
> >  arch/um/lkl/configs/lkl_defconfig          |  91 ++++++
> >  arch/um/lkl/include/asm/Kbuild             |  80 +++++
> >  arch/um/lkl/include/asm/bitsperlong.h      |  11 +
> >  arch/um/lkl/include/asm/byteorder.h        |   7 +
> >  arch/um/lkl/include/asm/cpu.h              |  14 +
> >  arch/um/lkl/include/asm/elf.h              |  15 +
> >  arch/um/lkl/include/asm/mutex.h            |   7 +
> >  arch/um/lkl/include/asm/processor.h        |  60 ++++
> >  arch/um/lkl/include/asm/ptrace.h           |  25 ++
> >  arch/um/lkl/include/asm/sched.h            |  23 ++
> >  arch/um/lkl/include/asm/syscalls.h         |  18 ++
> >  arch/um/lkl/include/asm/syscalls_32.h      |  43 +++
> >  arch/um/lkl/include/asm/tlb.h              |  12 +
> >  arch/um/lkl/include/asm/uaccess.h          |  64 ++++
> >  arch/um/lkl/include/asm/unistd_32.h        |  31 ++
> >  arch/um/lkl/include/asm/vmlinux.lds.h      |  14 +
> >  arch/um/lkl/include/asm/xor.h              |   9 +
> >  arch/um/lkl/include/uapi/asm/Kbuild        |   9 +
> >  arch/um/lkl/include/uapi/asm/bitsperlong.h |  13 +
> >  arch/um/lkl/include/uapi/asm/byteorder.h   |  11 +
> >  arch/um/lkl/include/uapi/asm/siginfo.h     |  11 +
> >  arch/um/lkl/include/uapi/asm/swab.h        |  11 +
> >  arch/um/lkl/include/uapi/asm/syscalls.h    | 348 +++++++++++++++++++++
>
> I think this is the first big thing which needs a unification.
>
> In UML we try hard to re-use headers from x86.
> We also have some headers in arch/x86/um/.
>
> LKL should do the same. At least try hard to avoid duplication.
>

In LKL we tried to avoid coupling the kernel build part to a
particular architecture, to make it easier to port it (to different
arches, but as well to other OSes or special environments [1][2]).
That is the main reason for having two build steps, one for kernel
proper, and one for the host. That is why the host part was placed
into tools/lkl to make it clear that is not part of the kernel proper.

I think this is one of the biggest differences between UML and LKL and
it would be helpful to get feedback of what people think of a
potential similar split for UML.

[1] https://www.haiku-os.org/blog/lucian/2010-07-08_booting_lkl_inside_haiku/
[2] https://github.com/lkl/lkl-ntk-driver-poc

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 01/37] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
  2019-11-25 22:02       ` Richard Weinberger
@ 2019-11-26 14:02         ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-26 14:02 UTC (permalink / raw)
  To: richard.weinberger
  Cc: linux-um, tavi.purdila, linux-kernel-library, linux-arch, retrage01


On Tue, 26 Nov 2019 07:02:04 +0900,
Richard Weinberger wrote:
> 
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> >
> > With CONFIG_64BIT enabled, atomic64 via CONFIG_GENERIC_ATOMIC64 options
> > are not compiled due to type conflict of atomic64_t defined in
> > linux/type.h.
> >
> > This commit fixes the issue and allow using generic atomic64 ops.
> 
> Hmm, why is this specific to LKL?

Currently, LKL is only the user which defines GENERIC_ATOMIC64
(lib/atomic64.c) under CONFIG_64BIT environment.  Thus, there would be no
issues in the current tree.

If you manually define `select GENERIC_ATOMIC64` in UML's Kconfig and build
it on a 64BIT host, the same problem would happen.

> This need a review from core developers.

I will explicitly Cc to maintainers (ATOMIC INFRASTRUCTURE) from the next
round.

Thanks,

-- Hajime

> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> > ---
> >  include/asm-generic/atomic64.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/include/asm-generic/atomic64.h b/include/asm-generic/atomic64.h
> > index 370f01d4450f..9b15847baae5 100644
> > --- a/include/asm-generic/atomic64.h
> > +++ b/include/asm-generic/atomic64.h
> > @@ -9,9 +9,11 @@
> >  #define _ASM_GENERIC_ATOMIC64_H
> >  #include <linux/types.h>
> >
> > +#ifndef CONFIG_64BIT
> >  typedef struct {
> >         s64 counter;
> >  } atomic64_t;
> > +#endif
> >
> >  #define ATOMIC64_INIT(i)       { (i) }

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 01/37] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms
@ 2019-11-26 14:02         ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-26 14:02 UTC (permalink / raw)
  To: richard.weinberger
  Cc: tavi.purdila, linux-kernel-library, retrage01, linux-um, linux-arch


On Tue, 26 Nov 2019 07:02:04 +0900,
Richard Weinberger wrote:
> 
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> >
> > With CONFIG_64BIT enabled, atomic64 via CONFIG_GENERIC_ATOMIC64 options
> > are not compiled due to type conflict of atomic64_t defined in
> > linux/type.h.
> >
> > This commit fixes the issue and allow using generic atomic64 ops.
> 
> Hmm, why is this specific to LKL?

Currently, LKL is only the user which defines GENERIC_ATOMIC64
(lib/atomic64.c) under CONFIG_64BIT environment.  Thus, there would be no
issues in the current tree.

If you manually define `select GENERIC_ATOMIC64` in UML's Kconfig and build
it on a 64BIT host, the same problem would happen.

> This need a review from core developers.

I will explicitly Cc to maintainers (ATOMIC INFRASTRUCTURE) from the next
round.

Thanks,

-- Hajime

> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> > ---
> >  include/asm-generic/atomic64.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/include/asm-generic/atomic64.h b/include/asm-generic/atomic64.h
> > index 370f01d4450f..9b15847baae5 100644
> > --- a/include/asm-generic/atomic64.h
> > +++ b/include/asm-generic/atomic64.h
> > @@ -9,9 +9,11 @@
> >  #define _ASM_GENERIC_ATOMIC64_H
> >  #include <linux/types.h>
> >
> > +#ifndef CONFIG_64BIT
> >  typedef struct {
> >         s64 counter;
> >  } atomic64_t;
> > +#endif
> >
> >  #define ATOMIC64_INIT(i)       { (i) }

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
  2019-11-25 22:00       ` Richard Weinberger
@ 2019-11-26 14:17         ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-26 14:17 UTC (permalink / raw)
  To: richard.weinberger
  Cc: linux-um, linux-arch, pscollins, levex, mattator, cem,
	tavi.purdila, staal1978, motomuman, jiangshanlai, retrage01,
	petrosagg, liuyuan, xiaoj, mark, linux-kernel-library, phh,
	sigmaepsilon92, luca.dariz, edisonmcastro


On Tue, 26 Nov 2019 07:00:33 +0900,
Richard Weinberger wrote:
> 
(snip)
> >
> > Signed-off-by: Andreas Abel <aabel@google.com>
> > Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
> > Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
> > Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> > Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> > Signed-off-by: Jens Staal <staal1978@gmail.com>
> > Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
> > Signed-off-by: Levente Kurusa <levex@linux.com>
> > Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
> > Signed-off-by: Mark Stillwell <mark@stillwell.me>
> > Signed-off-by: Matthieu Coudron <mattator@gmail.com>
> > Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
> > Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
> > Signed-off-by: Patrick Collins <pscollins@google.com>
> > Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
> > Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
> > Signed-off-by: Xiao Jia <xiaoj@google.com>
> > Signed-off-by: Yuan Liu <liuyuan@google.com>
> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> 
> Can we please have this chain cleaned up?
> Please see process/submitting-patches.rst.

Do you mean "this chain" by the long list of Signed-off-by lines, or
something else ?

We were trying to put all of contributors on the list.  I was failed to
interpret process/submitting-patches.rst on which part is not appropriate.

If you could be more specific, it would be definitely helpful.
# sorry to disturb you...

btw, currently we have more than 15 patches, which I may need to drop some
of them for the first step.


-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
@ 2019-11-26 14:17         ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-26 14:17 UTC (permalink / raw)
  To: richard.weinberger
  Cc: linux-arch, levex, mattator, cem, tavi.purdila, jiangshanlai,
	staal1978, motomuman, linux-um, retrage01, petrosagg,
	edisonmcastro, xiaoj, mark, linux-kernel-library, pscollins, phh,
	sigmaepsilon92, luca.dariz, liuyuan


On Tue, 26 Nov 2019 07:00:33 +0900,
Richard Weinberger wrote:
> 
(snip)
> >
> > Signed-off-by: Andreas Abel <aabel@google.com>
> > Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
> > Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
> > Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> > Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> > Signed-off-by: Jens Staal <staal1978@gmail.com>
> > Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
> > Signed-off-by: Levente Kurusa <levex@linux.com>
> > Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
> > Signed-off-by: Mark Stillwell <mark@stillwell.me>
> > Signed-off-by: Matthieu Coudron <mattator@gmail.com>
> > Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
> > Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
> > Signed-off-by: Patrick Collins <pscollins@google.com>
> > Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
> > Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
> > Signed-off-by: Xiao Jia <xiaoj@google.com>
> > Signed-off-by: Yuan Liu <liuyuan@google.com>
> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> 
> Can we please have this chain cleaned up?
> Please see process/submitting-patches.rst.

Do you mean "this chain" by the long list of Signed-off-by lines, or
something else ?

We were trying to put all of contributors on the list.  I was failed to
interpret process/submitting-patches.rst on which part is not appropriate.

If you could be more specific, it would be definitely helpful.
# sorry to disturb you...

btw, currently we have more than 15 patches, which I may need to drop some
of them for the first step.


-- Hajime

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
  2019-11-26 14:17         ` Hajime Tazaki
@ 2019-11-26 16:02           ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-26 16:02 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: linux-um, linux-arch, pscollins, levex, mattator, cem,
	tavi purdila, staal1978, motomuman, jiangshanlai, retrage01,
	petrosagg, liuyuan, xiaoj, mark, linux-kernel-library, phh,
	sigmaepsilon92, luca dariz, edisonmcastro

----- Ursprüngliche Mail -----
> Von: "Hajime Tazaki" <thehajime@gmail.com>
> An: "Richard Weinberger" <richard.weinberger@gmail.com>
> CC: "linux-um" <linux-um@lists.infradead.org>, "linux-arch" <linux-arch@vger.kernel.org>, "pscollins"
> <pscollins@google.com>, "levex" <levex@linux.com>, "mattator" <mattator@gmail.com>, "cem" <cem@freebsd.org>, "tavi
> purdila" <tavi.purdila@gmail.com>, "staal1978" <staal1978@gmail.com>, "motomuman" <motomuman@gmail.com>, "jiangshanlai"
> <jiangshanlai@gmail.com>, "retrage01" <retrage01@gmail.com>, "petrosagg" <petrosagg@gmail.com>, "liuyuan"
> <liuyuan@google.com>, "xiaoj" <xiaoj@google.com>, "mark" <mark@stillwell.me>, "linux-kernel-library"
> <linux-kernel-library@freelists.org>, "phh" <phh@phh.me>, "sigmaepsilon92" <sigmaepsilon92@gmail.com>, "luca dariz"
> <luca.dariz@gmail.com>, "edisonmcastro" <edisonmcastro@hotmail.com>
> Gesendet: Dienstag, 26. November 2019 15:17:26
> Betreff: Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library

> On Tue, 26 Nov 2019 07:00:33 +0900,
> Richard Weinberger wrote:
>> 
> (snip)
>> >
>> > Signed-off-by: Andreas Abel <aabel@google.com>
>> > Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
>> > Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
>> > Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
>> > Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
>> > Signed-off-by: Jens Staal <staal1978@gmail.com>
>> > Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
>> > Signed-off-by: Levente Kurusa <levex@linux.com>
>> > Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
>> > Signed-off-by: Mark Stillwell <mark@stillwell.me>
>> > Signed-off-by: Matthieu Coudron <mattator@gmail.com>
>> > Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
>> > Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
>> > Signed-off-by: Patrick Collins <pscollins@google.com>
>> > Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
>> > Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
>> > Signed-off-by: Xiao Jia <xiaoj@google.com>
>> > Signed-off-by: Yuan Liu <liuyuan@google.com>
>> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
>> 
>> Can we please have this chain cleaned up?
>> Please see process/submitting-patches.rst.
> 
> Do you mean "this chain" by the long list of Signed-off-by lines, or
> something else ?

The long list is rather unusual.
 
> We were trying to put all of contributors on the list.  I was failed to
> interpret process/submitting-patches.rst on which part is not appropriate.

If every contributor is also a Co-Author. Okay. But having such a long
list of authors is still a little odd.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
@ 2019-11-26 16:02           ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-26 16:02 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: linux-arch, levex, mattator, cem, tavi purdila, jiangshanlai,
	staal1978, motomuman, linux-um, retrage01, petrosagg,
	edisonmcastro, xiaoj, mark, linux-kernel-library, pscollins, phh,
	sigmaepsilon92, luca dariz, liuyuan

----- Ursprüngliche Mail -----
> Von: "Hajime Tazaki" <thehajime@gmail.com>
> An: "Richard Weinberger" <richard.weinberger@gmail.com>
> CC: "linux-um" <linux-um@lists.infradead.org>, "linux-arch" <linux-arch@vger.kernel.org>, "pscollins"
> <pscollins@google.com>, "levex" <levex@linux.com>, "mattator" <mattator@gmail.com>, "cem" <cem@freebsd.org>, "tavi
> purdila" <tavi.purdila@gmail.com>, "staal1978" <staal1978@gmail.com>, "motomuman" <motomuman@gmail.com>, "jiangshanlai"
> <jiangshanlai@gmail.com>, "retrage01" <retrage01@gmail.com>, "petrosagg" <petrosagg@gmail.com>, "liuyuan"
> <liuyuan@google.com>, "xiaoj" <xiaoj@google.com>, "mark" <mark@stillwell.me>, "linux-kernel-library"
> <linux-kernel-library@freelists.org>, "phh" <phh@phh.me>, "sigmaepsilon92" <sigmaepsilon92@gmail.com>, "luca dariz"
> <luca.dariz@gmail.com>, "edisonmcastro" <edisonmcastro@hotmail.com>
> Gesendet: Dienstag, 26. November 2019 15:17:26
> Betreff: Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library

> On Tue, 26 Nov 2019 07:00:33 +0900,
> Richard Weinberger wrote:
>> 
> (snip)
>> >
>> > Signed-off-by: Andreas Abel <aabel@google.com>
>> > Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
>> > Signed-off-by: Edison M. Castro <edisonmcastro@hotmail.com>
>> > Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
>> > Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
>> > Signed-off-by: Jens Staal <staal1978@gmail.com>
>> > Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
>> > Signed-off-by: Levente Kurusa <levex@linux.com>
>> > Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
>> > Signed-off-by: Mark Stillwell <mark@stillwell.me>
>> > Signed-off-by: Matthieu Coudron <mattator@gmail.com>
>> > Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
>> > Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
>> > Signed-off-by: Patrick Collins <pscollins@google.com>
>> > Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
>> > Signed-off-by: Pierre-Hugues Husson <phh@phh.me>
>> > Signed-off-by: Xiao Jia <xiaoj@google.com>
>> > Signed-off-by: Yuan Liu <liuyuan@google.com>
>> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
>> 
>> Can we please have this chain cleaned up?
>> Please see process/submitting-patches.rst.
> 
> Do you mean "this chain" by the long list of Signed-off-by lines, or
> something else ?

The long list is rather unusual.
 
> We were trying to put all of contributors on the list.  I was failed to
> interpret process/submitting-patches.rst on which part is not appropriate.

If every contributor is also a Co-Author. Okay. But having such a long
list of authors is still a little odd.

Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-26 10:42                   ` Octavian Purdila
@ 2019-11-26 16:04                     ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-26 16:04 UTC (permalink / raw)
  To: tavi purdila
  Cc: Johannes Berg, Hajime Tazaki, linux-arch, cem, linux-um,
	retrage01, linux-kernel-library, pscollins, sigmaepsilon92,
	liuyuan, anton ivanov

----- Ursprüngliche Mail -----
> Von: "tavi purdila" <tavi.purdila@gmail.com>
> An: "Johannes Berg" <johannes@sipsolutions.net>
> CC: "richard" <richard@nod.at>, "Hajime Tazaki" <thehajime@gmail.com>, "linux-arch" <linux-arch@vger.kernel.org>, "cem"
> <cem@freebsd.org>, "linux-um" <linux-um@lists.infradead.org>, "retrage01" <retrage01@gmail.com>, "linux-kernel-library"
> <linux-kernel-library@freelists.org>, "pscollins" <pscollins@google.com>, "sigmaepsilon92" <sigmaepsilon92@gmail.com>,
> "liuyuan" <liuyuan@google.com>
> Gesendet: Dienstag, 26. November 2019 11:42:01
> Betreff: Re: [RFC v2 17/37] lkl tools: host lib: virtio devices

> On Tue, Nov 26, 2019 at 12:16 PM Johannes Berg
> <johannes@sipsolutions.net> wrote:
>>
>> On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
>> > ----- Ursprüngliche Mail -----
>> > > > My point is that UML and LKL should try to do use the same concept/code
>> > > > regarding virtio. At the end of day both use virtual devices which use
>> > > > facilities from the host.
>> > > > If this is really not possible it needs a good explanation.
>> > >
>> > > I think it isn't possible, unless you use vhost-user over a unix domain
>> > > socket internally to talk between the kernel (virtio_uml) and hypervisor
>> > > (device) components.
>> > >
>> > > In virtio_uml, the device implementation is assumed to be a separate
>> > > process with a vhost-user connection. Here in LKL, the virtio device is
>> > > part of the "hypervisor", i.e. in the same process.
>> >
>> > Exactly, currently UML and LKL solve same things differently, but do we need to?
>>
>> It's not the same thing though :-)
>>
>> UML right now doesn't have or support virtio devices in the built-in
>> hypervisor, what we wanted to use virtio for was explicitly for the
>> vhost-user devices.
>>
>> LKL clearly wants to have device implementations in the hypervisor,
>> perhaps for networking or console etc.? That _might_ be useful since it
>> makes the device implementation more general, unlike the UML approach
>> where all devices come with a kernel- and user-side and are special
>> drivers in the kernel, vs. general virtio drivers.
>>
> 
> That is correct. Initially we used the same UML model, with dedicated
> drivers for LKL, and later switched to using the built-in virtio
> drivers (so far for network and block devices).

Can you please point out a little further why UML's net or block drivers
are not usable for LKL?
What is missing?

Performance numbers would be also nice to have.
Anton did great work on improving UML's drivers.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-26 16:04                     ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-26 16:04 UTC (permalink / raw)
  To: tavi purdila
  Cc: linux-arch, cem, linux-um, retrage01, liuyuan, pscollins,
	linux-kernel-library, Johannes Berg, sigmaepsilon92,
	Hajime Tazaki, anton ivanov

----- Ursprüngliche Mail -----
> Von: "tavi purdila" <tavi.purdila@gmail.com>
> An: "Johannes Berg" <johannes@sipsolutions.net>
> CC: "richard" <richard@nod.at>, "Hajime Tazaki" <thehajime@gmail.com>, "linux-arch" <linux-arch@vger.kernel.org>, "cem"
> <cem@freebsd.org>, "linux-um" <linux-um@lists.infradead.org>, "retrage01" <retrage01@gmail.com>, "linux-kernel-library"
> <linux-kernel-library@freelists.org>, "pscollins" <pscollins@google.com>, "sigmaepsilon92" <sigmaepsilon92@gmail.com>,
> "liuyuan" <liuyuan@google.com>
> Gesendet: Dienstag, 26. November 2019 11:42:01
> Betreff: Re: [RFC v2 17/37] lkl tools: host lib: virtio devices

> On Tue, Nov 26, 2019 at 12:16 PM Johannes Berg
> <johannes@sipsolutions.net> wrote:
>>
>> On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
>> > ----- Ursprüngliche Mail -----
>> > > > My point is that UML and LKL should try to do use the same concept/code
>> > > > regarding virtio. At the end of day both use virtual devices which use
>> > > > facilities from the host.
>> > > > If this is really not possible it needs a good explanation.
>> > >
>> > > I think it isn't possible, unless you use vhost-user over a unix domain
>> > > socket internally to talk between the kernel (virtio_uml) and hypervisor
>> > > (device) components.
>> > >
>> > > In virtio_uml, the device implementation is assumed to be a separate
>> > > process with a vhost-user connection. Here in LKL, the virtio device is
>> > > part of the "hypervisor", i.e. in the same process.
>> >
>> > Exactly, currently UML and LKL solve same things differently, but do we need to?
>>
>> It's not the same thing though :-)
>>
>> UML right now doesn't have or support virtio devices in the built-in
>> hypervisor, what we wanted to use virtio for was explicitly for the
>> vhost-user devices.
>>
>> LKL clearly wants to have device implementations in the hypervisor,
>> perhaps for networking or console etc.? That _might_ be useful since it
>> makes the device implementation more general, unlike the UML approach
>> where all devices come with a kernel- and user-side and are special
>> drivers in the kernel, vs. general virtio drivers.
>>
> 
> That is correct. Initially we used the same UML model, with dedicated
> drivers for LKL, and later switched to using the built-in virtio
> drivers (so far for network and block devices).

Can you please point out a little further why UML's net or block drivers
are not usable for LKL?
What is missing?

Performance numbers would be also nice to have.
Anton did great work on improving UML's drivers.

Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-26 10:49                     ` Anton Ivanov
@ 2019-11-27  4:06                       ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-27  4:06 UTC (permalink / raw)
  To: anton.ivanov
  Cc: tavi.purdila, johannes, linux-arch, cem, richard, linux-um,
	retrage01, linux-kernel-library, pscollins, sigmaepsilon92,
	liuyuan


On Tue, 26 Nov 2019 19:49:48 +0900,
Anton Ivanov wrote:
> 
> 
> 
> On 26/11/2019 10:42, Octavian Purdila wrote:
> > On Tue, Nov 26, 2019 at 12:16 PM Johannes Berg
> > <johannes@sipsolutions.net> wrote:
> >> 
> >> On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
> >>> ----- Ursprüngliche Mail -----
> >>>>> My point is that UML and LKL should try to do use the same concept/code
> >>>>> regarding virtio. At the end of day both use virtual devices which use
> >>>>> facilities from the host.
> >>>>> If this is really not possible it needs a good explanation.
> >>>> 
> >>>> I think it isn't possible, unless you use vhost-user over a unix domain
> >>>> socket internally to talk between the kernel (virtio_uml) and hypervisor
> >>>> (device) components.
> >>>> 
> >>>> In virtio_uml, the device implementation is assumed to be a separate
> >>>> process with a vhost-user connection. Here in LKL, the virtio device is
> >>>> part of the "hypervisor", i.e. in the same process.
> >>> 
> >>> Exactly, currently UML and LKL solve same things differently, but do we need to?
> >> 
> >> It's not the same thing though :-)
> >> 
> >> UML right now doesn't have or support virtio devices in the built-in
> >> hypervisor, what we wanted to use virtio for was explicitly for the
> >> vhost-user devices.
> >> 
> >> LKL clearly wants to have device implementations in the hypervisor,
> >> perhaps for networking or console etc.? That _might_ be useful since it
> >> makes the device implementation more general, unlike the UML approach
> >> where all devices come with a kernel- and user-side and are special
> >> drivers in the kernel, vs. general virtio drivers.
> >> 
> > 
> > That is correct. Initially we used the same UML model, with dedicated
> > drivers for LKL, and later switched to using the built-in virtio
> > drivers (so far for network and block devices).
> > 
> >> Now, arguably, since UML has all these already a combined UML/LKL
> >> doesn't actually *need* any virtio devices, since all (or at least most)
> >> of the things that could be covered by virtio today are already covered
> >> by UML devices (block, net, console, random).
> >> 
> >> I'd probably say then that this can be removed from an initial "minimum
> >> viable product" of LKL, since once merged with UML you get the devices
> >> from that. Later, we could decide that UML devices actually are better
> >> done as virtio, and support something like this.
> >> 
> > 
> > I agree, I think it make sense to drop these since the problem of
> > dedicated vs generic / virtio drivers are orthogonal with regard to
> > UML and LKL unification and can later be worked on.
> 
> This brings us back to the interrupt controller as noted by Richard earlier.
> 
> UML devices are heavily dependent on the file io as an IRQ trigger
> paradigm and they need an interrupt controller which has an IO event
> feed into it. I did not see that in LKL on first read.

Indeed, the current interrupt model in LKL is not directly associated
with file IO events delivered by epoll family as UML does.  Instead,
calling lkl_trigger_irq() at places will trigger an interrupt in the
LKL case so, we need to adapt somehow if LKL uses UML devices.

> So as a first step we should get it to work with existing UML IRQ
> controller and whatever incremental patches are needed on top of that.

I understand and agree on your comments by all of you.

If implementing LKL with current/existing UML devices is the right
direction, I would go and try to test this approach to verify if this
is doable.

-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-27  4:06                       ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-27  4:06 UTC (permalink / raw)
  To: anton.ivanov
  Cc: linux-arch, cem, tavi.purdila, richard, linux-um, retrage01,
	pscollins, linux-kernel-library, johannes, sigmaepsilon92,
	liuyuan


On Tue, 26 Nov 2019 19:49:48 +0900,
Anton Ivanov wrote:
> 
> 
> 
> On 26/11/2019 10:42, Octavian Purdila wrote:
> > On Tue, Nov 26, 2019 at 12:16 PM Johannes Berg
> > <johannes@sipsolutions.net> wrote:
> >> 
> >> On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
> >>> ----- Ursprüngliche Mail -----
> >>>>> My point is that UML and LKL should try to do use the same concept/code
> >>>>> regarding virtio. At the end of day both use virtual devices which use
> >>>>> facilities from the host.
> >>>>> If this is really not possible it needs a good explanation.
> >>>> 
> >>>> I think it isn't possible, unless you use vhost-user over a unix domain
> >>>> socket internally to talk between the kernel (virtio_uml) and hypervisor
> >>>> (device) components.
> >>>> 
> >>>> In virtio_uml, the device implementation is assumed to be a separate
> >>>> process with a vhost-user connection. Here in LKL, the virtio device is
> >>>> part of the "hypervisor", i.e. in the same process.
> >>> 
> >>> Exactly, currently UML and LKL solve same things differently, but do we need to?
> >> 
> >> It's not the same thing though :-)
> >> 
> >> UML right now doesn't have or support virtio devices in the built-in
> >> hypervisor, what we wanted to use virtio for was explicitly for the
> >> vhost-user devices.
> >> 
> >> LKL clearly wants to have device implementations in the hypervisor,
> >> perhaps for networking or console etc.? That _might_ be useful since it
> >> makes the device implementation more general, unlike the UML approach
> >> where all devices come with a kernel- and user-side and are special
> >> drivers in the kernel, vs. general virtio drivers.
> >> 
> > 
> > That is correct. Initially we used the same UML model, with dedicated
> > drivers for LKL, and later switched to using the built-in virtio
> > drivers (so far for network and block devices).
> > 
> >> Now, arguably, since UML has all these already a combined UML/LKL
> >> doesn't actually *need* any virtio devices, since all (or at least most)
> >> of the things that could be covered by virtio today are already covered
> >> by UML devices (block, net, console, random).
> >> 
> >> I'd probably say then that this can be removed from an initial "minimum
> >> viable product" of LKL, since once merged with UML you get the devices
> >> from that. Later, we could decide that UML devices actually are better
> >> done as virtio, and support something like this.
> >> 
> > 
> > I agree, I think it make sense to drop these since the problem of
> > dedicated vs generic / virtio drivers are orthogonal with regard to
> > UML and LKL unification and can later be worked on.
> 
> This brings us back to the interrupt controller as noted by Richard earlier.
> 
> UML devices are heavily dependent on the file io as an IRQ trigger
> paradigm and they need an interrupt controller which has an IO event
> feed into it. I did not see that in LKL on first read.

Indeed, the current interrupt model in LKL is not directly associated
with file IO events delivered by epoll family as UML does.  Instead,
calling lkl_trigger_irq() at places will trigger an interrupt in the
LKL case so, we need to adapt somehow if LKL uses UML devices.

> So as a first step we should get it to work with existing UML IRQ
> controller and whatever incremental patches are needed on top of that.

I understand and agree on your comments by all of you.

If implementing LKL with current/existing UML devices is the right
direction, I would go and try to test this approach to verify if this
is doable.

-- Hajime


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-26 16:04                     ` Richard Weinberger
@ 2019-11-27  4:08                       ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-27  4:08 UTC (permalink / raw)
  To: richard
  Cc: tavi.purdila, johannes, linux-arch, cem, linux-um, retrage01,
	linux-kernel-library, pscollins, sigmaepsilon92, liuyuan,
	anton.ivanov


Hello,

On Wed, 27 Nov 2019 01:04:55 +0900,
Richard Weinberger wrote:

> >> On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
> >> > ----- Ursprüngliche Mail -----
> >> > > > My point is that UML and LKL should try to do use the same concept/code
> >> > > > regarding virtio. At the end of day both use virtual devices which use
> >> > > > facilities from the host.
> >> > > > If this is really not possible it needs a good explanation.
> >> > >
> >> > > I think it isn't possible, unless you use vhost-user over a unix domain
> >> > > socket internally to talk between the kernel (virtio_uml) and hypervisor
> >> > > (device) components.
> >> > >
> >> > > In virtio_uml, the device implementation is assumed to be a separate
> >> > > process with a vhost-user connection. Here in LKL, the virtio device is
> >> > > part of the "hypervisor", i.e. in the same process.
> >> >
> >> > Exactly, currently UML and LKL solve same things differently, but do we need to?
> >>
> >> It's not the same thing though :-)
> >>
> >> UML right now doesn't have or support virtio devices in the built-in
> >> hypervisor, what we wanted to use virtio for was explicitly for the
> >> vhost-user devices.
> >>
> >> LKL clearly wants to have device implementations in the hypervisor,
> >> perhaps for networking or console etc.? That _might_ be useful since it
> >> makes the device implementation more general, unlike the UML approach
> >> where all devices come with a kernel- and user-side and are special
> >> drivers in the kernel, vs. general virtio drivers.
> >>
> > 
> > That is correct. Initially we used the same UML model, with dedicated
> > drivers for LKL, and later switched to using the built-in virtio
> > drivers (so far for network and block devices).
> 
> Can you please point out a little further why UML's net or block drivers
> are not usable for LKL?

I think we can do it (but need to check).

LKL may use UML's drivers, and UML can also use LKL's devices/drivers
(as my 36/37 and 37/37 patches do, though the patches has no careful
consideration on IRQ handling).

> What is missing?

As Anton mentioned, the IRQ handling needs to be considered in LKL, at
least. I need to check but there might be other factors.

> Performance numbers would be also nice to have.
> Anton did great work on improving UML's drivers.

Performance improve techniques (bulk operations, offload, etc) are
also applicable to both.  As UML can do, LKL can TSO/csum offload with
configured virtio-net devices.


-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-27  4:08                       ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-27  4:08 UTC (permalink / raw)
  To: richard
  Cc: linux-arch, cem, tavi.purdila, linux-um, retrage01, pscollins,
	linux-kernel-library, johannes, sigmaepsilon92, liuyuan,
	anton.ivanov


Hello,

On Wed, 27 Nov 2019 01:04:55 +0900,
Richard Weinberger wrote:

> >> On Tue, 2019-11-26 at 11:09 +0100, Richard Weinberger wrote:
> >> > ----- Ursprüngliche Mail -----
> >> > > > My point is that UML and LKL should try to do use the same concept/code
> >> > > > regarding virtio. At the end of day both use virtual devices which use
> >> > > > facilities from the host.
> >> > > > If this is really not possible it needs a good explanation.
> >> > >
> >> > > I think it isn't possible, unless you use vhost-user over a unix domain
> >> > > socket internally to talk between the kernel (virtio_uml) and hypervisor
> >> > > (device) components.
> >> > >
> >> > > In virtio_uml, the device implementation is assumed to be a separate
> >> > > process with a vhost-user connection. Here in LKL, the virtio device is
> >> > > part of the "hypervisor", i.e. in the same process.
> >> >
> >> > Exactly, currently UML and LKL solve same things differently, but do we need to?
> >>
> >> It's not the same thing though :-)
> >>
> >> UML right now doesn't have or support virtio devices in the built-in
> >> hypervisor, what we wanted to use virtio for was explicitly for the
> >> vhost-user devices.
> >>
> >> LKL clearly wants to have device implementations in the hypervisor,
> >> perhaps for networking or console etc.? That _might_ be useful since it
> >> makes the device implementation more general, unlike the UML approach
> >> where all devices come with a kernel- and user-side and are special
> >> drivers in the kernel, vs. general virtio drivers.
> >>
> > 
> > That is correct. Initially we used the same UML model, with dedicated
> > drivers for LKL, and later switched to using the built-in virtio
> > drivers (so far for network and block devices).
> 
> Can you please point out a little further why UML's net or block drivers
> are not usable for LKL?

I think we can do it (but need to check).

LKL may use UML's drivers, and UML can also use LKL's devices/drivers
(as my 36/37 and 37/37 patches do, though the patches has no careful
consideration on IRQ handling).

> What is missing?

As Anton mentioned, the IRQ handling needs to be considered in LKL, at
least. I need to check but there might be other factors.

> Performance numbers would be also nice to have.
> Anton did great work on improving UML's drivers.

Performance improve techniques (bulk operations, offload, etc) are
also applicable to both.  As UML can do, LKL can TSO/csum offload with
configured virtio-net devices.


-- Hajime

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 02/37] arch: add __SYSCALL_DEFINE_ARCH
  2019-11-25 22:02       ` Richard Weinberger
@ 2019-11-27  4:15         ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-27  4:15 UTC (permalink / raw)
  To: richard.weinberger
  Cc: linux-um, tavi.purdila, linux-kernel-library, linux-arch, retrage01


On Tue, 26 Nov 2019 07:02:54 +0900,
Richard Weinberger wrote:
> 
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> >
> > This allows the architecture code to process the system call
> > definitions. It is used by LKL to create strong typed function
> > definitions for system calls.
> >
> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> > ---
> >  include/linux/syscalls.h | 6 ++++++
> 
> Same here, core developers need to agree on this.

Okay, I'll also Cc linux-api@ from the next round.

-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 02/37] arch: add __SYSCALL_DEFINE_ARCH
@ 2019-11-27  4:15         ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-27  4:15 UTC (permalink / raw)
  To: richard.weinberger
  Cc: tavi.purdila, linux-kernel-library, retrage01, linux-um, linux-arch


On Tue, 26 Nov 2019 07:02:54 +0900,
Richard Weinberger wrote:
> 
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> >
> > This allows the architecture code to process the system call
> > definitions. It is used by LKL to create strong typed function
> > definitions for system calls.
> >
> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> > ---
> >  include/linux/syscalls.h | 6 ++++++
> 
> Same here, core developers need to agree on this.

Okay, I'll also Cc linux-api@ from the next round.

-- Hajime

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-27  4:08                       ` Hajime Tazaki
@ 2019-11-27 14:28                         ` Richard Weinberger
  -1 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-27 14:28 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: tavi purdila, Johannes Berg, linux-arch, cem, linux-um,
	retrage01, linux-kernel-library, pscollins, sigmaepsilon92,
	liuyuan, anton ivanov

----- Ursprüngliche Mail -----
>> Can you please point out a little further why UML's net or block drivers
>> are not usable for LKL?
> 
> I think we can do it (but need to check).
> 
> LKL may use UML's drivers, and UML can also use LKL's devices/drivers
> (as my 36/37 and 37/37 patches do, though the patches has no careful
> consideration on IRQ handling).

Of course. So please don't get me wrong, I don't want LKL to become
UML. I hope that UML can also benefit from LKL.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-27 14:28                         ` Richard Weinberger
  0 siblings, 0 replies; 206+ messages in thread
From: Richard Weinberger @ 2019-11-27 14:28 UTC (permalink / raw)
  To: Hajime Tazaki
  Cc: linux-arch, cem, tavi purdila, linux-um, retrage01, pscollins,
	linux-kernel-library, Johannes Berg, sigmaepsilon92, liuyuan,
	anton ivanov

----- Ursprüngliche Mail -----
>> Can you please point out a little further why UML's net or block drivers
>> are not usable for LKL?
> 
> I think we can do it (but need to check).
> 
> LKL may use UML's drivers, and UML can also use LKL's devices/drivers
> (as my 36/37 and 37/37 patches do, though the patches has no careful
> consideration on IRQ handling).

Of course. So please don't get me wrong, I don't want LKL to become
UML. I hope that UML can also benefit from LKL.

Thanks,
//richard

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
  2019-11-27 14:28                         ` Richard Weinberger
@ 2019-11-28  9:53                           ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-28  9:53 UTC (permalink / raw)
  To: richard
  Cc: tavi.purdila, johannes, linux-arch, cem, linux-um, retrage01,
	linux-kernel-library, pscollins, sigmaepsilon92, liuyuan,
	anton.ivanov


On Wed, 27 Nov 2019 23:28:37 +0900,
Richard Weinberger wrote:
> 
> ----- Ursprüngliche Mail -----
> >> Can you please point out a little further why UML's net or block drivers
> >> are not usable for LKL?
> > 
> > I think we can do it (but need to check).
> > 
> > LKL may use UML's drivers, and UML can also use LKL's devices/drivers
> > (as my 36/37 and 37/37 patches do, though the patches has no careful
> > consideration on IRQ handling).
> 
> Of course. So please don't get me wrong, I don't want LKL to become
> UML. I hope that UML can also benefit from LKL.

I understand, let me play with the UML code for a while.

-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 17/37] lkl tools: host lib: virtio devices
@ 2019-11-28  9:53                           ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2019-11-28  9:53 UTC (permalink / raw)
  To: richard
  Cc: linux-arch, cem, tavi.purdila, linux-um, retrage01, pscollins,
	linux-kernel-library, johannes, sigmaepsilon92, liuyuan,
	anton.ivanov


On Wed, 27 Nov 2019 23:28:37 +0900,
Richard Weinberger wrote:
> 
> ----- Ursprüngliche Mail -----
> >> Can you please point out a little further why UML's net or block drivers
> >> are not usable for LKL?
> > 
> > I think we can do it (but need to check).
> > 
> > LKL may use UML's drivers, and UML can also use LKL's devices/drivers
> > (as my 36/37 and 37/37 patches do, though the patches has no careful
> > consideration on IRQ handling).
> 
> Of course. So please don't get me wrong, I don't want LKL to become
> UML. I hope that UML can also benefit from LKL.

I understand, let me play with the UML code for a while.

-- Hajime


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 21/37] lkl tools: "boot" test
  2019-11-08  5:02     ` Hajime Tazaki
  (?)
@ 2020-01-23 19:33     ` Brendan Higgins
  2020-01-24  4:32         ` Hajime Tazaki
  2020-03-02 19:51         ` Luis Chamberlain
  -1 siblings, 2 replies; 206+ messages in thread
From: Brendan Higgins @ 2020-01-23 19:33 UTC (permalink / raw)
  To: kunit-dev, Hajime Tazaki, Octavian Purdila, Luis Chamberlain,
	David Gow, Aleksa Sarai
  Cc: Brendan Higgins, linux-um, linux-arch, Patrick Collins,
	Conrad Meyer, Motomu Utsumi, Lai Jiangshan, Akira Moroo,
	Petros Angelatos, Yuan Liu, Thomas Liebetraut, Mark Stillwell,
	David Disseldorp, linux-kernel-library, Luca Dariz

> Add a simple LKL test applications that starts the kernel and performs
> simple tests that minimally exercise the LKL API.
> 
> Signed-off-by: Conrad Meyer <cem@FreeBSD.org>
> Signed-off-by: David Disseldorp <ddiss@suse.de>
> Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
> Signed-off-by: Luca Dariz <luca.dariz@gmail.com>
> Signed-off-by: Mark Stillwell <mark@stillwell.me>
> Signed-off-by: Motomu Utsumi <motomuman@gmail.com>
> Signed-off-by: Patrick Collins <pscollins@google.com>
> Signed-off-by: Petros Angelatos <petrosagg@gmail.com>
> Signed-off-by: Thomas Liebetraut <thomas@tommie-lie.de>
> Signed-off-by: Yuan Liu <liuyuan@google.com>
> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> ---
>  tools/lkl/.gitignore              |   5 +
>  tools/lkl/Makefile                |   5 +-
>  tools/lkl/Targets                 |   3 +
>  tools/lkl/tests/Build             |   3 +
>  tools/lkl/tests/boot.c            | 562 ++++++++++++++++++++++++++++++
>  tools/lkl/tests/boot.sh           |   9 +
>  tools/lkl/tests/cla.c             | 159 +++++++++
>  tools/lkl/tests/cla.h             |  33 ++
>  tools/lkl/tests/disk.c            | 189 ++++++++++
>  tools/lkl/tests/disk.sh           |  61 ++++
>  tools/lkl/tests/run.py            | 182 ++++++++++
>  tools/lkl/tests/tap13.py          | 209 +++++++++++
>  tools/lkl/tests/test.c            | 126 +++++++
>  tools/lkl/tests/test.h            |  72 ++++
>  tools/lkl/tests/test.sh           | 179 ++++++++++
>  tools/lkl/tests/valgrind.supp     |  85 +++++
>  tools/lkl/tests/valgrind2xunit.py |  69 ++++
>  17 files changed, 1950 insertions(+), 1 deletion(-)
>  create mode 100644 tools/lkl/tests/Build
>  create mode 100644 tools/lkl/tests/boot.c
>  create mode 100755 tools/lkl/tests/boot.sh
>  create mode 100644 tools/lkl/tests/cla.c
>  create mode 100644 tools/lkl/tests/cla.h
>  create mode 100644 tools/lkl/tests/disk.c
>  create mode 100755 tools/lkl/tests/disk.sh
>  create mode 100755 tools/lkl/tests/run.py
>  create mode 100644 tools/lkl/tests/tap13.py
>  create mode 100644 tools/lkl/tests/test.c
>  create mode 100644 tools/lkl/tests/test.h
>  create mode 100644 tools/lkl/tests/test.sh
>  create mode 100644 tools/lkl/tests/valgrind.supp
>  create mode 100755 tools/lkl/tests/valgrind2xunit.py
> 
> diff --git a/tools/lkl/.gitignore b/tools/lkl/.gitignore
> index 1aed58bfe171..4e08254dbd46 100644
> --- a/tools/lkl/.gitignore
> +++ b/tools/lkl/.gitignore
> @@ -2,3 +2,8 @@ Makefile.conf
>  include/lkl_autoconf.h
>  tests/autoconf.sh
>  bin/stat
> +tests/net-test
> +tests/disk
> +tests/boot
> +tests/valgrind*.xml
> +*.pyc
> diff --git a/tools/lkl/Makefile b/tools/lkl/Makefile
> index 6d6d2cead03f..9a55df5064e4 100644
> --- a/tools/lkl/Makefile
> +++ b/tools/lkl/Makefile
> @@ -116,8 +116,11 @@ programs_install: $(progs-y:%=$(OUTPUT)%$(EXESUF))
>  install: headers_install libraries_install programs_install
>  
>  
> +run-tests:
> +	./tests/run.py $(tests)
> +
>  FORCE: ;
> -.PHONY: all clean FORCE
> +.PHONY: all clean FORCE run-tests
>  .PHONY: headers_install libraries_install programs_install install
>  .NOTPARALLEL : lib/lkl.o
>  .SECONDARY:
> diff --git a/tools/lkl/Targets b/tools/lkl/Targets
> index 24c985e64638..a9f74c3cc8fb 100644
> --- a/tools/lkl/Targets
> +++ b/tools/lkl/Targets
> @@ -1,3 +1,6 @@
>  libs-y += lib/liblkl
>  
> +progs-y += tests/boot
> +progs-y += tests/disk
> +progs-y += tests/net-test
>  
> diff --git a/tools/lkl/tests/Build b/tools/lkl/tests/Build
> new file mode 100644
> index 000000000000..ace86a3d3438
> --- /dev/null
> +++ b/tools/lkl/tests/Build
> @@ -0,0 +1,3 @@
> +boot-y += boot.o test.o
> +disk-y += disk.o cla.o test.o
> +net-test-y += net-test.o cla.o test.o
> diff --git a/tools/lkl/tests/boot.c b/tools/lkl/tests/boot.c
> new file mode 100644
> index 000000000000..b021e9540147
> --- /dev/null
> +++ b/tools/lkl/tests/boot.c
> @@ -0,0 +1,562 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <time.h>
> +#include <stdlib.h>
> +#include <stdint.h>
> +#include <lkl.h>
> +#include <lkl_host.h>
> +
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#if defined(__FreeBSD__)
> +#include <net/if.h>
> +#include <sys/ioctl.h>
> +#elif __linux
> +#include <sys/epoll.h>
> +#include <sys/ioctl.h>
> +#elif __MINGW32__
> +#include <windows.h>
> +#endif
> +
> +#include "test.h"
> +
> +#ifndef __MINGW32__
> +#define sleep_ns 87654321
> +int lkl_test_nanosleep(void)
> +{
> +	struct lkl_timespec ts = {
> +		.tv_sec = 0,
> +		.tv_nsec = sleep_ns,
> +	};
> +	struct timespec start, stop;
> +	long delta;
> +	long ret;
> +
> +	clock_gettime(CLOCK_MONOTONIC, &start);
> +	ret = lkl_sys_nanosleep((struct __lkl__kernel_timespec *)&ts, NULL);
> +	clock_gettime(CLOCK_MONOTONIC, &stop);
> +
> +	delta = 1e9*(stop.tv_sec - start.tv_sec) +
> +		(stop.tv_nsec - start.tv_nsec);
> +
> +	lkl_test_logf("sleep %ld, expected sleep %d\n", delta, sleep_ns);
> +
> +	if (ret == 0 && delta > sleep_ns * 0.9)
> +		return TEST_SUCCESS;
> +
> +	return TEST_FAILURE;
> +}
> +#endif
> +
> +LKL_TEST_CALL(getpid, lkl_sys_getpid, 1)
> +
> +void check_latency(long (*f)(void), long *min, long *max, long *avg)
> +{
> +	int i;
> +	unsigned long long start, stop, sum = 0;
> +	static const int count = 1000;
> +	long delta;
> +
> +	*min = 1000000000;
> +	*max = -1;
> +
> +	for (i = 0; i < count; i++) {
> +		start = lkl_host_ops.time();
> +		f();
> +		stop = lkl_host_ops.time();
> +		delta = stop - start;
> +		if (*min > delta)
> +			*min = delta;
> +		if (*max < delta)
> +			*max = delta;
> +		sum += delta;
> +	}
> +	*avg = sum / count;
> +}
> +
> +static long native_getpid(void)
> +{
> +#ifdef __MINGW32__
> +	GetCurrentProcessId();
> +#else
> +	getpid();
> +#endif
> +	return 0;
> +}
> +
> +int lkl_test_syscall_latency(void)
> +{
> +	long min, max, avg;
> +
> +	lkl_test_logf("avg/min/max: ");
> +
> +	check_latency(lkl_sys_getpid, &min, &max, &avg);
> +
> +	lkl_test_logf("lkl:%ld/%ld/%ld ", avg, min, max);
> +
> +	check_latency(native_getpid, &min, &max, &avg);
> +
> +	lkl_test_logf("native:%ld/%ld/%ld\n", avg, min, max);
> +
> +	return TEST_SUCCESS;
> +}
> +
> +#define access_rights 0721
> +
> +LKL_TEST_CALL(creat, lkl_sys_creat, 0, "/file", access_rights)
> +LKL_TEST_CALL(close, lkl_sys_close, 0, 0);
> +LKL_TEST_CALL(failopen, lkl_sys_open, -LKL_ENOENT, "/file2", 0, 0);
> +LKL_TEST_CALL(umask, lkl_sys_umask, 022,  0777);
> +LKL_TEST_CALL(umask2, lkl_sys_umask, 0777, 0);
> +LKL_TEST_CALL(open, lkl_sys_open, 0, "/file", LKL_O_RDWR, 0);
> +static const char wrbuf[] = "test";
> +LKL_TEST_CALL(write, lkl_sys_write, sizeof(wrbuf), 0, wrbuf, sizeof(wrbuf));
> +LKL_TEST_CALL(lseek_cur, lkl_sys_lseek, sizeof(wrbuf), 0, 0, LKL_SEEK_CUR);
> +LKL_TEST_CALL(lseek_end, lkl_sys_lseek, sizeof(wrbuf), 0, 0, LKL_SEEK_END);
> +LKL_TEST_CALL(lseek_set, lkl_sys_lseek, 0, 0, 0, LKL_SEEK_SET);
> +
> +int lkl_test_read(void)
> +{
> +	char buf[10] = { 0, };
> +	long ret;
> +
> +	ret = lkl_sys_read(0, buf, sizeof(buf));
> +
> +	lkl_test_logf("lkl_sys_read=%ld buf=%s\n", ret, buf);
> +
> +	if (ret == sizeof(wrbuf) && !strcmp(wrbuf, buf))
> +		return TEST_SUCCESS;
> +
> +	return TEST_FAILURE;
> +}

These tests make me think that LKL could be very useful for KUnit and
testing syscalls.

Luis and I had been talking about writing KUnit tests for syscalls to
validate that syscalls conform to the expected behavior; however,
calling syscalls from the kernel obviously has issues.

On the other hand, testing syscalls from a userspace on a booted kernel
is something that we do and something that needs to be done; however,
this too has some issues. Writing and running tests in userspace on a
booted kernel is not as easy as being able to write and run tests in the
kernel. Also, even though some syscall end-to-end tests are necessary,
not all syscall tests must be end-to-end tests, especially those which
are only trying to exercise the entire syscall contract.

I think it looks like LKL might be able to help us square that circle.

Hajime (and other LKL people):

What is the current status of this patchset? I have not seen any
activity for a couple months.

Luis,

Does this kind of match what you were thinking with the syscall testing?
I think this looks pretty close. You should be able to fully test the
contract here using KUnit. Is there anyone else you think would be
interested in this?

In any case, I am excited about this. Please keep me posted in the
future!

Cheers

> +int lkl_test_fstat(void)
> +{
> +	struct lkl_stat stat;
> +	long ret;
> +
> +	ret = lkl_sys_fstat(0, &stat);
> +
> +	lkl_test_logf("lkl_sys_fstat=%ld mode=%o size=%zd\n", ret, stat.st_mode,
> +		      stat.st_size);
> +
> +	if (ret == 0 && stat.st_size == sizeof(wrbuf) &&
> +	    stat.st_mode == (access_rights | LKL_S_IFREG))
> +		return TEST_SUCCESS;
> +
> +	return TEST_FAILURE;
> +}
> +
> +LKL_TEST_CALL(mkdir, lkl_sys_mkdir, 0, "/mnt", access_rights)
> +
> +int lkl_test_stat(void)
> +{
> +	struct lkl_stat stat;
> +	long ret;
> +
> +	ret = lkl_sys_stat("/mnt", &stat);
> +
> +	lkl_test_logf("lkl_sys_stat(\"/mnt\")=%ld mode=%o\n", ret,
> +		      stat.st_mode);
> +
> +	if (ret == 0 && stat.st_mode == (access_rights | LKL_S_IFDIR))
> +		return TEST_SUCCESS;
> +
> +	return TEST_FAILURE;
> +}
> +
> +static int lkl_test_pipe2(void)
> +{
> +	int pipe_fds[2];
> +	int READ_IDX = 0, WRITE_IDX = 1;
> +	static const char msg[] = "Hello world!";
> +	char str[20];
> +	int msg_len_bytes = strlen(msg) + 1;
> +	int cmp_res;
> +	long ret;
> +
> +	ret = lkl_sys_pipe2(pipe_fds, LKL_O_NONBLOCK);
> +	if (ret) {
> +		lkl_test_logf("pipe2: %s\n", lkl_strerror(ret));
> +		return TEST_FAILURE;
> +	}
> +
> +	ret = lkl_sys_write(pipe_fds[WRITE_IDX], msg, msg_len_bytes);
> +	if (ret != msg_len_bytes) {
> +		if (ret < 0)
> +			lkl_test_logf("write error: %s\n", lkl_strerror(ret));
> +		else
> +			lkl_test_logf("short write: %ld\n", ret);
> +		return TEST_FAILURE;
> +	}
> +
> +	ret = lkl_sys_read(pipe_fds[READ_IDX], str, msg_len_bytes);
> +	if (ret != msg_len_bytes) {
> +		if (ret < 0)
> +			lkl_test_logf("read error: %s\n", lkl_strerror(ret));
> +		else
> +			lkl_test_logf("short read: %ld\n", ret);
> +		return TEST_FAILURE;
> +	}
> +
> +	cmp_res = memcmp(msg, str, msg_len_bytes);
> +	if (cmp_res) {
> +		lkl_test_logf("memcmp failed: %d\n", cmp_res);
> +		return TEST_FAILURE;
> +	}
> +
> +	ret = lkl_sys_close(pipe_fds[0]);
> +	if (ret) {
> +		lkl_test_logf("close error: %s\n", lkl_strerror(ret));
> +		return TEST_FAILURE;
> +	}
> +
> +	ret = lkl_sys_close(pipe_fds[1]);
> +	if (ret) {
> +		lkl_test_logf("close error: %s\n", lkl_strerror(ret));
> +		return TEST_FAILURE;
> +	}
> +
> +	return TEST_SUCCESS;
> +}
> +
> +static int lkl_test_epoll(void)
> +{
> +	int epoll_fd, pipe_fds[2];
> +	int READ_IDX = 0, WRITE_IDX = 1;
> +	struct lkl_epoll_event wait_on, read_result;
> +	static const char msg[] = "Hello world!";
> +	long ret;
> +
> +	memset(&wait_on, 0, sizeof(wait_on));
> +	memset(&read_result, 0, sizeof(read_result));
> +
> +	ret = lkl_sys_pipe2(pipe_fds, LKL_O_NONBLOCK);
> +	if (ret) {
> +		lkl_test_logf("pipe2 error: %s\n", lkl_strerror(ret));
> +		return TEST_FAILURE;
> +	}
> +
> +	epoll_fd = lkl_sys_epoll_create(1);
> +	if (epoll_fd < 0) {
> +		lkl_test_logf("epoll_create error: %s\n", lkl_strerror(ret));
> +		return TEST_FAILURE;
> +	}
> +
> +	wait_on.events = LKL_POLLIN | LKL_POLLOUT;
> +	wait_on.data = pipe_fds[READ_IDX];
> +
> +	ret = lkl_sys_epoll_ctl(epoll_fd, LKL_EPOLL_CTL_ADD, pipe_fds[READ_IDX],
> +				&wait_on);
> +	if (ret < 0) {
> +		lkl_test_logf("epoll_ctl error: %s\n", lkl_strerror(ret));
> +		return TEST_FAILURE;
> +	}
> +
> +	/* Shouldn't be ready before we have written something */
> +	ret = lkl_sys_epoll_wait(epoll_fd, &read_result, 1, 0);
> +	if (ret != 0) {
> +		if (ret < 0)
> +			lkl_test_logf("epoll_wait error: %s\n",
> +				      lkl_strerror(ret));
> +		else
> +			lkl_test_logf("epoll_wait: bad event: 0x%lx\n", ret);
> +		return TEST_FAILURE;
> +	}
> +
> +	ret = lkl_sys_write(pipe_fds[WRITE_IDX], msg, strlen(msg) + 1);
> +	if (ret < 0) {
> +		lkl_test_logf("write error: %s\n", lkl_strerror(ret));
> +		return TEST_FAILURE;
> +	}
> +
> +	/* We expect exactly 1 fd to be ready immediately */
> +	ret = lkl_sys_epoll_wait(epoll_fd, &read_result, 1, 0);
> +	if (ret != 1) {
> +		if (ret < 0)
> +			lkl_test_logf("epoll_wait error: %s\n",
> +				      lkl_strerror(ret));
> +		else
> +			lkl_test_logf("epoll_wait: bad ev no %ld\n", ret);
> +		return TEST_FAILURE;
> +	}
> +
> +	/* Already tested reading from pipe2 so no need to do it
> +	 * here
> +	 */
> +
> +	return TEST_SUCCESS;
> +}
> +
> +LKL_TEST_CALL(chdir_proc, lkl_sys_chdir, 0, "proc");
> +
> +static int dir_fd;
> +
> +static int lkl_test_open_cwd(void)
> +{
> +	dir_fd = lkl_sys_open(".", LKL_O_RDONLY | LKL_O_DIRECTORY, 0);
> +	if (dir_fd < 0) {
> +		lkl_test_logf("failed to open current directory: %s\n",
> +			      lkl_strerror(dir_fd));
> +		return TEST_FAILURE;
> +	}
> +
> +	return TEST_SUCCESS;
> +}
> +
> +/* column where to insert a line break for the list file tests below. */
> +#define COL_LINE_BREAK 70
> +
> +static int lkl_test_getdents64(void)
> +{
> +	long ret;
> +	char buf[1024], *pos;
> +	struct lkl_linux_dirent64 *de;
> +	int wr;
> +
> +	de = (struct lkl_linux_dirent64 *)buf;
> +	ret = lkl_sys_getdents64(dir_fd, de, sizeof(buf));
> +
> +	wr = lkl_test_logf("%d ", dir_fd);
> +
> +	if (ret < 0)
> +		return TEST_FAILURE;
> +
> +	for (pos = buf; pos - buf < ret; pos += de->d_reclen) {
> +		de = (struct lkl_linux_dirent64 *)pos;
> +
> +		wr += lkl_test_logf("%s ", de->d_name);
> +		if (wr >= COL_LINE_BREAK) {
> +			lkl_test_logf("\n");
> +			wr = 0;
> +		}
> +	}
> +
> +	return TEST_SUCCESS;
> +}
> +
> +LKL_TEST_CALL(close_dir_fd, lkl_sys_close, 0, dir_fd);
> +LKL_TEST_CALL(chdir_root, lkl_sys_chdir, 0, "/");
> +LKL_TEST_CALL(mount_fs_proc, lkl_mount_fs, 0, "proc");
> +LKL_TEST_CALL(umount_fs_proc, lkl_umount_timeout, 0, "proc", 0, 1000);
> +LKL_TEST_CALL(lo_ifup, lkl_if_up, 0, 1);
> +
> +static int lkl_test_mutex(void)
> +{
> +	long ret = TEST_SUCCESS;
> +	/*
> +	 * Can't do much to verify that this works, so we'll just let Valgrind
> +	 * warn us on CI if we've made bad memory accesses.
> +	 */
> +
> +	struct lkl_mutex *mutex;
> +
> +	mutex = lkl_host_ops.mutex_alloc(0);
> +	lkl_host_ops.mutex_lock(mutex);
> +	lkl_host_ops.mutex_unlock(mutex);
> +	lkl_host_ops.mutex_free(mutex);
> +
> +	mutex = lkl_host_ops.mutex_alloc(1);
> +	lkl_host_ops.mutex_lock(mutex);
> +	lkl_host_ops.mutex_lock(mutex);
> +	lkl_host_ops.mutex_unlock(mutex);
> +	lkl_host_ops.mutex_unlock(mutex);
> +	lkl_host_ops.mutex_free(mutex);
> +
> +	return ret;
> +}
> +
> +static int lkl_test_semaphore(void)
> +{
> +	long ret = TEST_SUCCESS;
> +	/*
> +	 * Can't do much to verify that this works, so we'll just let Valgrind
> +	 * warn us on CI if we've made bad memory accesses.
> +	 */
> +
> +	struct lkl_sem *sem = lkl_host_ops.sem_alloc(1);
> +
> +	lkl_host_ops.sem_down(sem);
> +	lkl_host_ops.sem_up(sem);
> +	lkl_host_ops.sem_free(sem);
> +
> +	return ret;
> +}
> +
> +static int lkl_test_gettid(void)
> +{
> +	long tid = lkl_host_ops.gettid();
> +
> +	lkl_test_logf("%ld", tid);
> +
> +	/* As far as I know, thread IDs are non-zero on all reasonable
> +	 * systems.
> +	 */
> +	if (tid)
> +		return TEST_SUCCESS;
> +	else
> +		return TEST_FAILURE;
> +}
> +
> +static void test_thread(void *data)
> +{
> +	int *pipe_fds = (int *) data;
> +	char tmp[LKL_PIPE_BUF+1];
> +	int ret;
> +
> +	ret = lkl_sys_read(pipe_fds[0], tmp, sizeof(tmp));
> +	if (ret < 0)
> +		lkl_test_logf("%s: %s\n", __func__, lkl_strerror(ret));
> +}
> +
> +static int lkl_test_syscall_thread(void)
> +{
> +	int pipe_fds[2];
> +	char tmp[LKL_PIPE_BUF+1];
> +	long ret;
> +	lkl_thread_t tid;
> +
> +	ret = lkl_sys_pipe2(pipe_fds, 0);
> +	if (ret) {
> +		lkl_test_logf("pipe2: %s\n", lkl_strerror(ret));
> +		return TEST_FAILURE;
> +	}
> +
> +	ret = lkl_sys_fcntl(pipe_fds[0], LKL_F_SETPIPE_SZ, 1);
> +	if (ret < 0) {
> +		lkl_test_logf("fcntl setpipe_sz: %s\n", lkl_strerror(ret));
> +		return TEST_FAILURE;
> +	}
> +
> +	tid = lkl_host_ops.thread_create(test_thread, pipe_fds);
> +	if (!tid) {
> +		lkl_test_logf("failed to create thread\n");
> +		return TEST_FAILURE;
> +	}
> +
> +	ret = lkl_sys_write(pipe_fds[1], tmp, sizeof(tmp));
> +	if (ret != sizeof(tmp)) {
> +		if (ret < 0)
> +			lkl_test_logf("write error: %s\n", lkl_strerror(ret));
> +		else
> +			lkl_test_logf("short write: %ld\n", ret);
> +		return TEST_FAILURE;
> +	}
> +
> +	ret = lkl_host_ops.thread_join(tid);
> +	if (ret) {
> +		lkl_test_logf("failed to join thread\n");
> +		return TEST_FAILURE;
> +	}
> +
> +	return TEST_SUCCESS;
> +}
> +
> +#ifndef __MINGW32__
> +static void thread_get_pid(void *unused)
> +{
> +	lkl_sys_getpid();
> +}
> +
> +static int lkl_test_many_syscall_threads(void)
> +{
> +	lkl_thread_t tid;
> +	int count = 65, ret;
> +
> +	while (--count > 0) {
> +		tid = lkl_host_ops.thread_create(thread_get_pid, NULL);
> +		if (!tid) {
> +			lkl_test_logf("failed to create thread\n");
> +			return TEST_FAILURE;
> +		}
> +
> +		ret = lkl_host_ops.thread_join(tid);
> +		if (ret) {
> +			lkl_test_logf("failed to join thread\n");
> +			return TEST_FAILURE;
> +		}
> +	}
> +
> +	return TEST_SUCCESS;
> +}
> +#endif
> +
> +static void thread_quit_immediately(void *unused)
> +{
> +}
> +
> +static int lkl_test_join(void)
> +{
> +	lkl_thread_t tid = lkl_host_ops.thread_create(thread_quit_immediately,
> +						      NULL);
> +	int ret = lkl_host_ops.thread_join(tid);
> +
> +	if (ret == 0) {
> +		lkl_test_logf("joined %ld\n", tid);
> +		return TEST_SUCCESS;
> +	}
> +
> +	lkl_test_logf("failed joining %ld\n", tid);
> +	return TEST_FAILURE;
> +}
> +
> +LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
> +	     "mem=16M loglevel=8");
> +LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
> +
> +struct lkl_test tests[] = {
> +	LKL_TEST(mutex),
> +	LKL_TEST(semaphore),
> +	LKL_TEST(join),
> +	LKL_TEST(start_kernel),
> +	LKL_TEST(getpid),
> +	LKL_TEST(syscall_latency),
> +	LKL_TEST(umask),
> +	LKL_TEST(umask2),
> +	LKL_TEST(creat),
> +	LKL_TEST(close),
> +	LKL_TEST(failopen),
> +	LKL_TEST(open),
> +	LKL_TEST(write),
> +	LKL_TEST(lseek_cur),
> +	LKL_TEST(lseek_end),
> +	LKL_TEST(lseek_set),
> +	LKL_TEST(read),
> +	LKL_TEST(fstat),
> +	LKL_TEST(mkdir),
> +	LKL_TEST(stat),
> +#ifndef __MINGW32__
> +	LKL_TEST(nanosleep),
> +#endif
> +	LKL_TEST(pipe2),
> +	LKL_TEST(epoll),
> +	LKL_TEST(mount_fs_proc),
> +	LKL_TEST(chdir_proc),
> +	LKL_TEST(open_cwd),
> +	LKL_TEST(getdents64),
> +	LKL_TEST(close_dir_fd),
> +	LKL_TEST(chdir_root),
> +	LKL_TEST(umount_fs_proc),
> +	LKL_TEST(lo_ifup),
> +	LKL_TEST(gettid),
> +	LKL_TEST(syscall_thread),
> +	/*
> +	 * Wine has an issue where the FlsCallback is not called when
> +	 * the thread terminates which makes testing the automatic
> +	 * syscall threads cleanup impossible under wine.
> +	 */
> +#ifndef __MINGW32__
> +	LKL_TEST(many_syscall_threads),
> +#endif
> +	LKL_TEST(stop_kernel),
> +};
> +
> +int main(int argc, const char **argv)
> +{
> +	lkl_host_ops.print = lkl_test_log;
> +
> +	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
> +			    "boot");
> +}
> diff --git a/tools/lkl/tests/boot.sh b/tools/lkl/tests/boot.sh
> new file mode 100755
> index 000000000000..d985c04b0ac1
> --- /dev/null
> +++ b/tools/lkl/tests/boot.sh
> @@ -0,0 +1,9 @@
> +#!/usr/bin/env bash
> +# SPDX-License-Identifier: GPL-2.0
> +
> +script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
> +source $script_dir/test.sh
> +
> +lkl_test_plan 1 "boot"
> +lkl_test_run 1
> +lkl_test_exec $script_dir/boot
> diff --git a/tools/lkl/tests/cla.c b/tools/lkl/tests/cla.c
> new file mode 100644
> index 000000000000..a34badeb5f06
> --- /dev/null
> +++ b/tools/lkl/tests/cla.c
> @@ -0,0 +1,159 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <stdio.h>
> +#include <string.h>
> +#include <errno.h>
> +#include <stdlib.h>
> +#ifdef __MINGW32__
> +#include <winsock2.h>
> +#else
> +#include <sys/socket.h>
> +#include <netinet/in.h>
> +#include <arpa/inet.h>
> +#endif
> +
> +#include "cla.h"
> +
> +static int cl_arg_parse_bool(struct cl_arg *arg, const char *value)
> +{
> +	*((int *)arg->store) = 1;
> +	return 0;
> +}
> +
> +static int cl_arg_parse_str(struct cl_arg *arg, const char *value)
> +{
> +	*((const char **)arg->store) = value;
> +	return 0;
> +}
> +
> +static int cl_arg_parse_int(struct cl_arg *arg, const char *value)
> +{
> +	errno = 0;
> +	*((int *)arg->store) = strtol(value, NULL, 0);
> +	return errno == 0;
> +}
> +
> +static int cl_arg_parse_str_set(struct cl_arg *arg, const char *value)
> +{
> +	const char **set = arg->set;
> +	int i;
> +
> +	for (i = 0; set[i] != NULL; i++) {
> +		if (strcmp(set[i], value) == 0) {
> +			*((int *)arg->store) = i;
> +			return 0;
> +		}
> +	}
> +
> +	return -1;
> +}
> +
> +static int cl_arg_parse_ipv4(struct cl_arg *arg, const char *value)
> +{
> +	unsigned int addr;
> +
> +	if (!value)
> +		return -1;
> +
> +	addr = inet_addr(value);
> +	if (addr == INADDR_NONE)
> +		return -1;
> +	*((unsigned int *)arg->store) = addr;
> +	return 0;
> +}
> +
> +static cl_arg_parser_t parsers[] = {
> +	[CL_ARG_BOOL] = cl_arg_parse_bool,
> +	[CL_ARG_INT] = cl_arg_parse_int,
> +	[CL_ARG_STR] = cl_arg_parse_str,
> +	[CL_ARG_STR_SET] = cl_arg_parse_str_set,
> +	[CL_ARG_IPV4] = cl_arg_parse_ipv4,
> +};
> +
> +static struct cl_arg *find_short_arg(char name, struct cl_arg *args)
> +{
> +	struct cl_arg *arg;
> +
> +	for (arg = args; arg->short_name != 0; arg++) {
> +		if (arg->short_name == name)
> +			return arg;
> +	}
> +
> +	return NULL;
> +}
> +
> +static struct cl_arg *find_long_arg(const char *name, struct cl_arg *args)
> +{
> +	struct cl_arg *arg;
> +
> +	for (arg = args; arg->long_name; arg++) {
> +		if (strcmp(arg->long_name, name) == 0)
> +			return arg;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void print_help(struct cl_arg *args)
> +{
> +	struct cl_arg *arg;
> +
> +	fprintf(stderr, "usage:\n");
> +	for (arg = args; arg->long_name; arg++) {
> +		fprintf(stderr, "-%c, --%-20s %s", arg->short_name,
> +			arg->long_name, arg->help);
> +		if (arg->type == CL_ARG_STR_SET) {
> +			const char **set = arg->set;
> +
> +			fprintf(stderr, " [ ");
> +			while (*set != NULL)
> +				fprintf(stderr, "%s ", *(set++));
> +			fprintf(stderr, "]");
> +		}
> +		fprintf(stderr, "\n");
> +	}
> +}
> +
> +int parse_args(int argc, const char **argv, struct cl_arg *args)
> +{
> +	int i;
> +
> +	for (i = 1; i < argc; i++) {
> +		struct cl_arg *arg = NULL;
> +		cl_arg_parser_t parser;
> +
> +		if (argv[i][0] == '-') {
> +			if (argv[i][1] != '-')
> +				arg = find_short_arg(argv[i][1], args);
> +			else
> +				arg = find_long_arg(&argv[i][2], args);
> +		}
> +
> +		if (!arg) {
> +			fprintf(stderr, "unknown option '%s'\n", argv[i]);
> +			print_help(args);
> +			return -1;
> +		}
> +
> +		if (arg->type == CL_ARG_USER || arg->type >= CL_ARG_END)
> +			parser = arg->parser;
> +		else
> +			parser = parsers[arg->type];
> +
> +		if (!parser) {
> +			fprintf(stderr, "can't parse --'%s'/-'%c'\n",
> +				arg->long_name, args->short_name);
> +			return -1;
> +		}
> +
> +		if (parser(arg, argv[i + 1]) < 0) {
> +			fprintf(stderr, "can't parse '%s'\n", argv[i]);
> +			print_help(args);
> +			return -1;
> +		}
> +
> +		if (arg->has_arg)
> +			i++;
> +	}
> +
> +	return 0;
> +}
> diff --git a/tools/lkl/tests/cla.h b/tools/lkl/tests/cla.h
> new file mode 100644
> index 000000000000..f8369be02e5a
> --- /dev/null
> +++ b/tools/lkl/tests/cla.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LKL_TEST_CLA_H
> +#define _LKL_TEST_CLA_H
> +
> +enum cl_arg_type {
> +	CL_ARG_USER = 0,
> +	CL_ARG_BOOL,
> +	CL_ARG_INT,
> +	CL_ARG_STR,
> +	CL_ARG_STR_SET,
> +	CL_ARG_IPV4,
> +	CL_ARG_END,
> +};
> +
> +struct cl_arg;
> +
> +typedef int (*cl_arg_parser_t)(struct cl_arg *arg, const char *value);
> +
> +struct cl_arg {
> +	const char *long_name;
> +	char short_name;
> +	const char *help;
> +	int has_arg;
> +	enum cl_arg_type type;
> +	void *store;
> +	void *set;
> +	cl_arg_parser_t parser;
> +};
> +
> +int parse_args(int argc, const char **argv, struct cl_arg *args);
> +
> +
> +#endif /* _LKL_TEST_CLA_H */
> diff --git a/tools/lkl/tests/disk.c b/tools/lkl/tests/disk.c
> new file mode 100644
> index 000000000000..0aa039876b54
> --- /dev/null
> +++ b/tools/lkl/tests/disk.c
> @@ -0,0 +1,189 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <time.h>
> +#include <stdlib.h>
> +#include <stdint.h>
> +#include <lkl.h>
> +#include <lkl_host.h>
> +#ifndef __MINGW32__
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#include <sys/ioctl.h>
> +#else
> +#include <windows.h>
> +#endif
> +
> +#include "test.h"
> +#include "cla.h"
> +
> +static struct {
> +	int printk;
> +	const char *disk;
> +	const char *fstype;
> +	int partition;
> +} cla;
> +
> +struct cl_arg args[] = {
> +	{"disk", 'd', "disk file to use", 1, CL_ARG_STR, &cla.disk},
> +	{"partition", 'P', "partition to mount", 1, CL_ARG_INT, &cla.partition},
> +	{"type", 't', "filesystem type", 1, CL_ARG_STR, &cla.fstype},
> +	{0},
> +};
> +
> +
> +static struct lkl_disk disk;
> +static int disk_id = -1;
> +
> +int lkl_test_disk_add(void)
> +{
> +#ifdef __MINGW32__
> +	disk.handle = CreateFile(cla.disk, GENERIC_READ | GENERIC_WRITE,
> +			       0, NULL, OPEN_EXISTING, 0, NULL);
> +	if (!disk.handle)
> +#else
> +	disk.fd = open(cla.disk, O_RDWR);
> +	if (disk.fd < 0)
> +#endif
> +		goto out_unlink;
> +
> +	disk.ops = NULL;
> +
> +	disk_id = lkl_disk_add(&disk);
> +	if (disk_id < 0)
> +		goto out_close;
> +
> +	goto out;
> +
> +out_close:
> +#ifdef __MINGW32__
> +	CloseHandle(disk.handle);
> +#else
> +	close(disk.fd);
> +#endif
> +
> +out_unlink:
> +#ifdef __MINGW32__
> +	DeleteFile(cla.disk);
> +#else
> +	unlink(cla.disk);
> +#endif
> +
> +out:
> +	lkl_test_logf("disk fd/handle %x disk_id %d", disk.fd, disk_id);
> +
> +	if (disk_id >= 0)
> +		return TEST_SUCCESS;
> +
> +	return TEST_FAILURE;
> +}
> +
> +int lkl_test_disk_remove(void)
> +{
> +	int ret;
> +
> +	ret = lkl_disk_remove(disk);
> +
> +#ifdef __MINGW32__
> +	CloseHandle(disk.handle);
> +#else
> +	close(disk.fd);
> +#endif
> +
> +	if (ret == 0)
> +		return TEST_SUCCESS;
> +
> +	return TEST_FAILURE;
> +}
> +
> +
> +static char mnt_point[32];
> +
> +LKL_TEST_CALL(mount_dev, lkl_mount_dev, 0, disk_id, cla.partition, cla.fstype,
> +	      0, NULL, mnt_point, sizeof(mnt_point))
> +
> +static int lkl_test_umount_dev(void)
> +{
> +	long ret, ret2;
> +
> +	ret = lkl_sys_chdir("/");
> +
> +	ret2 = lkl_umount_dev(disk_id, cla.partition, 0, 1000);
> +
> +	lkl_test_logf("%ld %ld", ret, ret2);
> +
> +	if (!ret && !ret2)
> +		return TEST_SUCCESS;
> +
> +	return TEST_FAILURE;
> +}
> +
> +struct lkl_dir *dir;
> +
> +static int lkl_test_opendir(void)
> +{
> +	int err;
> +
> +	dir = lkl_opendir(mnt_point, &err);
> +
> +	lkl_test_logf("lkl_opedir(%s) = %d %s\n", mnt_point, err,
> +		      lkl_strerror(err));
> +
> +	if (err == 0)
> +		return TEST_SUCCESS;
> +
> +	return TEST_FAILURE;
> +}
> +
> +static int lkl_test_readdir(void)
> +{
> +	struct lkl_linux_dirent64 *de = lkl_readdir(dir);
> +	int wr = 0;
> +
> +	while (de) {
> +		wr += lkl_test_logf("%s ", de->d_name);
> +		if (wr >= 70) {
> +			lkl_test_logf("\n");
> +			wr = 0;
> +			break;
> +		}
> +		de = lkl_readdir(dir);
> +	}
> +
> +	if (lkl_errdir(dir) == 0)
> +		return TEST_SUCCESS;
> +
> +	return TEST_FAILURE;
> +}
> +
> +LKL_TEST_CALL(closedir, lkl_closedir, 0, dir);
> +LKL_TEST_CALL(chdir_mnt_point, lkl_sys_chdir, 0, mnt_point);
> +LKL_TEST_CALL(start_kernel, lkl_start_kernel, 0, &lkl_host_ops,
> +	     "mem=16M loglevel=8");
> +LKL_TEST_CALL(stop_kernel, lkl_sys_halt, 0);
> +
> +struct lkl_test tests[] = {
> +	LKL_TEST(disk_add),
> +	LKL_TEST(start_kernel),
> +	LKL_TEST(mount_dev),
> +	LKL_TEST(chdir_mnt_point),
> +	LKL_TEST(opendir),
> +	LKL_TEST(readdir),
> +	LKL_TEST(closedir),
> +	LKL_TEST(umount_dev),
> +	LKL_TEST(stop_kernel),
> +	LKL_TEST(disk_remove),
> +
> +};
> +
> +int main(int argc, const char **argv)
> +{
> +	if (parse_args(argc, argv, args) < 0)
> +		return -1;
> +
> +	lkl_host_ops.print = lkl_test_log;
> +
> +	return lkl_test_run(tests, sizeof(tests)/sizeof(struct lkl_test),
> +			    "disk %s", cla.fstype);
> +}
> diff --git a/tools/lkl/tests/disk.sh b/tools/lkl/tests/disk.sh
> new file mode 100755
> index 000000000000..9bdcb16f2d5c
> --- /dev/null
> +++ b/tools/lkl/tests/disk.sh
> @@ -0,0 +1,61 @@
> +#!/usr/bin/env bash
> +# SPDX-License-Identifier: GPL-2.0
> +
> +script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
> +
> +source $script_dir/test.sh
> +
> +function prepfs()
> +{
> +    set -e
> +
> +    file=`mktemp`
> +
> +    dd if=/dev/zero of=$file bs=1024 count=204800
> +
> +    yes | mkfs.$1 $file
> +
> +    if ! [ -z $BSD_WDIR ]; then
> +        $MYSSH mkdir -p $BSD_WDIR
> +        ssh_copy $file $BSD_WDIR
> +        rm $file
> +        file=$BSD_WDIR/$(basename $file)
> +    fi
> +
> +    export_vars file
> +}
> +
> +function cleanfs()
> +{
> +    set -e
> +
> +    if ! [ -z $BSD_WDIR ]; then
> +        $MYSSH rm $1
> +        $MYSSH rm $BSD_WDIR/disk
> +    else
> +        rm $1
> +    fi
> +}
> +
> +if [ "$1" = "-t" ]; then
> +    shift
> +    fstype=$1
> +    shift
> +fi
> +
> +if [ -z "$fstype" ]; then
> +    fstype="ext4"
> +fi
> +
> +if [ -z $(which mkfs.$fstype) ]; then
> +    lkl_test_plan 0 "disk $fstype"
> +    echo "no mkfs.$fstype command"
> +    exit 0
> +fi
> +
> +lkl_test_plan 1 "disk $fstype"
> +lkl_test_run 1 prepfs $fstype
> +lkl_test_exec $script_dir/disk -d $file -t $fstype $@
> +lkl_test_plan 1 "disk $fstype"
> +lkl_test_run 1 cleanfs $file
> +
> diff --git a/tools/lkl/tests/run.py b/tools/lkl/tests/run.py
> new file mode 100755
> index 000000000000..8fea72686a7a
> --- /dev/null
> +++ b/tools/lkl/tests/run.py
> @@ -0,0 +1,182 @@
> +#!/usr/bin/env python
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; version 2 of the License
> +#
> +# Author: Octavian Purdila <tavi@cs.pub.ro>
> +#
> +
> +from __future__ import print_function
> +
> +import argparse
> +import os
> +import subprocess
> +import sys
> +import tap13
> +import xml.etree.ElementTree as ET
> +
> +from junit_xml import TestSuite, TestCase
> +
> +
> +class Reporter(tap13.Reporter):
> +    def start(self, obj):
> +        if type(obj) is tap13.Test:
> +            if obj.result == "*":
> +                end='\r'
> +            else:
> +                end='\n'
> +            print("  TEST       %-8s %.50s" %
> +                  (obj.result, obj.description + " " + obj.comment), end=end)
> +
> +        elif type(obj) is tap13.Suite:
> +            if obj.tests_planned == 0:
> +                status = "skip"
> +            else:
> +                status = ""
> +            print("  SUITE      %-8s %s" % (status, obj.name))
> +
> +    def end(self, obj):
> +        if type(obj) is tap13.Test:
> +            if obj.result != "ok":
> +                try:
> +                    print(obj.yaml["log"], end='')
> +                except:
> +                    None
> +
> +
> +mydir=os.path.dirname(os.path.realpath(__file__))
> +
> +tests = [
> +    'boot.sh',
> +    'disk.sh -t ext4',
> +    'disk.sh -t vfat',
> +    'net.sh -b loopback',
> +    'net.sh -b tap',
> +    'net.sh -b pipe',
> +    'net.sh -b raw',
> +    'net.sh -b macvtap',
> +    'lklfuse.sh -t ext4',
> +    'lklfuse.sh -t vfat',
> +    'hijack-test.sh'
> +]
> +
> +parser = argparse.ArgumentParser(description='LKL test runner')
> +parser.add_argument('tests', nargs='?', action='append',
> +                    help='tests to run %s' % tests)
> +parser.add_argument('--junit-dir',
> +                    help='directory where to store the juni suites')
> +parser.add_argument('--gdb', action='store_true', default=False,
> +                    help='run simple tests under gdb; implies --pass-through')
> +parser.add_argument('--pass-through', action='store_true',  default=False,
> +                    help='run the test without interpeting the test output')
> +parser.add_argument('--valgrind', action='store_true', default=False,
> +                    help='run simple tests under valgrind')
> +
> +args = parser.parse_args()
> +if args.tests == [None]:
> +    args.tests = tests
> +
> +if args.gdb:
> +    args.pass_through=True
> +    os.environ['GDB']="yes"
> +
> +if args.valgrind:
> +    os.environ['VALGRIND']="yes"
> +
> +tap = tap13.Parser(Reporter())
> +
> +os.environ['PATH'] += ":" + mydir
> +
> +exit_code = 0
> +
> +for t in args.tests:
> +    if not t:
> +        continue
> +    if args.pass_through:
> +        print(t)
> +        if subprocess.call(t, shell=True) != 0:
> +            exit_code = 1
> +    else:
> +        p = subprocess.Popen(t, shell=True, stdout=subprocess.PIPE)
> +        tap.parse(p.stdout)
> +
> +if args.pass_through:
> +    sys.exit(exit_code)
> +
> +suites_count = 0
> +tests_total = 0
> +tests_not_ok = 0
> +tests_ok = 0
> +tests_skip = 0
> +val_errs = 0
> +val_fails = 0
> +val_skips = 0
> +
> +for s in tap.run.suites:
> +
> +    junit_tests = []
> +    suites_count += 1
> +
> +    for t in s.tests:
> +        try:
> +            secs = t.yaml["time_us"] / 1000000.0
> +        except:
> +            secs = 0
> +        try:
> +            log = t.yaml['log']
> +        except:
> +            log = ""
> +
> +        jt = TestCase(t.description, elapsed_sec=secs, stdout=log)
> +        if t.result == 'skip':
> +            jt.add_skipped_info(output=log)
> +        elif t.result == 'not ok':
> +            jt.add_error_info(output=log)
> +
> +        junit_tests.append(jt)
> +
> +        tests_total += 1
> +        if t.result == "ok":
> +            tests_ok += 1
> +        elif t.result == "not ok":
> +            tests_not_ok += 1
> +            exit_code = 1
> +        elif t.result == "skip":
> +            tests_skip += 1
> +
> +    if args.junit_dir:
> +        js = TestSuite(s.name, junit_tests)
> +        with open(os.path.join(args.junit_dir, os.path.basename(s.name) + '.xml'), 'w') as f:
> +            js.to_file(f, [js])
> +
> +        if os.getenv('VALGRIND') is not None:
> +            val_xml = 'valgrind-%s.xml' % os.path.basename(s.name).replace(' ','-')
> +            # skipped tests don't generate xml file
> +            if os.path.exists(val_xml) is False:
> +                continue
> +
> +            cmd = 'mv %s %s' % (val_xml, args.junit_dir)
> +            subprocess.call(cmd, shell=True, )
> +
> +            cmd = mydir + '/valgrind2xunit.py ' + val_xml
> +            subprocess.call(cmd, shell=True, cwd=args.junit_dir)
> +
> +            # count valgrind results
> +            doc = ET.parse(os.path.join(args.junit_dir, 'valgrind-%s_xunit.xml' \
> +                                        % (os.path.basename(s.name).replace(' ','-'))))
> +            ts = doc.getroot()
> +            val_errs += int(ts.get('errors'))
> +            val_fails += int(ts.get('failures'))
> +            val_skips += int(ts.get('skip'))
> +
> +print("Summary: %d suites run, %d tests, %d ok, %d not ok, %d skipped" %
> +      (suites_count, tests_total, tests_ok, tests_not_ok, tests_skip))
> +
> +if os.getenv('VALGRIND') is not None:
> +    print(" valgrind (memcheck): %d failures, %d skipped" % (val_fails, val_skips))
> +    if val_errs or val_fails:
> +        exit_code = 1
> +
> +sys.exit(exit_code)
> diff --git a/tools/lkl/tests/tap13.py b/tools/lkl/tests/tap13.py
> new file mode 100644
> index 000000000000..65c73cda7ca1
> --- /dev/null
> +++ b/tools/lkl/tests/tap13.py
> @@ -0,0 +1,209 @@
> +#!/usr/bin/env python
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; version 2 of the License
> +#
> +# Author: Octavian Purdila <tavi@cs.pub.ro>
> +#
> +# Based on TAP13:
> +#
> +# Copyright 2013, Red Hat, Inc.
> +# Author: Josef Skladanka <jskladan@redhat.com>
> +#
> +from __future__ import print_function
> +
> +import re
> +import sys
> +import yamlish
> +
> +
> +class Reporter(object):
> +
> +    def start(self, obj):
> +        None
> +
> +    def end(self, obj):
> +        None
> +
> +
> +class Test(object):
> +    def __init__(self, reporter, result, id, description=None, directive=None,
> +                 comment=None):
> +        self.reporter = reporter
> +        self.result = result
> +        if directive:
> +            self.result = directive.lower()
> +        if id:
> +            self.id = int(id)
> +        else:
> +            self.id = None
> +        if description:
> +            self.description = description
> +        else:
> +            self.description = ""
> +        if comment:
> +            self.comment = "# " + comment
> +        else:
> +            self.comment = ""
> +        self.yaml = None
> +        self._yaml_buffer = None
> +        self.diagnostics = []
> +
> +        self.reporter.start(self)
> +
> +    def end(self):
> +        if not self.yaml:
> +            self.yaml = yamlish.load(self._yaml_buffer)
> +            self.reporter.end(self)
> +
> +
> +class Suite(object):
> +    def __init__(self, reporter, start, end, explanation):
> +        self.reporter = reporter
> +        self.tests = []
> +        self.name = explanation
> +        self.tests_planned = int(end)
> +
> +        self.__tests_counter = 0
> +        self.__tests_base = 0
> +
> +        self.reporter.start(self)
> +
> +    def newTest(self, args):
> +        try:
> +            self.tests[-1].end()
> +        except IndexError:
> +            None
> +
> +        if 'id' not in args or not args['id']:
> +            args['id'] = self.__tests_counter
> +        else:
> +            args['id'] = int(args['id']) + self.__tests_base
> +
> +        if args['id'] < self.__tests_counter:
> +            print("error: bad test id %d, fixing it" % (args['id']))
> +            args['id'] = self.__tests_counter
> +        # according to TAP13 specs, missing tests must be handled as 'not ok'
> +        # here we add the missing tests in sequence
> +        while args['id'] > (self.__tests_counter + 1):
> +            comment = 'test %d not present' % self.__tests_counter
> +            self.tests.append(Test(self.reporter, 'not ok',
> +                                   self.__tests_counter, comment=comment))
> +            self.__tests_counter += 1
> +
> +        if args['id'] == self.__tests_counter:
> +            if args['directive']:
> +                self.test().result = args['directive'].lower()
> +            else:
> +                self.test().result = args['result']
> +            self.reporter.start(self.test())
> +        else:
> +            self.tests.append(Test(self.reporter, **args))
> +            self.__tests_counter += 1
> +
> +    def test(self):
> +        return self.tests[-1]
> +
> +    def end(self, name, planned):
> +        if name == self.name:
> +            self.tests_planned += int(planned)
> +            self.__tests_base = self.__tests_counter
> +            return False
> +        try:
> +            self.test().end()
> +        except IndexError:
> +            None
> +        if len(self.tests) != self.tests_planned:
> +            for i in range(len(self.tests), self.tests_planned):
> +                self.tests.append(Test(self.reporter, 'not ok', i+1,
> +                                       comment='test not present'))
> +        return True
> +
> +
> +class Run(object):
> +
> +    def __init__(self, reporter):
> +        self.reporter = reporter
> +        self.suites = []
> +
> +    def suite(self):
> +        return self.suites[-1]
> +
> +    def test(self):
> +        return self.suites[-1].tests[-1]
> +
> +    def newSuite(self, args):
> +        new = False
> +        try:
> +            if self.suite().end(args['explanation'], args['end']):
> +                new = True
> +        except IndexError:
> +            new = True
> +        if new:
> +            self.suites.append(Suite(self.reporter, **args))
> +
> +    def newTest(self, args):
> +        self.suite().newTest(args)
> +
> +
> +class Parser(object):
> +    RE_PLAN = re.compile(r"^\s*(?P<start>\d+)\.\.(?P<end>\d+)\s*(#\s*(?P<explanation>.*))?\s*$")
> +    RE_TEST_LINE = re.compile(r"^\s*(?P<result>(not\s+)?ok|[*]+)\s*(?P<id>\d+)?\s*(?P<description>[^#]+)?\s*(#\s*(?P<directive>TODO|SKIP)?\s*(?P<comment>.+)?)?\s*$",  re.IGNORECASE)
> +    RE_EXPLANATION = re.compile(r"^\s*#\s*(?P<explanation>.+)?\s*$")
> +    RE_YAMLISH_START = re.compile(r"^\s*---.*$")
> +    RE_YAMLISH_END = re.compile(r"^\s*\.\.\.\s*$")
> +
> +    def __init__(self, reporter):
> +        self.seek_test = False
> +        self.in_test = False
> +        self.in_yaml = False
> +        self.run = Run(reporter)
> +
> +    def parse(self, source):
> +        # to avoid input buffering
> +        while True:
> +            line = source.readline()
> +            if not line:
> +                break
> +
> +            if self.in_yaml:
> +                if Parser.RE_YAMLISH_END.match(line):
> +                    self.run.test()._yaml_buffer.append(line.strip())
> +                    self.in_yaml = False
> +                else:
> +                    self.run.test()._yaml_buffer.append(line.rstrip())
> +                continue
> +
> +            line = line.strip()
> +
> +            if self.in_test:
> +                if Parser.RE_EXPLANATION.match(line):
> +                    self.run.test().diagnostics.append(line)
> +                    continue
> +                if Parser.RE_YAMLISH_START.match(line):
> +                    self.run.test()._yaml_buffer = [line.strip()]
> +                    self.in_yaml = True
> +                    continue
> +
> +            m = Parser.RE_PLAN.match(line)
> +            if m:
> +                self.seek_test = True
> +                args = m.groupdict()
> +                self.run.newSuite(args)
> +                continue
> +
> +            if self.seek_test:
> +                m = Parser.RE_TEST_LINE.match(line)
> +                if m:
> +                    args = m.groupdict()
> +                    self.run.newTest(args)
> +                    self.in_test = True
> +                    continue
> +
> +            print(line)
> +        try:
> +            self.run.suite().end(None, 0)
> +        except IndexError:
> +            None
> diff --git a/tools/lkl/tests/test.c b/tools/lkl/tests/test.c
> new file mode 100644
> index 000000000000..3e334d106c48
> --- /dev/null
> +++ b/tools/lkl/tests/test.c
> @@ -0,0 +1,126 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <stdio.h>
> +#include <stdarg.h>
> +#include <time.h>
> +
> +#include "test.h"
> +
> +/* circular log buffer */
> +
> +static char log_buf[0x10000];
> +static char *head = log_buf, *tail = log_buf;
> +
> +static inline void advance(char **ptr)
> +{
> +	if ((unsigned int)(*ptr - log_buf) >= sizeof(log_buf))
> +		*ptr = log_buf;
> +	else
> +		*ptr = *ptr + 1;
> +}
> +
> +static void log_char(char c)
> +{
> +	*tail = c;
> +	advance(&tail);
> +	if (tail == head)
> +		advance(&head);
> +}
> +
> +static void print_log(void)
> +{
> +	char last;
> +
> +	printf(" log: |\n");
> +	last = '\n';
> +	while (head != tail) {
> +		if (last == '\n')
> +			printf("  ");
> +		last = *head;
> +		putchar(last);
> +		advance(&head);
> +	}
> +	if (last != '\n')
> +		putchar('\n');
> +}
> +
> +int lkl_test_run(const struct lkl_test *tests, int nr, const char *fmt, ...)
> +{
> +	int i, ret, status = TEST_SUCCESS;
> +	clock_t start, stop;
> +	char name[1024];
> +	va_list args;
> +
> +	va_start(args, fmt);
> +	vsnprintf(name, sizeof(name), fmt, args);
> +	va_end(args);
> +
> +	printf("1..%d # %s\n", nr, name);
> +	for (i = 1; i <= nr; i++) {
> +		const struct lkl_test *t = &tests[i-1];
> +		unsigned long delta_us;
> +
> +		printf("* %d %s\n", i, t->name);
> +		fflush(stdout);
> +
> +		start = clock();
> +
> +		ret = t->fn(t->arg1, t->arg2, t->arg3);
> +
> +		stop = clock();
> +
> +		switch (ret) {
> +		case TEST_SUCCESS:
> +			printf("ok %d %s\n", i, t->name);
> +			break;
> +		case TEST_SKIP:
> +			printf("ok %d %s # SKIP\n", i, t->name);
> +			break;
> +		case TEST_BAILOUT:
> +			status = TEST_BAILOUT;
> +			/* fall through */
> +		case TEST_FAILURE:
> +		default:
> +			if (status != TEST_BAILOUT)
> +				status = TEST_FAILURE;
> +			printf("not ok %d %s\n", i, t->name);
> +		}
> +
> +		printf(" ---\n");
> +		delta_us = (stop - start) * 1000000 / CLOCKS_PER_SEC;
> +		printf(" time_us: %ld\n", delta_us);
> +		print_log();
> +		printf(" ...\n");
> +
> +		if (status == TEST_BAILOUT) {
> +			printf("Bail out!\n");
> +			return TEST_FAILURE;
> +		}
> +
> +		fflush(stdout);
> +	}
> +
> +	return status;
> +}
> +
> +
> +void lkl_test_log(const char *str, int len)
> +{
> +	while (len--)
> +		log_char(*(str++));
> +}
> +
> +int lkl_test_logf(const char *fmt, ...)
> +{
> +	char tmp[1024], *c;
> +	va_list args;
> +	unsigned int n;
> +
> +	va_start(args, fmt);
> +	n = vsnprintf(tmp, sizeof(tmp), fmt, args);
> +	va_end(args);
> +
> +	for (c = tmp; *c != 0; c++)
> +		log_char(*c);
> +
> +	return n > sizeof(tmp) ? sizeof(tmp) : n;
> +}
> diff --git a/tools/lkl/tests/test.h b/tools/lkl/tests/test.h
> new file mode 100644
> index 000000000000..f63ad6d419cb
> --- /dev/null
> +++ b/tools/lkl/tests/test.h
> @@ -0,0 +1,72 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LKL_TEST_H
> +#define _LKL_TEST_H
> +
> +#define TEST_SUCCESS	0
> +#define TEST_FAILURE	1
> +#define TEST_SKIP	2
> +#define TEST_TODO	3
> +#define TEST_BAILOUT	4
> +
> +struct lkl_test {
> +	const char *name;
> +	int (*fn)();
> +	void *arg1, *arg2, *arg3;
> +};
> +
> +/**
> + * Simple wrapper to initialize a test entry.
> + * @name - test name, it assume test function is named test_@name
> + * @vargs - arguments to be passed to the function
> + */
> +#define LKL_TEST(name, ...) { #name, lkl_test_##name, __VA_ARGS__ }
> +
> +/**
> + * lkl_test_run - run a test suite
> + *
> + * @tests - the list of tests to run
> + * @nr - number of tests
> + * @fmt - format string to be used for suite name
> + */
> +int lkl_test_run(const struct lkl_test *tests, int nr, const char *fmt, ...);
> +
> +/**
> + * lkl_test_log - store a string in the test log buffer
> + * @str - the string to log (can be non-NULL terminated)
> + * @len - the string length
> + */
> +void lkl_test_log(const char *str, int len);
> +
> +/**
> + * lkl_test_logf - printf like function to store into the test log buffer
> + * @fmt - printf format string
> + * @vargs - arguments to the format string
> + */
> +int lkl_test_logf(const char *fmt, ...) __attribute__((format(printf, 1, 2)));
> +
> +/**
> + * LKL_TEST_CALL - create a test function as for a LKL call
> + *
> + * The test function will be named lkl_test_@name and will return
> + * TEST_SUCCESS if the called functions returns @expect. Otherwise
> + * will return TEST_FAILUIRE.
> + *
> + * @name - test name; must be unique because it is part of the the
> + * test function; the test function will be named
> + * @call - function to call
> + * @expect - expected return value for success
> + * @args - arguments to pass to the LKL call
> + */
> +#define LKL_TEST_CALL(name, call, expect, ...)				\
> +	static int lkl_test_##name(void)				\
> +	{								\
> +		long ret;						\
> +									\
> +		ret = call(__VA_ARGS__);				\
> +		lkl_test_logf("%s(%s) = %ld %s\n", #call, #__VA_ARGS__, \
> +			ret, ret < 0 ? lkl_strerror(ret) : "");		\
> +		return (ret == expect) ? TEST_SUCCESS : TEST_FAILURE;	\
> +	}
> +
> +
> +#endif /* _LKL_TEST_H */
> diff --git a/tools/lkl/tests/test.sh b/tools/lkl/tests/test.sh
> new file mode 100644
> index 000000000000..1a5619aed735
> --- /dev/null
> +++ b/tools/lkl/tests/test.sh
> @@ -0,0 +1,179 @@
> +#!/usr/bin/env bash
> +# SPDX-License-Identifier: GPL-2.0
> +
> +script_dir=$(cd $(dirname ${BASH_SOURCE:-$0}); pwd)
> +basedir=$(cd $script_dir/..; pwd)
> +source ${script_dir}/autoconf.sh
> +
> +TEST_SUCCESS=0
> +TEST_FAILURE=1
> +TEST_SKIP=113
> +TEST_TODO=114
> +TEST_BAILOUT=115
> +
> +print_log()
> +{
> +    echo " log: |"
> +    while read line; do
> +        echo "  $line"
> +    done < $1
> +}
> +
> +export_vars()
> +{
> +    if [ -z "$var_file" ]; then
> +        return
> +    fi
> +
> +    for i in $@; do
> +        echo "$i=${!i}" >> $var_file
> +    done
> +}
> +
> +lkl_test_run()
> +{
> +    log_file=$(mktemp)
> +    export var_file=$(mktemp)
> +
> +    tid=$1 && shift && tname=$@
> +
> +    echo "* $tid $tname"
> +
> +    start=$(date '+%s%9N')
> +    # run in a separate shell to avoid -e terminating us
> +    $@ 2>&1 | strings >$log_file
> +    exit=${PIPESTATUS[0]}
> +    stop=$(date '+%s%9N')
> +
> +    case $exit in
> +    $TEST_SUCCESS)
> +        echo "ok $tid $tname"
> +        ;;
> +    $TEST_SKIP)
> +        echo "ok $tid $tname # SKIP"
> +        ;;
> +    $TEST_BAILOUT)
> +        echo "not ok $tid $tname"
> +        echo "Bail out!"
> +        ;;
> +    $TEST_FAILURE|*)
> +        echo "not ok $tid $tname"
> +        ;;
> +    esac
> +
> +    delta=$(((stop-start)/1000))
> +
> +    echo " ---"
> +    echo " time_us: $delta"
> +    print_log $log_file
> +    echo -e " ..."
> +
> +    rm $log_file
> +    . $var_file
> +    rm $var_file
> +
> +    return $exit
> +}
> +
> +lkl_test_plan()
> +{
> +    echo "1..$1 # $2"
> +    export suite_name="${2// /\-}"
> +}
> +
> +lkl_test_exec()
> +{
> +    local SUDO=""
> +    local WRAPPER=""
> +
> +    if [ "$1" = "sudo" ]; then
> +        SUDO=sudo
> +        shift
> +    fi
> +
> +    local file=$1
> +    shift
> +
> +    if [ -n "$LKL_HOST_CONFIG_NT" ]; then
> +        file=$file.exe
> +    fi
> +
> +    if file $file | grep ARM; then
> +        WRAPPER="qemu-arm-static"
> +    elif file $file | grep "FreeBSD" ; then
> +        ssh_copy "$file" $BSD_WDIR
> +        if [ -n "$SUDO" ]; then
> +            SUDO=""
> +        fi
> +        WRAPPER="$MYSSH $SU"
> +        # ssh will mess up with pipes ('|') so, escape the pipe char.
> +        args="${@//\|/\\\|}"
> +        set - $BSD_WDIR/$(basename $file) $args
> +        file=""
> +    elif [ -n "$GDB" ]; then
> +        WRAPPER="gdb"
> +        args="$@"
> +        set - -ex "run $args" -ex quit $file
> +        file=""
> +    elif [ -n "$VALGRIND" ]; then
> +        WRAPPER="valgrind --suppressions=$script_dir/valgrind.supp \
> +                  --leak-check=full --show-leak-kinds=all --xml=yes \
> +                  --xml-file=valgrind-$suite_name.xml"
> +    fi
> +
> +    $SUDO $WRAPPER $file "$@"
> +}
> +
> +lkl_test_cmd()
> +{
> +    local WRAPPER=""
> +
> +    if [ -z "$QUIET" ]; then
> +        SHOPTS="-x"
> +    fi
> +
> +    if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
> +        WRAPPER="$MYSSH $SU"
> +    fi
> +
> +    echo "$@" | $WRAPPER sh $SHOPTS
> +}
> +
> +# XXX: $MYSSH and $MYSCP are defined in a circleci docker image.
> +# see the definitions in lkl/lkl-docker:circleci/freebsd11/Dockerfile
> +ssh_push()
> +{
> +    while [ -n "$1" ]; do
> +        if [[ "$1" = *.sh ]]; then
> +            type="script"
> +        else
> +            type="file"
> +        fi
> +
> +        dir=$(dirname $1)
> +        $MYSSH mkdir -p $BSD_WDIR/$dir
> +
> +        $MYSCP -P 7722 -r $basedir/$1 root@localhost:$BSD_WDIR/$dir
> +        if [ "$type" = "script" ]; then
> +            $MYSSH chmod a+x $BSD_WDIR/$1
> +        fi
> +
> +        shift
> +    done
> +}
> +
> +ssh_copy()
> +{
> +    $MYSCP -P 7722 -r $1 root@localhost:$2
> +}
> +
> +lkl_test_bsd_cleanup()
> +{
> +    $MYSSH rm -rf $BSD_WDIR
> +}
> +
> +if [ -n "$LKL_HOST_CONFIG_BSD" ]; then
> +    trap lkl_test_bsd_cleanup EXIT
> +    export BSD_WDIR=/root/lkl
> +    $MYSSH mkdir -p $BSD_WDIR
> +fi
> diff --git a/tools/lkl/tests/valgrind.supp b/tools/lkl/tests/valgrind.supp
> new file mode 100644
> index 000000000000..5ce717d759fc
> --- /dev/null
> +++ b/tools/lkl/tests/valgrind.supp
> @@ -0,0 +1,85 @@
> +{
> +   <unfinished timer 1>
> +   Memcheck:Leak
> +   match-leak-kinds: possible
> +   ...
> +   fun:pthread_create@@GLIBC_2.2.5
> +   fun:__start_helper_thread
> +   fun:__pthread_once_slow
> +   fun:timer_create@@GLIBC_2.3.3
> +   fun:timer_alloc
> +   fun:clockevent_set_state_oneshot
> +   ...
> +   fun:__clockevents_switch_state
> +   fun:clockevents_switch_state
> +   fun:tick_setup_periodic
> +   ...
> +}
> +
> +{
> +   <pid1 kernel thread>
> +   Memcheck:Leak
> +   match-leak-kinds: possible
> +   ...
> +   fun:thread_create
> +   fun:copy_thread
> +   fun:copy_thread_tls
> +   ...
> +   fun:rest_init
> +   fun:start_kernel
> +   fun:lkl_run_kernel
> +}
> +
> +{
> +   <xfs uninitialized buf error: delete this once upstream is fixed>
> +   Memcheck:Value8
> +   fun:crc32_body
> +   fun:crc32_le_generic
> +   fun:__crc32c_le
> +   fun:chksum_update
> +   fun:crypto_shash_update
> +   fun:crc32c
> +   fun:xlog_cksum
> +}
> +
> +{
> +   <xfs pwrite64 issue: delete this once upstream is fixed>
> +   Memcheck:Param
> +   pwrite64(buf)
> +   ...
> +   fun:blk_request
> +   fun:blk_enqueue
> +   fun:virtio_process_one
> +   fun:virtio_process_queue
> +   fun:virtio_write
> +   fun:__raw_writel
> +   fun:writel
> +   fun:vm_notify
> +   fun:virtqueue_notify
> +   fun:virtio_queue_rq
> +   fun:blk_mq_dispatch_rq_list
> +   fun:blk_mq_sched_dispatch_requests
> +}
> +
> +{
> +   <virtio_net_pipe xmits>
> +   Memcheck:Param
> +   writev(vector[...])
> +   ...
> +   fun:fd_net_tx
> +   fun:net_enqueue
> +   fun:virtio_process_one
> +   fun:virtio_process_queue
> +   fun:virtio_write
> +   fun:__raw_writel
> +   fun:writel
> +   fun:vm_notify
> +   fun:virtqueue_notify
> +   fun:virtqueue_kick
> +   fun:start_xmit
> +   fun:__netdev_start_xmit
> +   fun:netdev_start_xmit
> +   fun:xmit_one
> +   fun:dev_hard_start_xmit
> +   fun:sch_direct_xmit
> +}
> \ No newline at end of file
> diff --git a/tools/lkl/tests/valgrind2xunit.py b/tools/lkl/tests/valgrind2xunit.py
> new file mode 100755
> index 000000000000..ab7c12b83377
> --- /dev/null
> +++ b/tools/lkl/tests/valgrind2xunit.py
> @@ -0,0 +1,69 @@
> +#!/usr/bin/env python
> +# SPDX-License-Identifier: GPL-2.0
> +
> +##
> +## Downloader from
> +## http://humdi.net/wiki/tips/valgrind-to-xunit-xml-converter
> +##
> +
> +import xml.etree.ElementTree as ET
> +import sys
> +import os
> +
> +fname = sys.argv[1]
> +if fname is None:
> +    fname = 'valgrind.xml'
> +
> +doc = ET.parse(fname)
> +errors = doc.findall('.//error')
> +
> +out = open(os.path.splitext(os.path.basename(fname))[0]+'_xunit.xml',"w")
> +out.write('<?xml version="1.0" encoding="UTF-8"?>\n')
> +out.write('<testsuite name="valgrind" tests="'+str(len(errors))+'" errors="0" failures="'+str(len(errors))+'" skip="0">\n')
> +errorcount=0
> +for error in errors:
> +    errorcount=errorcount+1
> +
> +    kind = error.find('kind')
> +    what = error.find('what')
> +    if  what == None:
> +        what = error.find('xwhat/text')
> +
> +    stack = error.find('stack')
> +    frames = stack.findall('frame')
> +
> +    for frame in frames:
> +        fi = frame.find('file')
> +        li = frame.find('line')
> +        if fi != None and li != None:
> +            break
> +
> +    if fi != None and li != None:
> +        out.write('    <testcase classname="ValgrindMemoryCheck" name="Memory check '+str(errorcount)+' ('+kind.text+', '+fi.text+':'+li.text+')" time="0">\n')
> +    else:
> +        out.write('    <testcase classname="ValgrindMemoryCheck" name="Memory check '+str(errorcount)+' ('+kind.text+')" time="0">\n')
> +    out.write('        <error type="'+kind.text+'">\n')
> +    out.write('  '+what.text+'\n\n')
> +
> +    for frame in frames:
> +        ip = frame.find('ip')
> +        fn = frame.find('fn')
> +        fi = frame.find('file')
> +        li = frame.find('line')
> +
> +        if fn is None:
> +            bodytext = '(unresolved symbol)'
> +        else:
> +            bodytext = fn.text
> +        bodytext = bodytext.replace("&","&amp;")
> +        bodytext = bodytext.replace("<","&lt;")
> +        bodytext = bodytext.replace(">","&gt;")
> +        if fi != None and li != None:
> +            out.write('  '+ip.text+': '+bodytext+' ('+fi.text+':'+li.text+')\n')
> +        else:
> +            out.write('  '+ip.text+': '+bodytext+'\n')
> +
> +    out.write('        </error>\n')
> +    out.write('    </testcase>\n')
> +out.write('</testsuite>\n')
> +out.close()

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 21/37] lkl tools: "boot" test
  2020-01-23 19:33     ` Brendan Higgins
@ 2020-01-24  4:32         ` Hajime Tazaki
  2020-03-02 19:51         ` Luis Chamberlain
  1 sibling, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-01-24  4:32 UTC (permalink / raw)
  To: brendanhiggins
  Cc: kunit-dev, tavi.purdila, mcgrof, davidgow, cyphar, linux-um,
	linux-arch, pscollins, cem, motomuman, jiangshanlai, retrage01,
	petrosagg, liuyuan, thomas, mark, ddiss, linux-kernel-library,
	luca.dariz


Hello Brendan,

On Fri, 24 Jan 2020 04:33:15 +0900,
Brendan Higgins wrote:

> > +int lkl_test_read(void)
> > +{
> > +	char buf[10] = { 0, };
> > +	long ret;
> > +
> > +	ret = lkl_sys_read(0, buf, sizeof(buf));
> > +
> > +	lkl_test_logf("lkl_sys_read=%ld buf=%s\n", ret, buf);
> > +
> > +	if (ret == sizeof(wrbuf) && !strcmp(wrbuf, buf))
> > +		return TEST_SUCCESS;
> > +
> > +	return TEST_FAILURE;
> > +}
> 
> These tests make me think that LKL could be very useful for KUnit and
> testing syscalls.
> 
> Luis and I had been talking about writing KUnit tests for syscalls to
> validate that syscalls conform to the expected behavior; however,
> calling syscalls from the kernel obviously has issues.
> 
> On the other hand, testing syscalls from a userspace on a booted kernel
> is something that we do and something that needs to be done; however,
> this too has some issues. Writing and running tests in userspace on a
> booted kernel is not as easy as being able to write and run tests in the
> kernel. Also, even though some syscall end-to-end tests are necessary,
> not all syscall tests must be end-to-end tests, especially those which
> are only trying to exercise the entire syscall contract.
> 
> I think it looks like LKL might be able to help us square that circle.

That's good to know :)

> Hajime (and other LKL people):
> 
> What is the current status of this patchset? I have not seen any
> activity for a couple months.

I've been a bit busy over the year-end term but recently restarted to
work for the patchset to address the comments received from the
discussion.

> Luis,
> 
> Does this kind of match what you were thinking with the syscall testing?
> I think this looks pretty close. You should be able to fully test the
> contract here using KUnit. Is there anyone else you think would be
> interested in this?
> 
> In any case, I am excited about this. Please keep me posted in the
> future!

I hope I can send v3 patches soon.
Thanks for the interest !

-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 21/37] lkl tools: "boot" test
@ 2020-01-24  4:32         ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-01-24  4:32 UTC (permalink / raw)
  To: brendanhiggins
  Cc: linux-arch, mark, cem, tavi.purdila, jiangshanlai, motomuman,
	linux-um, cyphar, retrage01, petrosagg, mcgrof,
	linux-kernel-library, thomas, davidgow, ddiss, pscollins,
	luca.dariz, liuyuan, kunit-dev


Hello Brendan,

On Fri, 24 Jan 2020 04:33:15 +0900,
Brendan Higgins wrote:

> > +int lkl_test_read(void)
> > +{
> > +	char buf[10] = { 0, };
> > +	long ret;
> > +
> > +	ret = lkl_sys_read(0, buf, sizeof(buf));
> > +
> > +	lkl_test_logf("lkl_sys_read=%ld buf=%s\n", ret, buf);
> > +
> > +	if (ret == sizeof(wrbuf) && !strcmp(wrbuf, buf))
> > +		return TEST_SUCCESS;
> > +
> > +	return TEST_FAILURE;
> > +}
> 
> These tests make me think that LKL could be very useful for KUnit and
> testing syscalls.
> 
> Luis and I had been talking about writing KUnit tests for syscalls to
> validate that syscalls conform to the expected behavior; however,
> calling syscalls from the kernel obviously has issues.
> 
> On the other hand, testing syscalls from a userspace on a booted kernel
> is something that we do and something that needs to be done; however,
> this too has some issues. Writing and running tests in userspace on a
> booted kernel is not as easy as being able to write and run tests in the
> kernel. Also, even though some syscall end-to-end tests are necessary,
> not all syscall tests must be end-to-end tests, especially those which
> are only trying to exercise the entire syscall contract.
> 
> I think it looks like LKL might be able to help us square that circle.

That's good to know :)

> Hajime (and other LKL people):
> 
> What is the current status of this patchset? I have not seen any
> activity for a couple months.

I've been a bit busy over the year-end term but recently restarted to
work for the patchset to address the comments received from the
discussion.

> Luis,
> 
> Does this kind of match what you were thinking with the syscall testing?
> I think this looks pretty close. You should be able to fully test the
> contract here using KUnit. Is there anyone else you think would be
> interested in this?
> 
> In any case, I am excited about this. Please keep me posted in the
> future!

I hope I can send v3 patches soon.
Thanks for the interest !

-- Hajime

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
  2019-11-26 16:02           ` Richard Weinberger
@ 2020-02-05  7:37             ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-02-05  7:37 UTC (permalink / raw)
  To: richard
  Cc: linux-um, linux-arch, pscollins, levex, mattator, cem,
	tavi.purdila, staal1978, motomuman, jiangshanlai, retrage01,
	petrosagg, liuyuan, mark, linux-kernel-library, phh,
	sigmaepsilon92, luca.dariz, edisonmcastro


On Wed, 27 Nov 2019 01:02:12 +0900,
Richard Weinberger wrote:

(snip)
> > Do you mean "this chain" by the long list of Signed-off-by lines, or
> > something else ?
> 
> The long list is rather unusual.
>  
> > We were trying to put all of contributors on the list.  I was failed to
> > interpret process/submitting-patches.rst on which part is not appropriate.
> 
> If every contributor is also a Co-Author. Okay. But having such a long
> list of authors is still a little odd.

I've replaced most of Signed-off-by: to Cc: in v3 patches.

-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library
@ 2020-02-05  7:37             ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-02-05  7:37 UTC (permalink / raw)
  To: richard
  Cc: linux-arch, levex, mattator, cem, tavi.purdila, jiangshanlai,
	staal1978, motomuman, linux-um, retrage01, petrosagg,
	edisonmcastro, mark, linux-kernel-library, pscollins, phh,
	sigmaepsilon92, luca.dariz, liuyuan


On Wed, 27 Nov 2019 01:02:12 +0900,
Richard Weinberger wrote:

(snip)
> > Do you mean "this chain" by the long list of Signed-off-by lines, or
> > something else ?
> 
> The long list is rather unusual.
>  
> > We were trying to put all of contributors on the list.  I was failed to
> > interpret process/submitting-patches.rst on which part is not appropriate.
> 
> If every contributor is also a Co-Author. Okay. But having such a long
> list of authors is still a little odd.

I've replaced most of Signed-off-by: to Cc: in v3 patches.

-- Hajime


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 05/37] lkl: memory handling
  2019-11-25 22:10       ` Richard Weinberger
@ 2020-02-05  7:38         ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-02-05  7:38 UTC (permalink / raw)
  To: richard.weinberger
  Cc: linux-um, linux-arch, levex, tavi.purdila, retrage01, liuyuan,
	linux-kernel-library


On Tue, 26 Nov 2019 07:10:28 +0900,
Richard Weinberger wrote:
> 
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> >
> > LKL is a non MMU architecture and hence there is not much work left to
> > do other than initializing the boot allocator and providing the page
> > and page table definitions.
> >
> > The backstore memory is allocated via a host operation and the memory
> > size to be used is specified when the kernel is started, in the
> > lkl_start_kernel call.
> >
> > Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> > Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> > Signed-off-by: Levente Kurusa <levex@linux.com>
> > Signed-off-by: Yuan Liu <liuyuan@google.com>
> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> > ---
> >  arch/um/lkl/include/asm/page.h          | 14 ++++++
> >  arch/um/lkl/include/asm/pgtable.h       | 62 +++++++++++++++++++++++
> >  arch/um/lkl/include/uapi/asm/host_ops.h |  5 ++
> >  arch/um/lkl/mm/bootmem.c                | 66 +++++++++++++++++++++++++
> 
> This is also something which needs unification with UML.
> UML in NOMMU mode would be LKL then...

At this moment, I leave those part as is; changing LKL to MMU mode
makes less possibility to host various underlying environments
(non-Linux hosts, non-x86 subarchs).

If you have nice suggestions (such as adding texts to docs), it would
be definitely helpful.

-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 05/37] lkl: memory handling
@ 2020-02-05  7:38         ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-02-05  7:38 UTC (permalink / raw)
  To: richard.weinberger
  Cc: linux-arch, levex, tavi.purdila, linux-um, retrage01,
	linux-kernel-library, liuyuan


On Tue, 26 Nov 2019 07:10:28 +0900,
Richard Weinberger wrote:
> 
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> >
> > LKL is a non MMU architecture and hence there is not much work left to
> > do other than initializing the boot allocator and providing the page
> > and page table definitions.
> >
> > The backstore memory is allocated via a host operation and the memory
> > size to be used is specified when the kernel is started, in the
> > lkl_start_kernel call.
> >
> > Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> > Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> > Signed-off-by: Levente Kurusa <levex@linux.com>
> > Signed-off-by: Yuan Liu <liuyuan@google.com>
> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> > ---
> >  arch/um/lkl/include/asm/page.h          | 14 ++++++
> >  arch/um/lkl/include/asm/pgtable.h       | 62 +++++++++++++++++++++++
> >  arch/um/lkl/include/uapi/asm/host_ops.h |  5 ++
> >  arch/um/lkl/mm/bootmem.c                | 66 +++++++++++++++++++++++++
> 
> This is also something which needs unification with UML.
> UML in NOMMU mode would be LKL then...

At this moment, I leave those part as is; changing LKL to MMU mode
makes less possibility to host various underlying environments
(non-Linux hosts, non-x86 subarchs).

If you have nice suggestions (such as adding texts to docs), it would
be definitely helpful.

-- Hajime


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 07/37] lkl: interrupt support
  2019-11-25 22:13       ` Richard Weinberger
@ 2020-02-05  7:38         ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-02-05  7:38 UTC (permalink / raw)
  To: richard.weinberger
  Cc: linux-um, linux-arch, tavi.purdila, retrage01,
	linux-kernel-library, sigmaepsilon92


On Tue, 26 Nov 2019 07:13:55 +0900,
Richard Weinberger wrote:
> 
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> >
> > Add APIs that allows the host to reserve and free and interrupt number
> > and also to trigger an interrupt.
> >
> > The trigger operation will simply store the interrupt data in
> > queue. The interrupt handler is run later, at the first opportunity it
> > has to avoid races with any kernel threads.
> >
> > Currently, interrupts are run on the first interrupt enable operation
> > if interrupts are disabled and if we are not already in interrupt
> > context.
> >
> > When triggering an interrupt, it uses GCC's built-in functions for
> > atomic memory access to synchronize and simple boolean flags.
> >
> > Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> > Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> > ---
> >  arch/um/lkl/include/asm/irq.h             |  13 ++
> >  arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
> >  arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
> >  arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++
> 
> Like I said before, this also something to unify with UML.
> I'm aware that this is easily said but we cannot have too much duplication.
> 
> Feel free to ask if UML internals give you headache. :-)

Same as nommu implementation, I left this part as-is.

Triggering interrupts with fd events (delivered by epoll&co) is a hard
part to implement host-independent interrupts of LKL.  OTOH, the v3
patchset shows that it is doable to use UML drivers with the LKL
interrupt facility.

I may also need more time to evaluate/find a right direction, though.
Your comments are always welcome.

-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 07/37] lkl: interrupt support
@ 2020-02-05  7:38         ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-02-05  7:38 UTC (permalink / raw)
  To: richard.weinberger
  Cc: linux-arch, tavi.purdila, linux-um, retrage01,
	linux-kernel-library, sigmaepsilon92


On Tue, 26 Nov 2019 07:13:55 +0900,
Richard Weinberger wrote:
> 
> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
> >
> > From: Octavian Purdila <tavi.purdila@gmail.com>
> >
> > Add APIs that allows the host to reserve and free and interrupt number
> > and also to trigger an interrupt.
> >
> > The trigger operation will simply store the interrupt data in
> > queue. The interrupt handler is run later, at the first opportunity it
> > has to avoid races with any kernel threads.
> >
> > Currently, interrupts are run on the first interrupt enable operation
> > if interrupts are disabled and if we are not already in interrupt
> > context.
> >
> > When triggering an interrupt, it uses GCC's built-in functions for
> > atomic memory access to synchronize and simple boolean flags.
> >
> > Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
> > Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
> > Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
> > ---
> >  arch/um/lkl/include/asm/irq.h             |  13 ++
> >  arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
> >  arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
> >  arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++
> 
> Like I said before, this also something to unify with UML.
> I'm aware that this is easily said but we cannot have too much duplication.
> 
> Feel free to ask if UML internals give you headache. :-)

Same as nommu implementation, I left this part as-is.

Triggering interrupts with fd events (delivered by epoll&co) is a hard
part to implement host-independent interrupts of LKL.  OTOH, the v3
patchset shows that it is doable to use UML drivers with the LKL
interrupt facility.

I may also need more time to evaluate/find a right direction, though.
Your comments are always welcome.

-- Hajime


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 07/37] lkl: interrupt support
  2020-02-05  7:38         ` Hajime Tazaki
@ 2020-02-05 10:49           ` Anton Ivanov
  -1 siblings, 0 replies; 206+ messages in thread
From: Anton Ivanov @ 2020-02-05 10:49 UTC (permalink / raw)
  To: Hajime Tazaki, richard.weinberger
  Cc: linux-arch, tavi.purdila, linux-um, retrage01,
	linux-kernel-library, sigmaepsilon92



On 05/02/2020 07:38, Hajime Tazaki wrote:
> 
> On Tue, 26 Nov 2019 07:13:55 +0900,
> Richard Weinberger wrote:
>>
>> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>>>
>>> From: Octavian Purdila <tavi.purdila@gmail.com>
>>>
>>> Add APIs that allows the host to reserve and free and interrupt number
>>> and also to trigger an interrupt.
>>>
>>> The trigger operation will simply store the interrupt data in
>>> queue. The interrupt handler is run later, at the first opportunity it
>>> has to avoid races with any kernel threads.
>>>
>>> Currently, interrupts are run on the first interrupt enable operation
>>> if interrupts are disabled and if we are not already in interrupt
>>> context.
>>>
>>> When triggering an interrupt, it uses GCC's built-in functions for
>>> atomic memory access to synchronize and simple boolean flags.
>>>
>>> Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
>>> Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
>>> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
>>> ---
>>>   arch/um/lkl/include/asm/irq.h             |  13 ++
>>>   arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
>>>   arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
>>>   arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++
>>
>> Like I said before, this also something to unify with UML.
>> I'm aware that this is easily said but we cannot have too much duplication.
>>
>> Feel free to ask if UML internals give you headache. :-)
> 
> Same as nommu implementation, I left this part as-is.
> 
> Triggering interrupts with fd events (delivered by epoll&co) is a hard
> part to implement host-independent interrupts of LKL.  OTOH, the v3
> patchset shows that it is doable to use UML drivers with the LKL
> interrupt facility.

Make sure you are testing with the vector network devices, the legacy ones are scheduled to be obsoleted at some point

I know this will cause a headache on non-Linux, I am happy to write wrappers/emulators for recvmms/sendmmsg so these build on the systems which do not support them.

Brgds,


> 
> I may also need more time to evaluate/find a right direction, though.
> Your comments are always welcome.
> 
> -- Hajime
> 
> 
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um
> 

-- 
Anton R. Ivanov
https://www.kot-begemot.co.uk/

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 07/37] lkl: interrupt support
@ 2020-02-05 10:49           ` Anton Ivanov
  0 siblings, 0 replies; 206+ messages in thread
From: Anton Ivanov @ 2020-02-05 10:49 UTC (permalink / raw)
  To: Hajime Tazaki, richard.weinberger
  Cc: linux-arch, tavi.purdila, linux-um, retrage01,
	linux-kernel-library, sigmaepsilon92



On 05/02/2020 07:38, Hajime Tazaki wrote:
> 
> On Tue, 26 Nov 2019 07:13:55 +0900,
> Richard Weinberger wrote:
>>
>> On Fri, Nov 8, 2019 at 6:03 AM Hajime Tazaki <thehajime@gmail.com> wrote:
>>>
>>> From: Octavian Purdila <tavi.purdila@gmail.com>
>>>
>>> Add APIs that allows the host to reserve and free and interrupt number
>>> and also to trigger an interrupt.
>>>
>>> The trigger operation will simply store the interrupt data in
>>> queue. The interrupt handler is run later, at the first opportunity it
>>> has to avoid races with any kernel threads.
>>>
>>> Currently, interrupts are run on the first interrupt enable operation
>>> if interrupts are disabled and if we are not already in interrupt
>>> context.
>>>
>>> When triggering an interrupt, it uses GCC's built-in functions for
>>> atomic memory access to synchronize and simple boolean flags.
>>>
>>> Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
>>> Signed-off-by: Michael Zimmermann <sigmaepsilon92@gmail.com>
>>> Signed-off-by: Octavian Purdila <tavi.purdila@gmail.com>
>>> ---
>>>   arch/um/lkl/include/asm/irq.h             |  13 ++
>>>   arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
>>>   arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
>>>   arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++
>>
>> Like I said before, this also something to unify with UML.
>> I'm aware that this is easily said but we cannot have too much duplication.
>>
>> Feel free to ask if UML internals give you headache. :-)
> 
> Same as nommu implementation, I left this part as-is.
> 
> Triggering interrupts with fd events (delivered by epoll&co) is a hard
> part to implement host-independent interrupts of LKL.  OTOH, the v3
> patchset shows that it is doable to use UML drivers with the LKL
> interrupt facility.

Make sure you are testing with the vector network devices, the legacy ones are scheduled to be obsoleted at some point

I know this will cause a headache on non-Linux, I am happy to write wrappers/emulators for recvmms/sendmmsg so these build on the systems which do not support them.

Brgds,


> 
> I may also need more time to evaluate/find a right direction, though.
> Your comments are always welcome.
> 
> -- Hajime
> 
> 
> _______________________________________________
> linux-um mailing list
> linux-um@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-um
> 

-- 
Anton R. Ivanov
https://www.kot-begemot.co.uk/

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 07/37] lkl: interrupt support
  2020-02-05 10:49           ` Anton Ivanov
@ 2020-02-05 14:24             ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-02-05 14:24 UTC (permalink / raw)
  To: anton.ivanov
  Cc: richard.weinberger, linux-arch, tavi.purdila, linux-um,
	retrage01, linux-kernel-library, sigmaepsilon92


Hello Anton,

On Wed, 05 Feb 2020 19:49:37 +0900,
Anton Ivanov wrote:

> >>>   arch/um/lkl/include/asm/irq.h             |  13 ++
> >>>   arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
> >>>   arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
> >>>   arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++
> >> 
> >> Like I said before, this also something to unify with UML.
> >> I'm aware that this is easily said but we cannot have too much duplication.
> >> 
> >> Feel free to ask if UML internals give you headache. :-)
> > 
> > Same as nommu implementation, I left this part as-is.
> > 
> > Triggering interrupts with fd events (delivered by epoll&co) is a hard
> > part to implement host-independent interrupts of LKL.  OTOH, the v3
> > patchset shows that it is doable to use UML drivers with the LKL
> > interrupt facility.
> 
> Make sure you are testing with the vector network devices, the
> legacy ones are scheduled to be obsoleted at some point

I was aware of the commit to obsolete several backend with the vector
device, but did not include in the patchset and tests.  I will try to
do it for the next round.

> I know this will cause a headache on non-Linux, I am happy to write
> wrappers/emulators for recvmms/sendmmsg so these build on the
> systems which do not support them.

If UML is going to extend to support non-Linux host, yes, those kind
of wrappers will be helpful.

Right now, the patchset only focuses on x86 hosts so, this can be
postponed.

-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 07/37] lkl: interrupt support
@ 2020-02-05 14:24             ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-02-05 14:24 UTC (permalink / raw)
  To: anton.ivanov
  Cc: linux-arch, richard.weinberger, linux-um, retrage01,
	tavi.purdila, linux-kernel-library, sigmaepsilon92


Hello Anton,

On Wed, 05 Feb 2020 19:49:37 +0900,
Anton Ivanov wrote:

> >>>   arch/um/lkl/include/asm/irq.h             |  13 ++
> >>>   arch/um/lkl/include/uapi/asm/irq.h        |  36 ++++
> >>>   arch/um/lkl/include/uapi/asm/sigcontext.h |  16 ++
> >>>   arch/um/lkl/kernel/irq.c                  | 193 ++++++++++++++++++++++
> >> 
> >> Like I said before, this also something to unify with UML.
> >> I'm aware that this is easily said but we cannot have too much duplication.
> >> 
> >> Feel free to ask if UML internals give you headache. :-)
> > 
> > Same as nommu implementation, I left this part as-is.
> > 
> > Triggering interrupts with fd events (delivered by epoll&co) is a hard
> > part to implement host-independent interrupts of LKL.  OTOH, the v3
> > patchset shows that it is doable to use UML drivers with the LKL
> > interrupt facility.
> 
> Make sure you are testing with the vector network devices, the
> legacy ones are scheduled to be obsoleted at some point

I was aware of the commit to obsolete several backend with the vector
device, but did not include in the patchset and tests.  I will try to
do it for the next round.

> I know this will cause a headache on non-Linux, I am happy to write
> wrappers/emulators for recvmms/sendmmsg so these build on the
> systems which do not support them.

If UML is going to extend to support non-Linux host, yes, those kind
of wrappers will be helpful.

Right now, the patchset only focuses on x86 hosts so, this can be
postponed.

-- Hajime


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 07/37] lkl: interrupt support
  2020-02-05 14:24             ` Hajime Tazaki
@ 2020-02-18  8:18               ` Hajime Tazaki
  -1 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-02-18  8:18 UTC (permalink / raw)
  To: anton.ivanov
  Cc: richard.weinberger, linux-arch, tavi.purdila, linux-um,
	retrage01, linux-kernel-library, sigmaepsilon92


Hello,

On Wed, 05 Feb 2020 23:24:29 +0900,
Hajime Tazaki wrote:

> > Make sure you are testing with the vector network devices, the
> > legacy ones are scheduled to be obsoleted at some point
> 
> I was aware of the commit to obsolete several backend with the vector
> device, but did not include in the patchset and tests.  I will try to
> do it for the next round.


So I added a vector device support, tested with tap backend.
Here is a list of numbers with various configurations that v3+ patch
have.

disclaimer: the experiment is immature, not apple-to-apple in many
aspects.  So this result only presents one of the parameter set that I
took.  I will update/clean up later if there are interests.

- testbed

               +--docker0--+
               |           |     10GbE
 netperf +---tap0        eth0 +==========+ eth0 +---+ netserver
 (client)                (ixgbe)          (ixgbe)

<-- Linux box (4.18.5) -->              <-- Linux box (4.17.19) -->

- setup
varied client side (netperf) with different net drivers/devices.
tso,tx/rx csum are enabled if possible.

- netperf 10secs (Mbps) result

                     |TCP_STREAM  | TCP_MAERTS
-------------------- --------------------------
UMMODE_LIB (um-tap)  | 2290.42    |    1.04
UMMODE_LIB (vec-tap) | 3699.98    | 5682.40
UMMODE_LIB (virtio)  | 8029.13    | 9384.78
UMMODE_KERN (um-tap) | 2233.17    |    7.85
UMMODE_KERN (vec-tap)| 5527.37    | 9414.00

# UMMODE_LIB (virtio) isn't included in v3 patches.

full output log is here;
https://gist.github.com/thehajime/a71878cccf7830a23a23f8f8e8cc8753

result of UMMODE_LIB (vec-tap) is not stable: it sometimes shows over
8Gbps (TCP_STREAM) while most of the times lower.

But I suppose UMMODE_LIB with vector driver isn't that bad, though
there is still a gap to UMMODE_KERN (vector).


-- Hajime

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 07/37] lkl: interrupt support
@ 2020-02-18  8:18               ` Hajime Tazaki
  0 siblings, 0 replies; 206+ messages in thread
From: Hajime Tazaki @ 2020-02-18  8:18 UTC (permalink / raw)
  To: anton.ivanov
  Cc: linux-arch, richard.weinberger, linux-um, retrage01,
	tavi.purdila, linux-kernel-library, sigmaepsilon92


Hello,

On Wed, 05 Feb 2020 23:24:29 +0900,
Hajime Tazaki wrote:

> > Make sure you are testing with the vector network devices, the
> > legacy ones are scheduled to be obsoleted at some point
> 
> I was aware of the commit to obsolete several backend with the vector
> device, but did not include in the patchset and tests.  I will try to
> do it for the next round.


So I added a vector device support, tested with tap backend.
Here is a list of numbers with various configurations that v3+ patch
have.

disclaimer: the experiment is immature, not apple-to-apple in many
aspects.  So this result only presents one of the parameter set that I
took.  I will update/clean up later if there are interests.

- testbed

               +--docker0--+
               |           |     10GbE
 netperf +---tap0        eth0 +==========+ eth0 +---+ netserver
 (client)                (ixgbe)          (ixgbe)

<-- Linux box (4.18.5) -->              <-- Linux box (4.17.19) -->

- setup
varied client side (netperf) with different net drivers/devices.
tso,tx/rx csum are enabled if possible.

- netperf 10secs (Mbps) result

                     |TCP_STREAM  | TCP_MAERTS
-------------------- --------------------------
UMMODE_LIB (um-tap)  | 2290.42    |    1.04
UMMODE_LIB (vec-tap) | 3699.98    | 5682.40
UMMODE_LIB (virtio)  | 8029.13    | 9384.78
UMMODE_KERN (um-tap) | 2233.17    |    7.85
UMMODE_KERN (vec-tap)| 5527.37    | 9414.00

# UMMODE_LIB (virtio) isn't included in v3 patches.

full output log is here;
https://gist.github.com/thehajime/a71878cccf7830a23a23f8f8e8cc8753

result of UMMODE_LIB (vec-tap) is not stable: it sometimes shows over
8Gbps (TCP_STREAM) while most of the times lower.

But I suppose UMMODE_LIB with vector driver isn't that bad, though
there is still a gap to UMMODE_KERN (vector).


-- Hajime


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 21/37] lkl tools: "boot" test
  2020-01-23 19:33     ` Brendan Higgins
@ 2020-03-02 19:51         ` Luis Chamberlain
  2020-03-02 19:51         ` Luis Chamberlain
  1 sibling, 0 replies; 206+ messages in thread
From: Luis Chamberlain @ 2020-03-02 19:51 UTC (permalink / raw)
  To: Brendan Higgins
  Cc: kunit-dev, Hajime Tazaki, Octavian Purdila, David Gow,
	Aleksa Sarai, linux-um, linux-arch, Patrick Collins,
	Conrad Meyer, Motomu Utsumi, Lai Jiangshan, Akira Moroo,
	Petros Angelatos, Yuan Liu, Thomas Liebetraut, Mark Stillwell,
	David Disseldorp, linux-kernel-library, Luca Dariz

On Thu, Jan 23, 2020 at 11:33:15AM -0800, Brendan Higgins wrote:
> Luis,
> 
> Does this kind of match what you were thinking with the syscall testing?

Without looking too deeply into the code, it seems to be the case.
Are you going to expose / port kunit to tools/ to allow usersapace
to run kunit tests?

> In any case, I am excited about this. Please keep me posted in the
> future!

Yes please Cc me too.

  Luis

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 21/37] lkl tools: "boot" test
@ 2020-03-02 19:51         ` Luis Chamberlain
  0 siblings, 0 replies; 206+ messages in thread
From: Luis Chamberlain @ 2020-03-02 19:51 UTC (permalink / raw)
  To: Brendan Higgins
  Cc: linux-arch, Mark Stillwell, Conrad Meyer, Octavian Purdila,
	Lai Jiangshan, Motomu Utsumi, linux-um, Akira Moroo,
	Petros Angelatos, Yuan Liu, Aleksa Sarai, linux-kernel-library,
	Thomas Liebetraut, David Gow, David Disseldorp, Patrick Collins,
	Luca Dariz, Hajime Tazaki, kunit-dev

On Thu, Jan 23, 2020 at 11:33:15AM -0800, Brendan Higgins wrote:
> Luis,
> 
> Does this kind of match what you were thinking with the syscall testing?

Without looking too deeply into the code, it seems to be the case.
Are you going to expose / port kunit to tools/ to allow usersapace
to run kunit tests?

> In any case, I am excited about this. Please keep me posted in the
> future!

Yes please Cc me too.

  Luis

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 21/37] lkl tools: "boot" test
  2020-03-02 19:51         ` Luis Chamberlain
@ 2020-03-02 22:25           ` Brendan Higgins
  -1 siblings, 0 replies; 206+ messages in thread
From: Brendan Higgins @ 2020-03-02 22:25 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: KUnit Development, Hajime Tazaki, Octavian Purdila, David Gow,
	Aleksa Sarai, linux-um, linux-arch, Patrick Collins,
	Conrad Meyer, Motomu Utsumi, Lai Jiangshan, Akira Moroo,
	Petros Angelatos, Yuan Liu, Thomas Liebetraut, Mark Stillwell,
	David Disseldorp, linux-kernel-library, Luca Dariz

On Mon, Mar 2, 2020 at 11:51 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Thu, Jan 23, 2020 at 11:33:15AM -0800, Brendan Higgins wrote:
> > Luis,
> >
> > Does this kind of match what you were thinking with the syscall testing?
>
> Without looking too deeply into the code, it seems to be the case.
> Are you going to expose / port kunit to tools/ to allow usersapace
> to run kunit tests?

Yes, I am thinking about this as a distinct possibility. I think it
only really makes sense in the context of syscall testing, probably
via LKL (assuming this works), but for general end-to-end testing, I
think kselftest has that area covered and users would be better served
by just integrating KUnit with kselftest.

> > In any case, I am excited about this. Please keep me posted in the
> > future!
>
> Yes please Cc me too.
>
>   Luis

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [RFC v2 21/37] lkl tools: "boot" test
@ 2020-03-02 22:25           ` Brendan Higgins
  0 siblings, 0 replies; 206+ messages in thread
From: Brendan Higgins @ 2020-03-02 22:25 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: linux-arch, Mark Stillwell, Conrad Meyer, Octavian Purdila,
	Lai Jiangshan, Motomu Utsumi, linux-um, Akira Moroo,
	Petros Angelatos, Yuan Liu, Aleksa Sarai, linux-kernel-library,
	Thomas Liebetraut, David Gow, David Disseldorp, Patrick Collins,
	Luca Dariz, Hajime Tazaki, KUnit Development

On Mon, Mar 2, 2020 at 11:51 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Thu, Jan 23, 2020 at 11:33:15AM -0800, Brendan Higgins wrote:
> > Luis,
> >
> > Does this kind of match what you were thinking with the syscall testing?
>
> Without looking too deeply into the code, it seems to be the case.
> Are you going to expose / port kunit to tools/ to allow usersapace
> to run kunit tests?

Yes, I am thinking about this as a distinct possibility. I think it
only really makes sense in the context of syscall testing, probably
via LKL (assuming this works), but for general end-to-end testing, I
think kselftest has that area covered and users would be better served
by just integrating KUnit with kselftest.

> > In any case, I am excited about this. Please keep me posted in the
> > future!
>
> Yes please Cc me too.
>
>   Luis

_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 206+ messages in thread

end of thread, other threads:[~2020-03-02 22:25 UTC | newest]

Thread overview: 206+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-23  4:37 [RFC PATCH 00/47] Unifying LKL into UML Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 01/47] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 02/47] kbuild: allow architectures to automatically define kconfig symbols Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 03/47] lkl: architecture skeleton for Linux kernel library Hajime Tazaki
2019-10-25 21:40   ` Richard Weinberger
2019-10-27  2:36     ` Hajime Tazaki
2019-10-29  4:04       ` Lai Jiangshan
2019-10-29  7:13         ` Hajime Tazaki
2019-10-29  7:57           ` Johannes Berg
2019-10-29  8:15             ` Richard Weinberger
2019-10-30  3:19             ` Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 04/47] lkl: host interface Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 05/47] lkl: memory handling Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 06/47] lkl: kernel threads support Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 07/47] lkl: interrupt support Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 08/47] lkl: system call interface and application API Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 09/47] lkl: timers, time and delay support Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 10/47] lkl: memory mapped I/O support Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 11/47] lkl: basic kernel console support Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 12/47] lkl: initialization and cleanup Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 13/47] lkl: plug in the build system Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 14/47] lkl tools: skeleton for host side library, tests and tools Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 15/47] lkl tools: host lib: add utilities functions Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 16/47] lkl tools: host lib: memory mapped I/O helpers Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 17/47] lkl tools: host lib: virtio devices Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 18/47] lkl tools: host lib: virtio block device Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 19/47] lkl tools: host lib: filesystem helpers Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 20/47] lkl tools: host lib: posix host operations Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 21/47] lkl tools: "boot" test Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 22/47] lkl tools: tool that converts a filesystem image to tar Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 23/47] lkl tools: tool that reads/writes to/from a filesystem image Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 24/47] lkl tools: virtio: add network device support Hajime Tazaki
2019-10-23  4:37 ` [RFC PATCH 25/47] lkl: add support for Windows hosts Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 26/47] lkl tools: add support for Windows host Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 27/47] lkl: Android ARM (arm/arm64) support Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 28/47] lkl tools: add lklfuse Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 29/47] lkl: add initial system call hijack support (a.k.a. NUSE of libos) Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 30/47] lkl: add documentation Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 31/47] cpu: add cpu_yield_to_irqs Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 32/47] tools: Add the lkl host library to the common tools Makefile Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 33/47] signal: use CONFIG_X86_32 instead of __i386__ Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 34/47] arch: add __SYSCALL_DEFINE_ARCH Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 35/47] xfs: support for non-mmu architectures Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 36/47] checkpatch: avoid showing BIT_ULL warnings for tools/ files Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 37/47] Revert "vmlinux.lds.h: remove stale <linux/export.h> include" Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 38/47] Revert "export.h: remove code for prefixing symbols with underscore" Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 39/47] Revert "linux/linkage.h: replace VMLINUX_SYMBOL_STR() with __stringify()" Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 40/47] Revert "vmlinux.lds.h: remove no-op macro VMLINUX_SYMBOL()" Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 41/47] Revert "kbuild: remove CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX" Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 42/47] Revert "kallsyms: remove symbol prefix support" Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 43/47] kallsyms: Add a config option to select section for kallsyms Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 44/47] um lkl: use ARCH=um SUBARCH=lkl for tools/lkl Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 45/47] um lkl: add CI tests Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 46/47] um: use lkl virtio_net_tap device as UML device Hajime Tazaki
2019-10-23  4:38 ` [RFC PATCH 47/47] um: add lkl virtio-blk device Hajime Tazaki
2019-10-25 21:34 ` [RFC PATCH 00/47] Unifying LKL into UML Richard Weinberger
2019-10-25 21:34   ` Richard Weinberger
2019-10-27  2:34   ` Hajime Tazaki
2019-10-27  2:34     ` Hajime Tazaki
2019-10-29  7:57 ` Johannes Berg
2019-10-29 15:45   ` Hajime Tazaki
2019-11-08  5:02 ` [RFC v2 00/37] " Hajime Tazaki
2019-11-08  5:02   ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 01/37] asm-generic: atomic64: allow using generic atomic64 on 64bit platforms Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-25 22:02     ` Richard Weinberger
2019-11-25 22:02       ` Richard Weinberger
2019-11-26 14:02       ` Hajime Tazaki
2019-11-26 14:02         ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 02/37] arch: add __SYSCALL_DEFINE_ARCH Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-25 22:02     ` Richard Weinberger
2019-11-25 22:02       ` Richard Weinberger
2019-11-27  4:15       ` Hajime Tazaki
2019-11-27  4:15         ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 03/37] lkl: architecture skeleton for Linux kernel library Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-25 22:00     ` Richard Weinberger
2019-11-25 22:00       ` Richard Weinberger
2019-11-26 11:42       ` Octavian Purdila
2019-11-26 11:42         ` Octavian Purdila
2019-11-26 14:17       ` Hajime Tazaki
2019-11-26 14:17         ` Hajime Tazaki
2019-11-26 16:02         ` Richard Weinberger
2019-11-26 16:02           ` Richard Weinberger
2020-02-05  7:37           ` Hajime Tazaki
2020-02-05  7:37             ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 04/37] lkl: host interface Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 05/37] lkl: memory handling Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-25 22:10     ` Richard Weinberger
2019-11-25 22:10       ` Richard Weinberger
2020-02-05  7:38       ` Hajime Tazaki
2020-02-05  7:38         ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 06/37] lkl: kernel threads support Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 07/37] lkl: interrupt support Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-25 22:13     ` Richard Weinberger
2019-11-25 22:13       ` Richard Weinberger
2020-02-05  7:38       ` Hajime Tazaki
2020-02-05  7:38         ` Hajime Tazaki
2020-02-05 10:49         ` Anton Ivanov
2020-02-05 10:49           ` Anton Ivanov
2020-02-05 14:24           ` Hajime Tazaki
2020-02-05 14:24             ` Hajime Tazaki
2020-02-18  8:18             ` Hajime Tazaki
2020-02-18  8:18               ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 08/37] lkl: system call interface and application API Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 09/37] lkl: timers, time and delay support Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 10/37] lkl: memory mapped I/O support Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 11/37] lkl: basic kernel console support Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 12/37] lkl: initialization and cleanup Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 13/37] lkl: plug in the build system Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 14/37] lkl tools: skeleton for host side library, tests and tools Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 15/37] lkl tools: host lib: add utilities functions Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 16/37] lkl tools: host lib: memory mapped I/O helpers Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 17/37] lkl tools: host lib: virtio devices Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-25 22:07     ` Richard Weinberger
2019-11-25 22:07       ` Richard Weinberger
2019-11-26  8:43       ` Johannes Berg
2019-11-26  8:43         ` Johannes Berg
2019-11-26  8:50         ` Richard Weinberger
2019-11-26  8:50           ` Richard Weinberger
2019-11-26  8:52           ` Johannes Berg
2019-11-26  8:52             ` Johannes Berg
2019-11-26 10:09             ` Richard Weinberger
2019-11-26 10:09               ` Richard Weinberger
2019-11-26 10:16               ` Johannes Berg
2019-11-26 10:16                 ` Johannes Berg
2019-11-26 10:42                 ` Octavian Purdila
2019-11-26 10:42                   ` Octavian Purdila
2019-11-26 10:49                   ` Anton Ivanov
2019-11-26 10:49                     ` Anton Ivanov
2019-11-27  4:06                     ` Hajime Tazaki
2019-11-27  4:06                       ` Hajime Tazaki
2019-11-26 16:04                   ` Richard Weinberger
2019-11-26 16:04                     ` Richard Weinberger
2019-11-27  4:08                     ` Hajime Tazaki
2019-11-27  4:08                       ` Hajime Tazaki
2019-11-27 14:28                       ` Richard Weinberger
2019-11-27 14:28                         ` Richard Weinberger
2019-11-28  9:53                         ` Hajime Tazaki
2019-11-28  9:53                           ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 18/37] lkl tools: host lib: virtio block device Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 19/37] lkl tools: host lib: filesystem helpers Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 20/37] lkl tools: host lib: posix host operations Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 21/37] lkl tools: "boot" test Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2020-01-23 19:33     ` Brendan Higgins
2020-01-24  4:32       ` Hajime Tazaki
2020-01-24  4:32         ` Hajime Tazaki
2020-03-02 19:51       ` Luis Chamberlain
2020-03-02 19:51         ` Luis Chamberlain
2020-03-02 22:25         ` Brendan Higgins
2020-03-02 22:25           ` Brendan Higgins
2019-11-08  5:02   ` [RFC v2 22/37] lkl tools: tool that reads/writes to/from a filesystem image Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 23/37] lkl tools: tool that converts a filesystem image to tar Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 24/37] lkl tools: virtio: add network device support Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 25/37] checkpatch: avoid showing BIT_ULL warnings for tools/ files Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 26/37] tools: Add the lkl host library to the common tools Makefile Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 27/37] lkl tools: add lklfuse Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 28/37] lkl: add system call hijack support Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 29/37] lkl: add documentation Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 30/37] scripts: revert CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX patches Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 31/37] lkl: add support for Windows hosts Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 32/37] lkl tools: add support for Windows host Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 33/37] kallsyms: Add a config option to select section for kallsyms Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 34/37] lkl: Android ARM (arm/arm64) support Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 35/37] um lkl: add CI tests Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 36/37] um: use lkl virtio_net_tap device as UML device Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  5:02   ` [RFC v2 37/37] um: add lkl virtio-blk device Hajime Tazaki
2019-11-08  5:02     ` Hajime Tazaki
2019-11-08  9:13   ` [RFC v2 00/37] Unifying LKL into UML Anton Ivanov
2019-11-08  9:13     ` Anton Ivanov
2019-11-08 11:17     ` Octavian Purdila
2019-11-08 11:17       ` Octavian Purdila

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.